Skip to content

Commit

Permalink
Enhance documentation
Browse files Browse the repository at this point in the history
- Moved "Model integration" section to separate page.
- Updated README to include a new "Model integration" link and improved the structure of the "Supported models" section.
- Introduced environment variable examples in the CLI documentation for better user guidance.
- Removed unused dotenv variable handling from the CLI utils.
- Minor adjustments to the evaluation script and example files for consistency.

These changes aim to improve user experience and clarity in the documentation and CLI usage.
  • Loading branch information
krasserm committed Jan 23, 2025
1 parent 46aa683 commit 7993f3c
Show file tree
Hide file tree
Showing 10 changed files with 165 additions and 128 deletions.
14 changes: 6 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,9 @@ A lightweight library for code-action based agents.

- [Introduction](#introduction)
- [Key capabilities](#key-capabilities)
- [Supported models](#supported-models)
- [Quickstart](#quickstart)
- [Evaluation](#evaluation)

The `freeact` documentation is available [here](https://gradion-ai.github.io/freeact/).
- [Supported models](#supported-models)

## Introduction

Expand All @@ -35,10 +33,6 @@ The library's architecture emphasizes extensibility and transparency, avoiding t

`freeact` executes all code actions within [`ipybox`](https://gradion-ai.github.io/ipybox/), a secure execution environment built on IPython and Docker that can also be deployed locally. This ensures safe execution of dynamically generated code while maintaining full access to the Python ecosystem. Combined with its lightweight and extensible architecture, `freeact` provides a robust foundation for building adaptable AI agents that can tackle real-world challenges requiring dynamic problem-solving approaches.

## Supported models

In addition to the models we [evaluated](#evaluation), `freeact` also supports any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. See [Model integration](https://gradion-ai.github.io/freeact/models/#model-integration) for details.

## Quickstart

Install `freeact` using pip:
Expand All @@ -57,7 +51,7 @@ ANTHROPIC_API_KEY=...
GOOGLE_API_KEY=...
```

Launch a `freeact` agent with generative Google Search skill using the CLI
Launch a `freeact` agent with generative Google Search skill using the [CLI](https://gradion-ai.github.io/freeact/cli/):

```bash
python -m freeact.cli \
Expand Down Expand Up @@ -117,3 +111,7 @@ When comparing our results with smolagents using Claude 3.5 Sonnet on [m-ric/age
[<img src="docs/eval/eval-plot-comparison.png" alt="Performance comparison" width="60%">](docs/eval/eval-plot-comparison.png)

Interestingly, these results were achieved using zero-shot prompting in `freeact`, while the smolagents implementation utilizes few-shot prompting. To ensure a fair comparison, we employed identical evaluation protocols and tools. You can find all evaluation details [here](evaluation).

## Supported models

In addition to the models we [evaluated](#evaluation), `freeact` also supports the [integration](https://gradion-ai.github.io/freeact/integration/) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
68 changes: 68 additions & 0 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,71 @@ The `freeact` CLI supports entering messages that span multiple lines in two way
To submit a multiline message, simply press `Enter`.

![Multiline input](img/multiline.png)

## Environment variables

The CLI reads environment variables from a `.env` file in the current directory and passes them to the [execution environment](installation.md#execution-environment). API keys required for an agent's code action model must be either defined in the `.env` file, passed as command-line arguments, or directly set as variables in the shell.

### Example 1

The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given a `.env` file with the following content:

```env title=".env"
# Required for Claude 3.5 Sonnet
ANTHROPIC_API_KEY=your-anthropic-api-key
# Required for generative Google Search via Gemini 2
GOOGLE_API_KEY=your-google-api-key
```

the following command will launch an agent with `claude-3-5-sonnet-20241022` as code action model configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`:

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

The API key can alternatively be passed as command-line argument:

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--api-key=your-anthropic-api-key \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

### Example 2

To use models from other providers, such as [accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3) hosted by [Fireworks](https://fireworks.ai/), you can either provide all required environment variables in a `.env` file:

```env title=".env"
# Required for DeepSeek V3 hosted by Fireworks
DEEPSEEK_BASE_URL=https://api.fireworks.ai/inference/v1
DEEPSEEK_API_KEY=your-deepseek-api-key
# Required for generative Google Search via Gemini 2
GOOGLE_API_KEY=your-google-api-key
```

and launch the agent with

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-v3 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

or pass the base URL and API key directly as command-line arguments:

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-v3 \
--base-url=https://api.fireworks.ai/inference/v1 \
--api-key=your-deepseek-api-key \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```
10 changes: 5 additions & 5 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@ The library's architecture emphasizes extensibility and transparency, avoiding t
## Next steps

- [Quickstart](quickstart.md) - Launch your first `freeact` agent and interact with it on the command line
- [Installation](installation.md) - Installation instructions and configuration of execution environments
- [Building blocks](blocks.md) - Learn about the essential components of a `freeact` agent system
- [Tutorials](tutorials/index.md) - Tutorials demonstrating the `freeact` building blocks

## Further reading

- [Installation](installation.md) - Detailed instructions for building custom execution environments
- [Command line](cli.md) - Minimalistic command-line interface for running `freeact` agents
- [Supported models](models.md) - Overview of evaluated models and how to [integrate new ones](models.md#model-integration).
- [Streaming protocol](streaming.md) - Protocol for streaming model responses and execution results
- [Evaluation results](evaluation.md) - Evaluation of `freeact` performance incl. a smolagents comparison
- [Command line interface](cli.md) - Guide to using `freeact` agents on the command line
- [Supported models](models.md) - Overview of models [evaluated](evaluation.md) with `freeact`
- [Model integration](integration.md) - Guidelines for integrating new models into `freeact`
- [Streaming protocol](streaming.md) - Specification for streaming model responses and execution results

## Status

Expand Down
80 changes: 80 additions & 0 deletions docs/integration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Model integration

`freeact` provides both a low-level and high-level API for integrating new models.

- The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions
- The [high-level API](api/generic.md) provides a `GenericModel` class based on the [OpenAI Python SDK](https://github.com/openai/openai-python)

### Low-level API

The low-level API is not further described here. For implementation examples, see the [`freeact.model.claude`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [`freeact.model.gemini`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini) packages.

### High-level API

The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own. Then, you can either create an instance of `GenericModel` or subclass it.

The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index) and locally with [ollama](https://ollama.com/).

#### Prompt templates

Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions. For example:

```python title="freeact/model/qwen/prompt.py"
--8<-- "freeact/model/qwen/prompt.py"
```

!!! Note

These prompt templates are still experimental. They work reasonably well for larger Qwen 2.5 Coder models, but need optimization for smaller ones.

!!! Tip

While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example).

#### Model definition

Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience:

```python title="freeact/model/qwen/model.py"
--8<-- "freeact/model/qwen/model.py"
```

#### Model usage

Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Hugging Face Inference API:

```python title="freeact/examples/qwen.py"
--8<-- "freeact/examples/qwen.py"
```

1. Your Hugging Face [user access token](https://huggingface.co/docs/hub/en/security-tokens)

2. Interact with the agent via a CLI

Run it with:

```bash
HF_TOKEN=<your-huggingface-token> python -m freeact.examples.qwen
```

Alternatively, use the default `freeact` [CLI](cli.md) directly:

```bash
python -m freeact.cli \
--model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
--base-url=https://api-inference.huggingface.co/v1/ \
--api-key=<your-huggingface-token> \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, `--base-url` and `--api-key` to match your local deployment:

```bash
python -m freeact.cli \
--model-name=qwen2.5-coder:32b-instruct-fp16 \
--base-url=http://localhost:11434/v1 \
--api-key=ollama \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```
83 changes: 3 additions & 80 deletions docs/models.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,5 @@
# Supported models

In addition to the models we evaluated, `freeact` also supports any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. See [Model integration](#model-integration) for details.

## Evaluated models

The following models have been [evaluated](evaluation.md) with `freeact`:

- Claude 3.5 Sonnet (20241022)
Expand All @@ -12,85 +8,12 @@ The following models have been [evaluated](evaluation.md) with `freeact`:
- Qwen 2.5 Coder 32B Instruct
- DeepSeek V3

For these models, `freeact` provides model-specific prompt templates.

!!! Tip

For best performance, we recommend using Claude 3.5 Sonnet. Support for Gemini 2.0 Flash, Qwen 2.5 Coder and DeepSeek V3 is still experimental. The Qwen 2.5 Coder integration is described in [Model integration](#model-integration). The DeepSeek V3 integration follows the same pattern using a custom model class.

## Model integration

`freeact` provides both a low-level and high-level API for integrating new models.

- The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions
- The [high-level API](api/generic.md) provides a `GenericModel` implementation of `CodeActModel` using the [OpenAI Python SDK](https://github.com/openai/openai-python)

### Low-level API

The low-level API is not further described here. For implementation examples see packages [claude](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [gemini](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini).

### High-level API

The high-level API support usage of any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. This is shown in the following for Qwen 2.5 Coder 32B Instruct.

#### Prompt templates

Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions:

```python title="freeact/model/qwen/prompt.py"
--8<-- "freeact/model/qwen/prompt.py"
```
For these models, `freeact` provides model-specific prompt templates.

!!! Note

These prompt templates are still experimental.
In addition to the models we evaluated, `freeact` also supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.

!!! Tip

While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example).

#### Model definition

Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience.

```python title="freeact/model/qwen/model.py"
--8<-- "freeact/model/qwen/model.py"
```

#### Model usage

Here's a Python example that uses `QwenCoder` in an interactive CLI:

```python title="freeact/examples/qwen.py"
--8<-- "freeact/examples/qwen.py"
```

1. Your Hugging Face [user access token](https://huggingface.co/docs/hub/en/security-tokens)

Run it with:

```bash
HF_TOKEN=<your-huggingface-token> python -m freeact.examples.qwen
```

Or use the `freeact` CLI directly:

```bash
python -m freeact.cli \
--model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
--base-url=https://api-inference.huggingface.co/v1/ \
--api-key=<your-huggingface-token> \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

For using the same model deployed locally on [ollama](https://ollama.com/), for example, change `--model-name`, `--base-url` and `--api-key` to match your local deployment:

```bash
python -m freeact.cli \
--model-name=qwen2.5-coder:32b-instruct-fp16 \
--base-url=http://localhost:11434/v1 \
--api-key=ollama \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```
For best performance, we recommend Claude 3.5 Sonnet, with DeepSeek V3 as a close second. Support for Gemini 2.0 Flash, Qwen 2.5 Coder, and DeepSeek V3 remains experimental as we continue to optimize their prompt templates.
2 changes: 1 addition & 1 deletion docs/tutorials/basics.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,7 +69,7 @@ The Python example above is part of the `freeact` package and can be run with:
python -m freeact.examples.basics
```

For formatted and colored console output, as shown in the [example conversation](#example-conversation), you can use the `freeact` CLI:
For formatted and colored console output, as shown in the [example conversation](#example-conversation), you can use the `freeact` [CLI](../cli.md):

```shell
--8<-- "freeact/examples/commands.txt:cli-basics-claude"
Expand Down
2 changes: 0 additions & 2 deletions evaluation/evaluate.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@
QwenCoder,
execution_environment,
)
from freeact.cli.utils import dotenv_variables

app = typer.Typer()

Expand Down Expand Up @@ -224,7 +223,6 @@ async def run_agent(
executor_key="agent-evaluation",
ipybox_tag="ghcr.io/gradion-ai/ipybox:eval",
log_file=Path("logs", "agent-evaluation.log"),
env_vars=dotenv_variables(),
) as env:
skill_sources = await env.executor.get_module_sources(
["google_search.api", "visit_webpage.api"],
Expand Down
31 changes: 0 additions & 31 deletions freeact/cli/utils.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
import platform
from contextlib import asynccontextmanager
from pathlib import Path
from typing import Dict

import aiofiles
import prompt_toolkit
from dotenv import dotenv_values
from PIL import Image
from prompt_toolkit.key_binding import KeyBindings
from rich.console import Console
Expand All @@ -19,36 +17,7 @@
CodeActAgentTurn,
CodeActModelTurn,
CodeExecution,
CodeExecutionContainer,
CodeExecutor,
)
from freeact.logger import Logger


def dotenv_variables() -> dict[str, str]:
return {k: v for k, v in dotenv_values().items() if v is not None}


@asynccontextmanager
async def execution_environment(
executor_key: str = "default",
ipybox_tag: str = "ghcr.io/gradion-ai/ipybox:minimal",
env_vars: dict[str, str] = dotenv_variables(),
workspace_path: Path | str = Path("workspace"),
log_file: Path | str = Path("logs", "agent.log"),
):
async with CodeExecutionContainer(
tag=ipybox_tag,
env=env_vars,
workspace_path=workspace_path,
) as container:
async with CodeExecutor(
key=executor_key,
port=container.port,
workspace=container.workspace,
) as executor:
async with Logger(file=log_file) as logger:
yield executor, logger


async def stream_conversation(agent: CodeActAgent, console: Console, show_token_usage: bool = False, **kwargs):
Expand Down
2 changes: 1 addition & 1 deletion freeact/examples/qwen.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ async def main():
)

agent = CodeActAgent(model=model, executor=env.executor)
await stream_conversation(agent, console=Console())
await stream_conversation(agent, console=Console()) # (2)!


if __name__ == "__main__":
Expand Down
Loading

0 comments on commit 7993f3c

Please sign in to comment.