diff --git a/README.md b/README.md index e34d0c2..2308181 100644 --- a/README.md +++ b/README.md @@ -15,11 +15,9 @@ A lightweight library for code-action based agents. - [Introduction](#introduction) - [Key capabilities](#key-capabilities) -- [Supported models](#supported-models) - [Quickstart](#quickstart) - [Evaluation](#evaluation) - -The `freeact` documentation is available [here](https://gradion-ai.github.io/freeact/). +- [Supported models](#supported-models) ## Introduction @@ -35,10 +33,6 @@ The library's architecture emphasizes extensibility and transparency, avoiding t `freeact` executes all code actions within [`ipybox`](https://gradion-ai.github.io/ipybox/), a secure execution environment built on IPython and Docker that can also be deployed locally. This ensures safe execution of dynamically generated code while maintaining full access to the Python ecosystem. Combined with its lightweight and extensible architecture, `freeact` provides a robust foundation for building adaptable AI agents that can tackle real-world challenges requiring dynamic problem-solving approaches. -## Supported models - -In addition to the models we [evaluated](#evaluation), `freeact` also supports any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. See [Model integration](https://gradion-ai.github.io/freeact/models/#model-integration) for details. - ## Quickstart Install `freeact` using pip: @@ -57,7 +51,7 @@ ANTHROPIC_API_KEY=... GOOGLE_API_KEY=... ``` -Launch a `freeact` agent with generative Google Search skill using the CLI +Launch a `freeact` agent with generative Google Search skill using the [CLI](https://gradion-ai.github.io/freeact/cli/): ```bash python -m freeact.cli \ @@ -117,3 +111,7 @@ When comparing our results with smolagents using Claude 3.5 Sonnet on [m-ric/age [Performance comparison](docs/eval/eval-plot-comparison.png) Interestingly, these results were achieved using zero-shot prompting in `freeact`, while the smolagents implementation utilizes few-shot prompting. To ensure a fair comparison, we employed identical evaluation protocols and tools. You can find all evaluation details [here](evaluation). + +## Supported models + +In addition to the models we [evaluated](#evaluation), `freeact` also supports the [integration](https://gradion-ai.github.io/freeact/integration/) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. diff --git a/docs/cli.md b/docs/cli.md index bda7889..934d966 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -18,3 +18,71 @@ The `freeact` CLI supports entering messages that span multiple lines in two way To submit a multiline message, simply press `Enter`. ![Multiline input](img/multiline.png) + +## Environment variables + +The CLI reads environment variables from a `.env` file in the current directory and passes them to the [execution environment](installation.md#execution-environment). API keys required for an agent's code action model must be either defined in the `.env` file, passed as command-line arguments, or directly set as variables in the shell. + +### Example 1 + +The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given a `.env` file with the following content: + +```env title=".env" +# Required for Claude 3.5 Sonnet +ANTHROPIC_API_KEY=your-anthropic-api-key + +# Required for generative Google Search via Gemini 2 +GOOGLE_API_KEY=your-google-api-key +``` + +the following command will launch an agent with `claude-3-5-sonnet-20241022` as code action model configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`: + +```bash +python -m freeact.cli \ + --model-name=claude-3-5-sonnet-20241022 \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` + +The API key can alternatively be passed as command-line argument: + +```bash +python -m freeact.cli \ + --model-name=claude-3-5-sonnet-20241022 \ + --api-key=your-anthropic-api-key \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` + +### Example 2 + +To use models from other providers, such as [accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3) hosted by [Fireworks](https://fireworks.ai/), you can either provide all required environment variables in a `.env` file: + +```env title=".env" +# Required for DeepSeek V3 hosted by Fireworks +DEEPSEEK_BASE_URL=https://api.fireworks.ai/inference/v1 +DEEPSEEK_API_KEY=your-deepseek-api-key + +# Required for generative Google Search via Gemini 2 +GOOGLE_API_KEY=your-google-api-key +``` + +and launch the agent with + +```bash +python -m freeact.cli \ + --model-name=accounts/fireworks/models/deepseek-v3 \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` + +or pass the base URL and API key directly as command-line arguments: + +```bash +python -m freeact.cli \ + --model-name=accounts/fireworks/models/deepseek-v3 \ + --base-url=https://api.fireworks.ai/inference/v1 \ + --api-key=your-deepseek-api-key \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` diff --git a/docs/index.md b/docs/index.md index c2e6aef..a78dfa5 100644 --- a/docs/index.md +++ b/docs/index.md @@ -19,16 +19,16 @@ The library's architecture emphasizes extensibility and transparency, avoiding t ## Next steps - [Quickstart](quickstart.md) - Launch your first `freeact` agent and interact with it on the command line +- [Installation](installation.md) - Installation instructions and configuration of execution environments - [Building blocks](blocks.md) - Learn about the essential components of a `freeact` agent system - [Tutorials](tutorials/index.md) - Tutorials demonstrating the `freeact` building blocks ## Further reading -- [Installation](installation.md) - Detailed instructions for building custom execution environments -- [Command line](cli.md) - Minimalistic command-line interface for running `freeact` agents -- [Supported models](models.md) - Overview of evaluated models and how to [integrate new ones](models.md#model-integration). -- [Streaming protocol](streaming.md) - Protocol for streaming model responses and execution results -- [Evaluation results](evaluation.md) - Evaluation of `freeact` performance incl. a smolagents comparison +- [Command line interface](cli.md) - Guide to using `freeact` agents on the command line +- [Supported models](models.md) - Overview of models [evaluated](evaluation.md) with `freeact` +- [Model integration](integration.md) - Guidelines for integrating new models into `freeact` +- [Streaming protocol](streaming.md) - Specification for streaming model responses and execution results ## Status diff --git a/docs/integration.md b/docs/integration.md new file mode 100644 index 0000000..fda6c6a --- /dev/null +++ b/docs/integration.md @@ -0,0 +1,80 @@ +# Model integration + +`freeact` provides both a low-level and high-level API for integrating new models. + +- The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions +- The [high-level API](api/generic.md) provides a `GenericModel` class based on the [OpenAI Python SDK](https://github.com/openai/openai-python) + +### Low-level API + +The low-level API is not further described here. For implementation examples, see the [`freeact.model.claude`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [`freeact.model.gemini`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini) packages. + +### High-level API + +The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own. Then, you can either create an instance of `GenericModel` or subclass it. + +The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index) and locally with [ollama](https://ollama.com/). + +#### Prompt templates + +Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions. For example: + +```python title="freeact/model/qwen/prompt.py" +--8<-- "freeact/model/qwen/prompt.py" +``` + +!!! Note + + These prompt templates are still experimental. They work reasonably well for larger Qwen 2.5 Coder models, but need optimization for smaller ones. + +!!! Tip + + While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example). + +#### Model definition + +Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience: + +```python title="freeact/model/qwen/model.py" +--8<-- "freeact/model/qwen/model.py" +``` + +#### Model usage + +Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Hugging Face Inference API: + +```python title="freeact/examples/qwen.py" +--8<-- "freeact/examples/qwen.py" +``` + +1. Your Hugging Face [user access token](https://huggingface.co/docs/hub/en/security-tokens) + +2. Interact with the agent via a CLI + +Run it with: + +```bash +HF_TOKEN= python -m freeact.examples.qwen +``` + +Alternatively, use the default `freeact` [CLI](cli.md) directly: + +```bash +python -m freeact.cli \ + --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \ + --base-url=https://api-inference.huggingface.co/v1/ \ + --api-key= \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` + +For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, `--base-url` and `--api-key` to match your local deployment: + +```bash +python -m freeact.cli \ + --model-name=qwen2.5-coder:32b-instruct-fp16 \ + --base-url=http://localhost:11434/v1 \ + --api-key=ollama \ + --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ + --skill-modules=freeact_skills.search.google.stream.api +``` diff --git a/docs/models.md b/docs/models.md index 8b53309..b707325 100644 --- a/docs/models.md +++ b/docs/models.md @@ -1,9 +1,5 @@ # Supported models -In addition to the models we evaluated, `freeact` also supports any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. See [Model integration](#model-integration) for details. - -## Evaluated models - The following models have been [evaluated](evaluation.md) with `freeact`: - Claude 3.5 Sonnet (20241022) @@ -12,85 +8,12 @@ The following models have been [evaluated](evaluation.md) with `freeact`: - Qwen 2.5 Coder 32B Instruct - DeepSeek V3 -For these models, `freeact` provides model-specific prompt templates. - -!!! Tip - - For best performance, we recommend using Claude 3.5 Sonnet. Support for Gemini 2.0 Flash, Qwen 2.5 Coder and DeepSeek V3 is still experimental. The Qwen 2.5 Coder integration is described in [Model integration](#model-integration). The DeepSeek V3 integration follows the same pattern using a custom model class. - -## Model integration - -`freeact` provides both a low-level and high-level API for integrating new models. - -- The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions -- The [high-level API](api/generic.md) provides a `GenericModel` implementation of `CodeActModel` using the [OpenAI Python SDK](https://github.com/openai/openai-python) - -### Low-level API - -The low-level API is not further described here. For implementation examples see packages [claude](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [gemini](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini). - -### High-level API - -The high-level API support usage of any model from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including models deployed locally on [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. This is shown in the following for Qwen 2.5 Coder 32B Instruct. - -#### Prompt templates - -Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions: - -```python title="freeact/model/qwen/prompt.py" ---8<-- "freeact/model/qwen/prompt.py" -``` +For these models, `freeact` provides model-specific prompt templates. !!! Note - These prompt templates are still experimental. + In addition to the models we evaluated, `freeact` also supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example. !!! Tip - While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example). - -#### Model definition - -Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience. - -```python title="freeact/model/qwen/model.py" ---8<-- "freeact/model/qwen/model.py" -``` - -#### Model usage - -Here's a Python example that uses `QwenCoder` in an interactive CLI: - -```python title="freeact/examples/qwen.py" ---8<-- "freeact/examples/qwen.py" -``` - -1. Your Hugging Face [user access token](https://huggingface.co/docs/hub/en/security-tokens) - -Run it with: - -```bash -HF_TOKEN= python -m freeact.examples.qwen -``` - -Or use the `freeact` CLI directly: - -```bash -python -m freeact.cli \ - --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \ - --base-url=https://api-inference.huggingface.co/v1/ \ - --api-key= \ - --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ - --skill-modules=freeact_skills.search.google.stream.api -``` - -For using the same model deployed locally on [ollama](https://ollama.com/), for example, change `--model-name`, `--base-url` and `--api-key` to match your local deployment: - -```bash -python -m freeact.cli \ - --model-name=qwen2.5-coder:32b-instruct-fp16 \ - --base-url=http://localhost:11434/v1 \ - --api-key=ollama \ - --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \ - --skill-modules=freeact_skills.search.google.stream.api -``` + For best performance, we recommend Claude 3.5 Sonnet, with DeepSeek V3 as a close second. Support for Gemini 2.0 Flash, Qwen 2.5 Coder, and DeepSeek V3 remains experimental as we continue to optimize their prompt templates. diff --git a/docs/tutorials/basics.md b/docs/tutorials/basics.md index 3fe13c6..4dd7a0d 100644 --- a/docs/tutorials/basics.md +++ b/docs/tutorials/basics.md @@ -69,7 +69,7 @@ The Python example above is part of the `freeact` package and can be run with: python -m freeact.examples.basics ``` -For formatted and colored console output, as shown in the [example conversation](#example-conversation), you can use the `freeact` CLI: +For formatted and colored console output, as shown in the [example conversation](#example-conversation), you can use the `freeact` [CLI](../cli.md): ```shell --8<-- "freeact/examples/commands.txt:cli-basics-claude" diff --git a/evaluation/evaluate.py b/evaluation/evaluate.py index 39bc8e0..7430f1e 100644 --- a/evaluation/evaluate.py +++ b/evaluation/evaluate.py @@ -25,7 +25,6 @@ QwenCoder, execution_environment, ) -from freeact.cli.utils import dotenv_variables app = typer.Typer() @@ -224,7 +223,6 @@ async def run_agent( executor_key="agent-evaluation", ipybox_tag="ghcr.io/gradion-ai/ipybox:eval", log_file=Path("logs", "agent-evaluation.log"), - env_vars=dotenv_variables(), ) as env: skill_sources = await env.executor.get_module_sources( ["google_search.api", "visit_webpage.api"], diff --git a/freeact/cli/utils.py b/freeact/cli/utils.py index bec9a61..3a83cc2 100644 --- a/freeact/cli/utils.py +++ b/freeact/cli/utils.py @@ -1,11 +1,9 @@ import platform -from contextlib import asynccontextmanager from pathlib import Path from typing import Dict import aiofiles import prompt_toolkit -from dotenv import dotenv_values from PIL import Image from prompt_toolkit.key_binding import KeyBindings from rich.console import Console @@ -19,36 +17,7 @@ CodeActAgentTurn, CodeActModelTurn, CodeExecution, - CodeExecutionContainer, - CodeExecutor, ) -from freeact.logger import Logger - - -def dotenv_variables() -> dict[str, str]: - return {k: v for k, v in dotenv_values().items() if v is not None} - - -@asynccontextmanager -async def execution_environment( - executor_key: str = "default", - ipybox_tag: str = "ghcr.io/gradion-ai/ipybox:minimal", - env_vars: dict[str, str] = dotenv_variables(), - workspace_path: Path | str = Path("workspace"), - log_file: Path | str = Path("logs", "agent.log"), -): - async with CodeExecutionContainer( - tag=ipybox_tag, - env=env_vars, - workspace_path=workspace_path, - ) as container: - async with CodeExecutor( - key=executor_key, - port=container.port, - workspace=container.workspace, - ) as executor: - async with Logger(file=log_file) as logger: - yield executor, logger async def stream_conversation(agent: CodeActAgent, console: Console, show_token_usage: bool = False, **kwargs): diff --git a/freeact/examples/qwen.py b/freeact/examples/qwen.py index 565894b..0dcbace 100644 --- a/freeact/examples/qwen.py +++ b/freeact/examples/qwen.py @@ -23,7 +23,7 @@ async def main(): ) agent = CodeActAgent(model=model, executor=env.executor) - await stream_conversation(agent, console=Console()) + await stream_conversation(agent, console=Console()) # (2)! if __name__ == "__main__": diff --git a/mkdocs.yml b/mkdocs.yml index 1767a2f..52fd9a9 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -95,6 +95,7 @@ nav: - Skill development: tutorials/skills.md - System extensions: tutorials/extend.md - Advanced topics: + - Model integration: integration.md - Streaming protocol: streaming.md - Evaluation results: evaluation.md - API Documentation: