Use LiteLLM for unified model access (#41)

gradion-ai · Feb 20, 2025 · f7eb46d · f7eb46d
1 parent fddb35e
commit f7eb46d
Show file tree

Hide file tree

Showing 45 changed files with 1,493 additions and 1,791 deletions.
diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml
@@ -13,9 +13,7 @@ jobs:
     env:
       ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
       GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
-      QWEN_API_KEY: ${{ secrets.TEST_QWEN_CODER_API_KEY }}
-      QWEN_BASE_URL: ${{ secrets.TEST_QWEN_CODER_BASE_URL }}
-      QWEN_MODEL_NAME: ${{ secrets.TEST_QWEN_CODER_MODEL_NAME }}
+      QWEN_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}
 
     steps:
       - uses: actions/checkout@v4
@@ -55,4 +53,4 @@ jobs:
         shell: bash -l {0}
         run: |
           docker pull ghcr.io/gradion-ai/ipybox:basic
-          poetry run pytest tests/integration --no-flaky-report
+          poetry run pytest tests/integration
diff --git a/DEVELOPMENT.md b/DEVELOPMENT.md
@@ -34,18 +34,16 @@ Install pre-commit hooks:
 invoke precommit-install
 ```
 
-Create a `.env` file with [Anthropic](https://console.anthropic.com/settings/keys) and [Gemini](https://aistudio.google.com/app/apikey) API keys:
+Create a `.env` file with [Anthropic](https://console.anthropic.com/settings/keys), [Gemini](https://aistudio.google.com/app/apikey) and [Fireworks](https://fireworks.ai/account/api-keys) API keys:
 
 ```env title=".env"
-# Required for Claude 3.5 Sonnet
+# Required integration tests with Claude 3.5 Haiku
 ANTHROPIC_API_KEY=...
 
-# Required for generative Google Search via Gemini 2
+# Required integration tests with Gemini 2. Flash
 GOOGLE_API_KEY=...
 
-# Required to run integration tests using Qwen models via HuggingFace API
-QWEN_MODEL_NAME=Qwen/Qwen2.5-Coder-32B-Instruct
-QWEN_BASE_URL=https://api-inference.huggingface.co/v1/
+# Required integration tests with Qwen 2.5 Coder
 QWEN_API_KEY=...
 ```
 

diff --git a/README.md b/README.md
@@ -51,7 +51,7 @@ Launch a `freeact` agent with generative Google Search skill using the [CLI](htt
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-sonnet-20241022 \
+  --model-name=anthropic/claude-3-5-sonnet-20241022 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
@@ -75,7 +75,7 @@ async def main():
             module_names=["freeact_skills.search.google.stream.api"],
         )
 
-        model = Claude(model_name="claude-3-5-sonnet-20241022", logger=env.logger)
+        model = Claude(model_name="anthropic/claude-3-5-sonnet-20241022")
         agent = CodeActAgent(model=model, executor=env.executor)
         await stream_conversation(agent, console=Console(), skill_sources=skill_sources)
 

diff --git a/docs/api/generic.md b/docs/api/generic.md
diff --git a/docs/api/litellm.md b/docs/api/litellm.md
@@ -0,0 +1,9 @@
+::: freeact.model.litellm.model
+    options:
+      show_root_heading: false
+      members:
+      - LiteLLMBase
+      - LiteLLM
+      - LiteLLMTurn
+      - LiteLLMResponse
+::: freeact.model.litellm.utils
diff --git a/docs/cli.md b/docs/cli.md
@@ -17,7 +17,7 @@ The `freeact` CLI supports entering messages that span multiple lines in two way
 1. **Copy-paste**: You can directly copy and paste multiline content into the CLI
 2. **Manual entry**: Press `Alt+Enter` (Linux/Windows) or `Option+Enter` (macOS) to add a new line while typing
 
-To submit a multiline message, simply press `Enter`.
+To submit a multiline message for processing, simply press `Enter`.
 
 ![Multiline input](img/multiline.png)
 
@@ -42,42 +42,44 @@ This is shown by example in the following two subsections.
 
 ### Example 1
 
-The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given a `.env` file with the following content:
+The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `anthropic/claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given the following `.env` file
 
 ```env title=".env"
-# Required for Claude 3.5 Sonnet
+# Required by agents that use a Claude 3.5 model as code action model
 ANTHROPIC_API_KEY=...
 
-# Required for generative Google Search via Gemini 2
+# Required for Google Search via Gemini 2 in the execution environment
 GOOGLE_API_KEY=...
 ```
 
-the following command will launch an agent with `claude-3-5-sonnet-20241022` as code action model configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`:
+you can launch an agent with `anthropic/claude-3-5-sonnet-20241022` as code action model, configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`, with the following command:
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-sonnet-20241022 \
+  --model-name=anthropic/claude-3-5-sonnet-20241022 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
 
-The API key can alternatively be passed as command-line argument:
+The API key for the code action model can alternatively be passed as command-line argument:
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-sonnet-20241022 \
+  --model-name=anthropic/claude-3-5-sonnet-20241022 \
   --api-key=$ANTHROPIC_API_KEY \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
 
+!!! note
+    Valid model names are those accepted by [LiteLLM](https://www.litellm.ai/).
+
 ### Example 2
 
-To use models from other providers, such as [accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3) hosted by [Fireworks](https://fireworks.ai/), you can either provide all required environment variables in a `.env` file:
+To use models from other providers, such as [fireworks_ai/accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3), you can either provide all required environment variables in a `.env` file:
 
 ```env title=".env"
 # Required for DeepSeek V3 hosted by Fireworks
-DEEPSEEK_BASE_URL=https://api.fireworks.ai/inference/v1
 DEEPSEEK_API_KEY=...
 
 # Required for generative Google Search via Gemini 2
@@ -88,17 +90,16 @@ and launch the agent with
 
 ```bash
 python -m freeact.cli \
-  --model-name=accounts/fireworks/models/deepseek-v3 \
+  --model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
 
-or pass the base URL and API key directly as command-line arguments:
+or pass the API key directly as command-line arguments:
 
 ```bash
 python -m freeact.cli \
-  --model-name=accounts/fireworks/models/deepseek-v3 \
-  --base-url=https://api.fireworks.ai/inference/v1 \
+  --model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
   --api-key=$DEEPSEEK_API_KEY \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api

diff --git a/docs/installation.md b/docs/installation.md
@@ -6,10 +6,10 @@
 pip install freeact
 ```
 
-## Development installation
+## Development setup
 
-The development installation is described in the [Development Guide](https://github.com/gradion-ai/freeact/blob/main/DEVELOPMENT.md).
+For development setup instructions, please refer to our [Development Guide](https://github.com/gradion-ai/freeact/blob/main/DEVELOPMENT.md).
 
-## Execution environment
+## Execution environments
 
-For creating custom execution environments with your own dependency requirements, see [Execution environment](environment.md).
+For an overview of available execution environments and instructions for creating execution environments with custom dependencies pre-installed, see our [Execution environment](environment.md) guide.
diff --git a/docs/integration.md b/docs/integration.md
@@ -3,41 +3,41 @@
 `freeact` provides both a low-level and high-level API for integrating new models.
 
 - The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions
-- The [high-level API](api/generic.md) provides a `GenericModel` class based on the [OpenAI Python SDK](https://github.com/openai/openai-python)
+- The [high-level API](api/litellm.md) provides a `LiteLLM` class based on the [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk)
 
 ### Low-level API
 
-The low-level API is not further described here. For implementation examples, see the [`freeact.model.claude`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [`freeact.model.gemini`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini) packages.
+The low-level API is not further described here. For implementation examples, see the [`freeact.model.litellm.model`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/litellm/model.py) or [`freeact.model.gemini.live`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini/live.py) modules.
 
 ### High-level API
 
-The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.
+The high-level API supports usage of models from any provider that is compatible with the [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.
 
-The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index) and locally with [ollama](https://ollama.com/).
+The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Fireworks](https://docs.fireworks.ai/) API and locally with [ollama](https://ollama.com/).
 
 #### Prompt templates
 
 Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions. For example:
 
-```python title="freeact/model/qwen/prompt.py"
+`````python title="freeact/model/qwen/prompt.py"
 --8<-- "freeact/model/qwen/prompt.py"
-```
+`````
 
 !!! Tip
 
-    While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example).
+    While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as starting point for other models.
 
 #### Model definition
 
-Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience:
+Although we could instantiate `LiteLLM` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience:
 
 ```python title="freeact/model/qwen/model.py"
 --8<-- "freeact/model/qwen/model.py"
 ```
 
 #### Model usage
 
-Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Hugging Face Inference API:
+Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Fireworks API:
 
 ```python title="examples/qwen.py"
 --8<-- "examples/qwen.py"
@@ -50,27 +50,25 @@ Here's a Python example that uses `QwenCoder` as code action model in a `freeact
 Run it with:
 
 ```bash
-HF_TOKEN=... python -m freeact.examples.qwen
+FIREWORKS_API_KEY=... python -m freeact.examples.qwen
 ```
 
 Alternatively, use the `freeact` [CLI](cli.md) directly:
 
 ```bash
 python -m freeact.cli \
-  --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
-  --base-url=https://api-inference.huggingface.co/v1/ \
-  --api-key=$HF_TOKEN \
+  --model-name=fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
-  --skill-modules=freeact_skills.search.google.stream.api
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --api-key=$FIREWORKS_API_KEY
 ```
 
-For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, `--base-url` and `--api-key` to match your local deployment:
+For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, remove `--api-key` and set `--base-url` to match your local deployment:
 
 ```bash
 python -m freeact.cli \
-  --model-name=qwen2.5-coder:32b-instruct-fp16 \
-  --base-url=http://localhost:11434/v1 \
-  --api-key=ollama \
+  --model-name=ollama/qwen2.5-coder:32b-instruct-fp16 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
-  --skill-modules=freeact_skills.search.google.stream.api
+  --skill-modules=freeact_skills.search.google.stream.api \
+  --base-url=http://localhost:11434
 ```
diff --git a/docs/models.md b/docs/models.md
@@ -6,18 +6,18 @@ For the following models, `freeact` provides model-specific prompt templates.
 |-----------------------------|------------|-----------|--------------|
 | Claude 3.5 Sonnet           | 2024-10-22 | ✓         | optimized    |
 | Claude 3.5 Haiku            | 2024-10-22 | ✓         | optimized    |
-| Gemini 2.0 Flash            | 2024-02-05 | ✓[^1]     | draft        |
+| Gemini 2.0 Flash            | 2024-02-05 | ✓[^1]     | experimental |
 | Gemini 2.0 Flash Thinking   | 2024-02-05 | ✗         | experimental |
-| Qwen 2.5 Coder 32B Instruct |            | ✓         | draft        |
-| DeepSeek V3                 |            | ✓         | draft        |
+| Qwen 2.5 Coder 32B Instruct |            | ✓         | experimental |
+| DeepSeek V3                 |            | ✓         | experimental |
 | DeepSeek R1[^2]             |            | ✓         | experimental |
 
 [^1]: We evaluated Gemini 2.0 Flash Experimental (`gemini-2.0-flash-exp`), released on 2024-12-11.
 [^2]: DeepSeek R1 wasn't trained on agentic tool use but demonstrates strong performance with code actions, even surpassing Claude 3.5 Sonnet on the GAIA subset in our [evaluation](evaluation.md). See [this article](https://krasserm.github.io/2025/02/05/deepseek-r1-agent/) for further details. 
 
-!!! Info
+!!! info
 
-    `freeact` additionally supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
+    `freeact` supports the [integration](integration.md) of any model that is compatible with the [LiteLLM](https://www.litellm.ai/) Python SDK.
 
 ## Command line
 
@@ -28,7 +28,7 @@ This section demonstrates how you can launch `freeact` agents with these models
 GOOGLE_API_KEY=...
 ```
 
-API keys and base URLs for code action models are provided as `--api-key` and `--base-url` arguments, respectively. Code actions are executed in a Docker container created from the [prebuilt](environment.md#prebuilt-docker-images) `ghcr.io/gradion-ai/ipybox:basic` image, passed as `--ipybox-tag` argument.
+API keys for code action models are provided as `--api-key` argument, respectively. Code actions are executed in a Docker container created from the [prebuilt](environment.md#prebuilt-docker-images) `ghcr.io/gradion-ai/ipybox:basic` image, passed as `--ipybox-tag` argument.
 
 !!! Info
 
@@ -38,7 +38,7 @@ API keys and base URLs for code action models are provided as `--api-key` and `-
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-sonnet-20241022 \
+  --model-name=anthropic/claude-3-5-sonnet-20241022 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
   --api-key=$ANTHROPIC_API_KEY
@@ -48,7 +48,7 @@ python -m freeact.cli \
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-haiku-20241022 \
+  --model-name=anthropic/claude-3-5-haiku-20241022 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
   --api-key=$ANTHROPIC_API_KEY
@@ -58,7 +58,7 @@ python -m freeact.cli \
 
 ```bash
 python -m freeact.cli \
-  --model-name=gemini-2.0-flash \
+  --model-name=gemini/gemini-2.0-flash \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
   --api-key=$GOOGLE_API_KEY
@@ -68,7 +68,7 @@ python -m freeact.cli \
 
 ```bash
 python -m freeact.cli \
-  --model-name=gemini-2.0-flash-thinking-exp \
+  --model-name=gemini/gemini-2.0-flash-thinking-exp-01-21 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
   --api-key=$GOOGLE_API_KEY
@@ -78,31 +78,28 @@ python -m freeact.cli \
 
 ```bash
 python -m freeact.cli \
-  --model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
+  --model-name=fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
-  --base-url=https://api-inference.huggingface.co/v1/ \
-  --api-key=$HF_TOKEN
+  --api-key=$FIREWORKS_API_KEY
 ```
 
 ### DeepSeek R1
 
 ```bash
 python -m freeact.cli \
-  --model-name=accounts/fireworks/models/deepseek-r1 \
+  --model-name=fireworks_ai/accounts/fireworks/models/deepseek-r1 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
-  --base-url=https://api.fireworks.ai/inference/v1 \
   --api-key=$FIREWORKS_API_KEY
 ```
 
 ### DeepSeek V3
 
 ```bash
 python -m freeact.cli \
-  --model-name=accounts/fireworks/models/deepseek-v3 \
+  --model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api \
-  --base-url=https://api.fireworks.ai/inference/v1 \
   --api-key=$FIREWORKS_API_KEY
 ```
diff --git a/docs/quickstart.md b/docs/quickstart.md
@@ -20,7 +20,7 @@ Launch a `freeact` agent with generative Google Search skill using the [CLI](cli
 
 ```bash
 python -m freeact.cli \
-  --model-name=claude-3-5-sonnet-20241022 \
+  --model-name=anthropic/claude-3-5-sonnet-20241022 \
   --ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
   --skill-modules=freeact_skills.search.google.stream.api
 ```
@@ -31,6 +31,9 @@ or an equivalent Python script:
 --8<-- "examples/quickstart.py"
 ```
 
+!!! note
+    Valid model names are those accepted by [LiteLLM](https://www.litellm.ai/).
+
 Once launched, you can start interacting with the agent:
 
 <video width="100%" controls>