Skip to content

Commit

Permalink
Use LiteLLM for unified model access (#41)
Browse files Browse the repository at this point in the history
  • Loading branch information
krasserm authored Feb 20, 2025
1 parent fddb35e commit f7eb46d
Show file tree
Hide file tree
Showing 45 changed files with 1,493 additions and 1,791 deletions.
6 changes: 2 additions & 4 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,7 @@ jobs:
env:
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
GOOGLE_API_KEY: ${{ secrets.GOOGLE_API_KEY }}
QWEN_API_KEY: ${{ secrets.TEST_QWEN_CODER_API_KEY }}
QWEN_BASE_URL: ${{ secrets.TEST_QWEN_CODER_BASE_URL }}
QWEN_MODEL_NAME: ${{ secrets.TEST_QWEN_CODER_MODEL_NAME }}
QWEN_API_KEY: ${{ secrets.FIREWORKS_API_KEY }}

steps:
- uses: actions/checkout@v4
Expand Down Expand Up @@ -55,4 +53,4 @@ jobs:
shell: bash -l {0}
run: |
docker pull ghcr.io/gradion-ai/ipybox:basic
poetry run pytest tests/integration --no-flaky-report
poetry run pytest tests/integration
10 changes: 4 additions & 6 deletions DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,18 +34,16 @@ Install pre-commit hooks:
invoke precommit-install
```

Create a `.env` file with [Anthropic](https://console.anthropic.com/settings/keys) and [Gemini](https://aistudio.google.com/app/apikey) API keys:
Create a `.env` file with [Anthropic](https://console.anthropic.com/settings/keys), [Gemini](https://aistudio.google.com/app/apikey) and [Fireworks](https://fireworks.ai/account/api-keys) API keys:

```env title=".env"
# Required for Claude 3.5 Sonnet
# Required integration tests with Claude 3.5 Haiku
ANTHROPIC_API_KEY=...
# Required for generative Google Search via Gemini 2
# Required integration tests with Gemini 2. Flash
GOOGLE_API_KEY=...
# Required to run integration tests using Qwen models via HuggingFace API
QWEN_MODEL_NAME=Qwen/Qwen2.5-Coder-32B-Instruct
QWEN_BASE_URL=https://api-inference.huggingface.co/v1/
# Required integration tests with Qwen 2.5 Coder
QWEN_API_KEY=...
```

Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Launch a `freeact` agent with generative Google Search skill using the [CLI](htt

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--model-name=anthropic/claude-3-5-sonnet-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```
Expand All @@ -75,7 +75,7 @@ async def main():
module_names=["freeact_skills.search.google.stream.api"],
)

model = Claude(model_name="claude-3-5-sonnet-20241022", logger=env.logger)
model = Claude(model_name="anthropic/claude-3-5-sonnet-20241022")
agent = CodeActAgent(model=model, executor=env.executor)
await stream_conversation(agent, console=Console(), skill_sources=skill_sources)

Expand Down
5 changes: 0 additions & 5 deletions docs/api/generic.md

This file was deleted.

9 changes: 9 additions & 0 deletions docs/api/litellm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
::: freeact.model.litellm.model
options:
show_root_heading: false
members:
- LiteLLMBase
- LiteLLM
- LiteLLMTurn
- LiteLLMResponse
::: freeact.model.litellm.utils
29 changes: 15 additions & 14 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ The `freeact` CLI supports entering messages that span multiple lines in two way
1. **Copy-paste**: You can directly copy and paste multiline content into the CLI
2. **Manual entry**: Press `Alt+Enter` (Linux/Windows) or `Option+Enter` (macOS) to add a new line while typing

To submit a multiline message, simply press `Enter`.
To submit a multiline message for processing, simply press `Enter`.

![Multiline input](img/multiline.png)

Expand All @@ -42,42 +42,44 @@ This is shown by example in the following two subsections.

### Example 1

The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given a `.env` file with the following content:
The [quickstart](quickstart.md) example requires `ANTHROPIC_API_KEY` and `GOOGLE_API_KEY` to be defined in a `.env` file in the current directory. The `ANTHROPIC_API_KEY` is needed for the `anthropic/claude-3-5-sonnet-20241022` code action model, while the `GOOGLE_API_KEY` is required for the `freeact_skills.search.google.stream.api` skill in the execution environment. Given the following `.env` file

```env title=".env"
# Required for Claude 3.5 Sonnet
# Required by agents that use a Claude 3.5 model as code action model
ANTHROPIC_API_KEY=...
# Required for generative Google Search via Gemini 2
# Required for Google Search via Gemini 2 in the execution environment
GOOGLE_API_KEY=...
```

the following command will launch an agent with `claude-3-5-sonnet-20241022` as code action model configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`:
you can launch an agent with `anthropic/claude-3-5-sonnet-20241022` as code action model, configured with a generative Google search skill implemented by module `freeact_skills.search.google.stream.api`, with the following command:

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--model-name=anthropic/claude-3-5-sonnet-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

The API key can alternatively be passed as command-line argument:
The API key for the code action model can alternatively be passed as command-line argument:

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--model-name=anthropic/claude-3-5-sonnet-20241022 \
--api-key=$ANTHROPIC_API_KEY \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

!!! note
Valid model names are those accepted by [LiteLLM](https://www.litellm.ai/).

### Example 2

To use models from other providers, such as [accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3) hosted by [Fireworks](https://fireworks.ai/), you can either provide all required environment variables in a `.env` file:
To use models from other providers, such as [fireworks_ai/accounts/fireworks/models/deepseek-v3](https://fireworks.ai/models/fireworks/deepseek-v3), you can either provide all required environment variables in a `.env` file:

```env title=".env"
# Required for DeepSeek V3 hosted by Fireworks
DEEPSEEK_BASE_URL=https://api.fireworks.ai/inference/v1
DEEPSEEK_API_KEY=...
# Required for generative Google Search via Gemini 2
Expand All @@ -88,17 +90,16 @@ and launch the agent with

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-v3 \
--model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```

or pass the base URL and API key directly as command-line arguments:
or pass the API key directly as command-line arguments:

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-v3 \
--base-url=https://api.fireworks.ai/inference/v1 \
--model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
--api-key=$DEEPSEEK_API_KEY \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
Expand Down
8 changes: 4 additions & 4 deletions docs/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
pip install freeact
```

## Development installation
## Development setup

The development installation is described in the [Development Guide](https://github.com/gradion-ai/freeact/blob/main/DEVELOPMENT.md).
For development setup instructions, please refer to our [Development Guide](https://github.com/gradion-ai/freeact/blob/main/DEVELOPMENT.md).

## Execution environment
## Execution environments

For creating custom execution environments with your own dependency requirements, see [Execution environment](environment.md).
For an overview of available execution environments and instructions for creating execution environments with custom dependencies pre-installed, see our [Execution environment](environment.md) guide.
36 changes: 17 additions & 19 deletions docs/integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,41 +3,41 @@
`freeact` provides both a low-level and high-level API for integrating new models.

- The [low-level API](api/model.md) defines the `CodeActModel` interface and related abstractions
- The [high-level API](api/generic.md) provides a `GenericModel` class based on the [OpenAI Python SDK](https://github.com/openai/openai-python)
- The [high-level API](api/litellm.md) provides a `LiteLLM` class based on the [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk)

### Low-level API

The low-level API is not further described here. For implementation examples, see the [`freeact.model.claude`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/claude) or [`freeact.model.gemini`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini) packages.
The low-level API is not further described here. For implementation examples, see the [`freeact.model.litellm.model`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/litellm/model.py) or [`freeact.model.gemini.live`](https://github.com/gradion-ai/freeact/tree/main/freeact/model/gemini/live.py) modules.

### High-level API

The high-level API supports usage of models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.
The high-level API supports usage of models from any provider that is compatible with the [LiteLLM Python SDK](https://docs.litellm.ai/docs/#litellm-python-sdk). To use a model, you need to provide prompt templates that guide it to generate code actions. You can either reuse existing templates or create your own.

The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index) and locally with [ollama](https://ollama.com/).
The following subsections demonstrate this using Qwen 2.5 Coder 32B Instruct as an example, showing how to use it both via the [Fireworks](https://docs.fireworks.ai/) API and locally with [ollama](https://ollama.com/).

#### Prompt templates

Start with model-specific prompt templates that guide Qwen 2.5 Coder Instruct models to generate code actions. For example:

```python title="freeact/model/qwen/prompt.py"
`````python title="freeact/model/qwen/prompt.py"
--8<-- "freeact/model/qwen/prompt.py"
```
`````

!!! Tip

While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as a good starting point for other models (as we did for DeepSeek V3, for example).
While tested with Qwen 2.5 Coder Instruct, these prompt templates can also serve as starting point for other models.

#### Model definition

Although we could instantiate `GenericModel` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience:
Although we could instantiate `LiteLLM` directly with these prompt templates, `freeact` provides a `QwenCoder` subclass for convenience:

```python title="freeact/model/qwen/model.py"
--8<-- "freeact/model/qwen/model.py"
```

#### Model usage

Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Hugging Face Inference API:
Here's a Python example that uses `QwenCoder` as code action model in a `freeact` agent. The model is accessed via the Fireworks API:

```python title="examples/qwen.py"
--8<-- "examples/qwen.py"
Expand All @@ -50,27 +50,25 @@ Here's a Python example that uses `QwenCoder` as code action model in a `freeact
Run it with:

```bash
HF_TOKEN=... python -m freeact.examples.qwen
FIREWORKS_API_KEY=... python -m freeact.examples.qwen
```

Alternatively, use the `freeact` [CLI](cli.md) directly:

```bash
python -m freeact.cli \
--model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
--base-url=https://api-inference.huggingface.co/v1/ \
--api-key=$HF_TOKEN \
--model-name=fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$FIREWORKS_API_KEY
```

For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, `--base-url` and `--api-key` to match your local deployment:
For using the same model deployed locally with [ollama](https://ollama.com/), modify `--model-name`, remove `--api-key` and set `--base-url` to match your local deployment:

```bash
python -m freeact.cli \
--model-name=qwen2.5-coder:32b-instruct-fp16 \
--base-url=http://localhost:11434/v1 \
--api-key=ollama \
--model-name=ollama/qwen2.5-coder:32b-instruct-fp16 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
--skill-modules=freeact_skills.search.google.stream.api \
--base-url=http://localhost:11434
```
31 changes: 14 additions & 17 deletions docs/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,18 @@ For the following models, `freeact` provides model-specific prompt templates.
|-----------------------------|------------|-----------|--------------|
| Claude 3.5 Sonnet | 2024-10-22 || optimized |
| Claude 3.5 Haiku | 2024-10-22 || optimized |
| Gemini 2.0 Flash | 2024-02-05 |[^1] | draft |
| Gemini 2.0 Flash | 2024-02-05 |[^1] | experimental |
| Gemini 2.0 Flash Thinking | 2024-02-05 || experimental |
| Qwen 2.5 Coder 32B Instruct | || draft |
| DeepSeek V3 | || draft |
| Qwen 2.5 Coder 32B Instruct | || experimental |
| DeepSeek V3 | || experimental |
| DeepSeek R1[^2] | || experimental |

[^1]: We evaluated Gemini 2.0 Flash Experimental (`gemini-2.0-flash-exp`), released on 2024-12-11.
[^2]: DeepSeek R1 wasn't trained on agentic tool use but demonstrates strong performance with code actions, even surpassing Claude 3.5 Sonnet on the GAIA subset in our [evaluation](evaluation.md). See [this article](https://krasserm.github.io/2025/02/05/deepseek-r1-agent/) for further details.

!!! Info
!!! info

`freeact` additionally supports the [integration](integration.md) of new models from any provider that is compatible with the [OpenAI Python SDK](https://github.com/openai/openai-python), including open models deployed locally with [ollama](https://ollama.com/) or [TGI](https://huggingface.co/docs/text-generation-inference/index), for example.
`freeact` supports the [integration](integration.md) of any model that is compatible with the [LiteLLM](https://www.litellm.ai/) Python SDK.

## Command line

Expand All @@ -28,7 +28,7 @@ This section demonstrates how you can launch `freeact` agents with these models
GOOGLE_API_KEY=...
```

API keys and base URLs for code action models are provided as `--api-key` and `--base-url` arguments, respectively. Code actions are executed in a Docker container created from the [prebuilt](environment.md#prebuilt-docker-images) `ghcr.io/gradion-ai/ipybox:basic` image, passed as `--ipybox-tag` argument.
API keys for code action models are provided as `--api-key` argument, respectively. Code actions are executed in a Docker container created from the [prebuilt](environment.md#prebuilt-docker-images) `ghcr.io/gradion-ai/ipybox:basic` image, passed as `--ipybox-tag` argument.

!!! Info

Expand All @@ -38,7 +38,7 @@ API keys and base URLs for code action models are provided as `--api-key` and `-

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--model-name=anthropic/claude-3-5-sonnet-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$ANTHROPIC_API_KEY
Expand All @@ -48,7 +48,7 @@ python -m freeact.cli \

```bash
python -m freeact.cli \
--model-name=claude-3-5-haiku-20241022 \
--model-name=anthropic/claude-3-5-haiku-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$ANTHROPIC_API_KEY
Expand All @@ -58,7 +58,7 @@ python -m freeact.cli \

```bash
python -m freeact.cli \
--model-name=gemini-2.0-flash \
--model-name=gemini/gemini-2.0-flash \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$GOOGLE_API_KEY
Expand All @@ -68,7 +68,7 @@ python -m freeact.cli \

```bash
python -m freeact.cli \
--model-name=gemini-2.0-flash-thinking-exp \
--model-name=gemini/gemini-2.0-flash-thinking-exp-01-21 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--api-key=$GOOGLE_API_KEY
Expand All @@ -78,31 +78,28 @@ python -m freeact.cli \

```bash
python -m freeact.cli \
--model-name=Qwen/Qwen2.5-Coder-32B-Instruct \
--model-name=fireworks_ai/accounts/fireworks/models/qwen2p5-coder-32b-instruct \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--base-url=https://api-inference.huggingface.co/v1/ \
--api-key=$HF_TOKEN
--api-key=$FIREWORKS_API_KEY
```

### DeepSeek R1

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-r1 \
--model-name=fireworks_ai/accounts/fireworks/models/deepseek-r1 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--base-url=https://api.fireworks.ai/inference/v1 \
--api-key=$FIREWORKS_API_KEY
```

### DeepSeek V3

```bash
python -m freeact.cli \
--model-name=accounts/fireworks/models/deepseek-v3 \
--model-name=fireworks_ai/accounts/fireworks/models/deepseek-v3 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api \
--base-url=https://api.fireworks.ai/inference/v1 \
--api-key=$FIREWORKS_API_KEY
```
5 changes: 4 additions & 1 deletion docs/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ Launch a `freeact` agent with generative Google Search skill using the [CLI](cli

```bash
python -m freeact.cli \
--model-name=claude-3-5-sonnet-20241022 \
--model-name=anthropic/claude-3-5-sonnet-20241022 \
--ipybox-tag=ghcr.io/gradion-ai/ipybox:basic \
--skill-modules=freeact_skills.search.google.stream.api
```
Expand All @@ -31,6 +31,9 @@ or an equivalent Python script:
--8<-- "examples/quickstart.py"
```

!!! note
Valid model names are those accepted by [LiteLLM](https://www.litellm.ai/).

Once launched, you can start interacting with the agent:

<video width="100%" controls>
Expand Down
Loading

0 comments on commit f7eb46d

Please sign in to comment.