Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/ollama extension #1652

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion extensions/llms/litellm/pandasai_litellm/litellm.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
import logging

from litellm import completion

from pandasai.agent.state import AgentState
from pandasai.core.prompts.base import BasePrompt
from pandasai.llm.base import LLM
import logging


class LiteLLM(LLM):
Expand Down
93 changes: 93 additions & 0 deletions extensions/llms/ollama/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# PandaAI Ollama LLM Extension

This extension integrates Ollama language models with PandaAI. It allows you to use Ollama's LLMs as the backend for generating Python code in response to natural language queries on your dataframes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an inconsistency in the spelling of the project name between the title and the text. The title uses "PandasAI" while the text in line 3 and later sections refers to "PandaAI". For clarity and consistency, please standardize the naming to one consistent form.

Suggested change
This extension integrates Ollama language models with PandaAI. It allows you to use Ollama's LLMs as the backend for generating Python code in response to natural language queries on your dataframes.
This extension integrates Ollama language models with PandasAI. It allows you to use Ollama's LLMs as the backend for generating Python code in response to natural language queries on your dataframes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you,I will take all that in account, and correct the issues

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Changes Description:

Overview

  • Uses a consistent naming convention ("PandaAI") across the code and documentation.
  • Provides clearer installation instructions in the README.
  • Supports a customizable base URL (normalized to end with /v1) and flexible model parameters (e.g., temperature, max_tokens).
  • Updates the integration tests to assert expected outputs rather than just printing responses.

Key Changes

Naming Consistency:

  • Updated the README and code comments so that the project is consistently referred to as "PandaAI".

Installation Instructions:

  • Improved README installation instructions to clearly show how to install and configure the extension.

Configuration & Parameters:

  • The OllamaLLM class now uses a dummy API key (since Ollama does not require one) and accepts a custom base URL via the ollama_base_url parameter.
  • Parameters such as model, temperature, and max_tokens are correctly forwarded to the underlying OpenAI client.

Testing Improvements:

  • Updated the tests in extensions/llms/ollama/tests/test_ollama.py to include proper assertions (e.g., asserting that the response is not None).
  • Removed the temporary test_1.py file that was used for manual testing.

Known Issues:

  • There is still a known issue where the generated LLM response may not always return the expected Python code snippet (either resulting in a NoCodeFoundError or an ExecuteSQLQueryNotUsed error). This is due to the inherent variability in model outputs and will be improved in future updates.

How to Test

  1. Installation:

    • Clone the repository and navigate to the extension directory.
    • Run:
      poetry install
      poetry add pandasai-ollama
  2. Configuration & Usage:

    • Configure via environment variables or directly in code (see updated README).
  3. Run Tests:

    • Execute:
      poetry run pytest extensions/llms/ollama/tests

Please review the changes and let me know if further modifications are required (my latests changes).


## Features

- **Ollama Integration:** Leverage Ollama's powerful language models (e.g., `llama2`, `llama3.2`) within PandaAI.
- **Customizable Base URL:** Easily change the Ollama base URL (default is `http://localhost:11434/v1`) to point to your own Ollama server.
- **Flexible Model Parameters:** Configure model parameters such as temperature, max tokens, top_p, frequency penalty, etc.
- **Chat & Non-Chat Modes:** Supports both conversational (chat) mode and standard completion mode.

## Installation

1. **Clone the Repository** (if you haven't already):
```bash
git clone https://github.com/sinaptik-ai/pandas-ai.git
cd pandas-ai
```

2. **Navigate to the Ollama Extension Directory:**
```bash
cd extensions/llms/ollama
```

3. **Install the Extension Dependencies:**
```bash
poetry install

poetry add pandasai-ollama
```

> **Note:** If you encounter packaging issues, ensure that this directory contains a valid `README.md` file. This README file is required for Poetry to install the project.

## Configuration

### Environment Variables

You can configure the extension by setting the following environment variables:

- **`OLLAMA_API_KEY`**
A dummy API key is required by the extension (the key itself is not used).
*Default:* `ollama`

- **`OLLAMA_BASE_URL`**
Supports a customizable base URL (normalized to end with `/v1`) and flexible model parameters (e.g., temperature, max_tokens)..
*Default format to set:* `http://localhost:11434`

### Code-Based Configuration

You can also override configuration options directly in your code when setting up PandaAI:

```python
import pandasai as pai
from extensions.llms.ollama.pandasai_ollama.ollama import OllamaLLM

# For Ollama, we use a dummy API key ("ollama") since it isn’t used.
pai.api_key.set("ollama")

# Set the global configuration to use the Ollama LLM
pai.config.set(
{
"llm": OllamaLLM(
api_key="ollama",
ollama_base_url="http://localhost:11434", # Custom URL if needed
model="llama3.2:latest", # Specify the model (can be overridden)
temperature=0.7,
max_tokens=150,
)
}
)
```

## Usage

Once you have configured the extension, you can use PandaAI’s DataFrame interface to interact with your data. For example:

```python
import pandasai as pai

# Create a sample DataFrame
df = pai.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy"],
"revenue": [5000, 3200, 2900, 4100, 2300]
})

# Ask a natural language question that expects a Python code answer
response = df.chat("Which are the top 5 countries by sales?")
print("Response from Ollama:", response)
```

The extension sends your prompt (and any conversation context) to the Ollama LLM backend. The LLM is expected to return a valid Python code snippet that, when executed, produces the desired result.


115 changes: 115 additions & 0 deletions extensions/llms/ollama/pandasai_ollama/ollama.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
import os
from typing import Any, Dict, Optional

import openai

from pandasai.agent.state import AgentState
from pandasai.core.prompts.base import BasePrompt
from pandasai.core.prompts.generate_system_message import GenerateSystemMessagePrompt
from pandasai.helpers import load_dotenv
from pandasai.llm.base import LLM

# Load .env if present
load_dotenv()


class OllamaLLM(LLM):
_type: str = "ollama"
model: str = "llama2" # default model, can be overridden

def __init__(
self,
api_key: Optional[str] = None,
ollama_base_url: Optional[str] = None,
**kwargs: Any,
) -> None:
# For Ollama, a dummy API key is required (but not used)
self.api_key = api_key or os.getenv("OLLAMA_API_KEY", "ollama")
# Get base URL from parameter or env, default to localhost
base = ollama_base_url or os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
# Ensure the base URL ends with "/v1"
if not base.rstrip("/").endswith("/v1"):
base = base.rstrip("/") + "/v1"
self.base_url = base

# Set additional parameters (e.g. model, temperature, etc.)
self._set_params(**kwargs)

# Assume chat mode by default
self._is_chat_model = True
self.client = openai.OpenAI(
api_key=self.api_key,
base_url=self.base_url,
**self._client_params,
).chat.completions

def _set_params(self, **kwargs: Any) -> None:
valid_params = [
"model",
"temperature",
"max_tokens",
"top_p",
"frequency_penalty",
"presence_penalty",
"stop",
"n",
"best_of",
"request_timeout",
"max_retries",
"seed",
]
for key, value in kwargs.items():
if key in valid_params:
setattr(self, key, value)

@property
def _client_params(self) -> Dict[str, Any]:
return {
"timeout": getattr(self, "request_timeout", None),
"max_retries": getattr(self, "max_retries", 2),
}

@property
def _default_params(self) -> Dict[str, Any]:
params: Dict[str, Any] = {
"temperature": getattr(self, "temperature", 0),
"top_p": getattr(self, "top_p", 1),
"frequency_penalty": getattr(self, "frequency_penalty", 0),
"presence_penalty": getattr(self, "presence_penalty", 0),
"n": getattr(self, "n", 1),
}
if hasattr(self, "max_tokens") and self.max_tokens is not None:
params["max_tokens"] = self.max_tokens
if hasattr(self, "stop") and self.stop is not None:
params["stop"] = [self.stop]
if hasattr(self, "best_of") and self.best_of > 1:
params["best_of"] = self.best_of
return params

def call(self, instruction: BasePrompt, context: AgentState = None) -> str:
# Get the base prompt string from the user instruction.
prompt_str = instruction.to_string()
# If a context is provided with conversation memory,
# prepend the system prompt (generated via GenerateSystemMessagePrompt).
if context and context.memory:
system_prompt = GenerateSystemMessagePrompt(memory=context.memory)
prompt_str = system_prompt.to_string() + "\n" + prompt_str

if self._is_chat_model:
response = self.client.create(
model=self.model,
messages=[{"role": "user", "content": prompt_str}],
**self._default_params,
)
return response.choices[0].message.content
else:
response = self.client.create(
model=self.model,
prompt=prompt_str,
**self._default_params,
)
return response.choices[0].text

@property
def type(self) -> str:
return self._type
Loading