Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I evaluate Ollama models such as Llama 3.3 using RAGAS? #1857

Open
HaniaMo opened this issue Jan 20, 2025 · 4 comments
Open

How can I evaluate Ollama models such as Llama 3.3 using RAGAS? #1857

HaniaMo opened this issue Jan 20, 2025 · 4 comments
Labels
question Further information is requested

Comments

@HaniaMo
Copy link

HaniaMo commented Jan 20, 2025

[ ] I checked the documentation and related resources and couldn't find an answer to my question.

@HaniaMo HaniaMo added the question Further information is requested label Jan 20, 2025
@Vidit-Ostwal
Copy link
Contributor

@dosu Can you solve this ?

Copy link

dosubot bot commented Jan 20, 2025

You can successfully use RAGAS with Ollama models by using the following code:

from langchain_community.chat_models import ChatOllama
from langchain_community.embeddings import OllamaEmbeddings

ollama_llm = "mistral"
model = ChatOllama(model=ollama_llm)
embeddings_ollama = OllamaEmbeddings(model="mistral")

This solution was suggested in a similar closed issue and received positive feedback [1].

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@HaniaMo
Copy link
Author

HaniaMo commented Jan 22, 2025

I successfully run RAGAS to evaluate an Ollama model. However, I get a timeout error during evaluation and NAN values for metrics, although increasing the timeout for the RunConfig.

Evaluating: 0%| | 0/1 [00:00<?, ?it/s]ERROR:ragas.executor:Exception raised in Job[0]: TimeoutError()
{'faithfulness': nan}

**** This is the function I used to calculate the evaluation metrics
def RAGAS_metrics(user_query, model_response, contexts_list):
# 6.2- Run RAGAS metrics
print("\nRunning RAGAS evaluation metrics...")
# To dict
data = {
"question": [user_query], #user_query is string of size 182
"answer": [model_response], #model response is string of size 3361
"contexts": contexts_list, # is a list of the context: [[size=10]]

    }
 dataset = Dataset.from_dict(data)
 results = evaluate(dataset = dataset, metrics=[faithfulness],llm=llm_factory(), 
                    embeddings=embedding_factory(),run_config=RunConfig(max_workers=8,timeout=1000, log_tenacity=True))
 print("Evaluation Results from ragas:")
 print(results)

@jjmachan
Copy link
Member

sadly this is a duplicate of #1170
today we don't support ollama model but will get this fixed in the coming weeks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants