Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to solve the ("All samples must be of the same type") issue? #1875

Closed
tvsathish opened this issue Jan 23, 2025 · 5 comments
Closed

How to solve the ("All samples must be of the same type") issue? #1875

tvsathish opened this issue Jan 23, 2025 · 5 comments
Labels
answered 🤖 The question has been answered. Will be closed automatically if no new comments bug Something isn't working question Further information is requested

Comments

@tvsathish
Copy link

tvsathish commented Jan 23, 2025

[ ] I have checked the documentation and related resources and couldn't resolve my bug.

Describe the bug
I am getting the following error when manually creating SingleTurnSamples from my dataset.
("All samples must be of the same type")
How to find the data frame that contributes to a mismatched sample record?

Ragas version: 0.2.12
Python version: 3.9

Code to Reproduce

import pprint
import re

import pandas as pd
from langchain_openai import AzureChatOpenAI
from langchain_openai import AzureOpenAIEmbeddings
from ragas import SingleTurnSample, EvaluationDataset
from ragas import evaluate
from ragas.embeddings import LangchainEmbeddingsWrapper
from ragas.llms import LangchainLLMWrapper
from ragas.metrics import LLMContextRecall, Faithfulness, FactualCorrectness, SemanticSimilarity

from urllib.request import urlopen
from bs4 import BeautifulSoup

from concurrent.futures import ThreadPoolExecutor, as_completed

def empty_nan_value(cell_value):
    return '' if pd.isna(cell_value) else cell_value


def create_turn_sample(row):
    url = re.split(r'[,\n ]+', empty_nan_value(row['reference']))[0]
    page = ''
    try:
        page = urlopen(url)
    except ValueError:
        return
    soup = BeautifulSoup(page, features='lxml')
    return SingleTurnSample(
        user_input=row['user_input'],
        retrieved_contexts=[empty_nan_value(row['context1']), empty_nan_value(row['context2']),
                            empty_nan_value(row['context3']),
                            empty_nan_value(row['context4'])],
        response=empty_nan_value(row['response']),
        reference=soup.get_text())


df = pd.read_excel("Test Automation Result.xlsx")

with ThreadPoolExecutor(max_workers=50) as executor:
    future_to_row = {
        executor.submit(create_turn_sample, row): index for index, (idx, row) in enumerate(df.iterrows(), start=0)
    }

samples = []
for future in as_completed(future_to_row):
    status = future_to_row[future]
    samples.append(future.result())

pprint.pprint(samples)

eval_dataset = EvaluationDataset(samples)

# other configuration
azure_config = {
    "base_url": <BASE_URL>,
    "model_deployment": <DEPLOYMENT_NAME>
    "model_name": "gpt-4o"  # your model name
}

evaluator_llm = LangchainLLMWrapper(AzureChatOpenAI(
    openai_api_version="2024-08-01-preview",
    azure_endpoint=azure_config["base_url"],
    azure_deployment=azure_config["model_deployment"],
    model=azure_config["model_name"],
    validate_base_url=False,
))

metrics = [
    LLMContextRecall(llm=evaluator_llm),
    FactualCorrectness(llm=evaluator_llm),
    Faithfulness(llm=evaluator_llm)
]
results = evaluate(dataset=eval_dataset, metrics=metrics)
pprint.pprint(results)

df = results.to_pandas()
df.head()

I can't share the excel sheet itself due to privacy reasons

Error trace
Traceback (most recent call last):

  File "<HOME_DIR>/Desktop/rag_eval.py", line 53, in <module>
    eval_dataset = EvaluationDataset(samples)
  File "<string>", line 4, in __init__
  File "<PROJECT_DIR>/.venv/lib/python3.9/site-packages/ragas/dataset_schema.py", line 173, in __post_init__
    self.samples = self.validate_samples(self.samples)
  File "<PROJECT_DIR>/.venv/lib/python3.9/site-packages/ragas/dataset_schema.py", line 193, in validate_samples
    raise ValueError("All samples must be of the same type")
ValueError: All samples must be of the same type

Expected behavior
I expected the samples to be created properly and evaluation to start

Additional context
Please help me fix the troublesome sample record and where the problem is. At the moment, this error message in itself is not very helpful to spot the error among the many sample records created

@tvsathish tvsathish added the bug Something isn't working label Jan 23, 2025
@dosubot dosubot bot added the question Further information is requested label Jan 23, 2025
@sahusiddharth
Copy link
Collaborator

Hi @tvsathish,

The error you're encountering is likely due to a mismatch in the types of items in the list you're using to create the EvaluationDataset. When you create the object like this:

eval_dataset = EvaluationDataset(samples)

it internally calls the validate_samples function, which ensures that all samples are of the same type. The relevant part of the code is:

def validate_samples(self, samples: t.List[Sample]) -> t.List[Sample]:
    """Validates that all samples are of the same type."""
    if len(samples) == 0:
        return samples

    first_sample_type = type(self.samples[0])
    if not all(isinstance(sample, first_sample_type) for sample in self.samples):
        raise ValueError("All samples must be of the same type")

    return samples

This function checks that each sample in the list is of the same type as the first one. If there's any inconsistency in the sample types, it raises a ValueError.

To fix the error, please ensure that all the samples are of the same type before passing them to the EvaluationDataset.

@tvsathish
Copy link
Author

Dear @sahusiddharth, I can understand the validation happening here. The question is how to find out the record that is a mismatch among many records I created? Can you please suggest a way?
Thanks,
Paddy

@sahusiddharth
Copy link
Collaborator

Hi @tvsathish,

I understand your concern. To find the mismatched records, you can add type checking and logging inside the create_turn_sample function. This will help identify and log any records that don’t match the expected types, making it easier to spot the problematic ones.

def empty_nan_value(cell_value, default=''):
    """Helper function to return empty string if NaN, else the cell value."""
    return '' if pd.isna(cell_value) else cell_value

def create_turn_sample(row):
    # Extract URL and safely handle it
    url = re.split(r'[,\n ]+', empty_nan_value(row['reference']))[0]
    
    try:
        page = urlopen(url)
    except ValueError:
        return
    
    # Parse the page with BeautifulSoup
    soup = BeautifulSoup(page, features='lxml')
    
    # Get user input (ensure it's a string)
    user_input = str(row['user_input']) if isinstance(row['user_input'], str) else ''
    
    # Get the list of contexts, ensuring each is a string
    retrieved_contexts = [str(empty_nan_value(row.get(f'context{i}'))) for i in range(1, 5)]
    
    # Get response (ensure it's a string)
    response = str(empty_nan_value(row['response']))
    
    # Get reference text (ensure it's a string)
    reference = str(soup.get_text())
    
    # Return the created SingleTurnSample
    return SingleTurnSample(
        user_input=user_input,
        retrieved_contexts=retrieved_contexts,
        response=response,
        reference=reference
    )

@Vidit-Ostwal
Copy link
Contributor

Vidit-Ostwal commented Jan 24, 2025

@sahusiddharth @tvsathish

I think if we change the validate_sample function, to raise the error telling which sample is out of sample type, would be better solution, to tell the user where to look at.

Kind of like this

def validate_samples(self, samples: t.List[Sample]) -> t.List[Sample]:
    """Validates that all samples are of the same type."""
    if len(samples) == 0:
        return samples

    first_sample_type = type(samples[0])
    for i, sample in enumerate(samples):
        if not isinstance(sample, first_sample_type):
            raise ValueError(f"Sample at index {i} is of type {type(sample)}, expected {first_sample_type}")

    return samples

jjmachan pushed a commit that referenced this issue Jan 24, 2025
Changed the validate_samples functionality to also tell which indexed
sample is causing the issue. #1875

Co-authored-by: Vidit Ostwal <[email protected]>
@jjmachan
Copy link
Member

hey @Vidit-Ostwal that would be a better error message like you suggest 🙂
thanks again for the PR and helping out @tvsathish

@jjmachan jjmachan added the answered 🤖 The question has been answered. Will be closed automatically if no new comments label Jan 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
answered 🤖 The question has been answered. Will be closed automatically if no new comments bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants