Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testset generation lost when a single generation fail #2022

Open
kevinmessiaen opened this issue Sep 4, 2024 · 2 comments
Open

Testset generation lost when a single generation fail #2022

kevinmessiaen opened this issue Sep 4, 2024 · 2 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@kevinmessiaen
Copy link
Member

Issue Type

Bug

Source

source

Giskard Library Version

2.15.0

OS Platform and Distribution

Macos

Python version

3.12.4

Installed python packages

No response

Current Behaviour?

When generating a testset, it takes around 45-60 minutes to go thru a doc and produce the topics. 

Then it starts generating the testset but... it's failing around 80% because of JSONDecode error and I have to restart again, including all the topic generation part.

We should implement a solution to save the progress, since it is very likely to have error like this when generating testset from very long docs.

Standalone code OR list down the steps to reproduce the issue

TODO

Relevant log output

No response

@elsatch
Copy link

elsatch commented Sep 5, 2024

Thanks for registering this issue. I've seen this behavior happening in a lot of projects when creating large datasets, assuming that the main process won't fail.

Besides catching the error and ignoring those to have the process finish, I think it could be very useful to append the created results to a local file, so that if the process fails somehow, you have some output as a result plus it can be easier to pinpoint the problem in the input data. Otherwise, it just stays in memory and when it fails... it's gone.

@henchaves
Copy link
Member

Hey @elsatch, thanks a lot for your suggestion.
Would you be interested in implementing this?

@henchaves henchaves added enhancement New feature or request good first issue Good for newcomers labels Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Development

No branches or pull requests

3 participants