Add explicit warnings for zero-identity comparison runs #145

widdowquinn · 2019-10-01T16:47:11Z

Summary:

Several issues have been raised (#93 #72 #73 etc.) where the main issue is that a comparison has given a zero identity, and errors have been thrown downstream. Throwing a specific error might be one way to alert users that the problem is in the dataset, not the analysis code.

baileythegreen · 2021-07-11T21:17:28Z

Where is the best place for such an error to be thrown so that it is helpful for the user? Looking at the Issues you cite, both anim.py and graphics.py seem like possibilities.

widdowquinn · 2021-07-12T12:23:37Z

I might think of doing it when parsing the input file. We have options here, I think. Do we, for instance:

log a WARNING and continue, treating the input as zero identity and substituting in dummy values
log an ERROR, raise an exception and halt, forcing the user to attend to or remove one or other input file
log an INFO message and continue, treating the input as zero identity and substituting in dummy values

If we're not halting, then we could warn on file parsing and/or when comparisons are complete, so that the individual comparison is flagged and the total influence on the comparisons is noted (e.g. "6/n comparisons had zero identity values, please check that the input for [list of files] is what you expect)

I think it might be helpful to report the associated comparison commands so that the user can investigate more easily.

Maybe we need a new log table in the database to associate messages like this with the originating run?

baileythegreen · 2021-07-12T13:17:22Z

I think adding logs to the database is a really good idea; it is a good way of keeping them associated with a particular run without needing to keep a bunch of log files (if one chooses not to).

My only concern would be with the choice of what to store; storing everything for a run may be a lot and there may be a point at which it is too much to be very useful. (Like how I feel about the pytest logging output.)

widdowquinn · 2021-07-12T14:11:28Z

It will be a case of trying things out to see if they're proportionate, I think.

We certainly need the versions of tools, the command-line used, date and time - all the things you'd want for reproducibility/reporting in a publication.

For other logs a natural choice would be: anything that could be an error and might have affected the outputs/results, which I think would include the "zero identity" case above.

I'm running out after that. I expect more candidates will present themselves.

widdowquinn self-assigned this Oct 1, 2019

widdowquinn added the enhancement something we'd like pyani to do that it doesn't already label Oct 1, 2019

widdowquinn added this to the 0.3.0 milestone May 29, 2020

widdowquinn added the interface issues related to how the user tells pyani to do something label May 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add explicit warnings for zero-identity comparison runs #145

Add explicit warnings for zero-identity comparison runs #145

widdowquinn commented Oct 1, 2019

baileythegreen commented Jul 11, 2021

widdowquinn commented Jul 12, 2021

baileythegreen commented Jul 12, 2021

widdowquinn commented Jul 12, 2021

Add explicit warnings for zero-identity comparison runs #145

Add explicit warnings for zero-identity comparison runs #145

Comments

widdowquinn commented Oct 1, 2019

Summary:

baileythegreen commented Jul 11, 2021

widdowquinn commented Jul 12, 2021

baileythegreen commented Jul 12, 2021

widdowquinn commented Jul 12, 2021