-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataCalibrated -- The worst name possible, and how we can make it better #208
Comments
TimeLine is semantically wrong, see I would propose
|
The group of operations labeled as "calibration" varies in size and scope. E.g. the process of extracting the "charge" from a "time_series" is sometimes called calibration, c.f. MARS/Callisto. The problem, as I understand @relleums, was not that he was confusing a time_series with a single charge value. So in a sense, the "time_series" was clear in the context and thus could be considered noise. I think the proposal by @maxnoe is not sufficient, since it would not have solved the confusion described by @relleums. He new it was "voltages", he did not know if these voltages were already corrected for a certain effect or not. The problem with naming these time_series as I see it is this. We have to perform a certain number of "calibration"-steps on the "raw" adc_time_series as they are written into these files ending in ".fits.fz", these include:
and possibly more. one could store the time series before and after each of these steps in this "DataItemMapDictThing" and try to find apropriate names to convey to a passerby which step produces a certain time_series, e.g.
But "after_jump_removal" does not tell me if the "drs_time_calibration" has been performed already or not yet, since one might not know the order of these steps. writing the entire history of steps into the name of a time_series is utterly out of question. So I wonder ... is there a solution at all? Or do we have to face the fact, that any name must omit some part of information? And that @relleums premise to assume he had a chance to understand what calibration steps were done or not done, by looking at the name of the thing, was simply wrong. Our system is so complicated that one can only understand what operations have been done on a certain time_series in the DataItem by carefully reading the code and the xml, since people might bypass certain processors by funny usage of keys for the processors in their xml. As a solution I would propose this. We have some experience after 5 years now and we kind of know what steps we want to perform on a time series before using it for anything. So this means there are two kinds of people, those who want to operate on a "ready to use" time_series. They should simply see the key "time_series" ... no need to show them any of the intermediate results. And then there are people who are interested in optimizing one of these steps ... be it jump finding or drs-time-calibration ... these people I would consider experts ... they might need access to each and every intermediate step of the calibration-series in order to find out where a certain artifact was introduced. For them "adc_time_series" and "voltage_time_series" does not even start to solve any problem. Fazit: Let's remove all time_series from the DataItemThingy but the single one, that is useful. Let's call that one: "time_series" |
I don't think this is worth the effort. It's only internal and never written out. Please open again, if you have a strong different opinion. |
Again I fell into the pit (#197) of assuming that the 'calibrated' in
DataCalibrated
is about the calibrations I am thinking of. In this particular case it turns out that the single pulse gain calibration, which mainly corrects for the individual pixel pre-amp magnification factors, is not yet applied toDataCalibrated
. BothData
andCalibrated
are imho very ambiguous names and must be replaced for future students. I propose to rename the variables and change the processing flow a bit:We introduce a single pulse (pre-amp) gain calibration service as I started it in PR #207 .
We remove the single pulse (pre-amp) gain correction from
BasicExtraction
into a processor of its own maybe calledfact.datacorrection.SinglePulseGainCorrection
.Personally, I prefer
TimeLines
overTimeSerieses
because of the plural 's' confusion.Btw. Why is
<fact.filter.DrsTimeCalibration outputKey="timeCalibConst" />
not a calibration service OR the other way round: Why are not all calibration services processors?This topic is now hot again because there are now to processors which need the single pulse (pre-amp) gain correction (
BasicExtraction
andSinglePulseExtraction
).We have to unify the loading or in-code providing of calibration constants. Since there are currently different solutions to this in use I ask you: Which one do you prefer? How can we tidy this up?
The text was updated successfully, but these errors were encountered: