-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Initial implementation of Hilbert detector #23
Conversation
@@ -79,8 +84,144 @@ def _write_json(fname, dictionary, overwrite=False, verbose=False): | |||
print(os.linesep + f"Writing '{fname}'..." + os.linesep) | |||
print(json_output) | |||
|
|||
def _band_zscore_detect(signal, sfreq, band_idx, l_freq, h_freq, n_times, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not quite fix the loop issue here, but I figured out how this works and tried to document it better.
mne_hfo/utils.py
Outdated
@@ -139,26 +282,166 @@ def compute_line_length(signal, win_size=6): | |||
return data[start:-stop] | |||
|
|||
|
|||
def threshold_std(signal, threshold): | |||
def compute_hilbert(signal, extra_params): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is a bit different because you compute the metric per freq band, so the return ends up being an ndarray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you document what the array axes mean, so I can take a closer look?
cycles_threshold, gap_threshold, | ||
zscore_threshold] | ||
tdetects.append(_band_zscore_detect(args)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From here, we should have band indices, start and stop times, and max values during that band
mne_hfo/utils.py
Outdated
return tdetects | ||
|
||
|
||
def _run_detect_branch(detects, det_idx, HFO_outline): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe all this does is merge overlapping events
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, let's see if we can refactor this then to use our version, which imo is more explicit.
This recursion imo is completely unnecessary and hard to decipher heh
mne_hfo/utils.py
Outdated
# corresponding to start and stop times | ||
outlines = [] | ||
if len(tdetects): | ||
while sum(tdetects[:, 0] != 0): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to look more into what this function's purpose is...I don't see how tdetects is changing, so not sure how this loop would ever end
mne_hfo/utils.py
Outdated
max_amplitude = max(outline[:, 3]) | ||
ch_hfos.append((start, stop, freq_min, freq_max, | ||
frq_at_max, max_amplitude)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is where the additional info comes in. Hilbert calculates start and stop times, the frq band that the detection was found in, the max amplitude of the signal during this window, and the freq where that max amplitude occurred. Should we try to work this info into the output data structures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So based on a threshold applied, the start/stop times can get converted to our hfo_event_arr_
, while we can store in parallel a same-size array with freq-band (l_freq and h_freq) for each index, and the also another array with the same-size with the max_amplitude
and also another array with the freq
of the max amplitude.
We can either use these downstream, or scrap these and pipe them into a cohesive dataframe. This tbd, since I'm not sure how this would look, so better to be explicit rn.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, you are proposing that a HilbertDetector would have additional objects assigned to it at the end of class. We put start/stop times into the hfo_event_arr_
that we have been using. Then we have another array for freq bands and another array for amplitudes. Then we compare them just by matching row indices? Seems easy enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. These can all be "additional" data structures that are unique to just Hilbert detectors. We can later figure out how to best combine these if one would need programmatic access to it when doing post-hoc anallysis.
mne_hfo/detect.py
Outdated
verbose: bool = False): | ||
|
||
def __init__(self, | ||
sfreq: float, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need.
mne_hfo/detect.py
Outdated
X = mne.filter.filter_data(X, sfreq=self.sfreq, | ||
l_freq=self.l_freq, | ||
h_freq=self.h_freq, | ||
method='iir', verbose=self.verbose) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remind me why iir vs fir for this one?
Do they use IIR in the original implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, just document it here perhaps inline comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is what was in the original implementation, so I can comment
So right now, |
Would you mind explaining to me with some more detail here? I'm not quite following. So Hilbert rn requires 3 thresholds: i) zscore of the Hilbert transform envelope, ii) number of cycles found and iii) gaps (idr what gaps were, so maybe we need to document that better). The issue is Some thoughts: My impression is that Hilbert only needs to refactor the If thresholding needs to be a distinct step in of itself, then we can have each If I'm missing anything, lmk. |
So the basic idea of Once the overlapping events are merged, the HFO is defined as the [min start time, max end time] and the freq band is defined as [l_freq(band_idx), h_freq(band_idx+(n-1)] for n overlapping events. Unfortunately I don't think computational efficiency is going to be good for this no matter what, but this is generally the new idea: And the merging step would be something simple like: Then outlines should only have distinct HFO events |
print(f'Using {threshold_method} to perform HFO ' | ||
f'thresholding.') | ||
if method == "time-windows": | ||
return detections |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't merge time-windows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do, but we do it upstream. I left a TODO where its done to refactor it here if you wanted. I think refactoring will slow it down a bit, which is why I held off
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm okay I will take a look
mne_hfo/detect.py
Outdated
if band_method == 'log': | ||
low_fc = float(filter_band[0]) | ||
high_fc = float(filter_band[1]) | ||
freq_cutoffs = np.logspace(0, np.log10(high_fc), n_bands) | ||
self.freq_cutoffs = freq_cutoffs[(freq_cutoffs > low_fc) & | ||
(freq_cutoffs < high_fc)] | ||
self.freq_span = len(freq_cutoffs) - 1 | ||
elif band_method == 'linear': | ||
self.freq_cutoffs = np.arange(filter_band[0], filter_band[1]) | ||
self.freq_span = (filter_band[1] - filter_band[0]) - 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good, but will not work because goes against sklearn-style API. Stick inside fit()
at the beginning instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meaning, sklearn imposes that one does not define additional things inside __init__
except what is passed in. Not sure why, but just something to do w/ it's API compliance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I stick it in Hilbert's _compute_hfo_statistic
function? That way we avoid a bunch of if statements in the base fit method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that makes sense, unless there's any logic in fit that relies on this before calling _compute_hfo_statistic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, it just calls _check_input_raw
, then right into _compute_hfo_statistic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some changes, but overall looks good, especially if unit tests are still passing.
Can you also add to the docstring of Detector
, the general flow of an algorithm? This way, any downstream Detectors follow this abstraction.
E.g.
Step 1: compute a statistic on the data
Step 2: ...
etc.
Once you also add unit tests for Hilbert and validate that this works, then LGTM.
mne_hfo/utils.py
Outdated
|
||
# return the absolute value of the Hilbert transform. | ||
# (i.e. the envelope) | ||
hfx = np.abs(hilbert(signal)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is what was throwing the memory error
If you fix the CI and resolve the resolved comments, then I can take another review. Can merge and then we can move on with other high pri tasks. |
Codecov Report
@@ Coverage Diff @@
## master #23 +/- ##
==========================================
- Coverage 70.96% 70.00% -0.96%
==========================================
Files 13 13
Lines 1054 1167 +113
==========================================
+ Hits 748 817 +69
- Misses 306 350 +44
Continue to review full report at Codecov.
|
Looks like |
No, it is used by |
Codecov Report
@@ Coverage Diff @@
## master #23 +/- ##
==========================================
+ Coverage 70.96% 72.23% +1.26%
==========================================
Files 13 13
Lines 1054 1167 +113
==========================================
+ Hits 748 843 +95
- Misses 306 324 +18
Continue to review full report at Codecov.
|
PR Description
Addresses part of: #18
Refactor detector into 3 step process of calculate, threshold, merge, and implement Hilbert Detector in that scheme.
Merge checklist
Maintainer, please confirm the following before merging: