Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix path handling for non-conforming datasets #81

Merged
merged 11 commits into from
Dec 19, 2024
Merged

Conversation

GreenK173
Copy link
Contributor

This PR replaces PR #79 because I accidentally created that one on the main branch of my fork.

Fixes #78. The matter was also recently dealt with in the SigMF specification repo in issue #321 and PR #322. In particular this PR addresses the issue raised by @lgoix in this comment.

The get_dataset_filename_from_metadata function returns the full file path for non-conforming datasets, not just the file name. Furthermore an unit test verifying this, plus a couple of typos described in the above issue fixed.

@Teque5
Copy link
Collaborator

Teque5 commented Dec 14, 2024

While rewriting your test I observed that to use a non-compliant data file we have to do something like this:

meta = SigMFFile(metadata=ncd_metadata, data_file=data_path)
# tell SigMF that the data is noncompliant
meta.set_global_field(SigMFFile.DATASET_KEY, data_path.name)

but really I think the SigMFFile should detect that data_path.suffix != '.sigmf-data' and set that key & value automatically.

We can either merge this and create another PR, or if you want you can add this improvement to this PR.

Copy link
Collaborator

@Teque5 Teque5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we never want temp files to exist after testing I used tempfile and also make some simplifications.

@GreenK173 GreenK173 mentioned this pull request Dec 19, 2024
@GreenK173
Copy link
Contributor Author

Thanks for all the feedback.

I've implemented the automatic detection of NCD discussed above, and also made a small fix in get_dataset_filename_from_metadata.

However there's another issue: the behavior of os.remove differs between Windows and Unix platforms (see the linked doc). I'm guessing that you're using Linux where it should work fine, however on Windows (which I'm using) it crashes because the dataset file is still being used by self._memmap of both SigMFFile instances and the deletion isn't allowed. Since there seams to be no good way how to close np.memmap (see numpy/numpy#13510 for more details), the only way how I was able to solve it is to get rid of both SigMFFile instances so that they get garbage collected.

Since I've also noticed another Windows/Linux problem elsewere, I've opened issue #89 regarding this.

@Teque5 Teque5 added the bug Something isn't working label Dec 19, 2024
@@ -558,6 +558,12 @@ def set_data_file(self, data_file=None, data_buffer=None, skip_checksum=False, o
self._memmap = raveled.reshape(mapped_reshape)
self.shape = self._memmap.shape if (self._return_type is None) else self._memmap.shape[:-1]

if self.data_file is not None:
file_name = path.split(self.data_file)[1]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love path handling like this, but I opened #90 and we can address it in the future.

@Teque5 Teque5 merged commit 9809de0 into sigmf:main Dec 19, 2024
3 checks passed
@Teque5
Copy link
Collaborator

Teque5 commented Dec 19, 2024

Thank you @GreenK173 for your work on NCD handling and this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_dataset_filename_from_metadata should return full path for non-conforming datasets + couple of typos
2 participants