Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename multiplier to frames_per_event and move to first dim of shape #726

Open
wants to merge 72 commits into
base: main
Choose a base branch
from

Conversation

thomashopkins32
Copy link
Contributor

@thomashopkins32 thomashopkins32 commented Jan 8, 2025

This PR does the following:

  • Renames multiplier -> frames_per_event
  • Add the frames_per_event as the first dimension of DataKey.shape
  • Ensure that the index provided by DetectorWriter.get_indices_written() and DetectorWriter.observe_indices_written() is divided by frames_per_event so that it actually captures the correct amount of exposures in each index (except for PandA which explicitly says it only has 1 "frame" per event)
  • Add unit tests showing that describe() works as intended
  • Add unit tests showing that stream resources are actually batches of exposures
  • Re-order self._writer.open() and self._writer.get_indices_written(). The writer needs to be opened in order to get the indices written. Otherwise, it has no idea what frames_per_event to use when returning the index last written.

I could not actually add tests using bluesky plans and inspecting the data afterword because TriggerInfo is hardcoded in StandardDetector. I think it is a separate issue that should be raised since it would enhance the scope of this PR. I will open an issue for this soon and mention it below.

Otherwise, I have a few open questions regarding my understanding of ophyd-async as well as the implementation which I will also leave as review comments. Please see below.

Closes #576

jwlodek and others added 30 commits September 4, 2024 13:16
Copy link
Contributor Author

@thomashopkins32 thomashopkins32 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question about shapes when frames_per_event is 1: do we want to always "squeeze" the shape?
I.e. there are a couple of options:

  • For 2d arrays:
    • [1, h, w] -> [h, w] when frames_per_event = 1
    • [frames_per_event, h, w] when frames_per_event > 1
  • For scalar values:
    • [1,] -> [] when frames_per_event = 1
    • [frames_per_event,] when frames_per_event > 1

Currently, it is set up such that if the result would be a single scalar value, the shape would be replaced with []. Otherwise, the shape always contains the extra dim.

src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
Copy link
Member

@jwlodek jwlodek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is close, just a few minor notes. We should try setting this up in the lab and running a test w/ collecting data from different devices w/ different frames_per_event to make sure it behaves as expected (and also to maybe work out the needed changes to the consolidators).

src/ophyd_async/core/_detector.py Outdated Show resolved Hide resolved
src/ophyd_async/core/_hdf_dataset.py Outdated Show resolved Hide resolved
src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
tests/core/test_flyer.py Outdated Show resolved Hide resolved
src/ophyd_async/epics/adcore/_core_writer.py Outdated Show resolved Hide resolved
tests/epics/adaravis/test_aravis.py Outdated Show resolved Hide resolved
@jwlodek
Copy link
Member

jwlodek commented Jan 13, 2025

Question about shapes when frames_per_event is 1: do we want to always "squeeze" the shape? I.e. there are a couple of options:

* For 2d arrays:
  
  * `[1, h, w]` -> `[h, w]` when `frames_per_event = 1`
  * `[frames_per_event, h, w]` when `frames_per_event > 1`

* For scalar values:
  
  * `[1,]` -> `[]` when `frames_per_event = 1`
  * `[frames_per_event,]` when `frames_per_event > 1`

Currently, it is set up such that if the result would be a single scalar value, the shape would be replaced with []. Otherwise, the shape always contains the extra dim.

I think I'd be in favor of avoiding such squeezing, because then we'd need a separate parameter to let us know if it had been squeezed or not. Say we have a frames_per_event of 1 w/ a dataset that's 10 x 10. If we squeeze we get [10, 10] as the shape, but there's no way of telling if this is actually a 1D dataset of size 10 w/ 10 frames per event.

@thomashopkins32
Copy link
Contributor Author

@jwlodek so the current squeezing behavior for the shape is

  • For 2d arrays:
    • [frames_per_event, h, w] (even if frames_per_event is 1)
  • For scalar values:
    • [1,] -> [] when frames_per_event = 1
    • [frames_per_event,] when frames_per_event > 1

The final change I would make based on your comment would be to remove the squeezing on scalar values from [1,] -> [].

@thomashopkins32
Copy link
Contributor Author

Should be ready to review once more. The new shape behavior is such that the frames_per_event is always the first dimension of shape. If the len(shape) > 1, then the dtype is an array, otherwise, its a number.

@thomashopkins32
Copy link
Contributor Author

thomashopkins32 commented Jan 24, 2025

@jwlodek, @jennmald , and I did some testing on actual devices and found a few more issues that need to be resolved. We didn't get through all of the testing we planned for so we will continue next week most likely.

For ophyd-async (completed in a5b1f27) :

  • PandA needs to be able to handle frames_per_event > 1. @coretl do you know why this was limited to only being 1 for PandA?
  • The computed total_number_of_triggers needs to be multiplied by frames_per_event

For bluesky:

  • ConsolidatorBase needs to be reworked based on the new assumption that frames_per_event is the first dim of datum_shape.

For tiled:

  • TBD

That covers pretty much everything that we tested a debugged on the devices so far. We will see if changes to tiled are necessary in further testing.

@coretl
Copy link
Collaborator

coretl commented Jan 28, 2025

  • PandA needs to be able to handle frames_per_event > 1. @coretl do you know why this was limited to only being 1 for PandA?

I'm not sure, I don't think that's a real restriction. If you remove it, what breaks?

@jwlodek
Copy link
Member

jwlodek commented Jan 28, 2025

  • PandA needs to be able to handle frames_per_event > 1. @coretl do you know why this was limited to only being 1 for PandA?

I'm not sure, I don't think that's a real restriction. If you remove it, what breaks?

Nothing actually, we removed it and got everything to work as expected, just into separate streams. We're going to make sure they can fit into the same stream this week

Copy link
Member

@jwlodek jwlodek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've now tested this and it is working for us, pending the Consolidator PR.

@thomashopkins32
Copy link
Contributor Author

Actually we decided that it does not work well with tiled just yet. We want tiled to have the frames_per_event explicitly in the shape which is causing issues with reading the data back from the files (due to how chunking works).

Basically, ophyd-async has the descriptor shape with the first dim as frames_per_event. bluesky's consolidators uses this to figure out the proper chunking of the data prior to writing it to the hdf5 file. Then tiled needs to also understand this chunking in order to read the data back from the file and unpack it properly.

The shape of the data from the user perspective should always be (num_events, frames_per_event, ...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove multiplier from resource docs, add it instead as an extra first dimension in descriptor shape
3 participants