Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move from mongodb to tiled-backed catalog #339

Merged
merged 16 commits into from
Jan 8, 2025
Merged

Move from mongodb to tiled-backed catalog #339

merged 16 commits into from
Jan 8, 2025

Conversation

canismarko
Copy link
Contributor

@canismarko canismarko commented Jan 6, 2025

The bluesky project is moving from storing raw scan documents in a mongo database via databroker to storing tabular data in a Tiled catalog backed by a relational (sqlite or postgres) database. This PR updates Haven and Firefly to use the Tiled catalog by default.

This change is necessary to enable ophyd-async detector data to be read back out by Tiled.

The aim with this PR is keep top-level features (e.g. run browser) the same as with the old database structure. This PR does not add features that will be useful with the new catalog, such as filtering by beamline; these will be added in a separate PR.

To avoid losing data, the raw documents should still be saved in the mongo database. This way, when we are ready to fully switch over we will create a new tiled catalog and stream the mongodb documents into the new catalog. I suggest we keep that old mongodb going until we are confident we don't need them.

Things to do before merging:

  • add tests
  • write docs
  • update iconfig_testing.toml
  • flake8, black, and isort
  • test at the beamline

@canismarko
Copy link
Contributor Author

As-written, the tiled writer exists as a run engine callback. Previously, we put documents on the kafka topic and read them out using the mongo_consumer.py. Do we want to do that here as well?

Maybe that should be a different PR, where the run_engine factory lets you attach the kafka producer instead of the tiled writer and databroker separately?

@canismarko
Copy link
Contributor Author

Also, I think documentation for this belongs on the wiki, since it extends beyond the scope of Haven.

@canismarko canismarko marked this pull request as ready for review January 6, 2025 21:10
@canismarko canismarko requested a review from Cathyhjj January 7, 2025 19:39
@canismarko
Copy link
Contributor Author

As-written, the tiled writer exists as a run engine callback. Previously, we put documents on the kafka topic and read them out using the mongo_consumer.py. Do we want to do that here as well?

I added a TiledConsumer module to the queueserver package that reads documents from Kafka and sends them to Tiled. I will do a separate PR that adds a kafka produce to the run engine (the queueserver does this automatically).

src/firefly/run_browser/widgets.py Outdated Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for changing the data structure for me steal!

src/queueserver/tiled_consumer.py Outdated Show resolved Hide resolved
@canismarko canismarko merged commit 2645708 into main Jan 8, 2025
1 check passed
@canismarko canismarko deleted the rdb branch January 8, 2025 00:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants