Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Completing pygrb xml to hdf5 transition #4419

Closed
3 of 7 tasks
pannarale opened this issue Jun 29, 2023 · 13 comments
Closed
3 of 7 tasks

Completing pygrb xml to hdf5 transition #4419

pannarale opened this issue Jun 29, 2023 · 13 comments
Assignees
Labels
PyGRB PyGRB development

Comments

@pannarale
Copy link
Contributor

pannarale commented Jun 29, 2023

Comparing the latest results webpage to the most complete one generated when using old xml results files, the items missing are

  • tables of missed and quiet injections (A31 and A32)
  • follow ups of the 10 loudest quiet injections (sections 3.02 and 3.04, these are done for each injection set)
  • all section 4, i.e., loudest offsource events distributions, table and follow ups of the 10 loudest
  • all of section 5 (exclusion distances)

With the exception of the followups which we will handle later, these correspond to 3 scripts that require updating:

At the moment these still assume the input files will be xml files, so it’s a matter of getting them to read the hdf5 files we have now and then adjusting the scripts so they navigate those files accordingly to produce the plots/tables.

@pannarale pannarale added the PyGRB PyGRB development label Jun 29, 2023
@github-project-automation github-project-automation bot moved this to In Progress in PyGRB Development Jun 29, 2023
@pannarale
Copy link
Contributor Author

Old command lines are available under the plots and tables in the webpage linked in the issue description. Take one of those, edit it to point to hdf5 files and then start upgrading the scripts.

Make sure you run in an environment up to date with gwastro/pycbc/master as it is today.

A possible bump in the road is that so far with the new code we have run short, small tests, so I don’t think we have results files with enough background nor files with missed injections. This means your first plots and tables should be empty! However, we do want the codes to be able to produce empty output if the input is not interesting: after we get there, we can run a longer test run and make sure we have meaningful information to display.

You can find input files on CIT. Ask me if you need detailed paths.

@pannarale
Copy link
Contributor Author

@MarcoCusinato is an assignee too, but right now the search in the assignees box is not picking up his user name.

@pannarale
Copy link
Contributor Author

@MarcoCusinato , @ETVincent , see #4427: this is relevant for you as well.

@pannarale
Copy link
Contributor Author

A practical way to approach this is to search for glue.ligolw imports (these appear in the three executables mentioned above and in pycbc/results/pygrb_postprocessing_utils.py), remove them, and replace anything that relies on them.

@jakeb245
Copy link
Contributor

jakeb245 commented Sep 7, 2023

I have gotten pycbc_pygrb_efficiency to run using HDF trigger files (examples here). The full changes live on this branch. Unfortunately, it's currently an ugly mess of various branches. Some of these branches have PRs and some don't.

My intent is to get the currently open PRs merged in (those being #4427 and #4443), and then open one for the rest of the changes.

I had to update two functions (ppu.load_time_slides and ppu.load_segment_dict) that are common to all three of the executables listed above, so hopefully that helps move things along.

Getting these executables running in the workflow will be another task!

@ETVincent
Copy link
Contributor

ETVincent commented Oct 4, 2023

Putting this here to keep the conversation going.
I have a branch here with pycbc_pygrb_plot_stats_distribution that I believe does what we need for what I was assigned. It is based in Jacob's updates via the above mentioned PRs. My only comment is that I have not tested it on veto files.

@ETVincent
Copy link
Contributor

Putting this here to keep the conversation going. I have a branch here with pycbc_pygrb_plot_stats_distribution that I believe does what we need for what I was assigned. It is based in Jacob's updates via the above mentioned PRs. My only comment is that I have not tested it on veto files.

After the call today, @pannarale mentioned that I needed a clean branch so that I would be able to pull. I have the branch stats_dist_clean that updates only pycbc_pygrb_plot_stats_distribution. Note that it still requires Jacob's updates in the above PRs to function properly.

@pannarale
Copy link
Contributor Author

Thanks, @ETVincent. Could you open a PR, please?

@ETVincent
Copy link
Contributor

Yes of course, #4538

@pannarale
Copy link
Contributor Author

For the record, ongoing development for page_tables is happening on https://github.com/MarcoCusinato/pycbc/tree/page_tables

@pannarale
Copy link
Contributor Author

@MarcoCusinato, @jakeb245's relevant PRs are through: can you set up your PR for pycbc_pygrb_page_tables please?

@pannarale
Copy link
Contributor Author

#4649 completes the xml to hdf5 switch for pycbc_pygrb_page_tables

@pannarale
Copy link
Contributor Author

I am closing this issue. The remaining tasks pertain to the webpage in general, rather than the xml to hdf5 transition. They have been copied over to issue #3660.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PyGRB PyGRB development
Projects
Status: Done
Development

No branches or pull requests

4 participants