Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JP-1649: Add option for user-defined sky levels in skymatch #9053

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

emolter
Copy link
Collaborator

@emolter emolter commented Jan 7, 2025

Resolves JP-1649

Closes #5262

This PR adds the option to pass in user-defined sky levels to the skymatch step.

Tasks

  • request a review from someone specific, to avoid making the maintainers review every PR
  • add a build milestone, i.e. Build 11.3 (use the latest build if not sure)
  • Does this PR change user-facing code / API? (if not, label with no-changelog-entry-needed)
    • write news fragment(s) in changes/: echo "changed something" > changes/<PR#>.<changetype>.rst (see below for change types)
    • update or add relevant tests
    • update relevant docstrings and / or docs/ page
    • start a regression test and include a link to the running job (click here for instructions)
      • Do truth files need to be updated ("okified")?
        • after the reviewer has approved these changes, run okify_regtests to update the truth files
  • if a JIRA ticket exists, make sure it is resolved properly
news fragment change types...
  • changes/<PR#>.general.rst: infrastructure or miscellaneous change
  • changes/<PR#>.docs.rst
  • changes/<PR#>.stpipe.rst
  • changes/<PR#>.datamodels.rst
  • changes/<PR#>.scripts.rst
  • changes/<PR#>.fits_generator.rst
  • changes/<PR#>.set_telescope_pointing.rst
  • changes/<PR#>.pipeline.rst

stage 1

  • changes/<PR#>.group_scale.rst
  • changes/<PR#>.dq_init.rst
  • changes/<PR#>.emicorr.rst
  • changes/<PR#>.saturation.rst
  • changes/<PR#>.ipc.rst
  • changes/<PR#>.firstframe.rst
  • changes/<PR#>.lastframe.rst
  • changes/<PR#>.reset.rst
  • changes/<PR#>.superbias.rst
  • changes/<PR#>.refpix.rst
  • changes/<PR#>.linearity.rst
  • changes/<PR#>.rscd.rst
  • changes/<PR#>.persistence.rst
  • changes/<PR#>.dark_current.rst
  • changes/<PR#>.charge_migration.rst
  • changes/<PR#>.jump.rst
  • changes/<PR#>.clean_flicker_noise.rst
  • changes/<PR#>.ramp_fitting.rst
  • changes/<PR#>.gain_scale.rst

stage 2

  • changes/<PR#>.assign_wcs.rst
  • changes/<PR#>.badpix_selfcal.rst
  • changes/<PR#>.msaflagopen.rst
  • changes/<PR#>.nsclean.rst
  • changes/<PR#>.imprint.rst
  • changes/<PR#>.background.rst
  • changes/<PR#>.extract_2d.rst
  • changes/<PR#>.master_background.rst
  • changes/<PR#>.wavecorr.rst
  • changes/<PR#>.srctype.rst
  • changes/<PR#>.straylight.rst
  • changes/<PR#>.wfss_contam.rst
  • changes/<PR#>.flatfield.rst
  • changes/<PR#>.fringe.rst
  • changes/<PR#>.pathloss.rst
  • changes/<PR#>.barshadow.rst
  • changes/<PR#>.photom.rst
  • changes/<PR#>.pixel_replace.rst
  • changes/<PR#>.resample_spec.rst
  • changes/<PR#>.residual_fringe.rst
  • changes/<PR#>.cube_build.rst
  • changes/<PR#>.extract_1d.rst
  • changes/<PR#>.resample.rst

stage 3

  • changes/<PR#>.assign_mtwcs.rst
  • changes/<PR#>.mrs_imatch.rst
  • changes/<PR#>.tweakreg.rst
  • changes/<PR#>.skymatch.rst
  • changes/<PR#>.exp_to_source.rst
  • changes/<PR#>.outlier_detection.rst
  • changes/<PR#>.tso_photometry.rst
  • changes/<PR#>.stack_refs.rst
  • changes/<PR#>.align_refs.rst
  • changes/<PR#>.klip.rst
  • changes/<PR#>.spectral_leak.rst
  • changes/<PR#>.source_catalog.rst
  • changes/<PR#>.combine_1d.rst
  • changes/<PR#>.ami.rst

other

  • changes/<PR#>.wfs_combine.rst
  • changes/<PR#>.white_light.rst
  • changes/<PR#>.cube_skymatch.rst
  • changes/<PR#>.engdb_tools.rst
  • changes/<PR#>.guider_cds.rst

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.93%. Comparing base (03ca10f) to head (e72324f).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9053      +/-   ##
==========================================
+ Coverage   76.92%   76.93%   +0.01%     
==========================================
  Files         498      498              
  Lines       45810    45831      +21     
==========================================
+ Hits        35240    35261      +21     
  Misses      10570    10570              
Flag Coverage Δ *Carryforward flag
nightly 77.38% <ø> (-0.01%) ⬇️ Carriedforward from e3d263f

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@emolter emolter added this to the Build 11.3 milestone Jan 7, 2025
@emolter emolter requested review from goudfroo and drlaw1558 January 7, 2025 17:46
@@ -25,7 +25,7 @@
log.setLevel(logging.DEBUG)


def match(images, skymethod='global+match', match_down=True, subtract=False):
def skymatch(images, skymethod='global+match', match_down=True, subtract=False, skylist=None):
Copy link
Collaborator Author

@emolter emolter Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly beyond the PR scope, but I went ahead and changed the name of this function because it violates https://docs.astral.sh/ruff/rules/builtin-variable-shadowing/

Happy to revert if desired, though

Copy link
Member

@mcara mcara Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is match a reserved/built-in word?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +126 to +128
``skymethod`` parameter to "user". Note that the skylist will be applied on a
per-image basis irrespective of the groups defined in the association, and the
order of the list must match the order of the input images.
Copy link
Collaborator Author

@emolter emolter Jan 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious whether this is the desired behavior for asn groups @drlaw1558 @goudfroo

@emolter emolter marked this pull request as ready for review January 7, 2025 17:50
@emolter emolter requested a review from a team as a code owner January 7, 2025 17:50
@emolter
Copy link
Collaborator Author

emolter commented Jan 7, 2025

initial round of regression tests started here: https://github.com/spacetelescope/RegressionTests/actions/runs/12656574738

edit: all passing

@mcara
Copy link
Member

mcara commented Jan 7, 2025

  1. I think user-supplied sky values could be provided the same way custom group_id or tweakreg_catalog are currently provided in the ASN. This way all these values are bundled with the corresponding input cal file, which, IMO is more robust than a separate skylist.
  2. I am not sure this PR is even needed as currently users could simply edit input cal files and set:
cal_model.meta.background.method = "user"  # optional
cal_model.meta.background.level = custom_sky_value
cal_model.meta.background.subtracted = False

Then save the model and disable sky subtraction step.

Actually, you could add sky_subtracted word as well so that users could indicate whether the sky value has been subtracted.

@emolter
Copy link
Collaborator Author

emolter commented Jan 8, 2025

Thanks for the suggestions Mihai. There's a relevant discussion on the JIRA ticket: https://jira.stsci.edu/browse/JP-1649

This PR follows David Law's suggestion at the bottom: add a "user" option and a parameter encoding a list of sky levels. Let's call that suggestion (0).
Your suggestion (1) of including this in the association was also raised by Paul, and your suggestion (2) of just skipping skymatch after editing the cal files is similar to the workaround Paul ended up using.

I think ultimately any of 0, 1, or 2 would work, but they should be documented, including (especially?) if (2) ends up being our suggested workflow.

Wondering if @tapastro and @melanieclarke have opinions on this.

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, since it's just a list of floating point values, it's most easily and directly provided via a parameter, as implemented here. Option 0 sounds like the most user friendly to me.

jwst/skymatch/skymatch.py Outdated Show resolved Hide resolved
@mcara
Copy link
Member

mcara commented Jan 8, 2025

I think, since it's just a list of floating point values, it's most easily and directly provided via a parameter, as implemented here. Option 0 sounds like the most user friendly to me.

There are several problems with this approach:

  1. When users will eventually want to add custom regions for sky matching, we will introduce a new parameter: skyreglist, and then another parameter and so on.
  2. As I mentioned above, one issue is that a separate list can be or become out of order with the input image list. For example, take a look at this code:
    for group_index, (group_id, group_inds) in enumerate(library.group_indices.items()):
    sky_images = []
    for index in group_inds:
    model = library.borrow(index)
    try:
    sky_images.append(self._imodel2skyim(model, index))
    finally:
    library.shelve(model, index, modify=False)
    if len(sky_images) == 1:
    images.extend(sky_images)
    else:
    images.append(SkyGroup(sky_images, id=group_index))

    Let's assume input images belong to different groups and are not sorted by the group ID:
    [image1_g1, image2_g2, image3_g1, image4_g2] with sky values: [sky1, sky2, sky3, sky4]
    
    Then the code above will create two sky groups: SkyGroup1 (with images 1 and 3) and SkyGroup2 (with images 2 and 4). After flattening these groups in
    # unpack groups into individual images. assumption is that user-defined background values
    # have been figured out on a per-model basis, agnostic of group membership.
    images_flattened = []
    for im in images:
    if isinstance(im, SkyGroup):
    images_flattened.extend(im)
    else:
    images_flattened.append(im)
    the order will be: [image1_g1, image3_g1, image2_g2, image4_g2]. This does not correspond to the user-provided order for sky values: [sky1, sky2, sky3, sky4]. To make things worse, it will be difficult for the users to even know they need to check the log file to detect this kind of errors.
  3. Even if we consider that the previous issue could be solved - see, for example, the skyfile parameter in the drizzlepac - creation of a SkyImage or SkyGroup is expensive: it involves creating spherical polygons for the outlines of each image (and for groups, computing the union of these regions). Why do all of this? The purpose of skymatch is to figure out background values and the culmination of this process is in setting computed values in input model's meta:
    dm.meta.background.method = str(self.skymethod)
    dm.meta.background.level = sky
    dm.meta.background.subtracted = self.subtract
    if self.subtract:
    dm.data[...] = sky_image.image[...]
    Why not leave skymatch alone and simply in the Step class read sky values and update meta and return. There is nothing else to do here. In fact, I am not even sure this belongs in this step as there is no computation performed.

@drlaw1558
Copy link
Collaborator

Jumping in to clarify quickly; the aim of this change is to make it simpler for users to handle the case where backgrounds are complex and hard to match. Specifying particular regions for matching seems like a rabbit hole I'd be reluctant to explore, as it could become arbitrarily complicated. Rather, the intent is for users to measure offset values in whatever complex manner they want outside the pipeline, and for us to then give a simple way within the pipeline of applying those values (rather than hacking cal files). Might a two-element list of [filename, value] deal with the ordering issue?

@emolter
Copy link
Collaborator Author

emolter commented Jan 8, 2025

Melanie and I discussed this more at our stand-up today. I think we agree with Mihai that it might be good to avoid touching the skymatch() (formerly match()) routine itself, and we agree with David that the present UI is the best solution given that the "minimum competency" for a user is to change pipeline parameters and not necessarily to directly modify datamodels. So I will work on a way to move the fix into the top-level Step while retaining the current UI. David, I'll use the [filename, value] syntax if I find it's unavoidable (or preferred) to do it that way.

@mcara
Copy link
Member

mcara commented Jan 8, 2025

Might a two-element list of [filename, value] deal with the ordering issue?

Yes, it might. That is exactly how it works in the drizzlepac. But wouldn't it be just as easy to open asn table in an editor and add sky values to each member? We already have the machinery to handle this for group_id and tweakreg_catalog. Why not support 1-2 new keywords?

@mcara
Copy link
Member

mcara commented Jan 8, 2025

Here is the alternative that I would propose: https://github.com/spacetelescope/jwst/compare/main...mcara:jwst:add-custom-sky?expand=1

@emolter
Copy link
Collaborator Author

emolter commented Jan 8, 2025

But wouldn't it be just as easy to open asn table in an editor and add sky values to each member?

I think the concern is, no, it would not be just as easy. It would require a user to understand the structure of an association file. While I agree that it isn't particularly complicated to change, the expectation is usually that the asn files will "just work".

With regards to adding 1-2 new keywords to the asn, it could be just as easily argued that there's no harm adding 1-2 new input parameters to skymatch. Plus, step parameters are typically easier for a user to find in the documentation.

@mcara
Copy link
Member

mcara commented Jan 8, 2025

Actually, for tweakreg there is already such a parameter: catfile. However, tweakreg supports both custom catalogs via this argument and ASN tables.

@emolter
Copy link
Collaborator Author

emolter commented Jan 9, 2025

With the most recent commits I have moved handling for the "user" option out of skymatch() and into the top-level SkyMatchStep(). I agree with Mihai that this is the best place for it, since the "user` option really bypasses the major functionality of the step. Plus, we are unlikely to want to move handling of the "user" option into stcal because it primarily updates datamodel attributes, and stcal is not supposed to care about the datamodel structure.

The new code does exactly what Mihai suggests - it just updates the relevant metadata values in the datamodels and then returns from the step. There is no need to create a SkyImage.

With that refactor, it is not necessary to make the skylist parameter a list of tuples, because the code that groups by asn group membership is also bypassed.

I think this is the simplest possible way to support the desired functionality. I think we could decide separately whether to also allow background levels to be defined in the association as is done in Mihai's proposed alternative above.

Please do let me know your thoughts everyone!

@drlaw1558
Copy link
Collaborator

Sounds reasonable to me! I'd be inclined to keep this simple and just have the one input method for the time being.

@emolter emolter requested a review from a team as a code owner January 10, 2025 19:17
Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me! Simple and clear.

I'll hold off on approving, to make sure we give the science stakeholders a chance to review, but I have only one small question/suggestion below.

Have regression tests been run?

jwst/skymatch/skymatch_step.py Outdated Show resolved Hide resolved
@emolter
Copy link
Collaborator Author

emolter commented Jan 10, 2025

regtests here, rerun just started
https://github.com/spacetelescope/RegressionTests/actions/runs/12656574738

Copy link
Collaborator

@melanieclarke melanieclarke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, one more thing - the arguments documentation for skymatch needs to be updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

skymatch would be improved by offering an option to input user-defined sky values
5 participants