-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reconstruct CRS from WKT only #461
base: main
Are you sure you want to change the base?
Conversation
Add a test for loading an area from a CF NetCDF file encoded only with a CRS WKT. The test currently fails as I did not yet implement anything.
Refactor out _get_type_of_grid_mapping in utils/cf.py in preparation for getting the grid mapping from the WKT / CRS directly. Tolerate absence of grid mapping name if there is a crs_wkt attribute.
@gerritholl pin me when this is merged or mostly done so that I can test it on some of my use cases.
I will wait your PR before proceeding .... |
Side note: |
@ghiggi I am putting the work for this PR on hold. For what I need to achieve, I can and will use pytroll/satpy#2236 instead. Therefore, fixing pytroll/satpy#2226 for the equirectangular projection is no longer a priority from my side. |
I see that this PR seems stalled. If someone starts working on this again later, I want to note that the CF convention (CF-1.10) clearly states that: The crs_wkt attribute is intended to act as a supplement to other single-property CF grid mapping attributes (as described in Appendix F); it is not intended to replace those attributes. If data producers omit the single-property grid mapping attributes in favour of the crs_wkt attribute, software which cannot interpret crs_wkt will be unable to use the grid_mapping information. Therefore the CRS should be described as thoroughly as possible with the single-property grid mapping attributes as well as by crs_wkt. Thus, if we should load a CRS from wkt only, it should not be done silently by from_cf() or load_cf_area(). I think in the case described above, the user should actively (e.g. using a keyword) bypass decoding the single-property CF attributes, and rather use the WKT encoding. |
@TomLav I disagree with this.
This is specific to data producers, not consumers. We are consumers in this case. If the WKT does not match the single-property CF attributes then that is a fault of the data producer or of the CF format as a whole. Edit: We are allowed to just use the WKT attribute as consumers. I think the above point is pretty clear, but I'd also like to add that I strongly disagree with CF's choice of making these separate CF attributes and for making crs_wkt optional and supplemental. WKT is fully descriptive and the CF attributes are not. The CF attributes (and the CF standard) do not support every projection and the standard as a whole has to be updated to support new projections when they come along. Conversations in the xarray ecosystem and the OGC Zarr world have started leaning towards the idea of being CF-compatible-ish because of this and only using the CRS WKT attribute in their zarr standards. |
Hei @djhoese, I am not unaware of the history and tensions between the different communities around the CF attributes and the WKT definitions. Maybe these things will converge with time. I was not saying that our from_cf() should bark and not load the crs from WKT, but I do think that this should not happen silently either. I do not know if this should be controlled via a keyword or a warning, but I think there is value for a routine called from_cf() to have the option to strictly follow the CF convention. We can e.g. have a from_cf(strict=True|False). This being said pyresample load_cf_area() only checks if the :grid_mapping_name attribute is present and has a valid value, it then diverts to pyproj.from_cf() the decoding of the crs attributes and/or wkt. If pyproj adopts a loose approach, pyresample will inherit it silently. You also state that you disagree with a quote that I took from the CF convention, which is fine, but there is little point discussing the CF convention here. |
No, sorry. I disagree with your interpretation of the quote. The standard is saying that we can load from WKT without looking at the single-property attributes (as consumers of the data, not producers).
Why not? The CF convention gives us the
We are following the CF convention by being a consumer of the data and using the |
I see. We disagree on the interpretation of the CF convention. You understand from the convention that files where the CRS is only specified with wkt are CF compliant. I understand the convention that the CRS must first be defined in their single-attribute form, then we can duplicate and enrich with WKT. For reference, my interpretation stems from the sentence: The crs_wkt attribute is intended to act as a supplement to other single-property CF grid mapping attributes (as described in Appendix F); it is not intended to replace those attributes. What follows with data producers does not change my interpretation of this first sentence. I don't think we can settle this from Pytroll, this must have been discussed somewhere at CF, I will give this a look. For the record: if it is so that CF allows a crs to only be defined by its WKT then of course from_cf() should silently do its job and load this. If CF does not allow this, I still think we should help the user and load the CRS, but we should also give the user the control to use non-strict checks. Do you agree? |
Nope. I understand that PRODUCERS must always include the single-property attributes. The CRS WKT is supplemental. BUT there is nothing saying you/we/pyresample, as CONSUMERS of the data, should not use the CRS WKT. If it is there, we can use it. The WKT will never be less accurate than the CF attributes. So if the CRS WKT and single-property attributes are both available, we should always use/prefer the WKT...as CONSUMERS. |
For later reference, it seems that CF (as of CF-1.10) will not accept a crs variable defined only by its wkt description. If the text of the convention is possibly ambiguous (see discussion above), the conformance document is clear (*): The grid mapping variables must have the grid_mapping_name attribute. The legal values for the grid_mapping_name attribute are contained in Appendix F. Which still leaves us all latitude as to what pyresample should do in such cases. As I stated above I believe it should try to find its way around and produce the area_def. But I wanted to clarify (at least for my own sake) what was CF-compliant or not(*). I see that pyproj.CRS.to_cf() happily loads the CRS whether its definition uses only the WKT description (case B below), or both the CF-attribute and WKT descriptions (case A), or only the CF-attribute description (case C). The current behaviour of pyproj seems to be to load the CRS from crs_wkt if present (A and B) and decode the CF attributes only if there are no crs_wkt (case C). I think this is the behaviour you want for pyresample, @djhoese, so all good.
Thus, it should be easy for pyresample load_cf_area() to work smoothly in all 3 cases, we just need to decide how to warn/inform the user that we found a non-compliancy but marched on (in case B). Turning this (*) of course the conformance document could be wrong and not be aligned with CF convention. But what counts to me is that the conformance checkers will implement the conformance document. |
As mentioned on slack from our discussion yesterday and today, after looking at the pyresample CF code it is clear pyresample is doing a lot more validation of CF standards than I thought it was. In those cases it makes sense to have a I would be ok with changing that |
Load an AreaDefinition from a CF NetCDF file when the grid mapping variable is encoding only the Coordinate Reference System (CRS) in Well Known Text (WKT), but lacks other metadata.
Work in progress.
git diff origin/main **/*py | flake8 --diff