-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to read catalog as SELF_CONTAINED when written as RELATIVE_PUBLISHED #137
Comments
This is an interesting case, and raises a question for me: is a RELATIVE_PUBLISHED catalog that is copied to a location that does not match its root link HREF a valid STAC? According to the spec I don't believe so. So in this case PySTAC should either handle this error case better or there should be a clear way to accomplish what your trying to do through some other means. Currently PySTAC uses the root link of the catalog to resolve HREFs for relative links - which seems appropriate for a RELATIVE_PUBLISHED catalog in the original location. I can see why there'd be an expectation that you'd be able to traverse relative links based on the relative paths of the actual file locations. One option for handling this would be for PySTAC to always override the root link with the file location it read the catalog from in the case that you're reading a catalog from a file directly. However, I'm not sure there would be logic that could ensure the root HREF, when differing from the catalog read path, represents the same or a different catalog. The I think this speaks to a need for a more consistent way to copy STACs around. This is a particular case where there is only one link's difference (the 'root' link of the root catalog) between a RELATIVE_PUBLISHED and SELF_CONTAINED, the latter having the ability to be copied around to different locations without a problem. However, if a user wanted to copy an ABSOLUTE_PUBLISHED catalog, this becomes more complicated - all of the absolute link HREFs would need to be modified in order to accomplish this. A catalog-to-catalog transfer could happen consistently via this code snippet: cat = pystac.read_file('/original/catalog/location')
cat.normalize_and_save('/new/catalog/location', pystac.CatalogType.SELF_CONTAINED) With this method the catalog will always be written in a valid SELF_CONTAINED catalog at the desired location. I will note that this may change the layout of the STAC to be canonical, which may not be desired. In building a CLI of utility methods as part of #119, I think we should consider this as a subcommand to make this easier, something like:
This way there'd be a consistent way to copy STACs around, that could even have options like copying the assets as well (addressing some points raised in #61). |
Copying STACs was implemented in stactools. See https://stactools.readthedocs.io/en/latest/cli.html#stac-copy for the command line version, https://stactools.readthedocs.io/en/latest/api.html#copying-and-moving for the library functionality. |
I published a catalog to a remote private S3 bucket after writing it with RELATIVE_PUBLISHED, so that the catalog root contained an absolute self link.
I then wanted to read and walk the catalog locally in order to write a derived downstream catalog using some of the items in it, so I downloaded it to my machine and read it in with:
catalog = pystac.Catalog.from_file('path/to/catalog.json')
where this catalog.json is the root containing the absolute self link.Since all of the catalog links remain relative (aside from the root absolute ref) I would expect to be able to walk and read items locally from the catalog as entirely relative links, as if it were SELF_CONTAINED. However I was unable to do so:
It appears there's a
make_all_links_relative()
function that might do what I want, but that throws the same error:Perhaps a few potential solutions here:
Edit: As a manual workaround, I just deleted the absolute self link from the root catalog.json after I downloaded it.
The text was updated successfully, but these errors were encountered: