Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade PyCSW #608

Closed
adborden opened this issue Mar 2, 2019 · 18 comments
Closed

Upgrade PyCSW #608

adborden opened this issue Mar 2, 2019 · 18 comments
Assignees

Comments

@adborden
Copy link
Contributor

adborden commented Mar 2, 2019

We're currently on a fork of 1.8.5 from Oct 2015. 2.2.0 is available from March 2018.

@adborden
Copy link
Contributor Author

adborden commented Mar 2, 2019

They're now on python 3, so we might consider separating the pycsw virtualenv from ckan.

@adborden
Copy link
Contributor Author

adborden commented Mar 2, 2019

Scratch that, it still works with python 2, but we should still put it in its own virtualenv.

@adborden
Copy link
Contributor Author

adborden commented Mar 2, 2019

I took a quick look at this and it's a low-medium effort. Our customizations to pycsw are all for the ckan loading, which exists in ckanext-spatial. So the plan becomes:

  1. Pull in any relevant updates to ckanext-spatial.
  2. Remove our fork of pycsw and use 2.2.0.
  3. Reconfigure the load job to use ckan-pycsw from ckanext-spatial.

@adborden
Copy link
Contributor Author

adborden commented Mar 6, 2019

This is a blocker ckan/ckanext-spatial#49

@adborden
Copy link
Contributor Author

adborden commented Mar 6, 2019

But we can still upgrade our fork of pycsw.

@adborden adborden self-assigned this Mar 6, 2019
@adborden
Copy link
Contributor Author

https://github.com/geopython/pycsw/releases/tag/2.4.0 I don't see a changelog or similar, so I don't know exactly what the update brings.

@adborden
Copy link
Contributor Author

Upgrading PyCSW to provide a more stable instance of CSW for FGDC.

@adborden
Copy link
Contributor Author

In testing in staging, we'll need to upgrade requests and OWSLib. I don't know which specific versions, but the latest versions requests==2.22.0 and OWSLib==0.18.0 work in my testing. Need to check that the harvester still works.

Moving to the latest OWSLib, we should be able to drop our fork https://github.com/GSA/OWSLib

@adborden
Copy link
Contributor Author

Just ran into this :/

2019-07-31 00:31:12,346 [pycsw-ckan] DEBUG: Fetching url=https://catalog-datagov.dev-ocsit.bsp.gsa.gov//harvest/object/68f3852b-53af-4953-b6d7-df9b67e48072
2019-07-31 00:31:12,378 [urllib3.connectionpool] DEBUG: https://catalog-datagov.dev-ocsit.bsp.gsa.gov:443 "GET //harvest/object/68f3852b-53af-4953-b6d7-df9b67e48072 HTTP/1.1" 200 3031
2019-07-31 00:31:12,379 [pycsw-ckan] DEBUG: Open Data JSON detected. Converting to ISO XML: de6c0602-115e-4d0b-8292-08d1e3ced8c0
Traceback (most recent call last):
  File "/usr/lib/ckan/bin/pycsw-ckan.py", line 7, in <module>
    exec(compile(f.read(), __file__, 'exec'))
  File "/usr/lib/ckan-new/src/pycsw/bin/pycsw-ckan.py", line 423, in <module>
    __reconcile(gathered_records, existing_records)
  File "/usr/lib/ckan-new/src/pycsw/bin/pycsw-ckan.py", line 281, in __reconcile
    record = get_record(ckan_api, CONTEXT, repo, CKAN_URL, ckan_id, ckan_info)
  File "/usr/lib/ckan-new/src/pycsw/bin/pycsw-ckan.py", line 143, in get_record
    template = env.get_template(tmpl)
  File "/usr/lib/ckan/lib/python2.7/site-packages/jinja2/environment.py", line 719, in get_template
    return self._load_template(name, self.make_globals(globals))
  File "/usr/lib/ckan/lib/python2.7/site-packages/jinja2/environment.py", line 693, in _load_template
    template = self.loader.load(self, name, globals)
  File "/usr/lib/ckan/lib/python2.7/site-packages/jinja2/loaders.py", line 115, in load
    source, filename, uptodate = self.get_source(environment, name)
  File "/usr/lib/ckan/lib/python2.7/site-packages/jinja2/loaders.py", line 180, in get_source
    raise TemplateNotFound(template)
jinja2.exceptions.TemplateNotFound: datajson2iso.xml

@adborden
Copy link
Contributor Author

Work around by updating the path, it's using the source install path, not the virtualenv.

@adborden
Copy link
Contributor Author

To update the database, there's no migration.

  1. delete_records command is killed, likely OOM, so with psql, issue drop table commands.
  2. Make sure postgis is enabled, CREATE EXTENSION postgis;
  3. Create the tables $venv/bin/pycsw-ckan.py -c setup_db -f /etc/ckan/pycsw-all.cfg

@adborden
Copy link
Contributor Author

This was shipped!

@kalxas
Copy link

kalxas commented Sep 26, 2019

I see INSPIRE support enabled but no service metadata populated. I think it should be disabled.

@adborden
Copy link
Contributor Author

Thanks @kalxas can you clarify? Our config is based on the default example configuration provided from pycsw. The documentation for the extension, merely says to set enabled to true and it does not mention that any additional configuration is required.

Is that not the case? Should I report this upstream?

@kalxas
Copy link

kalxas commented Sep 26, 2019

I would recommend to set this to false since INSPIRE applies to EU countries :)

@adborden
Copy link
Contributor Author

Thanks @kalxas! I opened #986 to address the configuration. Is there any documentation on INSPIRE you can link to? I'm not familiar with it.

@kalxas
Copy link

kalxas commented Sep 26, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants