Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing seed objects access rights to dark #10

Open
edsu opened this issue May 29, 2022 · 7 comments
Open

Changing seed objects access rights to dark #10

edsu opened this issue May 29, 2022 · 7 comments
Labels
web archiving 2022 web archiving work cycle

Comments

@edsu
Copy link
Contributor

edsu commented May 29, 2022

When the access rights for a Seed Object change to Dark a given URL should no longer be accessible via swap.stanford.edu. If the rights are made World then any restriction for a given URL should be removed. Rights changes associated with Crawl Objects are considered separately from this issue (see #27).

pywb has embargo and access controls that allow it to be configured to block access based on the URL itself

The controls are stored in an ACLJ file. The wb-manage command line tool can be used to modify ACLJ files:

$ wb-manager acl -h
usage: wb-manager acl [-h] {add,remove,list,validate,match,importtxt} ...

positional arguments:
  {add,remove,list,validate,match,importtxt}

options:
  -h, --help            show this help message and exit

The current thinking on this is to write a daemon in Ruby, similar to the rolling_index in dor_indexing_app, which listens to RabbitMQ for objects that are changing, and inspects them to see if they are Seed Objects. On finding a rights change to a seed object the daemon will issue the corresponding rights change.

This daemon can be part of the was-pywb repository, and should get deployed and started by Capistrano.

@edsu edsu added web archiving 2022 web archiving work cycle needs analysis cannot proceed with this issue without analysis labels May 29, 2022
@lwrubel
Copy link
Contributor

lwrubel commented Jun 9, 2022

As noted in #5 there does not appear to be an pywb endpoint for these updates and other IIPC users are doing manual configuration changes.

@edsu
Copy link
Contributor Author

edsu commented Jun 9, 2022

I spoke with @ikreymer who recommended against Option 1 given the potential complexity of the change. He suggested Option 2 might be feasible, especially since the service could be a small wrapper around the wb-manager system calls, and could potentially be written in Ruby. Option 3 is still out there, and is really just a question of how feasible it is to add YAML editing on a share into how rights are managed elsewhere in SDR.

@edsu
Copy link
Contributor Author

edsu commented Jun 9, 2022

I've updated the description to better reflect that we are only talking about changes to Seed Objects in this issue.

@edsu edsu changed the title Manage remote configuration changes Changing access rights for Seed Objects Jun 9, 2022
@edsu edsu removed the needs analysis cannot proceed with this issue without analysis label Jun 17, 2022
@edsu
Copy link
Contributor Author

edsu commented Jun 17, 2022

I've updated the description in this issue to reflect a conversation with @jcoyne about a possible architectural design for this. The current thinking is we will use RabbitMQ to listen for changes to seed objects, and make the relevant ACLJ updates.

@lwrubel
Copy link
Contributor

lwrubel commented Jun 20, 2022

The focus of this issue is to be able to make World rights Dark via Argo and vice versa, getting the aclj file updated. Adding Stanford-only visibility is not part of this work.

@edsu
Copy link
Contributor Author

edsu commented Jun 20, 2022

Updated this issue to focus on the binary World / Dark change. Stanford only access rights changes has been moved to #46

@edsu edsu changed the title Changing access rights for Seed Objects Changing seed objects access rights to dark Jun 20, 2022
@edsu
Copy link
Contributor Author

edsu commented Jun 22, 2022

@lwrubel and I had a quick conversation w/ @justinlittman about this and he indicated that we would want to gauge how often we expect changes like this being made before we create automation around it. The thought being that a new daemon service that isn't used very often could fail in hard to diagnose ways. Are there obvious downsides to continuing to do the manual exclusions? These are largely questions for @peterchanws & @andrewjbtw to consider before we move forward with implementing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
web archiving 2022 web archiving work cycle
Projects
None yet
Development

No branches or pull requests

2 participants