Skip to content

Commit

Permalink
Add user-agent to openlibrary API calls
Browse files Browse the repository at this point in the history
Why are these changes being introduced:

* OpenLibrary API calls were failing as they were blocking traffic
  without User Agents being set

Relevant ticket(s):

* https://mitlibraries.atlassian.net/browse/TCO-109

How does this address that need:

* Follow OpenLibrary documentation to add a User Agent for our
  requests, including an email address they can contact if issues
  arise with our calls

Document any side effects to this change:

* Created a new moira list for tacos maintainers (tacos-help) and added
  engx-lib as the owners
* Updated Unpaywall code to use the same email (from ENV) as OpenLibrary
  to avoid setting the same value to two different variables
* Added some VCR cassette scrubbing to keep some "not sensitive but
  probably best to not store in cassettes" data
* Confirmed deleting cassettes will regenerate and scrub the data, but
  not require changing ENV in between runs as we see in some other apps
  (I'm not sure why to be honest).
  • Loading branch information
JPrevost committed Jan 13, 2025
1 parent 161862b commit f8335b3
Show file tree
Hide file tree
Showing 15 changed files with 105 additions and 81 deletions.
2 changes: 1 addition & 1 deletion .env.test
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
DETECTOR_VERSION=1
LINKRESOLVER_BASEURL=https://mit.primo.exlibrisgroup.com/discovery/openurl?institution=01MIT_INST&rfr_id=info:sid/mit.tacos.api&vid=01MIT_INST:MIT
UNPAYWALL_EMAIL=timdex@mit.edu
TACOS_EMAIL=tacos-help@mit.edu
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ changes, this is the signal which indicates that terms need to be re-evaluated.

`ORIGINS`: comma-separated list of domains allowed to connect to (and thus query or contribute to) the application. Be sure to specify the port number if a connecting application is not using the standard ports (this applies mostly to local development). If not defined, no external connections will be permitted.

`UNPAYWALL_EMAIL`: email address to include in API call as required in their [documentation](https://unpaywall.org/products/api). Your personal email is appropriate for development. Deployed and for tests, use the timdex moira list email.
`TACOS_EMAIL`: email address to include in API calls or contact information. Currently used in API calls to [Unpaywall](https://unpaywall.org/products/api) and [OpenLibrary](https://openlibrary.org/developers/api). Your personal email is appropriate for development. Deployed and for tests, use the tacos-help moira list email.

### Optional

Expand Down
2 changes: 1 addition & 1 deletion app.json
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
"LINKRESOLVER_BASEURL": {
"required": false
},
"UNPAYWALL_EMAIL": {
"TACOS_EMAIL": {
"required": false
}
},
Expand Down
2 changes: 1 addition & 1 deletion app/models/lookup_doi.rb
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def extract_metadata(external_data)
end

def url(doi)
"https://api.unpaywall.org/v2/#{doi}?email=#{ENV.fetch('UNPAYWALL_EMAIL')}"
"https://api.unpaywall.org/v2/#{doi}?email=#{ENV.fetch('TACOS_EMAIL')}"
end

def fetch(doi)
Expand Down
5 changes: 4 additions & 1 deletion app/models/lookup_isbn.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,10 @@ def fetch_authors(isbn_json)
end

def parse_response(url)
resp = HTTP.headers(accept: 'application/json', 'Content-Type': 'application/json').follow.get(url)
email = ENV.fetch('TACOS_EMAIL')
resp = HTTP.headers(accept: 'application/json',
'Content-Type': 'application/json',
'User-Agent': "MITL TACOS (#{email})").follow.get(url)

if resp.status == 200
JSON.parse(resp.to_s)
Expand Down
22 changes: 22 additions & 0 deletions test/test_helper.rb
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,28 @@
VCR.configure do |config|
config.cassette_library_dir = 'test/vcr_cassettes'
config.hook_into :webmock

# Filter TACOS email. It's not sensitive, but keeping it out of code is still good practice to avoid spam
config.filter_sensitive_data('FAKE_TACOS_EMAIL') do
ENV.fetch('TACOS_EMAIL', nil).to_s
end

config.before_record do |interaction|
header = interaction.response&.headers&.[]('Report-To')
header&.each do |redacted_text|
interaction.filter!(redacted_text, '<REDACTED_REPORT_TO>')
end

header = interaction.response&.headers&.[]('Reporting-Endpoints')
header&.each do |redacted_text|
interaction.filter!(redacted_text, '<REDACTED_REPORTING_ENDPOINT>')
end

header = interaction.response&.headers&.[]('Nel')
header&.each do |redacted_text|
interaction.filter!(redacted_text, '<REDACTED_NEL>')
end
end
end

module ActionDispatch
Expand Down
16 changes: 8 additions & 8 deletions test/vcr_cassettes/doi_10_1038/d41586-023-03497-2.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 8 additions & 8 deletions test/vcr_cassettes/doi_not_found.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

32 changes: 16 additions & 16 deletions test/vcr_cassettes/isbn_978-0-08-102133-0.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 6 additions & 6 deletions test/vcr_cassettes/isbn_not_found.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 7 additions & 8 deletions test/vcr_cassettes/issn_1078-8956.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions test/vcr_cassettes/issn_not_found.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit f8335b3

Please sign in to comment.