Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-indexing transcript PDFs and/or XML#60 #3

Open
labradford opened this issue Mar 9, 2023 · 2 comments
Open

Non-indexing transcript PDFs and/or XML#60 #3

labradford opened this issue Mar 9, 2023 · 2 comments
Assignees

Comments

@labradford
Copy link
Collaborator

I'm trying to track down some functional inconsistencies and log errors. I have been seeing an issue trying to display the xml transcript for items.

Next to the session item (bottom half of page on right), if you click the 'play' icon, an xml box appears with the indexed transcript. In the first case below we see expected behavior, the 2nd does not.

Proper behavior:
https://oralhistory.library.ucla.edu/catalog/21198-zz002kd7t9

Non-working behavior:
https://oralhistory.library.ucla.edu/catalog/21198-zz002kpjm4
Javascript error:
GET blob:https://oralhistory.library.ucla.edu/8c6c73a5-57a4-44c5-a22d-df6760328a7f net::ERR_FILE_NOT_FOUND

I suspect the xml is not getting indexed for some reason, are there any hard coded paths or quirks that might cause this issue?

Possibly related, but perhaps not, we have a decent amount of pdf indexing errors in the log, example:

job_class: IndexPdfTranscriptJob
job_id: 5cf81f52-27ae-4db6-909e-e625f6e7826d
provider_job_id:
queue_name: default
priority:
arguments:

  • 21198-zz002knr6c
  • https://static.library.ucla.edu/oralhistory/pdf/submasters/21198-zz002knr6c-3-submaster.pdf
    executions: 0
    locale: en
    Last Error
    Toggle full message
    Failed to open TCP connection to solr:8983 (No route to host - connect(2) for "solr" port 8983)
    /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/net/http.rb:939:in rescue in block in connect' /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/net/http.rb:936:in block in connect'
    /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/timeout.rb:93:in block in timeout' /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/timeout.rb:103:in timeout'
    /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/net/http.rb:935:in connect' /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/net/http.rb:920:in do_start'
    /usr/local/rvm/rubies/ruby-2.5.8/lib/ruby/2.5.0/net/http.rb:909:in `start'
@aprilrieger aprilrieger moved this to Ready for Development in oral_history Mar 27, 2023
@aprilrieger aprilrieger self-assigned this Mar 27, 2023
@aprilrieger
Copy link

This is solved with 17-index-pdf-transcript-metadata

@aprilrieger aprilrieger moved this from Ready for Development to Code Review in oral_history Mar 28, 2023
@aprilrieger aprilrieger moved this from Code Review to SoftServ QA in oral_history Mar 28, 2023
@aprilrieger aprilrieger moved this from SoftServ QA to Ready for Development in oral_history Apr 7, 2023
@aprilrieger aprilrieger moved this from Ready for Development to Deploy to Staging in oral_history Sep 22, 2023
@aprilrieger aprilrieger moved this from Deploy to Staging to Client QA in oral_history Sep 23, 2023
@aprilrieger
Copy link

aprilrieger commented Dec 7, 2023

This needs rework, that will get accomplished during the oai feed update

@aprilrieger aprilrieger moved this from Client QA to Ready for Development in oral_history Dec 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Ready for Development
Development

No branches or pull requests

2 participants