Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a test to audit tool: fail if volume has no images #84

Open
TBRC-Travis opened this issue Dec 3, 2020 · 10 comments
Open

Add a test to audit tool: fail if volume has no images #84

TBRC-Travis opened this issue Dec 3, 2020 · 10 comments
Assignees

Comments

@TBRC-Travis
Copy link

TBRC-Travis commented Dec 3, 2020

Audit tool should trigger an error if a volume only contains scan request files and no images

This issue arose out of https://github.com/buda-base/library-issues/issues/315

@TBRC-Travis TBRC-Travis self-assigned this Dec 7, 2020
@TBRC-Travis
Copy link
Author

Write use case for this scenario, and consider making tests configurable per user since this test case isn't necessary for all users

@TBRC-Travis
Copy link
Author

This test only applies to the "images" folder for a given work. If an image group folder under the "images" directory contains only 2 scan request files, and no images, then the audit test should fail. why? because if we sync a work with a volume that only contains scan request files it shows up on the website as having images, which isn't the case.

@jimk-bdrc
Copy link
Collaborator

@TBRC-Travis How can a machine distinguish between a scan images and regular images? What if an imagegroup contains only two monochrome images?

@TBRC-Travis
Copy link
Author

I suppose we can't unless we want to start embedding some kind of distinguishing metadata in our scan request files.

IMHO this is a highly improbable edge case, so my vote would be to go ahead and let it fail and if there really was a two-page volume with monochrome images then allow for the test to be disabled after the fact.

@TBRC-Travis TBRC-Travis assigned jimk-bdrc and unassigned TBRC-Travis Feb 4, 2021
@jimk-bdrc
Copy link
Collaborator

Resolved: create a test for a minimum number of files in ArchiveParent and DerivedImageGroup parent. Test property is in shell.properties. Error can also be a warning.

@eroux
Copy link

eroux commented Feb 4, 2021

I had to sort scan request pages for thumbnail generation, one easy way to check is the width / height in pixels

@jimk-bdrc
Copy link
Collaborator

jimk-bdrc commented Feb 4, 2021 via email

@jimk-bdrc
Copy link
Collaborator

There is a very reliable heuristic which doesn't involve examining a file's internals - it's the eXists database: IG XML DOC. See archive-ops #256 The return has image groups:

<work:volumeMap>
<work:volume imagegroup="I1PD34157" num="1"/>
<work:volume imagegroup="I1PD34158" num="2"/>
</work:volumeMap>

Querying each image group with the same query https://www.tbrc.org/xmldoc?rid=I1PD34157 gives the files in the image group, as well as the "intro images"

<imagegroup:images tbrcintro="2" total="0"/>

It should be simple enough to capture this output and make decisions on what is there. (Note there's a variant result which could have <imagegroup:images tbrcintro="2" total="2"/> in which case there's a pipe delimited list of images, which could be taken to be the scan request images.

@jimk-bdrc
Copy link
Collaborator

Can we use a BUDA equivalent?

@jimk-bdrc
Copy link
Collaborator

jimk-bdrc commented Jun 2, 2021

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants