Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem: It isn't easy to create reusable code in mets-reader-writer (metsrw) to use only the (user) needed PREMIS containers #1304

Open
5 tasks
ross-spencer opened this issue Sep 3, 2020 · 2 comments
Labels
Ⓜ️ mets/premis METS/PREMIS issues Request: discussion The path towards resolving the issue is unclear and opinion is sought from other community members. Status: refining The issue needs additional details to ensure that requirements are clear. Type: feature New functionality. Wellcome Wellcome Trust

Comments

@ross-spencer
Copy link
Contributor

ross-spencer commented Sep 3, 2020

Please describe the problem you'd like to be solved

Creating a PREMIS event might be done as follows:

    premis_data = (
        "event",
        PREMIS_META,
        (
            "event_identifier",
            ("event_identifier_type", ID_TYPE),
            ("event_identifier_value", event.event_id),
        ),
        ("event_type", event.event_type),
        ("event_date_time", event.event_datetime),
        ("event_detail_information", ("event_detail", event.event_detail)),
        (
            "event_outcome_information",
            ("event_outcome", event.event_outcome),
            (
                "event_outcome_detail",
                ("event_outcome_detail_note", event.event_outcome_detail),
            ),
        ),
    )
    for agent in event.agents.all():
        premis_data += (
            (
                "linking_agent_identifier",
                ("linking_agent_identifier_type", agent.identifiertype),
                ("linking_agent_identifier_value", agent.identifiervalue),
            ),
        )

    for linking_object_uuid in linking_object_uuids:
        premis_data += (
            (
                "linkingObjectIdentifier",
                ("linking_object_identifier_type", ID_TYPE),
                ("linking_object_identifier_value", linking_object_uuid),
                ("linking_object_role", SOURCE_ROLE),
            ),
        )

    return metsrw.plugins.premisrw.data_to_premis(
        premis_data, premis_version=PREMIS_META["version"]
    )

That's going to satisfy me for pretty much every event so I can reuse this. But events use containers differently, a very rough summary of the ones I've audited (in Archivematica) that have empty containers (which are not mandatory in the PREMIS schema) look as follows:

Ingestion
---------

    <premis:eventDetailInformation>
      <premis:eventDetail></premis:eventDetail>
    </premis:eventDetailInformation>
    
    
    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote></premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>
                        
Registration
------------

    <premis:eventDetailInformation>
      <premis:eventDetail></premis:eventDetail>
    </premis:eventDetailInformation>            
    
    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote>accession#DemoCSV1</premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>    
    
Fixity check
------------

  <premis:eventOutcomeDetail>
    <premis:eventOutcomeDetailNote></premis:eventOutcomeDetailNote>
  </premis:eventOutcomeDetail>    
  
Metadata extraction
-------------------

    <premis:eventDetailInformation>
      <premis:eventDetail></premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote>"METS-tools.15e219c3-0f51-4d32-80f4-577edfeceb05.xml#xpointer(id('techMD_1').xml"</premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>  
    
Name cleanup
------------

    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>    
      
Normalization
-------------

    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>      
      
Creation
--------

    <premis:eventDetailInformation>
      <premis:eventDetail></premis:eventDetail>
    </premis:eventDetailInformation>
    <premis:eventOutcomeInformation>
      <premis:eventOutcome></premis:eventOutcome>
      <premis:eventOutcomeDetail>
        <premis:eventOutcomeDetailNote></premis:eventOutcomeDetailNote>
      </premis:eventOutcomeDetail>
    </premis:eventOutcomeInformation>

As a user, I might want to conditionally output containers based on whether they are used or not but the nesting of tuples makes it a) difficult to construct conditionally, and b) difficult to filter the structure once created without more thorough processing.

Describe the solution you'd like to see implemented

I want to be able to signal to metsrw that I don't want something. So to be able to either construct a condensed PREMIS representation more easily and with less code repetition, or ask metsrw to optionally output a condensed PREMIS representation as required.

Describe alternatives you've considered

For now, folks can use the verbose constructor to achieve this, but from what I can see, where the nesting gets deeper and more complex, parts of functions will need to be duplicated in other helper methods which is fairly redundant and a little less clean to write.

Additional context

Related artefactual-labs/mets-reader-writer#43


For Artefactual use:

Before you close this issue, you must check off the following:

  • All pull requests related to this issue are properly linked
  • All pull requests related to this issue have been merged
  • A testing plan for this issue has been implemented and passed (testing plan information should be included in the issue body or comments)
  • Documentation regarding this issue has been written and merged
  • Details about this issue have been added to the release notes
@ross-spencer ross-spencer added Type: feature New functionality. Ⓜ️ mets/premis METS/PREMIS issues Wellcome Wellcome Trust labels Sep 3, 2020
@ross-spencer
Copy link
Contributor Author

ross-spencer commented Sep 3, 2020

The situation might not be as dire as described, but I'm not sure. This is one approach that works (it took some thinking backwards and less linearly):

    if event.event_outcome_detail or event.event_outcome:
        if event.event_outcome_detail:
            detail = ("event_outcome_detail",("event_outcome_detail_note", event.event_outcome_detail),)
            try:
                event_outcome_info += detail
            except UnboundLocalError:
                event_outcome_info = detail
        if event.event_outcome:
            detail = ("event_outcome", event.event_outcome)
            try:
                event_outcome_info += detail
            except UnboundLocalError:
                event_outcome_info = detail
        premis_data += (("event_outcome_information", event_outcome_info),)
    premis_data = (
        "event",
        PREMIS_META,
        (
            "event_identifier",
            ("event_identifier_type", ID_TYPE),
            ("event_identifier_value", event.event_id),
        ),
        ("event_type", event.event_type),
        ("event_date_time", event.event_datetime),
    )

    if event.event_detail:
        premis_data += (
            ("event_detail_information", ("event_detail", event.event_detail)),
        )

    if event.event_outcome_detail or event.event_outcome:
        if event.event_outcome_detail:
            detail = ("event_outcome_detail",("event_outcome_detail_note", event.event_outcome_detail),)
            try:
                event_outcome_info += detail
            except UnboundLocalError:
                event_outcome_info = detail
        if event.event_outcome:
            detail = ("event_outcome", event.event_outcome)
            try:
                event_outcome_info += detail
            except UnboundLocalError:
                event_outcome_info = detail
        premis_data += (("event_outcome_information", event_outcome_info),)


    for agent in event.agents.all():
        premis_data += (
            (
                "linking_agent_identifier",
                ("linking_agent_identifier_type", agent.identifiertype),
                ("linking_agent_identifier_value", agent.identifiervalue),
            ),
        )

    for linking_object_uuid in linking_object_uuids:
        premis_data += (
            (
                "linkingObjectIdentifier",
                ("linking_object_identifier_type", ID_TYPE),
                ("linking_object_identifier_value", linking_object_uuid),
                ("linking_object_role", SOURCE_ROLE),
            ),
        )

    return metsrw.plugins.premisrw.data_to_premis(
        premis_data, premis_version=PREMIS_META["version"]
    )

Ultimately it might still be better to teach metsrw to understand empty or null values as instructions to leave a field out, or some other solution as described above.

@sevein
Copy link
Contributor

sevein commented Sep 3, 2020

Relates to #743.

@ross-spencer ross-spencer added the Request: discussion The path towards resolving the issue is unclear and opinion is sought from other community members. label Sep 4, 2020
@sromkey sromkey added the Status: refining The issue needs additional details to ensure that requirements are clear. label Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ⓜ️ mets/premis METS/PREMIS issues Request: discussion The path towards resolving the issue is unclear and opinion is sought from other community members. Status: refining The issue needs additional details to ensure that requirements are clear. Type: feature New functionality. Wellcome Wellcome Trust
Projects
None yet
Development

No branches or pull requests

3 participants