Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use inheritance instead of key matching? #31

Open
no-reply opened this issue Jul 12, 2022 · 5 comments
Open

Use inheritance instead of key matching? #31

no-reply opened this issue Jul 12, 2022 · 5 comments

Comments

@no-reply
Copy link

in, e.g, https://github.com/samvera-labs/houndstooth/blob/main/examples/UCSD.yaml we have definitions like:

properties: # add to spec?
  "agent:contributor": # I think the quotes here are causing problems - which might be the colon in the property names is an issue...
    display_label: "Contributor"
    definition: "A person or organization responsible for making contributions to the resource."
    usage_guidelines: # I think the formatting of this list is wrong, and I need to either put everything in quotes, or turn it into a list of key/value pairs
      General: "Use as the broadest possible role, i.e. when lacking specific information on an agent's role, or the role is unclear or unknown. If a more specific role is available, prefer that, e.g. editor, compiler, illustrator."
      RDCP: "Use for institutions, centers in addition to individuals that contribute to a resource (e.g. SIO, CAICE)"
      "DLDP/SCA": "null"
      Music: "null"
      CLR: "null"

it's not clear to me in this case what "General", "RDCP", etc... represent, in terms of M3 abstractions. in other cases we have:

  usage_guidelines: 
      default: "Record in lastname, firstname order."
      Geospatial: "some other value"

where it's more clear that "Geospatial" is a class. but i'm still concerned that these examples involve a matching step after parsing to decide which parts of a property definition apply to which objects.

i'd propose an alternative:

properties: # add to spec?
  agent_contributor:
    display_label: "Contributor"
    available_on:
      - General
    definition: "A person or organization responsible for making contributions to the resource."
    usage_guidelines: "Use as the broadest possible role, i.e. when lacking specific information on an agent's role, or the role is unclear or unknown. If a more specific role is available, prefer that, e.g. editor, compiler, illustrator."

  "agent:contributor:geospatial":
    <<: *agent_contributor
    available_on:
      - Geospatial    
    usage_guidelines:  "some other value"

this gives a general solution for any case where a given property needs to be overridden per class. and avoids introducing a separate abstraction for applying these overrides. on the implementation side, it has the benefit that you can use base YAML parsing to determine what properties apply to which classes. once you've done this, you can apply them wholesale without the need to filter contextually.

@marrus-sh
Copy link

marrus-sh commented Jul 12, 2022

I like the overall idea here, although i have some questions/concerns :—

  1. Although the description defines available_on as an array of class names, the actual JSON schema expects it to be an object with a class property (which is then an array of class names), leaving an open (unrealized) potential for other ways of determining available_on. I think we might need more clarity about what available_on actually means and whether supporting other kinds of availability other than “classes” is actually a door we want to leave open.

  2. I worry that the subclassing here will make it harder to do generic code: Since General objects will have an agent:contributor property but Geospatial objects will have an agent:contibutor:geospatial property, it becomes more complicated writing code which handles both. Plus, it requires a stronger linkage between classes in code and in the metadata schema—it would be a problem if a class considered itself to be both a Geospatial and a General, because it would then have the property twice.

    In an ideal world, I’d probably rather things look like this :—

    properties:
    
    - &agent:contributor
      name: agent:contributor # the name for computers
      label: Contributor # the name for humans
      available_on:
        class: # ???
        - General
      usage_guidelines: Some general guidelines
    
    - <<: *agent:contributor
      # has the same name as above
      # it is up to implementations to resolve this conflict
      available_on:
        class: # ???
        - Geospatial
      usage_guidelines: Some specific guidelines

    Then an implementation could easily prioritize the agent:contributor defined on Geospatial but fall back to properties defined on General when no Geospatial property is defined. However, this is a much bigger change to the model.

  3. << is a YAML 1.1 feature which is supported in Ruby, but it was dropped from YAML 1.2. I don’t think this is an argument against using it, but we should maybe be conscious that YAML 1.2‐based implementations might need to do a little more work.

@no-reply
Copy link
Author

Although the description defines available_on as an array of class names, the actual JSON schema expects it to be an object with a class property (which is then an array of class names), leaving an open (unrealized) potential for other ways of determining available_on. I think we might need more clarity about what available_on actually means and whether supporting other kinds of availability other than “classes” is actually a door we want to leave open.

i think we should seriously consider foreclosing this. right now, it's possible to describe M3's basic function in terms of two abstractions:

  • Class
  • Property (or maybe property specifications)
    we can say M3 defines schemas by listing a set of Classes and a set of Property specs, and mapping the latter onto the former. if this is a good description (is it?) introducing another way of applying properties complicates things for us.

if this sounds reasonable, i'd like to open a ticket to somehow formalize this way of looking at what M3 does.

@no-reply
Copy link
Author

I worry that the subclassing here will make it harder to do generic code

i think you're right about this, and the kind of solution you're pointing to is the kind i would have in mind. i think the assumption that the YAML key defines the ruby method name, or the "property name" is problematic in general. if M3 is interested in defining property names, i think it's helpful to do so explicitly with a name key (and ideally allow implementations to make their own judgements about whether those are appropriate internal property keys).

@no-reply
Copy link
Author

<< is a YAML 1.1 feature which is supported in Ruby, but it was dropped from YAML 1.2.

if it's okay to say "M3 supports YAML 1.1" (i hope this is okay), i think we should do that. YAML 1.2 support can be a question for another day.

@marrus-sh
Copy link

i think we should seriously consider foreclosing this.

This would be my preference too.

if M3 is interested in defining property names, i think it's helpful to do so explicitly with a name key

The main motivation here being a general sense of “an implementation should support at most one definition with this name on any given object”, and nothing stricter than that. I think that is the requirement we are looking for with this usecase.

if it's okay to say "M3 supports YAML 1.1" (i hope this is okay), i think we should do that.

I think this is okay, and YAML 1.2 processors which don’t support << magic would (I think) just interpret it as a key named <<. I think that’s a fine fallback; it would just mean those implementations would have to check for that key and carry forward any properties available there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants