Use inheritance instead of key matching? #31

no-reply · 2022-07-12T21:00:27Z

in, e.g, https://github.com/samvera-labs/houndstooth/blob/main/examples/UCSD.yaml we have definitions like:

properties: # add to spec?
  "agent:contributor": # I think the quotes here are causing problems - which might be the colon in the property names is an issue...
    display_label: "Contributor"
    definition: "A person or organization responsible for making contributions to the resource."
    usage_guidelines: # I think the formatting of this list is wrong, and I need to either put everything in quotes, or turn it into a list of key/value pairs
      General: "Use as the broadest possible role, i.e. when lacking specific information on an agent's role, or the role is unclear or unknown. If a more specific role is available, prefer that, e.g. editor, compiler, illustrator."
      RDCP: "Use for institutions, centers in addition to individuals that contribute to a resource (e.g. SIO, CAICE)"
      "DLDP/SCA": "null"
      Music: "null"
      CLR: "null"

it's not clear to me in this case what "General", "RDCP", etc... represent, in terms of M3 abstractions. in other cases we have:

  usage_guidelines: 
      default: "Record in lastname, firstname order."
      Geospatial: "some other value"

where it's more clear that "Geospatial" is a class. but i'm still concerned that these examples involve a matching step after parsing to decide which parts of a property definition apply to which objects.

i'd propose an alternative:

properties: # add to spec?
  agent_contributor:
    display_label: "Contributor"
    available_on:
      - General
    definition: "A person or organization responsible for making contributions to the resource."
    usage_guidelines: "Use as the broadest possible role, i.e. when lacking specific information on an agent's role, or the role is unclear or unknown. If a more specific role is available, prefer that, e.g. editor, compiler, illustrator."

  "agent:contributor:geospatial":
    <<: *agent_contributor
    available_on:
      - Geospatial    
    usage_guidelines:  "some other value"

this gives a general solution for any case where a given property needs to be overridden per class. and avoids introducing a separate abstraction for applying these overrides. on the implementation side, it has the benefit that you can use base YAML parsing to determine what properties apply to which classes. once you've done this, you can apply them wholesale without the need to filter contextually.

marrus-sh · 2022-07-12T21:43:58Z

I like the overall idea here, although i have some questions/concerns :—

Although the description defines available_on as an array of class names, the actual JSON schema expects it to be an object with a class property (which is then an array of class names), leaving an open (unrealized) potential for other ways of determining available_on. I think we might need more clarity about what available_on actually means and whether supporting other kinds of availability other than “classes” is actually a door we want to leave open.
I worry that the subclassing here will make it harder to do generic code: Since General objects will have an agent:contributor property but Geospatial objects will have an agent:contibutor:geospatial property, it becomes more complicated writing code which handles both. Plus, it requires a stronger linkage between classes in code and in the metadata schema—it would be a problem if a class considered itself to be both a Geospatial and a General, because it would then have the property twice.

In an ideal world, I’d probably rather things look like this :—
```
properties:

- &agent:contributor
  name: agent:contributor # the name for computers
  label: Contributor # the name for humans
  available_on:
    class: # ???
    - General
  usage_guidelines: Some general guidelines

- <<: *agent:contributor
  # has the same name as above
  # it is up to implementations to resolve this conflict
  available_on:
    class: # ???
    - Geospatial
  usage_guidelines: Some specific guidelines
```
Then an implementation could easily prioritize the agent:contributor defined on Geospatial but fall back to properties defined on General when no Geospatial property is defined. However, this is a much bigger change to the model.
<< is a YAML 1.1 feature which is supported in Ruby, but it was dropped from YAML 1.2. I don’t think this is an argument against using it, but we should maybe be conscious that YAML 1.2‐based implementations might need to do a little more work.

no-reply · 2022-07-12T22:16:42Z

Although the description defines available_on as an array of class names, the actual JSON schema expects it to be an object with a class property (which is then an array of class names), leaving an open (unrealized) potential for other ways of determining available_on. I think we might need more clarity about what available_on actually means and whether supporting other kinds of availability other than “classes” is actually a door we want to leave open.

i think we should seriously consider foreclosing this. right now, it's possible to describe M3's basic function in terms of two abstractions:

Class
Property (or maybe property specifications)
we can say M3 defines schemas by listing a set of Classes and a set of Property specs, and mapping the latter onto the former. if this is a good description (is it?) introducing another way of applying properties complicates things for us.

if this sounds reasonable, i'd like to open a ticket to somehow formalize this way of looking at what M3 does.

no-reply · 2022-07-12T22:19:31Z

I worry that the subclassing here will make it harder to do generic code

i think you're right about this, and the kind of solution you're pointing to is the kind i would have in mind. i think the assumption that the YAML key defines the ruby method name, or the "property name" is problematic in general. if M3 is interested in defining property names, i think it's helpful to do so explicitly with a name key (and ideally allow implementations to make their own judgements about whether those are appropriate internal property keys).

no-reply · 2022-07-12T22:20:18Z

<< is a YAML 1.1 feature which is supported in Ruby, but it was dropped from YAML 1.2.

if it's okay to say "M3 supports YAML 1.1" (i hope this is okay), i think we should do that. YAML 1.2 support can be a question for another day.

marrus-sh · 2022-07-12T23:26:17Z

i think we should seriously consider foreclosing this.

This would be my preference too.

if M3 is interested in defining property names, i think it's helpful to do so explicitly with a name key

The main motivation here being a general sense of “an implementation should support at most one definition with this name on any given object”, and nothing stricter than that. I think that is the requirement we are looking for with this usecase.

if it's okay to say "M3 supports YAML 1.1" (i hope this is okay), i think we should do that.

I think this is okay, and YAML 1.2 processors which don’t support << magic would (I think) just interpret it as a key named <<. I think that’s a fine fallback; it would just mean those implementations would have to check for that key and carry forward any properties available there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use inheritance instead of key matching? #31

Use inheritance instead of key matching? #31

no-reply commented Jul 12, 2022

marrus-sh commented Jul 12, 2022 •

edited

Loading

no-reply commented Jul 12, 2022

no-reply commented Jul 12, 2022

no-reply commented Jul 12, 2022

marrus-sh commented Jul 12, 2022

Use inheritance instead of key matching? #31

Use inheritance instead of key matching? #31

Comments

no-reply commented Jul 12, 2022

marrus-sh commented Jul 12, 2022 • edited Loading

no-reply commented Jul 12, 2022

no-reply commented Jul 12, 2022

no-reply commented Jul 12, 2022

marrus-sh commented Jul 12, 2022

marrus-sh commented Jul 12, 2022 •

edited

Loading