Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Killing the space-separated-token-string meta-type #149

Closed
stuartpb opened this issue Feb 7, 2017 · 13 comments
Closed

Killing the space-separated-token-string meta-type #149

stuartpb opened this issue Feb 7, 2017 · 13 comments

Comments

@stuartpb
Copy link
Member

stuartpb commented Feb 7, 2017

So, I'm looking at JSON Schema right now (#146), and it looks like it's probably robust enough to represent everything that's in the loosely-defined non-schema profiles use right now.

However, one thing that would be clumsy to support, in its current form, is the "space-separated string of keywords" pattern that a bunch of values use. The only real validation JSON Schema has for string values is regular-expression-based pattern matching, and that would mean we can't enforce deeper validation of the string, like unique values. An enforced uniqueness constraint would be allowable if these sequences were represented as arrays, and that's what I want to get at here.

There isn't really any value, outside of a little bit of authoring-side syntax lightness, to keeping these sequences as strings. Yes, most of the time, the value is just one string, but the thing is, that makes this pattern more likely to be problematic, not less. It means that implementations that should be checking against value.split(' ').indexOf('item') > -1 don't encounter errors when they check against value == 'item', which works 90% of the time. Using an array consistently forces them to test via .contains, and it saves the .split step that any reasonable test would have to do anyway.

And it's not like it's really harder to write these values in square braces separated by commas than it is to do them separated by spaces. Really, the main difference that this introduces is that it forces anything that could support multiple values to actively admit it supports multiple values. On both sides, making this notion more visible - and making the concern less deferrable from an extensible-schema-authorship standpoint - this is a win.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

One thing that I think should be addressed is what should then become of the "two values together" union type. My thoughts on this are that there are really only two viable options:

  • Explicitly document possible combinations in a canonical way, as tokens in and of themselves, keeping the plus-separated style. This lets combination values be tested against as plain string values, instead of requiring deeper code to suss out tuple values etc, and preserves the decomposable nature of "you can represent this by splitting on the plusses and presenting a union of types you already have assets for". The downside is that it's less extensible.
  • Introduce a second layer of array, which is interpreted as an "and" relationship. This makes code for handling these sequences potentially a lot more complex, but makes it so every single valid combination doesn't have to be spelled out in the schema.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

On this note, password.change.reauth and company should be an array, not a string, with [] as the "none" value - since it's possible for a password change or username change to require multiple kinds of reauthentication.

EDIT: This is now handled naturally by separate fields under "form" having a "required" value.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 9, 2017

Again, when deciding if a thing being designed will be an enum-string or a tag-array, the question will need to be "is any possible value for this field mutually exclusive, including values which may be yet to come?"

Also, I'm starting to find that many places where the answer is "no" (sessions.invalidate, usability, now reauth) are a leaky headache that is ultimately better off with multiple distinct fields (like the way #120 fell apart into #127, or reauth fell apart in favor of form).

@stuartpb
Copy link
Member Author

stuartpb commented Feb 10, 2017

I think this'll probably get accepted soon with the "strings-with-plus-in-them-as-tokens" option to handle compound tuple values described in #149 (comment), and when that happens, a new issue should be filed arguing for the latter option, because I'm not certain the plus-method is more desireable, and am really only picking it because it seems like the path of least resistance from an authorship side right now.

@stuartpb
Copy link
Member Author

I'm working on this now.

I think when I originally envisioned this type, it was with the idea that I'd be cooler with translations from the YAML to JSON or whatever build format than I ultimately ended up being (like, I was thinking consumers would build their own translations into a local format as a matter of course). As I mentioned above, it grants a little bit of authorship ease YAML-side, but really, it's not all that worth it, especially when you consider how authors ought to be aware of what's array-based as much as consumers (ie. seeing a bunch of written templates have one item might mislead one who's never read the docs to think "hmm, I have to pick one or the other for this").

@stuartpb
Copy link
Member Author

This is also going to ship a couple updates to the enums:

  • dob (as seen in hrblock.com's profile, concatenated to a bunch of other factors) has been swapped for birthdate, mirroring the terminology established in Date-of-birth input in registration.form #182.
  • token, really a legacy value from the early days of "does it send a token, or does it send a randomized password" that was retrofitted to mean "code", has become the much-less-easy-to-mistake (and less confusing, considering that I call enum values "tokens") code.
  • the undocumented password.reset.flow.open.expects: question-answer from odesk.com has become knowledge (and is now documented), anticipating Refactoring questions #207. (Also, both values for this are swinging around with Profiling lists of steps to reach a form from the initial URL #222 and Rethinking password.reset.flow.open.expects #228, but that's neither here nor there.)
  • the domain enum for password.reset.flow.request.form.account.accepts and company on some domain registrars is getting a line of documentation explaining what it entails.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 16, 2017

You know, I'm pretty sure the "this plus that" pattern, akin to what I was describing above about leaky abstractions where values aren't mutually exclusive, is really just a bad solution for a problem that would better be solved along another axis, like a "requires" field (or just some input: required fields in form, and/or identifies: account under some elements under form).

I think that's probably the right solution, ultimately: revisiting these and seeing if a clearer pattern can be pulled out that isn't implying extra stuff is specifying the account (even if it means abandoning the form.account field altogether for some profiles).

@stuartpb
Copy link
Member Author

Yeah, actually, this is kind of a legacy structure that doesn't apply any more, and certainly not in the context of account.

That's a sensible point, but I'm getting pretty tired, so it'll have to ship in a second PR from this one, with new documentation for fields like password.reset.flow.request.form.domain (and some brainpower and instance consideration to figure out if password.reset.flow.request.form.email.input should get thrown into the mix).

@stuartpb
Copy link
Member Author

stuartpb commented Feb 16, 2017

Also, in a bit of irony, due to a test I wrote up to make this revision easier to TDD and to catch me from forgetting I committed it when authoring, this previously-schema-hostile type is actually going to be the first part of profiles to undergo validation in testing.

@stuartpb
Copy link
Member Author

I think the account input handing will be "if there are two things that are like account, like how Project Euler requires both username and password, then account will be left out."

Of course, Project Euler these days is a whole other kind of headache, so that specific example isn't great, but it's along the right lines. Meanwhile, a site with email+domain username+domain will just become a site with account.accepts: [email, username] and domain.input: required. (And email+other+stuff will have account.accepts: [email] and other form fields for other and stuff.)

@stuartpb
Copy link
Member Author

birthdate isn't in the documentation as a token, but yeah, it's not gonna be, so nbd.

@stuartpb
Copy link
Member Author

Okay, these are represented with separate input: required forms now - the couple that represented very complex entries as alternatives to an account identifier have been documented as notes.

These will probably evolve into caveats (#137) or an all-new object / array for documenting last-resort identifier combinations (if the pattern keeps coming up).

@stuartpb
Copy link
Member Author

stuartpb commented Nov 15, 2017

Per #308, future discussion about schema issues should be held under https://github.com/opws/opws-schemata/issues/. Describing how to address the data lost in the refactors from this issue (in light of notes being de-sanctioned with opws/opws-schemata#1) is being discussed at opws/opws-schemata#14.

@opws opws locked and limited conversation to collaborators Nov 15, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant