Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation of anyOf elements in schema should be oneOf #2666

Closed
marcprux opened this issue Jul 15, 2017 · 16 comments
Closed

Generation of anyOf elements in schema should be oneOf #2666

marcprux opened this issue Jul 15, 2017 · 16 comments

Comments

@marcprux
Copy link
Member

In modernizing our application that generates an API from the vega-lite JSON schema, I notice that vega-lite 1.2.1+ no longer uses oneOf elements in favor of anyOf elements. Many of these seem to define illogical schema constraints which, in many cases, would be impossible to represent in a strongly-typed language. It appears to be a result of the change made in YousefED/typescript-json-schema#47.

For example, at https://github.com/vega/schema/blob/master/vega-lite/v2.0.0-beta.8.json#L4172, the bin field is anyOf a boolean or #/definitions/Bin. Given that an element can only possibly satisfy one or the other of these constraints, wouldn't a oneOf constraint be more appropriate?

A more overt example is https://github.com/vega/schema/blob/master/vega-lite/v2.0.0-beta.8.json#L1401, where legend is anyOf a null or #/definitions/Legend. Clearly, it can't be both; wouldn't this be better represented by simply having legend be a non-required property?

This is a blocking issue for our project and possibly for other projects such as vegas-viz/Vegas#83, since a JSON-Schema-to-strongly-typed-language code generator isn't able to logically represent these anyOf constraints.

Would it be possible to revert to oneOf constraints for the vega-lite 2.0 release, or would it break something else? This issue appears to have been discussed and addressed in #1706, but the anyOf constraints continue to be generated in subsequent 2.0 alpha/beta releases.

@kanitw
Copy link
Member

kanitw commented Jul 15, 2017

If I remember correctly, oneOf was problematic for union type where each type has all properties optional. In such cases, it would satisfy multiple types.

@domoritz may have better idea why we changed this behavior.

@marcprux
Copy link
Member Author

Ambiguous schema satisfaction is problematic for both oneOf and anyOf. I would think that it would make more sense to add a required disambiguation property to the class in these cases. If anyone recalls which specific classes/schemas this affected, I'd be interested in taking a look to see if there is a sensible solution at the TS class definition level.

@domoritz
Copy link
Member

domoritz commented Jul 15, 2017 via email

@marcprux
Copy link
Member Author

marcprux commented Jul 15, 2017

Potential ambiguous schemas are problematic, both for anyOf and oneOf, because how should the ambiguous JSON representation be deserialized back into classes? For example, take the following TS classes:

export interface Human {
  pet: Dog | Cat
}

export interface Dog {
  name: string
  meows?: boolean
}

export interface Cat {
  name: string
  barks?: boolean
}

Regardless of whether pet is represented using anyOf or oneOf, the schema is still ambiguous. How should the following JSON document be deserialized back into a Human instance? As a Dog-owner or Cat-owner? Or, if it is represented by a schema of anyOf, as an array of both a Dog and Cat?

{ "pet": { "name": "snowball" } }

That's why I suggest that @kanitw's mention of the issue of objects with all optional (nullable) properties be addressed by adding some disambiguation property that would permit the JSON document to be parsed unambiguously. In the example above, this could be done like:

export interface Human {
  pet: Dog | Cat
}

export interface Dog {
  species: 'dog'
  name: string
  meows?: boolean
}

export interface Cat {
  species: 'cat'
  name: string
  barks?: boolean
}

I'd be happy to add a configuration option for specifying either oneOf or anyOf schemas for union types to the https://github.com/YousefED/typescript-json-schema project if you think you might use it.

@domoritz
Copy link
Member

domoritz commented Jul 15, 2017 via email

@marcprux
Copy link
Member Author

What do you mean by "an actual instance in the schema"?

@domoritz
Copy link
Member

domoritz commented Jul 16, 2017 via email

@marcprux
Copy link
Member Author

The links in my original report identify a couple of problematic areas. An obvious one, at https://github.com/vega/schema/blob/master/vega-lite/v2.0.0-beta.8.json#L4172:

        "bin": {
          "anyOf": [
            {
              "type": "boolean"
            },
            {
              "$ref": "#/definitions/Bin"
            }
          ],
          "description": "Flag for binning a `quantitative` field, or a bin property object\nfor binning parameters."
        },

This schema is nonsensical, since a piece of JSON could not possibly conform to both a boolean and a #/definitions/Bin. This is an clear case where oneOf is the correct option.

Another example at https://github.com/vega/schema/blob/master/vega-lite/v2.0.0-beta.8.json#L1137:

        "sort": {
          "anyOf": [
            {
              "$ref": "#/definitions/SortField"
            },
            {
              "$ref": "#/definitions/SortOrder"
            }
          ],
          "description": "Sort order for a field with discrete domain.\nThis can be `\"ascending\"`, `\"descending\"`, `null`, or a [sort field definition object](sort.html#sort-field) for sorting by an aggregate calculation of a specified sort field.\n\n__Note:__ For fields with continuous domain, please use `\"scale\": {\"reverse\": true}` to flip the scale instead."
        },

It isn't possible for an element to conform to both SortOrder and SortField, hence it makes no sense for this to be an anyOf.

As I look over the schema, I can't identify a single instance of anyOf that shouldn't be oneOf, which makes me suspect that there is some misunderstanding of what anyOf means: it indicates that an element can conform to one or more of the schemas, whereas oneOf indicates that an element must conform to exactly one of the schemas. A union field logically implies a oneOf, not an anyOf.

Perhaps if you could point out the case where anyOf solved some problem or issue, I'd better be able to understand why the change was made in the first place.

@kanitw kanitw modified the milestone: 2.x.x Features & Patches Jul 28, 2017
@marcprux
Copy link
Member Author

marcprux commented Nov 8, 2017

Also, can anyone offer insight into this piece of the schema:

    "Aggregate": {
      "anyOf": [
        {
          "$ref": "#/definitions/AggregateOp"
        }
      ]
    }

Why not just have "Aggregate" be a direct reference to "AggregateOp" (or eliminate it altogether)?

@domoritz
Copy link
Member

domoritz commented Nov 8, 2017

That one doesn't make much sense. I will fix it.

The reason that is happening is that we are hiding parts of the schema that would otherwise expose unreleased features.

/** @hide */
export type HiddenCompositeAggregate = CompositeAggregate;

export type Aggregate = AggregateOp | HiddenCompositeAggregate;

@domoritz
Copy link
Member

domoritz commented Nov 8, 2017

About the anyOf vs oneOf. It doesn't make much sense for non-objects but for objects but the semantics of | in Typescript is clearly that you can satisfy more than one object. I am not going to do my own type inference to figure out what would be a tighter typing.

@domoritz
Copy link
Member

domoritz commented Nov 8, 2017

Cleaned up the output in #3194

@marcprux
Copy link
Member Author

marcprux commented Nov 9, 2017

Are there an cases in vega-lite where | is used to intentionally express a non-disjoint union? While it is possible that there could be ambiguity in some objects that match both sides of the union (e.g., objects who have no required properties), I feel like these ought to be treated as potentially error conditions.

Doing a quick audit of the 132 anyOf definitions, 64 of them are unambiguously disjoint, like:

         "anyOf": [
            {
              "type": "string"
            },
            {
              "$ref": "#/definitions/RepeatRef"
            }
          ]

There are a few that could theoretically be non-disjoint, like:

     "anyOf": [
        {
          "$ref": "#/definitions/UrlData"
        },
        {
          "$ref": "#/definitions/InlineData"
        },
        {
          "$ref": "#/definitions/NamedData"
        }
      ]

But looking at the types, their required properties ensure that a single piece of JSON will never match more than one. I've audited the rest, and the only one that I found to be potentially ambiguous is the somewhat odd reference to VgBinding in SingleSelection and SingleSelectionConfig:

"anyOf": [
            {
              "$ref": "#/definitions/VgBinding"
            },
            {
              "additionalProperties": {
                "$ref": "#/definitions/VgBinding"
              },
              "type": "object"
            }
          ]

I'm wondering, if some or all of the anyOfs turned into oneOfs, would anything break? If not, why not err on the side of the strictness gained by declaring them as oneOfs, and then individually changing back any that should indeed be declared as anyOf?

I press the issue because when interacting with vega-lite from strongly-typed languages, oneOf is logically expressible, whereas the correct representation anyOf is (arguably) inexpressible in a correct way.

@domoritz
Copy link
Member

domoritz commented Nov 9, 2017

I'm wondering, if some or all of the anyOfs turned into oneOfs, would anything break? If not, why not err on the side of the strictness gained by declaring them as oneOfs, and then individually changing back any that should indeed be declared as anyOf?

The schema is generated from https://github.com/vega/ts-json-schema-generator so whatever reasoning we want to do, we need to do in there. I'd be happy to accept a pull request that changes some of the anyOf to oneOf.

I press the issue because when interacting with vega-lite from strongly-typed languages, oneOf is logically expressible, whereas the correct representation anyOf is (arguably) inexpressible in a correct way.

Can you give an example? In typescript, the semantics is anyOf, not oneOf, right?

I'm hesitant to change for three reasons:

  • We once had a bug in Vega because of using oneOf instead of anyOf: oneOf -> anyOf vega-parser#60
  • How does this change affect auto completion?
  • Changing all occurrences causes the schema validation for our examples to fail right now. So we want to be careful.

@domoritz domoritz reopened this Nov 9, 2017
@domoritz
Copy link
Member

domoritz commented Nov 9, 2017

Reopen as this issue is not about cleaning up the output alone.

@kanitw
Copy link
Member

kanitw commented May 9, 2018

Closing as this is not an issue specific to Vega-Lite, but rather an issue for https://github.com/vega/ts-json-schema-generator. If you still think this should be changed, please file an issue in the ts-json-schema-generator repo.

@kanitw kanitw closed this as completed May 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants