Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify allowed-values constraints #413

Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
1886e33
Update allowed-values overview
aj-stein-nist Aug 18, 2023
efad916
Correct poorly worded assembly constraint explanation.
aj-stein-nist Aug 22, 2023
98aa212
Per PR feedback, integrate allow-other bullet 3 into 1.
aj-stein-nist Aug 23, 2023
049664b
Take Dave's suggestion to make L24 sentence informational.
aj-stein-nist Aug 23, 2023
797f794
Tighten up definition of what loose (vs strict) is for constraints.
aj-stein-nist Aug 23, 2023
23a6dcd
Add suggestion for better informational constraint description
aj-stein-nist Aug 30, 2023
fcef640
Add terminology description first.
aj-stein-nist Aug 30, 2023
d30d744
Add MUST requirement for validation error.
aj-stein-nist Aug 30, 2023
e194de9
Add more comprehensive treatment of allowed-values
aj-stein-nist Aug 31, 2023
cee26fd
Remove leftover strict or loose wording.
aj-stein-nist Aug 31, 2023
63ed30d
Apply missed suggestions from draft sync
aj-stein-nist Aug 31, 2023
7d80a92
Hyphenate type-appropriate, per PR review.
aj-stein-nist Sep 5, 2023
d508431
Edit target to targeting set, per PR review.
aj-stein-nist Sep 5, 2023
b2bafdd
Add `@id`, `@level`, and `@target` for #411.
aj-stein-nist Oct 16, 2023
0169854
Adjust `@id` and `@level` wording per PR feedback.
aj-stein-nist Oct 17, 2023
f22ea1b
[WIP] Tighten wording and reorganization reqs.
aj-stein-nist Oct 20, 2023
984508a
More reorganization of common constraint data section and intro.
aj-stein-nist Nov 17, 2023
095817f
Tighten up enumerated value processing section.
aj-stein-nist Nov 17, 2023
e712397
Finalize inventory of different definition types.
aj-stein-nist Nov 17, 2023
f445e53
Fix typo in target attribute explanation
aj-stein-nist Dec 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
208 changes: 191 additions & 17 deletions website/content/specification/syntax/constraints.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,37 +8,211 @@ weight: 50

**Note: This section of the specification is still a work in progress.**

TODO: P3: Address issue https://github.com/usnistgov/metaschema/issues/325
Metaschema modules can define different kinds of constraints to support data validation within and between document instances.

The types of constraints allowed for a given definition

## `<define-flag>` constraints

The following constraint types are allowed for `<define-flag>` definitions.

- [`<allowed-values>`](#enumerated-values)
- `<matches>`
- `<index-has-key>`
- `<expect>`

For each of these constraint types, use of the `@target` attribute is prohibited. This is because a flag constraint may only target the flag, since a flag has no child nodes.

## `<define-field>` constraints

The following constraint types are allowed for `<define-field>` definitions.

- [`<allowed-values>`](#enumerated-values)
- `<matches>`
- `<index-has-key>`
- `<expect>`

## `<define-assembly>` constraints

The following constraint types are allowed for `<define-assembly>` definitions.

- [`<allowed-values>`](#enumerated-values)
- `<matches>`
- `<index-has-key>`
- `<expect>`
- `<index>`
- `<is-unique>`
- `<has-cardinality>`

## Common Constraint Data

Each individual constraint allows the following data.

### `@id`

A constraint MAY have an OPTIONAL `@id` attribute, which provides an identifier for the constraint.

Metaschema processors MAY use the identifier for processing constraints and/or referencing them in output for later analysis.

### `@level`

A constraint MAY have an OPTIONAL `@level` attribute, which identifies the severity level of a violation of the constraint.

If defined, a `@level` MUST have a value of either: `INFORMATIONAL`, `WARNING`, `ERROR`, or `CRITICAL`.

Metaschema processors MAY perform conditional processing and/or presentation of constraint violations based on the level value.

### `@target`

The *target* of a constraint identifies the content nodes that a constraint applies to.

Not all constraint types require a `@target`. Each constraint type defines if the `@target` is required, optional, or implicit.

When provided, the value of a `@target` MUST be a valid Metapath expression.
david-waltermire marked this conversation as resolved.
Show resolved Hide resolved

If a `@target` value is not defined, a Metaschema processor MUST process the value as `target="."`, the current context of that constraint definition in a module, for a [field](#define-field-constraints) or [flag](#define-flag-constraints).

A *target* can apply to any node(s) in the document instance(s). There is no guarantee the constraint *target* is a child of its respective assembly, field, or flag. Thus, a Metaschema processor MUST resolve the Metapath expression to identify the actual target nodes that the constraint applies to. If no resulting target nodes are identified, then the constraint MUST be ignored.

## Constraint Processing

In a Metaschema-based document instance, each node in the document instance is associated with a definition in a Metaschema module. Thus, a given content node has one and only one associated definition.

A constraint is defined relative to an assembly, field, or flag [definition](../definitions/) in a Metaschema module.

All constraints associated with a definition MUST be evaluated against all associated content nodes.

Constraints may be declared internally within a definition or as an external set of constraints associated with a definition. To determine the evaluation order, internal and external constraints associated with a definition need to be combined.

Declaration order MUST be determined in the following way.

1. Internal constraints defined directly in the definition are ordered first according to their original order.
2. External constraints are appended in the order the external constraints were provided to the processor.

Each constraint MUST be evaluated in declaration order.

For example:

Given the Metaschema module definitions below:

```
Assembly(name="asmA")
Field(name="fldX)
Flag(name="flgS")
Assembly(name="asmB")
Flag(name="flgT")
```

Constraint evaluation would be handled depth-first as follows:

- When document node `asmA` is processed, constraints defined on that node's definition will be evaluated.
- When document node `fldX` is processed, constraints defined on that node's definition will be evaluated.
- and so on...

Note: The target of the constraint does not affect this evaluation order, but may affect what resulting node the constraint applies to.

When a constraint is evaluated against the associated content node, this node is considered the constraints *evaluation focus*.

### Processing Error Handling

Processing errors occur when a defect in the constraint definition causes an unintended error to occur during constraint processing. This differs from a validation error that results from not meeting the requirement of a constraint.

- If a processing error occurs while processing a constraint, which can result from evaluating a Metapath expression, the error SHOULD be reported.
- If a processing error occurs while processing a constraint, then the document instance being validated MUST NOT be considered valid. This is due to the inability to make a conclusion around validity, since some constraints were not validated due to errors.

## Enumerated values

Additionally, flags may be constrained to a set of known values listed in advance.
The `allowed-values` constraint is a type of Metaschema constraint that restricts field or flag value(s) based on an enumerated set of permitted values.

Each `allowed-values` constraint has a *source* that will be either:

- **model:** The constraint is defined *in* a Metaschema module, i.e. an internal constraint.
- **external:** The constraint is defined *outside* a Metaschema module, i.e. an external constraint.

The `@target` of an `<allowed-values>` constraint specifies the node(s) in a document instance whose value is restricted by the constraint.

### Enumerated value processing

Metaschema processors MUST process `<allowed-values>` constraints.

The constraint's `@target` is a Metapath expression that identifies the node values the constraint applies to.

When evaluating the `@target` metapath expression, the Metapath focus MUST be the constraint's *evaluation focus*. Thus, the targets are determined in the context in where the constraint is declared.

The sequence of nodes that result from Metapath evaluation are the constraints *target node(s)*.

The nodes resulting from evaluating an `<allowed-values>` `@target` are intended to be *field* or *flag* nodes, which have a value. If these nodes are an instance of an *assembly*, a Metaschema processor error SHOULD be raised.

Multiple `<allowed-values>` constraints can apply to a given *target node*, which may be declared by constraints defined on different content nodes. Implementations will need a means to determine the complete set of `<allowed-values>` constraints that apply to a given *target node*, which is referred to as the *target node's* "*applicable set*".

This may be handled using a two phased evaluation that first resolves the `<allowed-values>` constraints associated with each *target node* determining the *applicable set*, then, second, evaluates the *applicable set* for each *target node*. Other implementations may be possible and are allowed if they result in the same effective behavior.

The *applicable set* of `<allowed-values>` constraints is verified for correctness using the `@extension` attribute on each set member.

For each `<allowed-values>` in the *applicable set*, the `@allow-other` attribute is used to determine the *expected value set* for a given content value.

This restriction can be either:
The following subsections detail the processing requirements for the `@extension` and `@allow-other` attributes.

1. strict (values must be in the list for document validity with `allow-other="no"` attribute for an `allowed-values` element) or
2. loose (i.e. for documentation only, no effect in schemas, with `allow-other="yes"`).
#### `@extension`
aj-stein-nist marked this conversation as resolved.
Show resolved Hide resolved

If an `allowed-values` constraint does not have the `allow-other` attribute defined, the default is `allow-other="no"`, resulting in strict validation where the only valid values are those in the list.
For each `<allowed-values>` constraints the *applicable set*, the `@extension` attribute MUST be one of the following values.

Within `allowed-values` of a `constraint`, an `enum` element's `@value` attribute assigns the permissible value, while its data content provides documentation. For example:
- **`none`:** There can be no other matching `<allowed-values>` constraint for the same target value. This is the least permissive option.

- **`model`:** (default) Multiple matching `<allowed-values>` constraints are allowed for the same target value as long as the constraints are defined in the same model. Constraints sourced from outside the model are not allowed.

All allowed-values constraints declared within a Metaschema model matching the same @target can be combined. If a matching constraint within the model has allow-other="no", then constraints declared externally from the model are not allowed. **This is the implicit default value if no `@extension` is provided.**

- **`external`:** Multiple matching `<allowed-values>` constraints are allowed for the same target value, which can be sourced from within the model or externally through a set of external constraints.

All allowed-values constraints, declared within the model and externally through an extension, that match the same @target can be combined. This is the most permissive option.

One of the following requirements MUST apply when processing a value's *applicable set* to validate it.

1. The *applicable set* MUST contain a single `<allowed-values>` constraint with the `@extension` attribute value `none`.
1. All `<allowed-values>` constraints in the *applicable set* MUST have the `@extension` attribute value `model` and originate from a *model* source.
1. All `<allowed-values>` constraints in the *applicable set* MUST have the `@extension` attribute value `external` and originate from either a *model* or *extension* source.
1. An error MUST be raised indicating the *applicable set* is invalid.

#### `@allow-other`
aj-stein-nist marked this conversation as resolved.
Show resolved Hide resolved

The *expected value set* can be considered *open* or *closed*.

- **open:** In an open set, the actual value can be any value. The *expected value set* provides suggested values.
- **closed:** In a closed set, the actual value is expected to match a value in the *expected value set*.

For each `<allowed-values>` constraint, the `@allow-other` attribute MUST be one of the following values.

- **`yes`:** Identifies the *expected value set* as *open*, as long as no other `<allowed-values>` constraint in the *applicable set* has `@allow-other="no"` declared explicitly or implicitly.
- **`no`:** (default) Identifies the *expected value set* as *closed*. **This is the implicit default value if no `@allow-other` is provided.**

One of the following requirements MUST apply when processing a value's *targeting set* of `<allowed-values>` constraints to determine the *expected value set*.

1. One `<allowed-values>` constraint in the *applicable set* MUST have the `@allow-other` attribute value `no`. The *expected value set* is *closed*.

The actual value MUST match one of the enumerated values declared on any of the `<allowed-values>` constraints in the *targeting set*. An error MUST be produced to indicate that the value doesn't match one of the enumerated values.

It is possible to require a value that does not align with the value node's Metaschema data type. In such cases, this creates a situation where both the data type and a closed value requirement cannot be met. In such cases, the constraint processor MUST report this as an error.

2. All `<allowed-values>` constraints in the *applicable set* MUST have the `@allow-other` attribute value `yes`. The *expected value set* is *open*.

Any type-appropriate actual value MUST be allowed. A warning MAY be produced to indicate that the value doesn't match one of the enumerated values.

A Metaschema processor MAY use the text value of the `enum`'s XML element as documentation for a given allowed value enumeration. Below is an example.

```xml
<define-flag name="algorithm" datatype="string">
<formal-name>Hash algorithm</formal-name>
<description>Method by which a hash is derived</description>
<define-flag name="form-factor">
<formal>Computer Form Factor</formal-name>
<description>The type of computer in the example application's data model.</description>
<constraint>
<allowed-values allow-other="yes">
<enum value="SHA-224">Documentation for one permissible option.</enum>
<enum value="SHA-256">Documentation for another permissible option.</enum>
<enum value="laptop">this text value documents the domain and information model's meaning of a laptop</enum>
<enum value="desktop">this text value documents the domain and information model's meaning of a desktop</enum>
</allowed-values>
</constraint> ...
</constraint> ...
</define-flag>
```

## `define-flag` constraints

## `define-field` constraints
## External Constraints

## `define-assembly` constraints
TBD
Loading