These guidelines provides recommendations for defining JSON data at Zalando. JSON here refers to {RFC-7159}[RFC 7159] (which updates {RFC-4627}[RFC 4627]), the "application/json" media type and custom JSON media types defined for APIs. The guidelines clarifies some specific cases to allow Zalando JSON data to have an idiomatic form across teams and services.
Use JSON ({RFC-7159}[RFC 7159]) to represent structured (resource) data passed with HTTP requests and responses as body payload. The JSON payload must use a JSON object as top-level data structure (if possible) to allow for future extension. This also applies to collection resources, where you ad-hoc would use an array — see also [110].
Additionally, the JSON payload must comply to the more restrictive Internet JSON ({RFC-7493}[RFC 7493]), particularly
-
{RFC-7493}#section-2.1[Section 2.1] on encoding of characters, and
-
{RFC-7493}#section-2.3[Section 2.3] on object constraints.
As a consequence, a JSON payload must
-
use {RFC-7493}#section-2.1[
UTF-8
encoding] -
consist of {RFC-7493}#section-2.1[valid Unicode strings], i.e. must not contain non-characters or surrogates, and
-
contain only {RFC-7493}#section-2.3[unique member names] (no duplicate names).
Non-JSON media types may be supported, if you stick to a business object specific standard format for the payload data, for instance, image data format (JPG, PNG, GIF), document format (PDF, DOC, ODF, PPT), or archive format (TAR, ZIP).
Generic structured data interchange formats other than JSON (e.g. XML, CSV) may be provided, but only additionally to JSON as default format using content negotiation, for specific use cases where clients may not interpret the payload structure.
You should use standard media types (defined in {media-types}[media type registry]
of Internet Assigned Numbers Authority (IANA)) as content-type
(or accept
) header
information. More specifically, for JSON payload you should use the standard media type
application/json
(or application/problem+json
for [176]).
You should avoid using custom media types like application/x.zalando.article+json
.
Custom media types beginning with x
bring no advantage compared to the
standard media type for JSON, and make automated processing more difficult.
Exception: Custom media type should be only used in situations where you need to provide API endpoint versioning (with content negotiation) due to incompatible changes.
Names of arrays should be pluralized to indicate that they contain multiple values. This implies in turn that object names should be singular.
Property names are restricted to ASCII snake_case strings matching regex ^[a-z_][a-z_0-9]*$
.
The first character must be a lower case letter, or an underscore, and subsequent
characters can be a letter, an underscore, or a number.
Examples:
customer_number, sales_order_number, billing_address
Rationale: No established industry standard exists, but many popular Internet companies prefer snake_case: e.g. GitHub, Stack Exchange, Twitter. Others, like Google and Amazon, use both - but not only camelCase. It’s essential to establish a consistent look and feel such that JSON looks as if it came from the same hand.
Enumerations should be represented as string
typed OpenAPI definitions of
request parameters or model properties.
Enum values (using enum
or {x-extensible-enum}) need to consistently use
the upper-snake case format, e.g. VALUE
or YET_ANOTHER_VALUE
.
This approach allows to clearly distinguish values from properties or other elements.
Exception: This rule does not apply for case sensitive values sourced from outside
API definition scope, e.g. for language codes from {ISO-639-1}[ISO 639-1], or when
declaring possible values for a rule 137 [sort
parameter].
Dates and date-time properties should end with _at
to distinguish them from
boolean properties which otherwise would have very similar or even identical
names:
-
{created_at} rather than {created},
-
{modified_at} rather than {modified},
-
occurred_at
rather thanoccurred
, and -
returned_at
rather thanreturned
.
Note: {created} and {modified} were mentioned in an earlier version of the guideline and are therefore still accepted for APIs that predate this rule.
A "map" here is a mapping from string keys to some other type. In JSON this is represented as an object, the key-value pairs being represented by property names and property values. In OpenAPI schema (as well as in JSON schema) they should be represented using additionalProperties with a schema defining the value type. Such an object should normally have no other defined properties.
The map keys don’t count as property names in the sense of rule 118, and can follow whatever format is natural for their domain. Please document this in the description of the map object’s schema.
Here is an example for such a map definition (the translations
property):
components:
schemas:
Message:
description:
A message together with translations in several languages.
type: object
properties:
message_key:
type: string
description: The message key.
translations:
description:
The translations of this message into several languages.
The keys are [IETF BCP-47 language tags](https://tools.ietf.org/html/bcp47).
type: object
additionalProperties:
type: string
description:
the translation of this message into the language identified by the key.
An actual JSON object described by this might then look like this:
{ "message_key": "color",
"translations": {
"de": "Farbe",
"en-US": "color",
"en-GB": "colour",
"eo": "koloro",
"nl": "kleur"
}
}
Schema based JSON properties that are by design booleans must not be presented as nulls. A boolean is essentially a closed enumeration of two values, true and false. If the content has a meaningful null value, strongly prefer to replace the boolean with enumeration of named values or statuses - for example accepted_terms_and_conditions with true or false can be replaced with terms_and_conditions with values yes, no and unknown.
OpenAPI 3.x allows to mark properties as required
and as nullable
to
specify whether properties may be absent ({}
) or null
({"example":null}
).
If a property is defined to be not required
and nullable
(see
2nd row in Table below), this rule demands
that both cases must be handled in the exact same manner by specification.
The following table shows all combinations and whether the examples are valid:
{CODE-START}required{CODE-END} | {CODE-START}nullable{CODE-END} | {CODE-START}{}{CODE-END} | {CODE-START}{"example":null}{CODE-END} |
---|---|---|---|
|
|
{NO} |
{YES} |
|
|
{YES} |
|
|
|
{NO} |
{NO} |
|
|
{YES} |
{NO} |
While API designers and implementers may be tempted to assign different semantics to both cases, we explicitly decide against that option, because we think that any gain in expressiveness is far outweighed by the risk of clients not understanding and implementing the subtle differences incorrectly.
As an example, an API that provides the ability for different users to
coordinate on a time schedule, e.g. a meeting, may have a resource for options
in which every user has to make a choice
. The difference between undecided
and decided against any of the options could be modeled as absent and
null
respectively. It would be safer to express the null
case with a
dedicated Null object, e.g.
{}
compared to {"id":"42"}
.
Moreover, many major libraries have somewhere between little to no support for
a null
/absent pattern (see
Gson,
Moshi,
Jackson,
JSON-B). Especially
strongly-typed languages suffer from this since a new composite type is required
to express the third state. Nullable Option
/Optional
/Maybe
types could be
used but having nullable references of these types completely contradicts their
purpose.
The only exception to this rule is JSON Merge Patch {RFC-7396}[RFC 7396]) which
uses null
to explicitly indicate property deletion while absent properties are
ignored, i.e. not modified.
Empty array values can unambiguously be represented as the empty list, []
.
You must use common field names and semantics whenever applicable. Common fields are idiomatic, create consistency across APIs and support common understanding for API consumers.
We define the following common field names:
-
{id}: the identity of the object. If used, IDs must be opaque strings and not numbers. IDs are unique within some documented context, are stable and don’t change for a given object once assigned, and are never recycled cross entities.
-
{xyz_id}: an attribute within one object holding the identifier of another object must use a name that corresponds to the type of the referenced object or the relationship to the referenced object followed by
_id
(e.g.partner_id
notpartner_number
, orparent_node_id
for the reference to a parent node from a child node, even if both have the typeNode
). Exception: We usecustomer_number
instead ofcustomer_id
for customer facing identification of customers due to legacy reasons. (Hint:customer_id
used to be defined as internal only, technical integer key, see Naming Decision:customer_number
vscustomer_id
[internal link]). -
{type}: the kind of thing this object is. If used, the type of this field should be a string. Types allow runtime information on the entity provided that otherwise requires examining the OpenAPI file.
-
{etag}: the ETag of an embedded sub-resource. It may be used to carry the {ETag} for subsequent {PUT}/{PATCH} calls (see [etag-in-result-entities]).
Further common fields are defined in {SHOULD} name date/time properties with _at
suffix.
The following guidelines define standard objects and fields:
Example JSON schema:
tree_node:
type: object
properties:
id:
description: the identifier of this node
type: string
created_at:
description: when got this node created
type: string
format: 'date-time'
modified_at:
description: when got this node last updated
type: string
format: 'date-time'
type:
type: string
enum: [ 'LEAF', 'NODE' ]
parent_node_id:
description: the identifier of the parent node of this node
type: string
example:
id: '123435'
created_at: '2017-04-12T23:20:50.52Z'
modified_at: '2017-04-12T23:20:50.52Z'
type: 'LEAF'
parent_node_id: '534321'
Address structures play a role in different business and use-case contexts, including country variances. All attributes that relate to address information must follow the naming and semantics defined below.
addressee:
description: a (natural or legal) person that gets addressed
type: object
properties:
salutation:
description: |
a salutation and/or title used for personal contacts to some
addressee; not to be confused with the gender information!
type: string
example: Mr
first_name:
description: |
given name(s) or first name(s) of a person; may also include the
middle names.
type: string
example: Hans Dieter
last_name:
description: |
family name(s) or surname(s) of a person
type: string
example: Mustermann
business_name:
description: |
company name of the business organization. Used when a business is
the actual addressee; for personal shipments to office addresses, use
`care_of` instead.
type: string
example: Consulting Services GmbH
required:
- first_name
- last_name
address:
description:
an address of a location/destination
type: object
properties:
care_of:
description: |
(aka c/o) the person that resides at the address, if different from
addressee. E.g. used when sending a personal parcel to the
office /someone else's home where the addressee resides temporarily
type: string
example: Consulting Services GmbH
street:
description: |
the full street address including house number and street name
type: string
example: Schönhauser Allee 103
additional:
description: |
further details like building name, suite, apartment number, etc.
type: string
example: 2. Hinterhof rechts
city:
description: |
name of the city / locality
type: string
example: Berlin
zip:
description: |
zip code or postal code
type: string
example: 14265
country_code:
description: |
the country code according to
[iso-3166-1-alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2)
type: string
example: DE
required:
- street
- city
- zip
- country_code
Grouping and cardinality of fields in specific data types may vary based on the specific use case (e.g. combining addressee and address fields into a single type when modeling an address label vs distinct addressee and address types when modeling users and their addresses).
Use the following common money structure:
link:../models/money-1.0.0.yaml[role=include]
APIs are encouraged to include a reference to the global schema for Money.
SalesOrder:
properties:
grand_total:
$ref: 'https://opensource.zalando.com/restful-api-guidelines/models/money-1.0.0.yaml#/Money'
Please note that APIs have to treat Money as a closed data type, i.e. it’s not meant to be used in an inheritance hierarchy. That means the following usage is not allowed:
{
"amount": 19.99,
"currency": "EUR",
"discounted_amount": 9.99
}
-
Violates the Liskov Substitution Principle
-
Breaks existing library support, e.g. Jackson Datatype Money
-
Less flexible since both amounts are coupled together, e.g. mixed currencies are impossible
A better approach is to favor composition over inheritance:
{
"price": {
"amount": 19.99,
"currency": "EUR"
},
"discounted_price": {
"amount": 9.99,
"currency": "EUR"
}
}
-
No inheritance, hence no issue with the substitution principle
-
Makes use of existing library support
-
No coupling, i.e. mixed currencies is an option
-
Prices are now self-describing, atomic values
Please be aware that some business cases (e.g. transactions in Bitcoin) call for a higher precision, so applications must be prepared to accept values with unlimited precision, unless explicitly stated otherwise in the API specification.
Examples for correct representations (in EUR):
-
42.20
or42.2
= 42 Euros, 20 Cent -
0.23
= 23 Cent -
42.0
or42
= 42 Euros -
1024.42
= 1024 Euros, 42 Cent -
1024.4225
= 1024 Euros, 42.25 Cent
Make sure that you don’t convert the "amount" field to float
/
double
types when implementing this interface in a specific language
or when doing calculations. Otherwise, you might lose precision.
Instead, use exact formats like Java’s
BigDecimal
.
See Stack Overflow for more
info.
Some JSON parsers (NodeJS’s, for example) convert numbers to floats by default. After discussing the pros and cons we’ve decided on "decimal" as our amount format. It is not a standard OpenAPI format, but should help us to avoid parsing numbers as float / doubles.