SJOT: Schemas for JSON Objects ============================== <p align="right"><i>by Robert van Engelen, September 28, 2016.<br>Updated November 25, 2016.</i></p> <p><a href="https://twitter.com/share" class="twitter-share-button" data-show-count="true">Tweet</a><script async defer src="https://platform.twitter.com/widgets.js" charset="utf-8"></script> <a class="github-button" href="https://github.com/Genivia/SJOT" data-count-href="/Genivia/SJOT/stargazers" data-count-api="/repos/Genivia/SJOT#stargazers_count" data-count-aria-label="# stargazers on GitHub" aria-label="Star Genivia/SJOT on GitHub">Star</a> <a class="github-button" href="https://github.com/Genivia/SJOT/archive/master.zip" data-icon="octicon-cloud-download" aria-label="Download Genivia/SJOT on GitHub">Download</a> <a style="vertical-align:top;" href="https://www.npmjs.com/package/sjot"><img src="https://badge.fury.io/js/sjot.svg" alt="npm version" height="18"></a> <a style="vertical-align:top;" href="https://travis-ci.org/Genivia/SJOT"><img src="https://travis-ci.org/Genivia/SJOT.svg?branch=master" alt="build status"></img></a></p> The [JSON schema](http://json-schema.org) draft was an important move forward to make JSON more useful with APIs and other systems that require JSON data validation. However, working with JSON schema can be daunting and defeats the simplicity of JSON. There is a simpler alternative to JSON schema that is more compact and easier to use. We call it *Schemas for JSON Objects* or simply *SJOT*. SJOT aims at quick validation of JSON data with lightweight schemas and compact validators. SJOT schemas are valid JSON, just like JSON schema. But SJOT schemas are more compact and intuitive. A SJOT schema of an object can be as simple as a *JSON object template*. Because a SJOT schema has the look and feel of a template, it is readable and understandable by humans. So why another schema format? Why not stick to JSON Schema? I will try to sum up the problems with JSON Schema briefly: - JSON schema is **non-strict by default**, meaning that all object properties are optional and any additional properties are permitted by default, that is, schemas accept almost anything by default. For example, JSON data with typos in property names will not be rejected by a JSON Schema validator. - JSON schemas are **not extensible**, you can only add more constraints when combining schemas. There is no easy way to achieve schema object inheritance. Worse, combining schemas may lead to a schema that rejects too much or even rejects everything. - Checking if a JSON schema's constraints reject everything is an **NP-complete problem**. Worse, constraints may depend on property values in the JSON data, not just property occurrences. - JSON schema **violates the encapsulation principle** because it permits referencing local schema types via JSON Pointer such as nested objects, which means that you cannot update local types without breaking all the schemas that point to the updated local type structures. - JSON schema design **violates the orthogonality principle** for several constructs. For example [ and ] can sometimes be used to indicate choices but in other cases it cannot (perhaps oneOf should be used, but that has its own problems). - The **principle of least surprise** does not apply to JSON schema: a construct may work well in one case when the same construct causes problems elsewhere. For example, using oneOf to select among primitive types, say "string" and "number" makes sense, but using oneOf to select schemas may not always work and leads to surprising rejections (consider the simple case when the input data is an empty array that matches both the "array of strings" and "array of numbers" schemas). - JSON schema validation performance is **not scalable** because validation cost may be higher than linear time (linear in the size of the input), in the worst case taking exponential time or memory to validate constraints, see the [exploding JSON Schema states](JSON-schema-sucks) examples. - JSON schema permits constraining primitive type value ranges, but offers **few predeclared primitive types** to choose from when almost all programming languages offer byte, short, int, float (single precision) types. - JSON schema is **verbose**, doubling the nesting level compared to JSON data. SJOT uses annotations in property names of JSON objects. These annotations add constraints, such as making a property optional instead of required by default, to restrict array lengths, to enumerate values, to extend base object types, and more. SJOT example {#example} ------------ Say we have a simple JSON representation of a company product, similar to the json-schema.org [example](http://json-schema.org/example1.html) of a JSON schema (which is over 40 lines long!) that describes a company product API. An example product in this API is: [json] { "id": 1, "name": "A green door", "price": 12.50 } The product properties `id`, `name`, and `price` are considered the bare minimum properties of a product and should therefore be required. Other products may contain optional `tags`, `dimensions`, and a `warehouseLocation`: [json] { "id": 2, "name": "An ice sculpture", "price": 12.50, "tags": ["cold", "ice"], "dimensions": { "length": 7.0, "width": 12.0, "height": 9.5 }, "warehouseLocation": { "latitude": -78.75, "longitude": 20.4 } } Let's give this a SJOT *(comments are added for clarity and are not part of SJOT)*: [json] { "@id": "http://example.com/product.json", ← identify the schema (this is optional) "@note": "A company product", ← describe what is defined "product": { ← define a product object that has... "id": "number", ← a required id number "name": "string", ← a required name string "price": "<0.0..", ← a required price in decimal greater than 0.0 "tags?": "string{1,}", ← an optional tags array of unique strings (a non-empty set) "dimensions?": { ← optional dimensions, when provided has.. "length": "number", ← a required length numeric dimension "width": "number", ← a required width numeric dimension "height": "number" ← a required height numeric dimension }, "warehouseLocation?": "http://example.com/geo.json#location" ← an optional warehouseLocation with a type defined by another SJOT } } Similar to the json-schema.org example, we decided to represent the location in a separate schema, here in SJOT: [json] { "@id": "http://example.com/geo.json", "location": { ← define location that has... "latitude": "float", ← a required latitude single precision float "longitude": "float" ← a required longitude single precision float } } SJOT types include the JSON types `"string"`, `"number"`, `"boolean"`, `"null"`, `"object"`, and `"array"` but also more specific types, such as `"char[0,6]"`, `"float"`, ranges `"0..10"`, and arrays of these, such as `"string[1,10]"`. To create an array using the inline style (without requiring named types), simply use a pair of `[` `]` brackets to enclose the type. For example, `[{"id":"number"}]` is an array of objects with numeric `id` properties. The json-schema.org example schema actually defines an array of products. In SJOT, the product array type is referred to by a SJOT type reference `"http://example.com/product.json#product[]"`. This reference uses an array annotation and this suffices to describe and validate a JSON array of products. Type referencing of the form *URI#name* is used to refer to a named type in a schema, such as `http://example.com/geo.json#location` that references the `location` object type defined in the `http://example.com/geo.json` schema. As you can see, a SJOT type reference is very simple and clean. A type reference string contains a `#` reference to a global type in a schema without requiring deeper multi-hop paths (no JSON pointers or paths). A reference to a type in the current schema (e.g. that has no `@id` attribute property) is simply written as *#name* with an empty URI. A reference to the root type in a schema is simply written as *URI#* and *#* for the root type of the current schema. Multiple schemas can be combined in a list of schemas, each schema with a unique `@id`. Types can be referenced between these schemas and the schemas in the array are used to validate JSON data. See the [examples](#examples) in this article and check out our live [demo](get-sjot.html#demo) of SJOT in action. SJOT can be translated to JSON schema draft v4 without loss of details. See the [SJOT to JSON schema converter.](get-sjot.html#demo) SJOT basics {#basics} ----------- A SJOT schema defines one or more types, each type is either atomic (i.e. a primitive type), an object type, or an array type. The following example defines a `Name` type as a string and a `Person` type as an object with `firstname` and `lastname` properties: [json] { "@id": "http://example.com/sjot.json", "@root": "#Person", "Name": "string", "Person": { "firstname": "#Name", "lastname": "#Name" } } The `@root` attribute property indicates the root type of the JSON data to validate, which in this case is a `Person` object. If a schema has only one named type then the `@root` attribute can be removed, because it is redundant. A SJOT schema may include an `@id` property to declare a *namespace URI* to identify this schema. The URI does not need to be a URL, but using a URL to identify a schema can be useful when external schemas must be loaded by a validator. To refer to a named type we use SJOT *type references* that are of the form *URI#name*, or simply *#name* to refer to the named type in the current schema. Spaghetti references are not allowed: a type reference must refer to a type and cannot refer to yet another type reference. Note that the `firstname` and `lastname` of a `Person` object refer to a `Name` instead of just a string, which is intentional: if we decide later to restrict the string content of names then we only have to do this once, for example: [json] "Name": "(\\w(\\w|\\s)*)", where `\\w` matches a letter or digit and `\\s` matches a space. The `@root` in this example schema is a type reference, but a `@root` may also directly define the type of the JSON data to validate. For example: [json] { "@root": "string[0,999]" } This schema validates JSON arrays of strings, where the array can contain up to 999 items. A `@root` can be assigned any type, a type reference, or multiple types with a union. SJOT types {#types} ---------- SJOT has a list of built-in primitive types that are commonly used, besides `"boolean"`, `"number"`, `"string"`, and `"null"`. Objects, arrays, sets, tuples, and unions are simply defined in a SJOT schema using an inline style. It only takes two tables to list all SJOT schema constructs. A SJOT type is one of: [json] "any" any type (wildcard) "atom" any primitive non-null value (boolean, number, or string) "boolean" Boolean with value true or false "true" Boolean with fixed value true "false" Boolean with fixed value false "byte" 8-bit integer "short" 16-bit integer "int" 32-bit integer "long" 64-bit integer "ubyte" 8-bit unsigned integer "ushort" 16-bit unsigned integer "uint" 32-bit unsigned integer "ulong" 64-bit unsigned integer "integer" integer (unconstrained) "float" single precision decimal "double" double precision decimal "number" decimal number (unconstrained) "n,m,..." integer/number enumeration (one of integer/decimal constants) "n..m" inclusive numeric range (n, m are optional integer/decimal constants) "<n..m>" exclusive numeric range (n, m are optional integer/decimal constants) "string" string "base64" string with base64 content "hex" string with hexadecimal content "uuid" string with UUID content, optionally starting with urn:uuid: "date" string with RFC 3339 date YYYY-MM-DD "time" string with RFC 3339 time with optional time zone HH-MM-SS[.s][[+|-]HH:MM|Z] "datetime" string with RFC 3339 datetime with optional time zone "duration" string with ISO-8601 duration PnYnMnDTnHnMnS "char" string with a single character (ASCII, Unicode, UTF-8, etc.) "char[n,m]" string with length between n and m (n, m are optional) "(regex)" string with text that matches the regex "type[]" array of a named type "type[n,m]" array with at least n items and at most m items (n, m are optional) "type{}" set of atoms (an array of unique primitive values) "type{n,m}" set of atoms with at least n items and at most m items (n, m are optional) "URI#name" reference to a named type in the schema identified by "@id": "URI" "#name" reference to a named type in the current schema "object" object (with any properties), same as {} "array" array (of any type of item), same as [] "null" null type with only one value null [ type ] array of a type [ n, type, m ] array with at least n items and at most m items (n, m, type are optional) [ type, ..., type ] tuple { "name": type, ... } object type with properties [[ type, ..., type ]] union of types (a choice of distinct types) The property names of object types can be annotated to make them optional or match a pattern: [json] "name" property is required "name?" property is optional "name?value" property with a default value (primitive types only) "(regex)" property name(s) that match the regex If the character `?` is to be part of a property name, then we write it as a regex `(who\\?)`, with a double backslash to escape the `?` (a single backslash will be removed by most JSON parsers). Likewise, if a property name starts with a `(` then we write it as a regex. ### Objects with required, optional, and default properties An example object type with a required, optional, and default property is: [json] { "Widget": { ← a widget has... "id": "string", ← a required id "tags?": "string{1,}", ← an optional non-empty array of unique string tags "counter?1": "ulong" ← an optional counter with default value 1 } } To disallow additional properties, add the `"@final": true` attribute property. To permit optional properties to occur depending on other optional properties, see the SJOT [dependencies](#deps) described further below. An object with any properties is `"object"` or just `{}`. An empty object that does not permit any properties is `{ "@final": true }`. ### Regex properties and values Regex anchoring with `^` and `$` is unnecessary (JSON and SJOT are language and regex library neutral: regex patterns match entire strings). For example, this dictionary object maps words to words: [json] { "(\\w+)": "(\\w+)" } To match strings partially, simply use a `.*` at the ends of the regex. Additional types with constraints can be easily added to a SJOT schema, for example the ISO 6709 Annex H latitude and longitude type values (see the Google [JSON Style Guide](https://google.github.io/styleguide/jsoncstyleguide.xml?showone=Latitude/Longitude_Property_Values#Latitude/Longitude_Property_Values)): [json] { "@id": "http://example.com/iso-6709.json", "@note": "ISO 6709 Annex H latitude and longitude location", "LatLon": "([+- ]\\d{2}(.\\d+)?[+- ]\\d{3}(.\\d+)?)" } Special string types such as ID, URI, email, hostname, and so on can be easily defined with a regex and put in a schema for reuse. ### Tuples A tuple is a fixed-length list of values, such as `[ "point", true ]`, which is defined by the tuple type: [json] [ "string", "boolean" ] ### Arrays and sets Arrays of named types are simply defined by `"type[]"` without bounds and `"type[n,m]"` with bounds. The lower and upper bounds are optional, so `"type[n,]"` and `"type[,m]"` can be used. Use `"type[n]"` for a fixed-size array. The inline style for arrays is `[type]` without bounds and `[n, type, m]` with bounds, where `n` and `m` are non-negative integers. The lower and upper bounds are optional, so `[n, type]` and `[type, m]` can be used. The `type` is also optional and is `"any"` when omitted. Thus, `[]` is an array of any type with any length, `[0]` is an empty array, `[2]` is an array with two items, and `[1,3]` is an array of one to three items of any type. For example, extending a Widget object type example to include an array of quantity-price objects: [json] { "Widget": { ← a widget has... "id": "string", ← a required id "tags?": "string{1,}", ← an optional non-empty array of unique string tags "counter?1": "ulong", ← an optional counter with default value 1 "pricing?": [ ← an optional array of quantity-price objects { "quantity": "1..", ← quantity "price": "<0.0.." ← price per quantity } ] } } Sets of named types `"type{}"` without bounds and `"type{n,m}"` with bounds are essentially arrays of atomic values that are unique. The lower and upper bounds are optional. Uniqueness of atomic values is well defined. By contrast, object equality is often semantic instead of structural. That is, two objects may still be considered equivalent when structurally different, such as when extra properties are to be ignored. Therefore, SJOT does not admit sets of non-atomic values. This requirement makes sorting stable and validation of sets (with sorting) fast. ### Enumerations To enumerate numbers for a numeric type, use constants and ranges: [json] "Composite": "4,6,8..10,12,14..16" To enumerate strings, use regex alternations: [json] "Color": "(RED|GREEN|YELLOW|BLUE)" Enumerations of mixed types are modeled with a union: [json] "TrueOrColorOrByte": [[ "true", "(RED|GREEN|YELLOW|BLUE)", "byte" ]] ### Unions A union of types describes the range of possible types that a value may have. For example, this union represents a string or a number value: [json] [[ "string", "number" ]] Unions of object and array types are permitted. Array types and object types in the union must be *distinct*. Objects are distinct if they do not share properties. For example, the following union has two distinct object types: [json] [[ { "a": "number" }, { "b": "string" } ]] To combine objects that share properties in a union, add an outer property name that is a unique tag: [json] [[ { "t1": { "a": "string", "b": "number" } }, { "t2": { "b": "string" } } ]] Why is this necessary? The goal of SJOT is to make validation fast and scalable with predictable validation times (similar to XML schema validators and XML data bindings). Therefore, the SJOT validator must be able to determine the type of the value efficiently among the choices in the union, *using constant algorithmic complexity*. By contrast, JSON schema's "oneOf" and "anyOf" are not always efficient because the validator may have to revisit the data multiple times. Arrays in a union are distinct if the item types are distinct. This takes care of notorious problems with JSON schema when using "oneOf" instead of "anyOf" for type choices. A "oneOf" over *M* arrays of length *N* may require *M* x *N* time to validate while SJOT takes at most *M*+*N* time. Worse, validation with this JSON schema "oneOf" fails for an empty array because it matches all arrays in the "oneOf" (surprise!). You may have guessed by now that a union is a smart combination of "oneOf" and "anyOf". The validator applies "anyOf" semantics for efficiency, but the restriction on distinct types essentially force "oneOf" semantics by avoiding ambiguity. Instead of using unions of objects consider using the `@one` attribute property instead: [json] { "t1?": { "a": "string", "b": "number" }, "t2?": { "b": "string" }, "@one": [ [ "t1", "t2" ] ] } Finally, unions should not be nested, either directly or indirectly via a type reference to another union or array of unions. SJOT in JSON {#embed} ------------ A SJOT schema can be embeded within a JSON object by using the `@sjot` property. The embedded schema describes and validates that object. For example: [json] { "@sjot": { "Person": { "@note": "Person with a first name and a last name", "firstname": "string", "lastname": "string" } }, "firstname": "Jason", "lastname": "Bourne" } When embedded, the SJOT schema should have only one type or define a `@root` object type (in case several types are defined) that describes the JSON data. In this example the `Person` object type describes the JSON data. The JSON data is valid because it includes the required `firstname` and `lastname` properties of a `Person` object type. An embedded SJOT may refer to an external schema's root using `URL#`. For example, the same object above with a schema reference: [json] { "@sjot": "http://example.com/sjot.json#", "firstname": "Jason", "lastname": "Bourne" } The `@sjot` URL points to a SJOT schema that has a `Person` object type as the root, such as the SJOT schema that we [described earlier](#basics) in this article. An embedded SJOT may refer to a specific type in a schema: [json] { "@sjot": "http://example.com/sjot.json#Person", "firstname": "Jason", "lastname": "Bourne" } When you invoke the validator with a specific type and schema, then only that type and schema are used to validate the data. Use `null` as a type when invoking the validator to permit an embedded `@sjot` to override the type. A `@sjot` in a JSON object may occur anywhere in JSON data, not just the root-level object. A `@sjot` may contain an array of schemas, each identified with a unique `@id`. SJOT attribute properties {#props} ------------------------- A `@sjot` attribute property of an object in JSON data contains an embedded SJOT that defines the JSON object. An embedded `@sjot` value can be a type reference to a SJOT schema. If multiple types are defined in the embedded SJOT schema, the type that defines the JSON object should be named `@root`. A `@id` attribute property in a SJOT schema identifies the schema by a URI namespace string. A `@note` attribute property can be added to a SJOT schema and to the object types that the schema defines. The `@note` value should be a string. A `@root` attribute property refers to the root type of the schema. An embedded SJOT should have a `@root` attribute property or the schema should define only one type. A `@one`, `@any`, `@all`, or `@dep` attribute property of an object type in a SJOT schema restricts the use of optional object properties. See the SJOT [dependencies](#deps) described further below. A `@extends` attribute property of an object type in a SJOT schema introduces a derived object type. A derived object type includes the properties of a base object type. We will discuss the use of base and derived object types below. A `@final` attribute property declares an object type final and it cannot be extended. Also extra properties for this object in JSON data are not permitted. SJOT base and derived object types {#extend} ---------------------------------- You can extend a base object by adding properties to define a derived object. The `@extends` attribute property in an object type refers to a base object type that is extended. For example: [json] { "@id": "http://www.example.com/sjot.json", "@note": "Schema to store personal information", "Person": { "@note": "Person with a first name and a last name", "firstname": "string", "lastname": "string" }, "PersonDetails": { "@note": "Person with optional age and gender", "@extends": "http://www.example.com/sjot.json#Person", "age?": "0..", "gender?": "(MALE|FEMALE)" } } The `age?` property is optional and has a non-negative integer value. The `gender?` property is optional and has one of the two string values `MALE` or `FEMALE`. When creating derived object types, it is not permitted to override the base properties. Only new properties can be added that are not already in the base object type to create a derived object type. This ensures that a derived object can be used in place of a base object in JSON data and will pass validation by ignoring the extra properties in the derived object. This permits upgrading of a JSON API with backward compatibility to a base API. A derived object type can change a base property from optional to required by using a `@one` singleton propset with that property name. SJOT final object types {#final} ---------------------- A `@final` object cannot have any extra properties that are not defined in the schema. Consider the `PersonDetails` example from the previous example but now declared `@final`: [json] { "PersonDetails": { "@note": "Person with optional age and gender", "@extends": "http://www.example.com/sjot.json#Person", "@final": true, "age?": "0..", "gender?": "(MALE|FEMALE)" } } Additional properties that are used in a JSON `PersonDetails` object will cause the validator to reject this JSON data. SJOT any, one, and all dependencies {#deps} ----------------------------------- When object type properties are optional, you can make their use dependent on the presence of other properties in the object. You can enforcing one property of a set of properties to be present. Or force any property of a set to be present. Or all properties as a group to be present or none of that group. More specific property dependencies can be enforced as well. ### SJOT one The SJOT `@one` attribute property of an object type is a list of sets of object property names. Each property set defines the properties that should be exclusive, meaning only one of the properties may be present. For example, the `choices` object type defined below has one of the properties `a`, `b`, or `c`, and one of the properties `x` or `y`: [json] { "choices": { "a?": "int", "b?": "int", "c?": "int", "x?": "float", "y?": "float", "@one": [ [ "a", "b", "c" ], [ "x", "y" ] ] } } The property sets in the `@one` list should be mutually disjoint and only refer to properties that are optional (without default values) in the schema. ### SJOT any The SJOT `@any` attribute property of an object type is a list of sets of object property names. Each property set defines the properties of which one or more should be used in this object. For example, the `anyabc` object type defined below must have at least one of the properties `a`, `b`, and `c` and therefore cannot be empty: [json] { "anyabc": { "a?": "int", "b?": "int", "c?": "int", "@any": [ [ "a", "b", "c" ] ] } } The property sets in the `@any` list should be mutually disjoint and only refer to properties that are optional (without default values) in the schema. ### SJOT all The SJOT `@all` attribute property of an object type is a list of sets of object property names. Each property set defines which properties should all be included when at least one of them is used, meaning that all properties should be present or none of them at all. For example, the `allornone` object type defined below must have both of the properties `x` and `y` or none of them: [json] { "allornone": { "x?": "int", "y?": "int", "@all": [ [ "x", "y" ] ] } } The property sets in the `@all` list should be mutually disjoint and only refer to properties that are optional (without default values) in the schema. ### SJOT dep The SJOT `@dep` attribute property of an object type enforces properties to be present when a specific property is present. For example, the `ifxthenyz` object type defined below must have properties `y` and `z` if property `x` is present: [json] { "ifxthenyz": { "x?": "int", "y?": "int", "z?": "int", "@dep": { "x": [ "y", "z" ] } } } To simplify this notation, if a property list has only one property, the property name can be directly used instead of the singleton list. The property sets in each `@dep` list should only refer to properties that are optional (without default values) in the schema. Note that the `@all` attribute property enforces the *N* dependencies for a group of *N* properties that are all dependent on each other. SJOT validation {#validation} --------------- Validation proceeds recursively over objects, arrays, and tuples. Primitive values (atoms) are verified against the value type constraints that are imposed on a value by using the type information in the SJOT schema. The property names of an object are matched against the property names of a SJOT object type. For each matching property name the value is recursively validated. If a property is required but is absent, validation fails. If a property is optional and is absent or its value is `null`, validation succeeds, meaning that `null` is equivalent to absent for optional properties. In this case the `null` property can be deleted by the validator. If an optional property has a default value and is absent or its value is `null`, the default value is assumed and the default value can be assigned to this property by the validator. The `@one`, `@any`, `@all`, and `@dep` constraints on object properties is enforced. For the `@one` constraints, exactly one property must occur for each property set specified. For the `@any` set of properties at least one of the properties must occur for each property set specified. For the `@all` constraints, all or none of the properties must occur for each property set specified. For the `@dep` constraints, if an optional property is present then the properties in the specified property set must all be present. Extra properties of an object are ignored unless the object type is `@final`. Validation fails when extra properties are present in a final object. An array is validated by checking constraints on its length and the uniqueness of atomic items in case of a set. In case of a set of atoms `atom{}`, it is assumed that integers and floating point values are compared based on their mathematical value, not their type. So a set cannot contain both 0 and 0.0. A `null` value in an array is converted when validated against a primitive type. The result is `false` for Boolean, `0` for numeric types, and `""` for string types. An array of objects, arrays, or tuples cannot contain `null` values and triggers a validation error. A tuple is validated by validating its members, with the same validation rule for `null` as for arrays stated above. Tuple sizes are fixed. Validation fails when tuples are not of the correct size. An object that is validated against the types `any` or `object` is validated using its embedded `@sjot` schema, when present. SJOT examples {#examples} ------------- ### Vehicle data with embedded schema [json] { "@sjot": { "vehicle": { "color?": "(WHITE|GRAY|BLACK)", "rgb?": "([0-9a-fA-F]{6})", "make": "string", "year?": "1970..", "@one": [ [ "color", "rgb" ] ] } }, "rgb": "D71E1E", "make": "Honda", "year": 2006 } ### Product catalog with embedded schemas [json] { "@sjot": [ { "@id": "http://example.com/product.json", "@note": "Company product catalog", "@root": { "products": "http://example.com/product.json#product[]" }, "product": { "@note": "A company product", "id": "number", "name": "string", "price": "<0.0..", "tags?": "string{1,}", "dimensions?": { "length": "number", "width": "number", "height": "number" }, "warehouseLocation?": "http://example.com/geo.json#location" } }, { "@id": "http://example.com/geo.json", "location": { "latitude": "float", "longitude": "float" } } ], "products": [ { "id": 1, "name": "A green door", "price": 12.50 }, { "id": 2, "name": "An ice sculpture", "price": 12.50, "tags": ["cold", "ice"], "dimensions": { "length": 7.0, "width": 12.0, "height": 9.5 }, "warehouseLocation": { "latitude": -78.75, "longitude": 20.4 } } ] } SJOT chameleon objects: trick or treat? {#trick} --------------------------------------- A tricky situation arises when a derived object type extends a base object type that is defined in another schema. Assuming that one or more of the base object properties refer to a *type* in the current base schema by using a local *#type* reference, then the scope of these type references changes as the base object properties are literally imported into the derived object. We call this type of base object a *chameleon object*. A chameleon object (ab)uses local type references and tricks its properties into changing shape! An example chameleon object is the `Base` object type in the top SJOT schema of the following two SJOT schemas: [json] [ { "@id": "http://example.com/base.json", "Base": { "id": "#ID" }, "ID", "any" }, { "@id": "http://example.com/derived.json", "Derived": { "@extends": "http://example.com/base.json#Base" }, "ID": "string" } ] The `Base` object `id` propery changes type, from `"any"` to `"string"` when imported into `Derived` with the SJOT `@extends` attribute property. To see why, consider the derived object that results after the import and after substituting the `#ID` type reference: [json] { "@id": "http://example.com/derived.json", "Derived": { "id": "#ID" }, "ID": "string" } Chameleons allow us to define *type generics* that change shape via local type references. A real treat to the expressiveness of SJOT. However, danger lurks here! When a JSON API relies on a base object with fixed property types and this base is a chameleon, then the use of a derived object in place of the base object may cause validation failures. To be safe, a *#type* reference should only be used when the current schema has no `@id` so this schema cannot be referenced. If an `@id` is used and the resulting chameleon type generics are extended, then it makes sense that local type references should be generic types, such as `any`, `atom`, or `object`. Want to give it a SJOT? {#ps} ----------------------- SJOT for JS and C/C++ is licensed under the BSD3 and available for download from [Github](https://github.com/Genivia/SJOT) and is also available as an [npm](https://www.npmjs.com/package/sjot) package. You can also try a [live demo](get-sjot.html#demo) of SJOT in action. If you are interested in contributing to SJOT then <a href="mailto:engelen@genivia.com?subject=Help%20with%20SJOT%20development&body=I'd%20like%20to%20help%20developing%20SJOT">drop me an email</a> and I will respond as soon as possible. APPENDIX A: Exploding JSON Schema states {#JSON-schema-sucks} ---------------------------------------- The first "ping-pong" JSON schema example randomly alternates between a "ping" and a "pong" schema for nested objects `x` until we find a boolean `y` that is a final "pong": [json] {"x":{"x":{"x":{"x":{"x":{"x":{"y":true}}}}}}} If the nesting level exceeds 16 then JSON schema validators can take minutes (or crash) using the following schema: [json] { "$schema" : "http://json-schema.org/draft-04/schema#", "$ref": "#/definitions/ping", "definitions": { "ping": { "type": "object", "properties": { "x": { "anyOf": [ { "$ref": "#/definitions/ping" }, { "$ref": "#/definitions/pong" } ] } }, "additionalProperties": false }, "pong": { "type": "object", "properties": { "x": { "anyOf": [ { "$ref": "#/definitions/ping" }, { "$ref": "#/definitions/pong" } ] }, "y": { "type": "boolean" } }, "additionalProperties": false } } } For the second example, let's implement a finite state machine in a JSON schema. The JSON Schema has *N* definitions. The "words" we validate with the schema are defined by the regular expression `(a{N}|a(a|b+){0,N-1}b)*x` that describes a sequence of `a` and `b` ending in `x`. The word `abbx` is represented by the JSON pointer `a/b/b/x` which is `{"a":{"b":{"b":{"x":true}}}}`. The first definition for "0" has the following schema: [json] { "$schema": "http://json-schema.org/draft-04/schema#", "$ref": "#/definitions/0", "definitions": { "0": { "type": "object", "properties": { "a": { "$ref": "#/definitions/1" }, "x": { "type": "boolean" } }, "additionalProperties": false }, Then we add *N*-1 definitions `<DEF>` to the schema enumerated "1", "2", "3", ... "*N*-1": [json] "<DEF>": { "type": "object", "properties": { "a": { "$ref": "#/definitions/<DEF>+1" }, "b": { "anyOf": [ { "$ref": "#/definitions/0" }, { "$ref": "#/definitions/<DEF>" } ] } }, "additionalProperties": false }, where "`<DEF>`+1" wraps back to "0" when `<DEF>` is equal to *N*-1. This "NFA" on a two-letter alphabet has *N* states, only one initial and one final state. Its equivalent minimal DFA has 2^*N* (2 to the power *N*) states. In the worst case, a validator that uses this JSON schema either takes 2^*N* time or uses 2^*N* memory "cells" to validate the input. [![To top](images/go-up.png) To top](#) APPENDIX B: Tips and tricks {#tricks} --------------------------- ### What does SJOT stand for? <b>S</b>chemas for <b>J</b>SON <b>O</b>bjec<b>t</b>s. <b>To JS</b> spelled backwards. ### How to define a schema for JSON data of different types If the different types are distinguishable and you must use the same schema for validation then use a union as the schema root: [json] { "@root": [[ type1, type2, type3, ... ]] } ### How to define a property with a ? in the name Use a regex: [json] "(PropWithA\\?InItsName)": "string", This regex property is optional. To make the property required, see below. Use the same approach when a property name starts with a `(`. ### How to make regex properties required instead of optional Regex properties are optional by design. If the property is required, add an `@any` attribute property to force its presence: [json] "(PropWithA\\?InItsName)": "string", "@any": [ ["PropWithA?InItsName"], ... ] ### How to define a property with a default empty string value Because `null` is converted to an empty string when used as a string type, use `null` as the default value for a property that needs an empty string default value: [json] "name?null": "string" By contrast, `"name?"` is an optional property without a default value. ### How to define a singleton tuple Use unit lower and upper bounds: [json] [1, type, 1] By contrast, `[type]` denotes an array of any length, not a singleton tuple. ### How to define an array of tuples Use an array lower bound and/or upper bound: [json] [0, [type1, type2] ] By constrast, `[[ type1, type2 ]]` denotes a union. ### How to define an object that rejects additional properties Use the `@final` attribute property to restrict the object type: [json] { "@final": true, "name": "string" } This validates objects with a required `"name"` property that is a string and rejects all objects that include other properties. An object type may have regex properties, which means that additional properties are permitted when they match the regex: [json] { "@final": true, "name": "string", "(extra.*)": "any" } This permits additional properties with names that start with `"extra"`. ### How to define an empty object Use the following: [json] { "@final": true } By contrast, `"object"` and `{}` denote extensible object types. ### How to define an empty array Use the following: [json] [0] By contrast, `"array"` and `[]` denote arrays of any type and of any length. [![To top](images/go-up.png) To top](#) <p align="right"><i>Copyright (c) 2016, Robert van Engelen, Genivia Inc. All rights reserved.</i></p>