Structured Log Message Schemas

Overview

All message schemas are described using the JSON schema language. These message schemas make it possible to validate that events in the log match their message schema. These schemas also contain some custom elements that give more control over the code that is generated.

Because these JSON schemas are overly complex for most use cases, we offer a simplified schema format. These simplified schemas are written with a similar structure to JSON schemas, but they are less verbose and lack some of the redundant and error-prone elements of JSON schemas.

Simplified schemas can be written in JSON, but we offer a human readable alternative which is python dictionary format (file extensions .py and .schema). Unlike JSON, python dictionary format allows comments, splitting strings over multiple lines and trailing commas. Note that you can only use python literals in this format; you will get an error if you try to execute a function or calculate anything. We may offer additional formats in the future.

There are some field names that may not be used in a schema. These are considered reserved field names. All of these field names are used in the CloudEvents wrapper for each event. Some transmission protocols may require that the event be flattened and requires that no collision occurs with these names. The following names may not be used in an event schema:

  • id or _id.

  • session or s_session.

  • time or ts_created.

  • specversion.

  • type or s_type.

  • source or s_source.

  • dataschema or s_dataschema.

  • data.

The structure of a simplified schema in python dictionary format is demonstrated in the following example:

{
    "name": "example.structuredlog",
    "version": "1.1",
    "namespace": "com.nvidia.carbonite.example.structuredlog",
    "description": "Example schema to demonstrate how to use IStructuredLog."
                   " This sends a dummy startup time event.",
    "flags": [ "fSchemaFlagAnonymizeEvents" ],
    "events": {
        "startup": {
            "privacy": {
                "category": "performance",
                "description": "example event, so categorization is arbitrary"
            },
            "description": "Marks when an app was started and how long startup took."
                           " This data is gathered to demonstrate the telemetry system.",
            "flags": [],
            "properties": {
                "startupTime": {
                    "type": "uint64",
                    "description": "time to startup in nanoseconds"
                },
                "registerTime": {
                    "type": "uint64",
                    "description": "time to register this schema in nanoseconds"
                },
                "exampleString": {
                    "type": "string",
                    "description": "an example string parameter"
                },
                "example.namespace": {
                    "type": "string",
                    "description": "a test value."
                },
                "Resources.list": {
                    "type": "object",
                    "description": "test object",
                    "properties": {
                        "app.namespace": {
                            "type": "string",
                            "description": "namespace value"
                        },
                        "app.name": {
                            "type": "string",
                            "description": "app name"
                        },
                        "app.instance.id": {
                            "type": "string",
                            "description": "instance ID"
                        },
                        "app.version": {
                            "type": "string",
                            "description": "app version"
                        }
                    }
                }
            }
        },
        "standardStreamOut": {
            "privacy": {
                "category": "performance",
                "description": "example event, so categorization is arbitrary"
            },
            "description": "Demonstrates how to output an event just to stdout.",
            "flags": [ "fEventFlagOutputToStdout", "fEventFlagSkipLog" ],
            "properties": {
                "exampleString": {
                    "type": "string",
                    "description": "an arbitrary test string."
                }
            }
        },
        # Multiple events can be added to the events dictionary.
        # Comments and trailing commas are fine in this format.
    }
}

The following elements can be placed at the top level of the schema. Additional elements will be copied directly into the JSON schema.

  • name: The name of this schema. The output JSON schema will have this value stored in #/schemaMeta/clientName. This should be as unique as possible among other schemas, but should also not be too long or overly descriptive. This name will be used to generate both the helper class names, and the log file names. This property is required and may not be an empty string. Note that this name is not schemaName so that it maintains compatibility with NvTelemetry schemas.

  • version: The version of this schema. The output JSON schema will have this value stored in #/schemaMeta/schemaVersion. This must be expressed as a <major>.<minor> version number in a string value. This version number is used to differentiate different versions of a schema when attempting to match and validate a message. This may not be an empty string.

  • description: Documentation for this schema. The output JSON schema will have this value stored in #/schemaMeta/description. Documentation is required to describe the usage of this schema for the legal approval process.

  • namespace: This specifies the common prefix that will be added to all event names in the schema. The output JSON schema will have this value stored in #/schemaMeta/eventPrefix. For example, the startup event in the example schema becomes com.nvidia.carbonite.example.structuredlog.startup. This may not be an empty string.

  • oldEventsThreshold: This specifies an ‘old events threshold value at the schema level. This is given in days relative to the current time. If set to 0, the ‘old events threshold’ will be disabled. If a non-zero value is specified, it will represent the number of days before the current transmission time where messages start to be considered old. This can be overridden if an ‘old events threshold’ is set at the global level (ie: with the setting "/telemetry/oldEventsThreshold") with a smaller number of days. The smallest non-zero number of days will always be used from among the various levels. This defaults to 0 days (ie: disables checking for old events, but can still be overridden at a higher level).

  • flags: Array of flags that alter the behavior of this schema. The output JSON schema will have this value stored in #/schemaMeta/omniverseFlags. Some of these flags are passed to omni::structuredlog::IStructuredLog::allocSchema() and others just affect the behavior of the generated code. Other flags are only consumed by the telemetry transmitter app. All flags specified at the schema level affect all events in the schema.

    Allowed flags are:

    • fSchemaFlagKeepLogOpen: Passes omni::structuredlog::fSchemaFlagKeepLogOpen to omni::structuredlog::IStructuredLog::allocSchema().

    • fSchemaFlagPseudonymizeEvents: All events generated on this schema should be pseudonymized. This step is done during transmission, so events in the log will show up with the user name. This will replace the user ID in the event’s ‘source’ property with a hash of the current global user ID string. This hash will be updated if the global user ID is ever changed. This allows the events for the user to still be connected together, even between sessions. However, it does greatly reduce the possibility of matching the activity to a specific user’s account. This flag will be overridden by fSchemaFlagAnonymizeEvents if both are present. This flag is only used by the telemetry transmitter; the user ID will still be present in the logs on the local machine.

    • fSchemaFlagAnonymizeEvents: All events generated on this schema should be anonymized. This step is done during transmission, so events in the log will show up with the user name. This will replace the user ID in the event’s ‘source’ property with a random number. This random user ID will remain constant for the duration of the process so that all events generated in the session can still be connected together. Any changes to the global user ID through IStructuredLogSettings::setUserId() will simply be ignored for this schema. The same random user ID will be used for all schemas that use this flag in the process. This flag will take precedence over fSchemaFlagPseudonymizeEvents if both are present. This flag is only used by the telemetry transmitter; the user ID will still be present in the logs on the local machine.

    • fSchemaFlagNoLogging: This flag disables all logging in the generated code. This flag is only needed when hooking up an omni::log::ILog consumer to structured logging. A omni::log::ILog consumer is already provided in omni::structuredlog::IStructuredLogFromILog.

    • fSchemaFlagLogWithProcessId: Passes omni::structuredlog::fSchemaFlagLogWithProcessId to omni::structuredlog::IStructuredLog::allocSchema().

    • fSchemaFlagIgnoreOldEvents: This flag causes the transmitter to discard messages that are older than the current ‘old events’ threshold even if they may validate against a schema. The threshold can be set in one of three ways:

      • On a global level by adding the command line or settings file setting "/telemetry/oldEventsThreshold".

      • On a per-schema level by adding the #/oldEventsThreshold property to the schema.

      • On a per-event level by adding the #/events/<eventName>/oldEventsThreshold property to an event.

      This is useful if old data isn’t interesting for analysis or if transmitting an old event would possibly violate a data retention policy. If this flag is specified but no ‘old events threshold’ has been set, it will be ignored. If this flag is not specified and an old events threshold has been set, the default behaviour will be to anonymize the old event.

    • fSchemaFlagPseudonymizeOldEvents: When this flag is used, the fSchemaFlagIgnoreOldEvents flag is not used, and an ‘old events threshold’ has been set (see above for how to set it), events that are considered ‘old’ will be pseudonymized instead of anonymized. This behaviour will be overridden if the fSchemaFlagIgnoreOldEvents flag is used.

    • fSchemaFlagUseObjectPointer: When this flag is used, the generated code will use C++ pointers to objects as parameters to the various ‘send’ functions for all events in the schema instead of the default reference to the object. This flag will override any similar per-event flags that are used in the schema.

    • fSchemaFlagOutputToStdout: This flag indicates that all events in the schema should be output to the stdout file. If the fSchemaFlagSkipLog flag is not also used, this output will be in addition to the normal output to the schema’s log file. If the fSchemaFlagSkipLog flag is used, the normal log file will not be written to. The default behavior is to only write the event to the schema’s log file. This can be combined with the fSchemaFlagOutputToStderr and fSchemaFlagSkipLog flags.

    • fSchemaFlagOutputToStderr: This flag indicates that all events in the schema should be output to the stderr file. If the fEventFlagSkipLog flag is not also used, this output will be in addition to the normal output to the schema’s log file. If the fEventFlagSkipLog flag is used, the normal log file will not be written to. The default behavior is to only write the event to the schema’s log file. This can be combined with the fSchemaFlagOutputToStdout and fSchemaFlagSkipLog flags.

    • fSchemaFlagSkipLog: This flag indicates that none of the events in this schema should be output to the schema’s specified log file. This flag is intended to be used in combination with the fEventFlagOutputToStdout and fEventFlagOutputToStderr flags to control which destination(s) each event message would be written to. The default behavior is to write each event message to the schema’s specified log file. Note that if this flag is used and neither the fEventFlagOutputToStderr nor fEventFlagOutputToStdout flags are used, it is effectively the same as disabling the event since there is no set destination for the message.

  • events: The table of structured log events. Each entry in this tables defines on structured log event that can be emitted from this schema.

    Each event in this table will be inserted into #/definitions/events/ in the output JSON schema.

    Each event has the following properties:

    • properties: This a table describing the data to be sent. The ordering of the properties in the table will be the same as the ordering of the parameters to the generated functions.

      This table contains the following properties:

      • type The data type of of the field in the event. The structured log system produces JSON, but the data types are always statically typed to minimize the performance cost of emitting an event and to make it easier to process the data. The type field is mandatory unless const or enum is specified.

        The simplified schema format specifies a custom set of data types to allow maximum flexibility with the C++ code.

        The following types are supported. For any of these types, except binary, you can append a [] to the type name (e.g. int32[]) to make the field an array.

        • bool: boolean value. bool in C++. Can be true or false. Note that True and False are capitalized in python dictionary format.

        • int32: 32 bit signed integer. int32_t in C++.

        • uint32: 32 bit unsigned integer. uint32_t in C++.

        • int64: 64 bit signed integer. int64_t in C++. Note that emitting 64 bit numbers may cause interoperability issues with the JSON library in Javascript, since it stores integers as doubles; if you need to use Javascript in your pipeline, you should use a JSON library with BigInt support.

        • uint64: 64 bit unsigned integer. uint64_t in C++.

        • float32: 32 bit float. float in C++.

        • float64: 64 bit float. double in C++.

        • string: A string. The C++ parameter type is omni::structuredlog::StringView, but the data stored in the queue is a const char *. Any string property must also contain also contain an examples array. This is to provide clear documentatino of real-world values that the string field could potentially contain. There must be at least two examples provided in the examples array. If only a single real-world example is possible, the use of the property should likely be re-evaluated.

        • binary: A base64 encoded string. const uint8_t * in C++. The binary blob that’s passed in is converted to base64 before being written to the log.

        • object: A hierarchical data structure. An object type must have a properties table that defines its layout (this is identical in structure to the properties table of an event, which is currently being described). This allows you to define hierarchical data. It is recommended to avoid having data hierarchy for most types of data analysis.

      • const A constant value for a field. These values will be inserted into the event when it is written into the log, but they will not be parameters to the event. Constant values do not need a type field, since their type can be deduced from their data. Constant fields cannot be an object, since that functionality has not been added to the code generator yet. You can specify an object with only constant fields if you need a constant object in your event.

                        "two": {
                            "const": 32,
                            "description": "Description of two."
                        },
        
      • enum: This switches the parameter to operate as an enum which selects from a list of values. For the example shown, the parameter would be an enum that has 3 possible values that correspond to "a", "b" or "c". The value written to the log will be one of "a", "b" or "c". Note that list entries may not differ in type; mixing strings with integers, for example, will result in the code generator rejecting your schema. Mixed numeric literals will also be switched to a single type with C++ style promotions.

                        "one": {
                            "enum": ["a", "b", "c"],
                            "description": "Description of one."
                        },
        
    • privacy: this is a table that specifies information about the information disclosure of this telemetry event. This is required as part of the legal review process. This table is stored in #/definitions/events/<eventName>/eventMeta/privacy in the output JSON schema.

      This contains the following fields:

      • category: The categorization of privacy for this event. This must be one of:

        • performance

        • personalization

        • usage

      • description: An explanation of why this event meets the definition of the specific category.

    • description: A description of the event. This should be an overview of what data is being collected, when it is being collected and the usage of that data. Descriptions not only provide documentation in the schema itself, but are also used to document the generated C++ code. Having the schema well documented helps ensure that the generated C++ code is readable and helpful to all developers, not just the schema writer.

      This tag is stored in #/definitions/events/<eventName>/description in the output JSON schema.

    • oldEventsThreshold: This specifies an ‘old events threshold value at the event level. This is given in days relative to the current time. If set to 0, the ‘old events threshold’ will be disabled unless overridden by a higher level setting. If a non-zero value is specified, it will represent the number of days before the current transmission time where messages start to be considered old. This can be overridden if an ‘old events threshold’ is set at the global level (ie: with the setting "/telemetry/oldEventsThreshold") or at the schema level (ie: with the property #/oldEventsThreshold) with a smaller number of days. The smallest non-zero number of days will always be used from among the various levels. This defaults to 0 days (ie: disables checking for old events, but can still be overridden at a higher level).

    • flags: Array of flags that alter the behavior of this schema. This array is stored in #/definitions/events/<eventName>/eventMeta/omniverseFlags in the output JSON schema. Some of these flags are passed to omni::structuredlog::IStructuredLog::commitSchema() and others just affect the behavior of the generated code.

      Allowed flags are:

      • fEventFlagUseLocalLog: Adds omni::structuredlog::fEventFlagUseLocalLog to the event list that’s passed to omni::structuredlog::IStructuredLog::commitSchema().

      • fEventFlagCriticalEvent: Adds omni::structuredlog::fEventFlagCriticalEvent to the event list that’s passed to omni::structuredlog::IStructuredLog::commitSchema().

      • fEventFlagPseudonymize: Marks that the event should be pseudonymized. This will replace the user ID in the event’s ‘source’ property with a hash of the current global user ID string before transmission. This hash will be updated if the global user ID is ever changed. This allows the events for the user to still be connected together, even between sessions. However, it does greatly reduce the possibility of matching the activity to a specific user’s account. This flag will be overridden by fEventFlagAnonymize if both are present. This flag will also be overridden if the schema’s flags used fSchemaFlagAnonymizeEvents. This flag will have no effect if the schema’s flags used fSchemaFlagPseudonymizeEvents. This flag is only used by the telemetry transmitter; the user ID will still be present in the logs on the local machine.

      • fEventFlagAnonymize: Marks that the event should be anonymized. This will replace the user ID in the event’s ‘source’ property with a random number. This random user ID will remain constant for the duration of the session so that all events generated in the session can still be connected together. This random ID will never match the ID that would be used for the pseudonymized ID for that same user. Any changes to the global user ID through IStructuredLogSettings::setUserId() will simply be ignored for this schema. The same random user ID will be used for all schemas that use this flag in the process. This flag will take precedence over fEventFlagPseudonymize if both are present. This flag will also take precedence over fSchemaFlagPseudonymizeEvents in the schema’s flags. This flag is only used by the telemetry transmitter; the user ID will still be present in the logs on the local machine.

      • fEventFlagExplicitFlags: This will expose a omni::structuredlog::AllocFlags parameter to the telemetry event macros. These flags are currently only needed by schemas that are part of the telemetry core.

      • fEventFlagIgnoreOldEvents: This flag causes the transmitter to discard messages that are older than the current ‘old events’ threshold even if they may validate against a schema. The threshold can be set in one of three ways:

        • On a global level by adding the command line or settings file setting "/telemetry/oldEventsThreshold".

        • On a per-schema level by adding the #/oldEventsThreshold property to the schema.

        • On a per-event level by adding the #/events/<eventName>/oldEventsThreshold property to an event.

        This is useful if old data isn’t interesting for analysis or if transmitting an old event would possibly violate a data retention policy. If this flag is specified but no ‘old events threshold’ has been set, it will be ignored. If this flag is not specified and an old events threshold has been set, the default behaviour will be to anonymize the old event.

      • fEventFlagPseudonymizeOldEvents: When this flag is used, the fSchemaFlagIgnoreOldEvents flag is not used, and an ‘old events threshold’ has been set (see above for how to set it), events that are considered ‘old’ will be pseudonymized instead of anonymized. This behaviour will be overridden if the fSchemaFlagIgnoreOldEvents flag is used.

      • fEventFlagUseObjectPointer: When this flag is used, any object properties that are passed as parameters to the ‘send event’ functions for this event in the generated C++ code will use pointers instead of the default references. This behaviour will always be overridden by the fSchemaFlagUseObjectPointer flag if it is used at the schema level.

      • fEventFlagOutputToStdout: This flag indicates that this event should be output to the stdout file. If the fEventFlagSkipLog flag is not also used, this output will be in addition to the normal output to the schema’s log file. If the fEventFlagSkipLog flag is used, the normal log file will not be written to. The default behavior is to only write the event to the schema’s log file. This can be combined with the fSchemaFlagOutputToStderr and fSchemaFlagSkipLog flags.

      • fEventFlagOutputToStderr: This flag indicates that this event should be output to the stderr file. If the fEventFlagSkipLog flag is not also used, this output will be in addition to the normal output to the schema’s log file. If the fEventFlagSkipLog flag is used, the normal log file will not be written to. The default behavior is to only write the event to the schema’s log file. This can be combined with the fSchemaFlagOutputToStdout and fSchemaFlagSkipLog flags.

      • fEventFlagSkipLog: This flag indicates that this event should not be output to the schema’s specified log file. This flag is intended to be used in combination with the fEventFlagOutputToStdout and fEventFlagOutputToStderr flags to control which destination(s) each event log message would be written to. The default behavior is to write each event message to the schema’s specified log file. Note that if this flag is used and neither the fEventFlagOutputToStderr nor fEventFlagOutputToStdout flags are used, it is effectively the same as disabling the event since there is no set destination for the message.

JSON Schema format

This is the JSON schema that gets baked from the simplified schema shown above:

{
    "generated": "This was generated from example.structuredlog.schema.",
    "anyOf": [
        {
            "$ref": "#/definitions/events/com.nvidia.carbonite.example.structuredlog.startup"
        },
        {
            "$ref": "#/definitions/events/com.nvidia.carbonite.example.structuredlog.standardStreamOut"
        }
    ],
    "$schema": "http://json-schema.org/draft-07/schema#",
    "schemaMeta": {
        "clientName": "example.structuredlog",
        "schemaVersion": "1.1",
        "eventPrefix": "com.nvidia.carbonite.example.structuredlog",
        "definitionVersion": "1.0",
        "omniverseFlags": [
            "fSchemaFlagAnonymizeEvents"
        ],
        "description": "Example schema to demonstrate how to use IStructuredLog. This sends a dummy startup time event."
    },
    "definitions": {
        "events": {
            "com.nvidia.carbonite.example.structuredlog.startup": {
                "eventMeta": {
                    "service": "telemetry",
                    "privacy": {
                        "category": "performance",
                        "description": "example event, so categorization is arbitrary"
                    },
                    "omniverseFlags": []
                },
                "type": "object",
                "additionalProperties": false,
                "required": [
                    "startupTime",
                    "registerTime",
                    "exampleString",
                    "example.namespace",
                    "Resources.list"
                ],
                "properties": {
                    "startupTime": {
                        "type": "integer",
                        "omniverseFormat": "uint64",
                        "description": "time to startup in nanoseconds"
                    },
                    "registerTime": {
                        "type": "integer",
                        "omniverseFormat": "uint64",
                        "description": "time to register this schema in nanoseconds"
                    },
                    "exampleString": {
                        "type": "string",
                        "description": "an example string parameter"
                    },
                    "example.namespace": {
                        "type": "string",
                        "description": "a test value."
                    },
                    "Resources.list": {
                        "type": "object",
                        "properties": {
                            "app.namespace": {
                                "type": "string",
                                "description": "namespace value"
                            },
                            "app.name": {
                                "type": "string",
                                "description": "app name"
                            },
                            "app.instance.id": {
                                "type": "string",
                                "description": "instance ID"
                            },
                            "app.version": {
                                "type": "string",
                                "description": "app version"
                            }
                        },
                        "required": [
                            "app.namespace",
                            "app.name",
                            "app.instance.id",
                            "app.version"
                        ],
                        "description": "test object"
                    }
                },
                "description": "Marks when an app was started and how long startup took. This data is gathered to demonstrate the telemetry system."
            },
            "com.nvidia.carbonite.example.structuredlog.standardStreamOut": {
                "eventMeta": {
                    "service": "telemetry",
                    "privacy": {
                        "category": "performance",
                        "description": "example event, so categorization is arbitrary"
                    },
                    "omniverseFlags": [
                        "fEventFlagOutputToStdout",
                        "fEventFlagSkipLog"
                    ]
                },
                "type": "object",
                "additionalProperties": false,
                "required": [
                    "exampleString"
                ],
                "properties": {
                    "exampleString": {
                        "type": "string",
                        "description": "an arbitrary test string."
                    }
                },
                "description": "Demonstrates how to output an event just to stdout."
            }
        }
    },
    "description": "Example schema to demonstrate how to use IStructuredLog. This sends a dummy startup time event."
}

This schema includes all of the important components that are required for Omniverse telemetry, as well as some optional features. All parts of the JSON schema language could theoretically work, but in order for a schema to be considered valid for a messaging schema, it must meet the following criteria:

  • when using for validation, a message must match to exactly one event described in the schema. This can be done by having a oneOf or anyOf array at the top level of the schema. See Exactly One Match below for more information on this array.

  • it must provide a schemaMeta object at the top level that specifies some extra information about the schema. This information is used in both generating the helper class with omni.structuredlog and in matching a schema to messages during validation and transmission of messages.

  • there must be a #/definitions/events/ object at the top level that describes each of the events that are part of the schema. See below for more information on this object. The JSON schema language does allow for the events to be described directly in the oneOf array, but for now, the omni.structuredlog tool is expecting to find the #/definitions/events/ object.

  • everything described in the schema’s #/definitions/events/ object must be of type object. All properties described within each event object’s properties object must also be typed using the type property. All JSON schema primitive types are supported. However, some combinations of arrays and objects are not supported. For example, arrays with mixed member types is not allowed, and arrays of arrays is not allowed.

  • all event objects in the #/definitions/events/ object for the schema must share a common prefix in the name. This prefix is defined in the #/schemaMeta/eventPrefix property.

  • each object in the #/definitions/events/ object must have its additionalProperties property set to false. This is a requirement from Legal to ensure that extra unapproved information cannot be piggybacked on an existing event. If additional properties are found in a message, it will simply fail to validate against a schema and be rejected.

  • Each properties tag must be followed by a required tag listing all properties.

If any of these conditions are not met in the schema, the omni.structuredlog code generator will fail out when processing it.

Note that type doesn’t work the same way as in the simplified schemas. Type must be one of the JSON types boolean, integer, number, string, array or object. For C++ types: integer maps to int32_t and number maps to double. The omniverseFormat field is added in addition to type to specify which that alternate C++ types should be used. omniverseFormat can be set to uint32, int64 or uint64 when type is integer, float32 when type is number, and binary when type is string.

Schema Naming and Code Generation

When writing your schema file, keep in mind that all of the event and property names will show up in the generated C++ or python code, so you should try to use names that are valid C++ and python identifiers. In particular, you should avoid using reserved keywords, such as for and while. The code generator has some ability to replace invalid characters, such as ., but if you insert unusual characters, like ! or 🎂, the generated code will not compile.

When creating the JSON schema file, the same file naming conventions as any other header or source file should be used. Specifically, PascalCase or TitleCase should be used. Further, the script bindings header file should be named to include the name of the scripting language it binds to. For example, for a sample schema, the following names should be used for the schema related various files:

Filename

Purpose

“FancyStuff.schema”

Python dictionary format schema file name.

“FancyStuff.json”

JSON schema file name.

“FancyStuff.gen.h”

Generated C++ header file.

“FancyStuff.bindings.python.h”

Generated Python bindings header file.

Once a JSON schema file has been created, it needs to be added to the project build. See Generating C++ Code for instructions on generating code.

Exactly One Match

In order for a schema to be useful as a means of validating event messages, it needs to be able to verify that a given message matches against the expected event type. If the message matches to zero events, it is considered rejected by the schema. With a oneOf array, when exactly one event matches, it is considered validated by the schema. With a anyOf array, when one or more events matches, it is considered validated by the schema.

anyOf is the preferred validation type, since it’s possible to have one event that is a subset of another even with the use of constant fields. The simplified schema format always generates a anyOf array for its events.

Changing a Schema After Release

When a schema and the events it can be used to generate has been included in a released product (ie: used in a plugin that has been released to the public), that schema is considered released and must remain supported as it is for that particular version number for all future releases. Changes may still occur on the schema, but they must always be accompanied by a version number change.

Once a given version of a schema has been released, it must never be changed. If a functional change (ie: something that affects the output of an event) needs to occur on a given schema, a new version of it (with a new version number) must be released instead. The only changes that are allowed without bumping the version number are to modify the “documentation” fields in the schema. All other changes are considered functional and must include a version number change. This includes changes such as modifying schema or event flags, adding or removing required fields, etc.

Further, backward compatibility needs to be considered when modifying a schema. This means that the type of any given field may not change once it has been released. For example, if a field is defined to be a string in a schema, a future version of the schema may not change that field’s type to an object or number. Instead, a new field with the different type or an entirely new event should be introduced in the new version of the schema.

To increase the version number number of a schema when making a change, the following simple change is needed. The location of the change depends on whether a simplified schema or a JSON schema is in use:

  • simplified schema: simply increase the minor version of the #/version field of the schema by one.

  • JSON schema: simply increase the minor version of the #/schemaMeta/schemaVersion field of the schema by one.

For example, if the current version of the schema is “1.4”, this would become “1.5” after making the required changes to the rest of the schema. The version only needs to be increased once per release of a schema. A single version change may cover changes to multiple events or fields in the schema, as long as they are all released at the same time.