Omniverse Telemetry Data Collection
Data Types, Collection, and Storage
Telemetry data can take one of a few forms:
metrics: a metric is an aggregated piece of information that is sent periodically. This usually takes the form of a counter, timer, or other statistic. For example, how many times has the user clicked on the ‘save’ button in the last five minutes, or how often the GUI buttons versus a keyboard shortcut are used. A metric value is usually collected internally within a component then sent as a telemetry event periodically (ie: once per second, minute, hour, etc). Metrics are used over events in situations where each independent occurrence is not important, but the overall usage of something is. A metric has the benefit of requiring significantly less storage and processing time on the data servers.
events: an event is a single occurrence of something of interest. This is accompanied by a timestamp and a other information relating to either the occurrence itself or the state of other systems at the time of the occurrence. For example, when a large load operation finishes it might be useful to know how much free RAM or VRAM is left. Each event would arrive at the data servers with all of its information and can be fully differentiated from all other occurrences of the same type. This has the benefit of having more information associated with each event, being able to analyze each occurrence, and being able to follow a pattern of behaviour even down to a single user. This does however require significantly more storage space on the data servers.
logs: a log is a listing of local events that have occurred up to a certain point in the execution of an app. Logs are most frequently sent with crash reports. Carbonite and Omni logging messages can be redirected to structured logging events using the
Data points of interest can come in many forms. It can be something as simple as a signal generated when an app starts up successfully, to hardware information, to personal user information, to tracking behavioural habits of the user. Each of these could potentially be useful to analyze or store to improve products. Collecting this information can be relatively simple and quick. However, nothing can be done with that data unless it is approved by Legal first (see Legal Requirements for more information). The Carbonite structured log core and generated schema code is intended to be the main collection system for Omniverse apps.
Once the data has been collected and sent to NVIDIA’s data servers, it will need to be stored in some way. For fresh data that has just arrived, this will typically be stored directly for more detailed analysis. To more easily comply with data collection laws such as the GDPR, some stored event data will be aggregated after a certain amount of time (ie: 25-30 days). This allows us to hold on to the bulk data for a stream of events almost indefinitely since it has been fully anonymized and is no longer linked to any single user. Once aggregated, the events also take up much less space and require less time and effort to analyze. The down side however is that the accuracy of the original data has also been lost.
Any Omniverse component or app may emit an event message to its local log file(s). This does not require approval from Legal. However, once you find an event message that you would like sent to NVIDIA’s data servers for general analysis or storage, that event and all of its contents MUST be approved by Legal. There are a lot of aspects of data collection that need to be considered since NVIDIA is a global company. The most important of which is whether the data being collected is considered personally identifying information or “PII”. For the most part, Legal likes to model NVIDIA’s global data policy around the most strict data collection rules - the General Data Protection Regulation (GDPR) in the EU, the Brazilian General Data Protection Law (LGPD), etc. There are a lot of aspects of each data collection that need to be taken into account before an event can be approved for collection. Even if a developer may consider some event data benign, they are not allowed to make that decision on their own.
The process of getting a set of events (called a schema) approved by Legal for Omniverse will be managed by Halldor Fannar (email@example.com). He will help shepherd the set of events through the approvals process with Legal. Generally this process will require filling out a questionnaire on exactly what data is to be collected, answering questions about the intended purposes and sources of this data, and discussing and documenting exactly how the data will be stored and used. This same process applies to both new schemas and existing ones that are just being modified.
The end result of the legal approval process is that the schema is approved by Legal. Once this happens,
the schema will be ‘installed’ in a location (online) that the message transmission system can download it
from. Each installed schema will describe a set of one or more events that is allowed to be sent to the
data servers. The message is matched to its schema through the
dataschema property in the message’s
top level block - the
dataschema name in the message must match
#/schemaMeta/schemaVersion separated by a
- character. It is left up to the message
transmission system to decide how to discover all the
dataschema names for each installed schema.