Making the case for a standard Device Information Model
Standardize the model to differentiate on value
A standardized information model enables devices and applications to understand each other.
Background: Why do we need device drivers for operational technology (OT)?
In a previous article of this series I explained how the concept of a device driver for IoT devices will help suppliers provide value add solutions based on data instead of competing at the edge on undifferentiated data collection technology.
Here, I focus on the need for a Device Information Model to describe the ‘things’ that are connected to the internet.
Purpose of an Information Model
The concept of an Information Model as a form of structure to data is not new in the IT world and used heavily in enterprise integration scenarios. Typically, the options are to define a common normalized data structure and map all heterogeneous data sources into the common model (i.e. build a canonical model) or map data sources and destinations on a ad-hoc mapping layer. Either has advantages and disadvantages that have been explored in details for decades.
This begs the question of the need for structure specifically in the Industrial Internet-of-Things (IIoT) context. In simple terms, the purpose of IIoT can be summarized as the act of collecting the data of things (simple sensors or complex powertrains) and sending it securely to the cloud for storage and processing.
The value for companies in this space lies not so much in the raw data, but in the inference they can gather from the data in order to provide value add services ranging from simple applications like asset management and health monitoring to condition monitoring and predictive maintenance solutions. The more volume of data over time and contextual information that can be supplied, the better the analytical models will be in providing value.
The estimated amount of time and processing power that is necessary to cleanse the incoming data before it can actually be processed is said to be around 60%.
This cleansing is required typically due to
- Unreliable sources (missing, estimated, or false data)
- Unknown or changing structure of the data, i.e. evolving hardware leads to changed data structures
- Semantic differences in data structures from different vendors
- Normalization of input requirement for common ML models that are vendor neutral or agnostic
A well defined Information Model embedded in the device itself or its data aggregator and processor on the edge eliminates or significantly reduces the need for additional cleansing in the cloud analytics pipeline as the model can be looked up and semantically analyzed.
Advantages of an Information Model
As explained above, the adherence to a well publicized and structured model allows any data processing pipeline on the edge or in the cloud to validate the incoming data stream and either filter the data as invalid – thus preventing it from tarnishing the results – or flag the data with additional metadata to explain the quality of the measurement, i.e. data is extrapolated, interpreted, aggregated, averaged, or staged.
The information model may also contain information about the technical specification from the manufacturer and, therefore, a model can detect violations to specified limits. Imagine a maximum environmental operating temperature set by the vendor when exceeded could void the warranty for the device. This information cannot be inferred from a raw telemetry data stream.
The application of a data quality processing model can further detect and flag abnormal values as outliers, stale, or even fake.
Besides the advantages of an IM related to the data quality, there are operational advantages such as compatibility of alike devices. Imagine a universally agreed upon information model of a drive. The device driver together with the IM enables an efficient plug-and-play model for any drive type from any vendor.
The information model can be enriched with multiple aspects of the same device to create a Digital Twin (topic of another post), thereby combining access to all-around information about a specific thing, its type, and more importantly its context. Typically devices don’t live in isolation, they are part of a system or installation. Their function may only be a side input to the larger combined system, e.g. a powertrain as a combination of a motor and a drive provides more useful information to the consumer than the sum of the individual parts. It’s like reading your car’s gauges without knowing the exact specification and role of each part.
The concept of different aspects of the same thing are further detailed and explored in the context of Industry 4.0 and its Administration Admin Shell (AAS).
Lastly, the information model does not only describe the data that the device is emitting, but it also contains information how to talk to the device and manage its entire lifecycle remotely. As such, the use of an IM is not limited to devices, it can also model other entities such as buildings and people with operations and attributes to manage them. This is the full potential of the Thing in IoT.
Approaches by various vendors
There are many ways to define an Information Model, e.g.
- Simple key/value pair with namespace convention (e.g. Ayla Networks)
- OPC UA (OPC Foundation, common industry standard)
- Schema based, e.g. JSON schema (ABB et el)
In the end, it boils down to the use cases and requirements from the customers. Often times a static model defined one time can suffice for a vertical solution when replicatability is not a concern. Ideally, in larger enterprise environments with heterogeneous components, vendors, and partners, a common and unified model quickly becomes complex but necessary, in which case a schema based approach would be preferred.
Great post, Tim Diekmann. This is an indeed an important problem to solve in order to advance IoT in industry. At Omnio.net are already working together with ABB through SynerLeap to solve it. To date we have mapped thousands of industrial devices including many from ABB. Main device categories are drives, switchgear, meters, etc.
Onedm.org
Nice Post Tim Diekmann. Did you check out Eclipse Vorto as tooling to define Information Models: https://www.eclipse.org/vorto/