IOT Stack: Opportunities and Consequences
aliexpress.com

IOT Stack: Opportunities and Consequences

Everyone talks up the IOT hype. But what does it take to make it real? And to make it real, what are the likely outcomes and consequences from all these opportunities?

To answer these questions, we need to look at what makes up the full IOT stack as an implementation and deployment model; from technology infrastructures to platforms, to components, devices, data, and business verticals building on top of these stack layers.

Here is a diagrammatic representation of the "full" partially-completed IOT stack:

A lot of pieces to pull together to get IOT working end to end as smart connected "things," and the diagram is not even complete. But even when looking at this "partial" stack, a number of unintended consequences appear to be evident.

Fragmented security: likely piecemeal together with security concerns trending towards vertical IOT silos

Ideally, security should be seamless throughout the stack from application authentication and authorization to eventually each device's own security protocols. But given the non-uniformity of security protocols between devices, infrastructure, and data platform dealing with whether or not and how data are encrypted and obfuscated in flight as well as at rest, a piecemeal security and data privacy implementation is highly likely.

This problem is not unique to IOT, whose complexities only exacerbate the exposure and risk. Even in the application world, how authentication and authorization are often implemented today just to deal with resetting a password is often wreak with exposure to DOS attack. The possibility of illegitimate "owners" accessing an account to complete a password reset can also mean doing an automatic login to the account. Now just imagine how to plug such security "holes" across a fragmented IOT security stack.

For example, in the device monitoring-and-control world, the use of NTP, or known as Network Time Protocol, to synchronize clocks betweens servers, devices, and controllers is quite common. But NTP is quite old relative to today's technology standards, and is probably not as battle-tested as some other protocols in the day-to-day of web "things." Now with IOT's popular emergence, NTP suddenly may be at the forefront of one of the potential flaws for security breaches.

In fact, based on a security notice from the Network Time Foundation, specially-crafted packets from hackers can be sent to devices, machines and controllers running NTP to do remote code execution. While the NTP daemon is typically made to run under a less-privileged account, attackers can exploit other privilege escalation techniques to gain root privilege to do wide-ranging damages. Attacks of this kind go beyond stealing credit card or other personal information.

Geo-location data: likely the nugget of all things good and evil

Of all the IOT data, geo-location probably ranks as one of the nuggets at the very top of IOT big data. It's a way to label and thus identify a target for actions, be they passive ones like actions to determine where the target is located at any point in time to get more meaningful contextual information and insight on the target, or active ones such as actions to directly control the functional states and operations of the target.

Shy of getting a user / device ID, name and other directly-identifiable information, geo-location is a way to achieve big data benefits without directly and explicitly violating personal privacy and device proprietary information respectively. Today, knowingly and unknowingly, we have increasingly accepted the risks and benefits of disclosing our location information to use our smart phone map just to take us readily and conveniently from point A to point B without guessing, or find a restaurant of our preference in our proximity when we need it the most.

In the equivalent device world like smart meters, the collection of its latitudes and longitudes as well as those of the connected transformers, feeders, and substation would allow utilities to determine in real time the total power consumed at any point in time to balance the total power output by that connected part of the smart grid to achieve efficient and optimal power generation and operation.

There are big benefits for doing this. Optimal power generation means energy efficiency and extended machine life for power plants with lesser carbon emissions. We cannot do any machine optimization if we can't measure its performance metrics. But we can't measure and calculate any meaningful performance metrics end-to-end without understanding the device-to-machine topology. But no geo-location data also means no topological data. This is why geo-location is one of the linchpins of IOT big data. Its benefits are phenomenal.

But geo-location is also a double-edge sword.

Devices can be controllers in the industrial world, controlling critical pieces of equipments that can be the achilles heel to an entire operation like a critical section of a major factory. When there is a data breach, data analytic techniques can be used to define and map the specific " data signatures" of the breached data to a small set of possible critical machineries in that critical section.

By further correlating the signatures with the device's geo-location, the likely role of the device in that factory's overall command-and-control operating environment can be readily discovered, making such cyber attacks much more damaging than financial losses when remote code is launched to control the critical devices to perform non-specified functions. Think the likes of Stuxnet.

Fragmented connectivity protocols: likely resulting in either a mishmash network of protocol gateways with limited on-demand connectivity, or vertical and often proprietary IOT networks, or yet another option using cloud pub/sub

The fragmentation is intrinsic to IOT being all things to all devices. Some devices are required to be very low power to deliver low data rates at regular intervals while some may need to support video. Typically medical devices and industrial IOT fit into the first category. These different device characteristics put constraints on the types of network protocols and topology to support them. Even with the IEEE 802.15 standard to support such device networks, at least three different standards are specified to support the different use cases: 802.15.4 for low power and data rates; 802.15.1 and 802.15.3 for medium to higher data rates.

But these standard options are only for the MAC and PHY layers. Different network topologies are then defined for different needs. The standard design also specifies a "device controller" to function as a "gateway" to connect to other device clusters (PAN). Zigbee, for example, is an implementation of the application and networking layers on top of 802.15.4. It's relatively slow, and is often used in buildings and homes to connect switches and appliances. Depending on the application, these PANs may need to connect with each other and ultimately to a data platform to provide smartness via analytic on the platform. Additional gateways are often needed to map to these protocols, to cellular, Ethernet, and so on.

Generalizing this design for connectivity flexibility and openness is often complex, leading likely to more vertical IOTs for specific vertical industries with additional ad hoc customized interfaces to support additional peripheral devices to broaden the initial vertical "smartness." We are seeing this already today with many startups filling that role. For example, cloud-ready embedded WiFi systems, or mobile data synchronization and backup systems to connect devices and applications, ....

IOT, again being all things to all devices, would also include mobile phones, tablets, game consoles, and so on. These are devices that often host applications with high message volume, making general gateways even more complex to do. Thus the emergence of a cloud-based real-time "gateway messaging services" using a publish / subscribe type of REST interfaces presents yet another viable option of effective connectivity for these types of devices.

Such real-time data stream clouds often hide the underneath network and infrastructure complexities with typically a simple set of API's to initialize a connection channel, publish, subscribe to that channel, and disconnect from the channel, using a selection of SDK libraries to support the different device protocols like WebSockets, SignalIR, WebRTC, ....

Truly a mishmash of connectivity options.

Device-to-device or M2M smartness: likely difficult to attain for a while

Some of today's devices come with a somewhat hefty hardware stack, making edge processing and device-device interaction possible and in principle an advantage over having to send data to a data platform in the cloud every time. This is particularly true if the data has to go over a mishmash of gateways just to get to the cloud. But doing edge processing is non-trivial, and it's not something any developer is well-equipped to do. It would be far easier for data and application developers to get data from the cloud and let someone else worry about the mishmash of network mappings.

For example, most devices do not come with pre-installed software runtime environments. Just to get a Linux baseline and a required runtime set up in a bare-metal device already involves quite a few number of steps and technical know-how. A fair number of libraries, tools, including a build environment are also required just to get the baseline Linux, the required runtime, and a software framework to support code development and execution. Lots of prerequisite work before doing the real "thing."

Writing edge code also harkens back to the days of code development for PC DOS with limited memory and processing power, plus an additional challenge that PC DOS developers did not have to face: power consumption. Unlike desktop machines, most devices are power-constrained to preserve battery life. While the device hardware stack is supposed to be designed for lower power consumption, having edge code that constantly drains the battery defeats the purpose. Besides making the code work in a device with limited resources, developers would need to write energy-efficient code, not necessarily a cakewalk for any Java developer to do.

Let's just look at a use scenario in a smart home where we would just like to use a smart phone to control lighting and shutters with limited device smartness to determine when to notify the smart phone to take certain actions. Despite such a relatively simple use scenario, a developer would have to understand the control interfaces of light switches like Verve Living Systems, of hand-held switches like Illumra, of BSC mode control, and energy metering. These are not common knowledge for most software developers. Adding edge processing code for device-device smartness only complicates the tasks further.

Thus device-device autonomy is unlikely. Connected smartness likely would come from device to cloud where streams of device data are ingested to the cloud data platform for analysis.

Data platforms: likely the center piece for device data analysis, orchestration and control

This is an area rich with technology options to make it a key enabler for IOT cloud-connected smartness. Despite the aforementioned connectivity protocol challenges, as long as each device can get its data to the platform in the cloud via whatever protocol it is optimized to use in its own "channel," the platform can ingest all the data for analysis. Data analysis provides data relationships and inferences to give "smartness" to the devices via "device-to-platform-to-device" intelligence.

Such intelligence comes from a combination of historical data contextual insights and real-time device data stream event processing and monitoring. Event stream is now, the state of what the collective devices are telling us. Contextual insight is the recent past of direct and a variety of indirect but related data to establish some correlations between the recent past and now. This bi-directional data semantic is what gives cloud-connected smartness to the collective dumb devices.

Additional data semantic can come from certain data platform abilities to handle streams as entity relations, thus allowing, for example, the grouping of a large number of parallel but different streams from similar but different devices to be inspected, monitored, and analyzed together as a single entity. The ability to do so would allow better contextual and aggregated views of the state of the devices than having to look at millions of them discreetly and independently. These are examples of the kind of smartness such connected devices can have via the data platform as a service in the cloud.

The platform technologies of choice to deliver such bi-directional data semantic are likely a combination of Hadoop, Spark, Akka, and some of the data eco-systems surrounding them, all built for massive scalability and performance containerized in a cloud somewhere.

Despite the choice of open-source components, there are key differences between the different platform implementations of these open-source components. Everyone is doing its best to gain an advantage. Some expand on the basic Hadoop architecture by adding some real-time capabilities. Some chose its own NoSQL as the underlying databases. Some create a time series layer on top to optimize the platform components for IOT data. Some use Spark's RDD graphs to process IOT data streams. The bottom line: even at the data platform level, don't expect uniformity and consistency between IOT data clouds, making interoperability between IOT clusters to share data not straightforward.

Data platform cloud services: likely further fragmentation between public and private data clouds

Large enterprises probably favor its own data platform implementation either as a private data cloud service or an on-premise fully wall-off installation, all to take the necessary steps to protect data privacy and heighten data security. Some of the core technology components are described in the section above.

However, for speed, cost, and simplicity, certain smaller enterprises or tech groups likely would choose to go with cloud providers like Google and Azure to take advantage of its readily available application scripting engines and storage.

Instead of spending heavily in time and talent to develop an entire data platform even when using some of the open-sourced technologies as described in the above section, these groups can focus primarily on the client-side code to interface with devices, capture its data-time pairs, do some data transformation as required, and package the resultant data into proper HTTP headers and payloads to post and put to a Google or Azure URL where further cloud-side processing can be done on the data via the specific cloud platform's scripting engine. If more real-time robustness is needed, web sockets can be used.

Is this approach any less secured than a full data platform implementation? Not necessarily, and it likely would be one of the different options some enterprises would choose to implement its own IOT cloud data platform.

Platform architectural patterns: likely full of variations

The above diagram represents the full IOT stack, not the implementation architecture. The same stack can be implemented in diverse different ways. One is not necessarily better than the other, depending on what an implementation attempts to optimize architecturally to meet the use cases.

As such, IOT integration or inter-operability at the data platform level also presents a challenge. If the data platform architectural pattern is based on a publish / subscribe model of devices communicating to its local message broker like RabbitMQ using MQTT that in turn communicates with a remote RMQ broker for the platforms, the resultant implementation is much more flexible to allow the addition of other devices and other platforms.

Naturally it takes more work and the implementation is not as "tight" as a "synchronous" one. But it's much more resilient, flexible and scaleable than the typical "device to platform" pattern.

The "device to platform" architectural pattern couples devices directly to platforms by either having the devices poll the platforms, or have the platforms push to the devices. This pattern may expose more security risks through open ports. The RMQ broker to broker pattern tends to have less such risk exposure.

There are other architectural patterns to consider. The bottom line: variations even in the full-stack platform implementation make IOT platform-platform interoperability non-trivial.

Impedance between device data: highly likely

And since each device or group of devices all "channel" its data to the cloud in its own way, even the sharing of the common JSON format is not automatic. In some cases, we may need to de-serialize and transform JSON to an appropriate XML format to share IOT cluster data. Further complexities involve sharing legacy system data with IOT data, sort of like sharing or correlating tuples and NoSQL log files documents with data time series.

Thus before we can even speak of any cloud-connected smartness for devices, a great deal of attention needs to focus on the upfront data extraction, cleansing, loading, and transformation to get the diverse data into meaningful semantic tuples or indexes to enable powerful queries and searches.

In fact, the quality of this upfront data scrubbing work can determine how good your analytic outcome is. The so-called Internet of Things is an outcome of the Internet of data scrubbing; otherwise you would likely end up with a vertical IOT solution with narrow smartness primarily along the synapses of your own data pathway within your own data silo with little other data variety to enrich your insight.

I'd say that most of today's IOT fall into this category due to a combination of the above factors: differences in network and connectivity protocols, plus differences in data definitions and semantic impedance. So it's far easier to simply provide connected smartness within a device's own data domain.

Data scrubbing is "dirty" work, but is work that needs to be done, and ideally speaking, done only once to make it available to all other and future data consumers. This is where a data platform can have a broader scope than what was defined above. It needs to be the platform that can deliver "data as a service" for master data to all data consumers.

In the IOT context, master data now expands beyond the traditional enterprise supply chain data views on a single source of truth for the multi-dimensional relationships between "product," "customer," "transaction," and "supplier." That old model contains no sensor data. The IOT model now needs to include quantitative metrics and KPI's into the traditional enterprise supply chain data views for efficiency measurements.

Effectively that "old" single source of truth now contains many more data resources to represent the new truth. The new single source of truth is now much more complex in dimensions and semantics, involving structured, non-structured, and now time-series data. In addition to more data producers, there are now also more data consumers even within an enterprise, spanning and tightly binding every aspect of an enterprise from nuts to bolts, customers to sales, marketing, engineering, manufacturing, shipping, and to support services.

Data impedance only gets worse unless there is a concerted focus on enabling data as a service via a properly designed data platform. In addition to what's defined above, a properly-designed data platform provides all the data logistics, pathways, manipulations, representations, access and control, manifestations of all relevant data relationships and actions on all data semantics, all done under the scope of a well-formed data stewardship and data model. It's only then that the data can be made available as a service to all data consumers within and in some cases outside an enterprise.

Data as a service: probably a stretched goal for IOT immediately

Based on the above challenges, an immediate opportunity for IOT would likely be sub-optimal IOT silos where efficiency gain is reached locally via sensor data and measurement metrics simply because it's much easier to do these within a silo vertical where metrics tend to have higher compatibility for measurements and efficiency gains.

The next stage may be to connect such IOT silos to obtain a bigger "local" optima. Locality here refers to the direct functional and operational relatedness of these IOT silos. Given the functional and operational proximity of these IOT silos, there is also likely more data relatedness and compatibility from these IOT silos to make bridging the data differences and data as a service easier between the silos.

But to go from such functional and operational proximities of direct relatedness to second and third degree of relatedness and connectedness in silo functions and operations, data impedance tends to increase significantly over silo separations and take much more work to map and transform, making data as a service available to general IOT data consumers likely a stretched goal for most IOT deployments.

Custom development services: likely an area in high demand

Like every challenge, opportunities abound. Given the above data and interfaces impedance, custom development services are almost always needed to implement an IOT strategy, sort of similar to the high demand for web designers and implementers during the early days of the web until years later when technology finally caught up to make building and launching a web site much simpler now.

So there are going to be IOT technology enablers, each with its own enabling set of capabilities; IOT value-added services, each built on top of selected enabling technologies; IOT integrators, each linking IOT data providers and consumers to power IOT enterprise silos that are often embedded in selected critical sections of an overall enterprise to achieve local optimization and efficiency gain via IOT data measurements and analytic.

Thanks, Chris. I actually had forgotten about this. But appreciate the comment. Hope you are doing well. Opportunities in the midst of confusions!

Like
Reply

Great post on IoT Charlie. You've identified a lot of the issues that underlie the hype around IoT.

Like
Reply

To view or add a comment, sign in

More articles by Charlie Sum

Others also viewed

Explore content categories