Enterprise Data and Analytics Integration

Many manufacturing companies are investing in large data and analytics infrastructures. The deployment leads to very different architectures and the question often is: How to design large scale data\analytics solution? Or, what is best practice?

Since all the components such as servers, interfaces and buffer systems are modular, you’ll find very different architectures, which mostly fall into these three categories:

1.      Centralized; all data are pushed to a central location

2.      Decentralized; data remain local

3.      Any combination of the above

There are pros and cons to each configuration, which can be summarized as follows:

Centralized: In this scenario all data a pushed into a central server and all client applications run on this system. Cloud systems would also fall into this category. The main advantages are:

·        Low Cost and Maintenance

·        Accessibility: Data and System Access

·        Data can be merged from different sub systems

·        Scalability

On the flip side you will find that:

·        Systems are less reliable; every component you add will reduce the overall reliability

·        Latency; data must be pushed from the source to the destination system

·        Data Sequence; buffering can lead to artifacts in the data flow

·        Data quality; enterprise level data are SQL’ized or resampled

The opposite is true for decentralized solutions, where data are processed on the local or plant level. Most of these systems are located on the automation layer, which is difficult to access and demands higher security settings. But since you are closer to the source, you will find minimal latency, higher system reliability and excellent data quality.

No alt text provided for this image


The challenge is often that there is not a clear separation of project types and therefor the specific data needs, which leads to one data architecture that needs to serve all projects. In a common scenario, you would separate your projects into at least two categories:

Enterprise Level Analytics:

               Abstract data such as data aggregates

               Slow moving data: time frame minutes, hours, days

               Application: Optimize business processes & operations

Process Analytics:

               Specific or raw data, e.g. sensor or QC data

               Fast moving data: time frame millisecond, seconds or minutes

               Application: Process Tracking and Optimization

Depending on the data requirements, the data architecture would then lead to a blend of a decentralized and centralized solution.

A good way to structure projects is to collect metrics such as:

               How many source systems need to be merged?

              What type of data and model will be used?

              Data density in data points\minute or hour

               What is the required uptime\system reliability? How many 9’s can you tolerate?

               How often does the model need to run\uddate?

               Model execution and update time are important parameters

               and others …

These metrics will help to decide where to place models\applications and what data streams are required.

To view or add a comment, sign in

More articles by Ernst Holger A.

Others also viewed

Explore content categories