Making the case: Why you should be using a dedicated layer for data modeling
Tony Paine
is the CEO of HighByte, focused on the company’s vision and ability to execute to plan. For 20 years, Tony immersed himself in industrial software development and strategy at Kepware. As CEO, he led the company through a successful acquisition to PTC in 2016 prior to founding HighByte in 2018. Tony has contributed to a variety of technical working groups, helping to shape the direction of standards used within the automation industry over the past two decades. Tony received a Bachelor of Science in Electrical Engineering with a concentration in Computer Software and Hardware Design and a Master of Business Administration with a concentration in Data Analytics, both from the University of Maine.
In my last post, “An intro to industrial data modeling”, I shared my definition of a data model and why data modeling is important for Industry 4.0. I’d like to take that a step further in this post by explaining why you need a dedicated abstraction layer for data modeling to achieve a data infrastructure that can really scale.
A single environment
A dedicated abstraction layer (I’ll call this the DataOps layer) is essential because not every application conforms to one single standard. Industry standards exist but they are finite. As such, vendors continue to create their own schemes to model rich information in the context of their application. Fortunately, vendors typically provide some level of an API to push and pull data from the application in the expected format that a dedicated data modeling layer can then leverage.
By orchestrating these integrations within a dedicated layer, we can begin to genericize data modeling, such that a user can work in a single environment to model any number of things. That layer then becomes responsible for transforming data into the specific data modeling schemas for all consuming applications. This is game changing for users that need to collect, merge, transform, and share information with many applications that live on-premises and in the Cloud. Users can build out or deploy new applications over time and take advantage of previous work. When data modeling is managed in a centralized location, users can add, delete, and edit parameters for connected applications without breaking existing integrations.
By orchestrating these integrations within a dedicated layer, we can begin to genericize data modeling, such that a user can work in a single environment to model any number of things. That layer then becomes responsible for transforming data into the specific data modeling schemas for all consuming applications. This is game changing for users that need to collect, merge, transform, and share information with many applications that live on-premises and in the Cloud. Users can build out or deploy new applications over time and take advantage of previous work. When data modeling is managed in a centralized location, users can add, delete, and edit parameters for connected applications without breaking existing integrations.
Visibility and flexibility
A single environment provides visibility. We often speak with automation engineers who know they have hardware and software on the plant floor that are producing and collecting raw data, but they don’t know who is connecting to them and what data is being shared. A centralized location allows OT to easily view where and how data flows in and out of the plant floor. They know who is producing the raw data, who is consuming the information, and how changes to the DataOps layer will impact the rest of the enterprise. A dedicated layer adds resiliency and flexibility to the vast ecosystem of technology found in most manufacturing facilities.
One-step data prep
A centralized location also reduces data preparation redundancy and decreases system integration time. Information can be automatically propagated to any vendor’s application without touching each application individually. It is time consuming to model data in many different applications as opposed to modeling data just once.
Passive connectivity
A DataOps layer provides passive connectivity—meaning, users won’t need to schedule downtime or rewire integrations to establish communication with the solution. An Industrial DataOps solution like HighByte Intelligence Hub can passively drop in, make connections to existing data sources, pull data, transform it, add context to it, and then push out real-time modeled information to running applications using their respective APIs.
Data quality and accuracy
The DataOps layer transforms raw data before making it available to all consuming applications so there is less chance of errors being made upstream—like attaching the wrong units of measure to a data point. DataOps reduces the opportunity for human error by providing a central location to manage conversions and transformations. If there is an error, it’s detected quickly and easily fixed without troubleshooting each application or mining custom code.
For data that includes time characteristics, the DataOps layer helps ensure time stamps are consistent and accurate across all applications. When applications are allowed to collect data directly and independently, they may collect different time samples depending on the rate-of-change velocity of the underlying data. Instead, by using a DataOps layer, users ensure all applications receive the same time sample and are in synch with one another, down to millisecond resolution.
For data that includes time characteristics, the DataOps layer helps ensure time stamps are consistent and accurate across all applications. When applications are allowed to collect data directly and independently, they may collect different time samples depending on the rate-of-change velocity of the underlying data. Instead, by using a DataOps layer, users ensure all applications receive the same time sample and are in synch with one another, down to millisecond resolution.
Security
Finally, when industrial companies manage data modeling and integration in a dedicated DataOps solution, they bolster their defense-in-depth strategy. The modeling environment enables only authorized individuals to determine which applications should receive data and exactly what data they should receive. Consuming applications no longer have unfettered access to raw data sources. The DataOps solution abstracts away this direct connection. And rather than burying integrations in custom code, they are visible to authorized users and help protect potentially critical infrastructure.
Next steps
If you would like to learn more about how a dedicated DataOps solution can provide a more efficient and secure means of merging, modeling, and moving your industrial data, I invite you to try HighByte Intelligence Hub for yourself. Join the trial program to get access to our latest release, product resources, and a HighByte team member to help guide you through your assessment.
Get started today!
Join the free trial program to get hands-on access to all the features and functionality within HighByte Intelligence Hub and start testing the software in your unique environment.