The Data Scientist’s View of the Manufacturing Data Hub

This article explores how a Manufacturing Data Hub makes data-science work in manufacturing more effective, efficient, and enjoyable.

In short:

  • High quality data and governance protects your analysis from the classic problem of Garbage In, Garbage Out.
  • The ISA-95 ontology uses semantics designed specifically to represent a manufacturing operation, including its every entity, relationship, and aggregation.
  • The ontology represents a structured graph of all knowledge about the manufacturing operation, providing the golden input for various applications of AI.

Here we use “data scientist” as a catch-all term, ranging from business analysis to unsupervised deep learning. If you work with manufacturing data to drive operational and business efficiencies, these topics apply to you!

Challenges of turning operational data into analytical

A manufacturing operation generates a lot of data: resource-planning documents, execution-system workflows, process measurements, device messages,  aggregations, reports, and many more digital artifacts all are critical parts of an operation. Of course, the promise of this abundance of data is that it contains hidden value that can be unlocked through analysis. And it’s true―data science can help drive new efficiencies and help stakeholders better understand their business.

In practice, however, the process of making operational data analytically useful is full of pitfalls and traps, with challenges at every step:

  1. Consolidating, storing, and cleaning operational data.
  2. Understanding its context.
  3. Evaluating its quality and integrity.
  4. Mining the data for information.
  5. Communicating insights to others.

A Manufacturing Data Hub can greatly minimize these challenges, which are inherent in a complex digital manufacturing system. The following sections explain why

High quality data with less cleaning

Data quality must be the foundation of useful data analysis. No level of algorithmic sophistication can save you from bad assumptions about the data itself. The ontological model of the Manufacturing Data Hub solves the typical issues of data governance and quality that plague most manufacturing operations.

All data-science process frameworks include some early steps to “scrub” out inaccuracies and abnormalities. Quite often, this process is the most frustrating and manually intensive part of data analysis—and it all happens before analysis even starts.

Because the Manufacturing Data Hub is built for data quality and governance, it greatly reduces the initial cleaning burden for the analyst. The key is the ontological model, whose language fully represents any manufacturing operation. Once the operations team maps their use cases to the model that already exists, data is ingested into a structure that brings a high degree of clarity and context.

Besides modeling, there are also technical reasons the Manufacturing Data Hub has high-quality data. Its durable architecture brings type safety, referential integrity, and validation on write. The model and architectural robustness also reduces the upstream work of consolidation and ETL.

High context, high semantics

Analysis works better when it comes from a deep understanding of its problem domain. Even with high-quality data, a data scientist must understand what the data represents to properly analyze it. Only a Manufacturing Data Hub provides data that captures the full Deep Context of Manufacturing.

The manufacturing ontological model also provides excellent semantics for the data itself. Learning how to use it simplifies the need to spend time communicating with data owners about what each field and relation means. With an ISA-95 backed database, questions like “how much of material X was wasted during process Y at plant Z?” are easily translatable into a precise query. Your operations team also likely will use the model to define their own KPIs, which you can use to align measures of efficiency across operational and analytical teams.

ISA-95 represents data at different granularities, from high-level operational schedules to events triggered by a single device measurement. Furthermore, the model itself provides a number of entities to classify and aggregate, and the data scientist can use these high-grain entities to compare and contrast a representative sample against an isolated variable. For example, classes and segments categorize entities;  instances, events, and actuals record facts about them.

Ontologies: golden data for AI

Besides understandable semantics, the ISA-95 data model also provides a complete knowledge graph of the operation itself. This means that you can reason about the data with a high degree of certainty about its meaning in the wider operation. For deep-learning, a knowledge graph also provides an outstanding structure for various extraction, inference, and prediction models.

Considering that the scope of the model encompasses the entire manufacturing domain, the data can be mined widely and deeply.

Closing the flywheel

This article focuses on how a Manufacturing Data Hub provides a unique advantage for data scientists in the manufacturing domain. The datasets you can analyze are high-quality, high context, and rich in scope. As we’ve written before, ISA-95 is your path and model to quality data.

With quality data, you can drive new insights and efficiencies. With a better understanding, the operations can better plan and design systems for further operation. As always, effective automation pays compound interest.

Examples: