Rhize Up Podcast: Episode 4 - Defining the Unified Namespace

Rhize Up Podcast

Episode 13 Transcript

[David]

All right, let’s fire this thing off. So good morning, good afternoon, good evening, and welcome to the Rhize Up podcast. Today this is going to be our defining episode.

And so what we’re going to attempt to do is actually define some of the words and phrases and concepts that get thrown around out there. And of course, every time I start thinking about how do we define things, I always remember, boy, sometimes we see simple things like furniture and toys. Those are very difficult for us to define.

And I’m going to channel my inner Potter Stewart, he was a U.S. Supreme Court Justice, and one of his popular phrases is, I know it when I see it. So I think we’re going to be around a, it’s maybe not a definition, but we’re certainly going to give some characteristics around a lot of the terms that get used a lot, especially within digital transformation, Industry 4.0 and the like. So today I’m joined by a gentleman named Tom Hollingworth, I’ll let him make an introduction himself, but it’s his first time on the podcast.

So I’m very excited to have him. When I was first introduced to him, it was described as the, the, the leprechaun riding on the back of a unicorn. So Tom, with that, tell us a little bit about who you are.

[Tom]

Yeah. Hi guys. Tom Hollingworth here.

Mechatronics engineer, as you’ll probably hear, Australian, but I am based in North Carolina. So I’ve been with Rhize for about six years now, implementing projects and services in that layer three space. I was previously a controls engineer doing like PLC SCADA stuff in a lot of high compliance industries.

So animal health, speciality cam, nuclear medicine, those kinds of industries. So basically today I help manufacturers develop and standardize, help them develop a standardized data foundation really to enable citizen development, operational excellence, and execute on use cases today. So yeah, a little bit about me.

[David]

Yeah. And I would say it’s not just the integration of the, you know, the level two, level three systems. It’s a rich knowledge base around, you know, data, data analytics, because you were, you were doing some instruction for, for some of the products that are out there as well.

So yeah.

[Tom]

I did a lot of SIG stuff at one point, a bit of a SIG trainer there in Australia for a little bit. So yeah.

[David]

Excellent. Well, Tom, great, great to have you here. Glad, glad you were able to carve out some time and spend some time chatting, thinking about some things.

So, so let’s get on with it. So one of the very first episodes we did here on the podcast is we made an attempt to define the unified namespace. And I would say that what we’re going to do today is going to maybe be an extension, a little bit of that, just from a socialization standpoint, it’s, you know, we look at it, it’s an approach for an event driven architecture.

It uses a pub sub technology, and it also includes a data ops tool that defines the, the data models and the topics where that information is going to be exchanged. So, you know, even in that definition, there’s some things we want to get to and talk about of what exactly does that mean. But when we start talking about unified namespace, the first thing that seems to come up is this concept of a broker.

Broker?

So when we say broker, what do we mean by that? Is it we just say MQTT broker, or is it something else there, Tom? So, you know, let’s talk a little bit about a broker, but it’s usually, you know, it doesn’t necessarily mean having less money than somebody else.

There’s something else.

[Tom]

I mean, usually it wasn’t, you think financial broker, right, or some sort of, you know, making markets or something like that. But no, in manufacturing, I think we think about events in manufacturing, and we think about someone sort of brokering those connections, and facilitating a lot of those messages between systems. So, yeah, I mean, typically today, it’s what MQTT seems to be the flavor, but you can probably consider, you know, with a really simple definition like that, you probably consider something like OPC UA, just equally as, you know, it could just as well be a broker as well, right?

[David]

Yeah, I mean, it’s really the broker. It’s just that instead of brokering dollars, it’s brokering information. I would say that, yeah, it’s MQTT is the most popular flavor, especially for the OT space.

But there’s DMP3 in terms of different protocols. I know there’s AMQP, that’s not as prominent, that’s, you know, I would characterize that more as a proper message queue, and I mean by proper message queue is, you know, relative to MQTT, it’s basically the relationship is between the client and the broker, but there’s really no way to guarantee message delivery or anything

like that, which in some applications, you know, like telemetry data, that’s perfect.

But for something else where I want to guarantee delivery, then I’m going to have something that’s going to ensure that that took place in a message queue like AMQP, or at least that protocol in that context certainly makes sense.

[Tom]

Yeah, it’s got a lot of parallels to what I’d call like an enterprise service bus. If you think about early 2000s IT, right? I did a project with Tidco, Tidco EMS, enterprise message bus, I believe, where you’ve got all these different protocols, whether it be like XML or REST or something like that, all going onto the bus.

And these enterprise service buses would transform those messages and deliver it to consumers, certain producers to certain consumers based on different rules, occasionally doing, you know, even protocol conversion. So I see a lot of similarities from that sort of enterprise architecture space into into like the program, what we talked about today with the UNS.

[David] Perfect. [Tom]

We’ve got some parallels as well from like, bottom up as well, you know, when we think about MQTT being the UNS, you’ve got these like topics and structures. It’s almost as if you’re, you’re taking your PLC controller tags, right? That sort of, that sort of, I guess you’d call it like tag pub sub, right within a controller, and then exposing that to the enterprise.

[David]
Okay.
[Tom]
You know, thinking of a structure of topics and tags and, you know, bringing that out. [David]

I think really, when we start talking about broker, we just, you know, I think the important thing to remember is just make sure we all understand the context there of, you know, generally speaking, I would say we’re probably talking MQTT, but there’s other ways that that can be used. You know, that’s one of the biggest fallacies of communication is, is assuming that the other person understood what you were saying, you know, that comes into there, I might be butchering that I think there was a really smart person that said something along those lines.

So I’ll certainly want to give you credit where that is and butchering it.

Schemas and Validations

So so with that, so you know, one of the things we talk about within DataOps is that we’re going to be sending this defined payload structure. And sometimes we call that a schema. But I think there’s a lot of different words, there are ways that you can use schema, there’s like the database schema, there’s a payload schema.

You know, what do we mean? What do we mean by schema? What is what is that word? Can you?
[Tom]

Yeah, usually we’re trying to describe, it’s usually language specific, as you sort of caught and done right database schema, maybe there’s a transport schema, like a JSON schema on XML schema, it’s usually very language specific, and defines the thoughts of properties, fields and attributes that should exist within a given blob of data, I guess you could say that data could be an event, or it could be stored on disk in the context of a database. So when we talk about schema, it’s really around like, what is the data structure that we’re talking about?

Right?

[David]

So it’s, it’s, it’s kind of the, you know, what, what information might be here. So going back to like a JSON, or even, you know, maybe an XML, there’s the key, what’s the what’s the property that’s associated with it, and there’s the value of it. So it’s that, that key value here, but you can have a ton of things that are in there.

You know, XML is kind of the same concept. But, you know, I think it’s also much like broker, when you’re talking about schema, it’s are you meaning the schema of the data that’s flowing?

[Tom]

Or are you talking the schema that’s at rest, and you did certainly present a little bit, but the database, go ahead, maybe it would be like the storage structure, right, whether it was storing an event and how that event is that packet of data is stored, right, and bits and bytes, how it’s physically stored and organized in a particular, you know, system or database.

[David]

Yeah. Yeah. And in the database schema, it’s the, you know, we’re looking at you have a table, you have primary keys, foreign keys, you know, what’s the relationship of everything?

And how does all that fit together? And certainly, in that context, it’s important to understand that when you’re going to retrieve information, having that understanding of the schema and

how all that data relates, that that’s where it’s going to come in important that type of thing. So, yeah, so schema is, you know, we’re going to have a broker, we’re going to publish a schema, you know, probably a JSON payload, but, you know, one of the things that we talked about on our last, go ahead, you have something where you want to have there.

[Tom]

There are a lot of other things that we talk about when we say broker, you know, we did touch on, you know, routing messages, but things like security metrics, governance, maybe even describing what schemas should exist in those messages, and potentially even like protocol conversion, right?

[David]

So which, yeah, perfect segue right into what I was, you know, gonna ask is, yeah, on a previous podcast, your owner and I were talking about transactional data, you know, the event data, so you have the event, and then there’s some data that’s moved back and forth. But let me get into the scheme of validation. And it’s, you know, you can probably intuit exactly what that means.

But, you know, what your owner and I talked a little bit about and why that’s so important. But when we get into schema validation, I think you already touched a little bit on it. But, you know, there’s some data governance aspect of that.

So what do you mean by that?

[Tom]

So schema validation, basically, you know, we’ve got this payload of data, does it conform to our expected structure, data structures, right? The, you know, in between 100 and 200, or the strings, the certain shape of string, right? So it’s about defining or receiving something and making sure that it conforms basically to our schema, right?

It’s just expected data contracts. You know, I said, give me A, is it giving me A or is it giving me B, Z, one, two, three kind of thing. And it’s about checking that there is what they’re saying.

Well, I think there’s a couple of different things out there. But, you know, we can’t really trust that the person saying the data is going to always in the right format, right? You got to kind of make your hope, right, that everyone’s going to send data in the right format.

And it’s going to fit every time. But was that Murphy’s law? I’ve got to think about these things now.

What’s the one with like user interfaces? If there’s an opportunity to break it, they certainly will. [David]

Yeah, exactly.
[Tom]
Escaping my head right now. [David]

Yeah, no, yeah. And I’m with you. There’s been some technology implementations that literally there are people that are in these manufacturing and process plants that they’ve got nothing better to do to see if can I break the new technology?

And they’re going to go out of their way to do it. And people say people, you know, I’ve been asked that people will do it. It’s like, yeah, they’re going to do it.

And that’s not all bad because it does help your product. [Tom]
You know, whatever can go wrong will go wrong, right? [David]

Yeah, absolutely. Absolutely. There’s always going to be, you know, something in there.

We try to think of all the edge cases that where that might occur. But, you know, and that’s why it’s so important that, yes, I’m using DataOps. Yes, I’m defining these payloads that are supposed to be sent.

And it’s all there. But, you know, guess what? Sometimes things get a little wonky.

Maybe the source of one of the data doesn’t come back.

[Tom]

And, you know, it’s not the known unknowns that will get you. It’s the unknown unknowns that will get you, right?

[David]
Yeah, exactly.
[Tom]
There’s cases that you didn’t even think were possible.
[David]
Yeah, absolutely. Seems like there’s a Donald Rumsfeld line in there somewhere, right? You

know, going back to early 2000s.

[Tom]

So, yeah, schema validation is about making sure that the sender is, you know, sort of conforming to the data contract, right? So if we can build a bunch of test cases around our schema, and we know we’re receiving data in our sort of format because we’re doing schema validation, we can lower our risk for errors happening and introducing bugs into the system.

[David]

So another concept that’s closely related that I’ll hear every now and again after, you know, in the context of schema validation is this data validation. And sometimes I hear them used interchangeably, though I think they’re a little bit different. So what say you?

[Tom]

What say me? I’d say they’re pretty similar. Schema validation sounds like a lot more smaller context than maybe schema validation.

Schema validation may, you know, you may go a little bit further in one where you’re like checking that these relationships do exist. Other times they’re just checking that the payload conforms, right? Maybe in messaging, maybe I’m just checking that there is an ID that exists where I’m referencing some sort of equipment or work master or something like that.

But maybe in the other scenario, I’m actually checking that yes, I actually said something there, but yes, that thing actually exists in my system, right? But maybe it’s going beyond just the simple, does the payload look good? Does the data look good, right?

Is this a valid work master I know about, piece of equipment I know about in my system? [David]

Yeah, it’s a fine line between them. I mean, you’re going to have the schema validation. It’s not only going to ensure the right structure.

So key values are there. So if you’re familiar with JSON formats, you’re going to have this collection of these objects and these collections of arrays that are there. So, you know, in ensuring that if you’re expecting an element, part of that schema validation is there.

Data is also going to, as it starts digesting and processing that information, it’s also going to ensure that as you’re processing it, that it actually can do the type of parsing of, say, that JSON payload that’s there. Maybe the schema was good, but as we start going through the data, yeah, there’s a problem here. Like, I’m going to have an operations request that’s made, and I’m going to ask you to make some product.

Well, as I’m validating that data, I’m going to find out, oh, you know what? We don’t have a material definition here. So we’re going to have to kick it back.

I’d be part of that data validation that might occur in there. [Tom]
Yeah. Yep. It’s kind of that next step, right?
[David]

Sure. Yeah. So it’s, you know, and oddly enough, whenever we start processing our data, what are the first thing we do is schema validation.

And then we do some data validation. Just make sure that things are happy. Because last thing we want to do is get noise in the database.

So it’s easier to keep it out and try to get it out once it’s there. [Tom]
Yes.
[David]

That can be a mess.
[Tom]
So to make gold, one must start with gold.

[David]
Absolutely. Absolutely. Here we go.

Ontology

All right. So since we talked a little bit about when you’re getting data, and you’re going to persist data, you want to ensure that it’s, you know, in the right, you know, in the right format. I think we want to talk a little bit about ontology.

And that’s a word that I’m starting to hear more and more about. Oh, we need an ontology. So what’s an ontology?

[Tom]

Yeah. So an ontology is more about defining the relationships and the meaning in those relationships between things, right? Within a specific knowledge domain, right?

So manufacturing, we talk about ontology, right? A classic example I give people when thinking

about ontology is a family tree, right? And the relationships that exist in a family tree, right?

With the emphasis being more so on like the relationships than the actual, sometimes the entities themselves. So if I were to say I have a sister, you inherently know what I mean by that, right? You know, it’s a person, likely female, and we’re related through parents, right?

Yeah. So that’s the idea where, yeah, common parents. So that’s an idea where like ontology is around describing these relationships between the data and the emphasis is more on relationships than also how it’s necessarily stored like a schema, right?

So a schema is talking about how it’s stored, the structure of it. Whereas an ontology is sort of like abstraction above your data that talks about the meaning in the relationships and the entities.

[David]

Okay. So if I want to talk about my third, second cousin, third removed, you know what that means. That relationship is very clear and well-defined.

That’s it. Family tree ontology, yeah. [Tom]

Like I’m just trying to think of an example in manufacturing. Maybe you could say like a test sample has a test specification. Well, the test specification on that test sample, right?

You may inherently know what I mean by that, right? It’s what we’re trying to test against and like maybe it contains the limits of what we expect our sample to be for a cost result, for example.

[David]

Yeah. And I think, you know, coming off of the concept of a UNS where you’re going to have data that is, you know, the data ops tool, it’s not just defining a payload, but as you’re adding things as additional intelligence is produced and consumed, that gives you that context. But what you may not have is this ontology of how all this data relates to one another.

And I think that becomes one of the fundamental differences. Mostly when I get into some applications like genealogy and track and trace, it’s I might know and be able to add the context of a material lot that’s associated with the manufacturing process, but I might not understand the relationship of how did that lot come to be right in this current time where it came in as a raw product, there was a raw lot and then it got moved into maybe a sub lot and then it got moved into all these ways that that material can be transformed throughout a process. And I think the ontology helps describe those relationships and that type of thing.

[Tom]

Now that looks together. [David]
Perfect.
[Tom]

Yeah.

Event Driven Architecture

[David]

Yeah. So, you know, continuing on with the whole UNS, one of the, you know, as we talked about the socialization, it’s approach for an event driven architecture. So I know some of the earlier podcasts, you know, we talked about what are some of the patterns I’ve referenced Martin Fowler and some of these four patterns that emerge in there.

But, you know, when we say event driven architecture, I’ll even talk about UNS being an event driven architecture for OT, what exactly is an event driven architecture?

[Tom]

Yeah, that’s a good question. Event driven architecture, obviously we’ve got events and they are doing the driving, right? It’s not in the game, but event driven architecture.

So, you know, one event could kind of fire off multiple events which fire off multiple events, that kind of thing, right? So we’ve got a bunch of different applications or systems or even within the one application, right? Reacting and subscribing to events and maybe, you know, the temperature fires off in Fahrenheit and, you know, because I subscribe to all the Fahrenheit events, I then transform it into Celsius so then publish out Celsius, right?

So it’s about reacting to those events that sort of come about throughout your system and then firing off other events. I mean, you could keep up a workflow, work orders, orders, all sorts in manufacturing.

[David]

Yeah, I think, you know, at least event driven architecture within the framework or context of an OT environment is that, you know, historically we’ve always talked about this pull response type mechanism and the idea and this is coming back to the broker that we were talking about earlier. It’s that within QTT. Yes, there’s always going to be some scanning that occurs at whatever that source is, but unless there’s a change that’s greater than what we’ve determined is yes, we’re defining this as a change, you know, now we’re going to report that by exception.

Well, that becomes that event driven element of we’re not going to do anything. There’s really no new information here and we’re going to architect that system to respond only to those

events. So I’m not going to have all these systems that are going to be interrogating that source of data.

We’re going to have one thing that tells us and I think really that’s the fundamental piece, at least within the context of manufacturing an OT.

[Tom]

Yeah, maybe it’s best described to its alternatives as well. So you gave request responses alternative. Another one typically is periodic, right?

Where maybe the ERP drops a flat file every day of the upcoming orders, right? So as opposed to the flat file coming down once a day, maybe whenever that order is created, it immediately gets sent on to the next system. And conversely, you know, your request response pattern as well there.

[David]

Yeah, and I’ve even seen a combination of both where you’ll have all that. Ultimately, the major event or when that large payload comes through, that’s more event driven. So for instance, I could be doing this pull response and when that pull gets a response that says something changed, now I’m going to trigger the main payload event like there is a new manufacturing work order that has just arrived type of thing.

But otherwise, I’m just looking for changes and oh, look, there’s a change now. Here’s all the information that’s associated with that. Yeah, so there’ll be a combination or because there’s some that I’ve done where it’s both if there’s a period where this occurs, you know, or it’s just a, you know, the time or that type of thing.

But, you know, ultimately the big movement is that event. So, you know, with that, there’s these two concepts that come together going back again to the UNS conversation we had. We spent a little bit of time talking about orchestration and choreography.

Orchestration/Choreography

And I think these are really important in event driven architectures because, you know, orchestration, if you think about, you know, an orchestra, you know, which is where it comes from, it’s that you’re going to have a conductor that is always going to bring a, you know, bring a part in. It’s going to look at you and say, here, it’s now your time to go do this thing, you know, versus choreography that if I’m doing, say, a dance routine, and I can tell you, you don’t want to see me doing a dance routine. But if for some reason I was, you know, now I’m just looking to see what somebody else is doing.

And that’s the cue for me to do the thing that I need to do. And that’s great because you don’t have maybe the overhead of the orchestration. But boy, if I get it off, or if I’m wrong about what it is that I’m doing, that could be bad.

So with orchestration and choreography, I mean, how do you use these? What do you see in manufacturing?

[Tom]

Um, so I guess I see a lot more orchestration when it’s more pipelines. And so I guess there’s probably three examples of orchestration. I typically see it’s either a data pipeline, right?

Maybe something happens, we need to grab in some master data, do something else, and then publish something out. Another example of orchestration would be like an MES or SAP, right? Where we’ve got those standard manufacturing instructions, where it’s all defined in sort of one system, and we are just orchestrating through that workflow, pulling out to the control system, right, where it makes sense.

So, I mean, like I said, I do a lot in sort of high-compliance industries. And so orchestration is definitely the, we prefer, um, certain, you know, making pharmaceutical batches and things like that. Choreography, I mean, that plays more into the UNS, right?

Where you’re a little bit less governed on what should exist, and what you’re sort of publishing and subscribing to each other. So, yeah.

[David]
And they’re both approaches. It’s just like everything. It’s not one’s right or wrong.

It’s just that there’s different approaches you can take for what it is that you’re trying to do. And there’s risks of not using orchestration. I sort of mentioned it earlier with orchestration.

There can be some overhead, where with choreography, there really isn’t anything. It’s, hey, you just let the system go, and I hope that it gets there and implements some of the governance that’s there. You mentioned data pipelines.

Can you talk a little bit about data pipelines? [Tom]

Um, sure. I kind of see three different types of workflow in manufacturing. I think there is the data pipeline flow, right?

We’ve got data in one system. We’re transforming, republishing, that kind of thing. Maybe bringing in an ID schema or normalizing the data into some sort of other format.

We’ve got the sort of workflow of an MES per se, like step one or SOP, right? That’s a sort of workflow. And then we’ve also got this process flow about process.

So, you know, baking your cake example, you know, mix our dry ingredients, mix our wet ingredients, mix them together, put into a cake tin, bake, and then decorate or something like

that, right? So it’s like the sort of three different workflows that I see in manufacturing. [David]

Okay. Yeah. And then just going back on this orchestration piece, you were talking about the workflows.

I mean, so I always think about, say, work instructions of once I get to this step, now here’s what I’m going to do next. It’s like putting together a Lego, you know, a Lego assembly. It’s there’s my work instructions, if you will.

Well, that’s that orchestration piece of you’re going to do this. Now you’re going to move on the next piece and do this set. It almost seems like Legos, building a Lego something or other would be a great way to demonstrate a lot of what this stuff is.

[Tom]

Yeah. Yeah. I agree.

I mean, you could always think of like choreograph. I’m trying to think of a good choreograph example. Maybe the way that you can, some manufacturing sites can arrange their lines a more choreograph, right?

All I know is that my bottling line accepts bottles, right? Whatever I plug up to that, it doesn’t really matter, right? No one’s orchestrated that you must do this, machine that makes bottles into the bottling line.

Yeah, I think those choreographed for the actual order that’s being produced.

[David]

Yeah, I think those Rube Goldberg machines or when they set up all those dominoes, you kick it off once and that rest of that thing’s choreographed and you hope to God that the mouse does what it’s supposed to do. Exactly.

[Tom] Yeah, right.

Microservices

[David]

Because you wouldn’t want to get that wrong. But hey, what do you do? So, so another piece that I run into a lot is, you know, especially in this, so Rick Belotta, he was on for the Uber broker and we talked a little bit about microservices.

And this is one of those terms that gets thrown around a lot. And it’s like, I know there’s services that run on, you know, servers and they’re doing certain things. But now we talk about

microservices.

So I don’t, you know, it’s almost like I, it’s just, is it just a smaller service? It’s very lightweight, does something unique. How does that fit into this event-driven architecture orchestration?

[Tom]

It’s that whole, yeah, I guess it, it ties into choreography and like nodes in the ecosystem, right? So a microservice is a little node that’s maybe it’s, you know, doing a very small bit of functionality, right? But what is it?

Linux through, you know, Linux is built on a bunch of little utilities that do one thing and one thing work very well, right? And then it’s up to the user then to orchestrate those things together. Similar sort of thing with microservice.

You could build a bunch of little applications that up and subscribe, right? To your ecosystem, just that does one thing and one thing really well. Yeah.

[David]

Now, when we talked about microservices, especially at the edge, it’s, you know, if I’m doing some kind of machine learning at the edge every 15 minutes, five minutes, you know, some period, maybe in this case, it’s that I’m going to look to see what’s going on. I’m going to run it. And then if there’s something that’s of interest, then I’m, you know, that I’m going to do, you know, the result is true, then go do something with that result.

[Tom]
Yeah.
[David]
Type of thing. [Tom]

It’s, you know, the alternative is a monolith is what they call it. And that’s basically putting all your code into one code base and then publishing that whole big app. If we could compose our app into little individual services and split them out into different apps, the idea is that you can deploy these little apps individually, as long as they fulfill that data contract, right?

With whoever they offer or whatever it needs to do. So that data doesn’t change. You can just, you know, innovate on that one little microservice independently of deploying a whole system, right?

It also reduces your like testing, because now you have to test the interface and the boundaries rather than have to test the system as a whole. So you can get sort of faster innovation through

sort of microservices architecture.

[David]

Yeah, absolutely. And I think just, you know, really the intent here, and we’ve talked about it earlier, is it’s not one, it’s not bad versus good. It’s that these are all the things that you need to consider and think about in every way.

We’re starting to connect data, connect systems. These are all the things that go into it. But I think it’s important at least to understand, you know, if it’s the microservice versus the monolith, what do those mean?

What are the risks? What are the upsides and those types of things?

[Tom]

Yeah, maintainability is a big one, I’d probably say. And microservices is a way to, when you have very large teams and teams of teams sort of thing, to sort of carve out a piece of functionality and make the whole team responsible for it, rather than monolith, where maybe you’ve just got the one team.

[David]

Yeah, yeah. But it’s to your point, if you make one change, then you have to deploy and undeploy the whole thing. You don’t get the chance of, whoops, let’s just roll back this one thing.

Yeah, that can be a challenge itself.

Part 2

Inheritance vs Composition

So let’s talk a little bit more about some data modeling. You know, this is more the data ops piece.

We talked a little bit about it before, but it’s the idea of inheritance and composition. And inheritance is used a lot in programming, but I think for data ops is really the context that we want to have for it here in the same way with composition. And I think it’s given us the ability, it’s another one of these, I wouldn’t call it a continuum, but it’s the idea that there’s going to be some things that inherit some things that are composed.

But what do those terms mean within the context of manufacturing data?

[Tom]

Yeah, I think OSI probably does the asset framework. So that was a great example of inheritance, right? Where you’ve got templates and they’re going to have a template that is a production line, right?

And then maybe for site, I’ve got my Charlotte production line, right? That inherits from the global template for production line. Maybe at the global level, all production lines, you know,

must have a speed and a speed set point.

Maybe at Charlotte, they’ve got a speed, a speed set point and like a temperature or something, right? I’m extending that template with additional properties as I sort of go down that inheritance tree. Then you can apply those classes or templates onto instances.

So if I’ve got five production lines, I could say, hey, you’re each a Charlotte production line, right? And you then inherit those properties. So those production lines, all five now have those three properties because I’ve assigned that class to these five production lines and they inherit the properties.

So typically like class of a class of something with properties, behaviors, rules, could be, it’s not necessarily limited to properties, but it’s some type of behavior. And then you’d have a top level item that inherits sort of down and then you could eventually apply that into an instance.

[David]

Okay, yeah, I generally do inheritance like if I’m working with asset models. So I might have just my basic pump, which is telling me I have a pressure and a temperature and maybe a flow, but then I might have other complex pumps that have a lot more information to it. Maybe I’m going to throw in a motor there as well.

But if I make a change to that base model, that now gets inherited to every model that’s off of that. And it also works really well. So I’m a big fan of inductive automation, their ignition product for at the level two, especially for building, since we’re talking a lot about UNS for building them.

There’s parameters that are associated with what they call UDTs, which are these, you can call it a class, if you will, and you can instantiate them, but I can inherit those parameters as well. So if there’s certain things that I want to pass through, it makes it very easy to build some very complex models. But the downside of doing that is where we get into the whole composition, because maybe I don’t want all of those things there.

So how does composition differ in when you want to use one versus the other?

[Tom]

You know, sometimes inheritance is like very singular, like your instance can only inherit from one template. But when we talk about composition, maybe we have classes that compose all different bits of functionality. So maybe you’ve got a class for OE, you’ve got a class for SPC, you’ve got a class for vibration analysis, right?

And then I had my instances on my world, this motor has vibration analysis and it has SPC, right? On flow rate or something, right? Or some sort of quality point.

So, you know, so composition, you can compose all these like classes to bring in certain

behaviors and start to apply them to specific instances. You’ve also got composition from another point of view, I guess. And that’s like the data modeling point of view where you may have entities that only exist in the context of its parent.

So there’s two types of composition. You can sort of do modeling with composition or you can do like, what would be an example? Like equipment properties only exist in the context of the equipment.

You model and use, you know, compose behavior to generate your final results. So your equipment instance could be composed of multiple classes. That’s one way of talking about composition.

The other way is in the sort of the way that you model your schema through composition. Or the way you message through, you know, these data structures. So I could have equipment properties and they only exist in the context of equipment.

So temperature 12 doesn’t mean anything to anybody, right? Temperature 12 on, you know, pump one, one, two, three does mean something, right? So you could say that that temperature only exists in the context of the motor through the relationship called composition.

The other one to that is an aggregation where these entities can exist outside of each other. So an example of that could be, maybe I’ve got pump one, two, three, and I’ve also got maybe a work master perhaps, right? Where these things can exist independently of each other but also be related, right?

So I could see this work master as an equipment specification, which is that motor one, two, three. So yeah, there’s three different types of ways you could talk about composition there.

[David]

Yeah, and in some ways I’ll also use these in this data governance concept especially when I’m starting to build equipment models of, you know, what do I wanna make? You know, we’ll call it a folder. So I’ll just reference back for people that are familiar with ignition.

Do I wanna make it a folder? Do I wanna make it a UDT with all these nested UDTs that are part of it? And that’s the balance of what do you wanna have inherited and what do you want to force and govern or say you must do this thing versus here you’re going to compose what this, you know, if you’re at a work center you’re gonna compose what are some of the work units that are associated with that and what are some of the other, you know, enforced data models that you’re gonna have associated with it. So, you know, for instance, you were talking about do I have an SPC measurement here? Do I have an OEE measurement that exists within that?

So you can compose it but once you have those composed then you can also have different flavors if you will, of that pump, different flavors of, you know, OEE, you know, those types of

things. So, you know, it’s one of the balances.

Aggregation

So you brought this word up here, it’s called aggregation.

And when I think of aggregation there’s some really great data tools that are analytic tools that are out there. Actually Flow Software, we had Lenny from there coming in talking about their new time-based historian on one of our time series podcasts. But when I think about what Flow Software does, you know, it’s aggregations and event context.

So, but in that, you know, for me there it’s I’m now taking some time series data and I’m looking at min, max, average, you know, there’s a lot of things I can do. Those are all aggregations that I can add. But, you know, what else can aggregation mean within the context of manufacturing data?

[Tom]

Yeah, well, we talked about data modeling aggregations. That’s where entities can exist, you know, and maybe it’s an aggregation relationship between entities and they can exist outside of each other. So I could talk about and share information about an equipment or I could talk about and share equipment, information relating to a work master, right?

So they can exist separately. So that would be an example of, I guess, aggregation. But then, like you said, you can aggregate data itself as opposed to talking about data relationships.

And yeah, min, max, deltas, moving, not moving averages, moving range. That’s it. SPC stuff’s going back to me.

Oh yeah, yeah.

[David]

There’s a lot of things you can do in there. You know, what was the, you know, yeah.

[Tom]

Totals, sums, you know, a bunch of different math.

[David]

Yeah, you can do all your rules in there as well. All the Nelson rules of, you know, all the various things that occur within SPC. So you have used an ISA 95 term.

And in case people that are listening to this aren’t as familiar with ISA 95, go back to the very beginning of our podcast. You’ll pick up a whole lot about ISA 95. But the concept of a work master, since you’ve used it a couple of times, you can break that down real fast.

[Tom]

Yeah, sure. It is product definition to the level required to make it on site, basically. And when I say product definition, I mean, what are the steps involved?

What are the resources? And when I say resources, I mean, you know, people, equipment, physical assets, and materials. And then what’s all the steps required to make it?

It’s that detailed SOP with resources applied.

[David]

Yeah, so it is a teaser. It’s taken the part three, the activity models, and then, you know, from the plan, and then what are all the resource models and information models that are associated to get it all the way through all that level of detail. That work instructions is a part of a work master, if I have that right.

[Tom]

So, yeah. Yeah, it’s the steps required to actually make it on site. It usually relates either to an operations definition, operations segment or process segment, depending on where you’ve modeled your product definition.

Usually your ERP system has some sort of concept of how to make something. Maybe it knows you just need a production line and has a bomb. Maybe it could just be one manufacturing process step.

Maybe it knows you’ve got to make and package. Maybe it knows you’ve got to, you know, mix, bake, package. So it depends on where you put the definition.

Do you put it up in your ERP? Do you put it down in your layer three space? And then what’s that common understanding between business applications and between the business of how to make something?

And that’s usually your process segment there. But the work definition is, you know, from layer three down, like it’s the actual detail on how to make it. So you might find it’s a one-to-one, you know, work master, the operation definition, maybe it’s a one-to-many, many-to-many.

It really depends. And sometimes your work masters are also reusable. Maybe you’ve got like an inspection work master, right?

And you’ve got 15 different inspections throughout making something. So there’s a little level of reusability in that as well.

[David]
Okay. Yeah. So just keep on the lookout for some upcoming ISA95 training we’re going to offer. We’re going to get a lot into work masters and what that means. And to me, they’re one of the,

that’s the glue that’s just so critical on getting everything. It’s like the orchestration of your manufacturing process.

[Tom]
More like the oncology. Yeah. [David]
Well, there you go. Exactly. Exactly.

Context

So last word, or just one thing I’ve noticed, I’ve used this a lot, it’s context. And it’s kind of funny that we just throw these words around and here we are in a podcast, you know, in the context of, so when we say context, you know, and just really from a data standpoint, the thing we always talk about is how we’re adding context to the data moving up. So for instance, you know, if I just have a number, well, that doesn’t mean anything until I give it a unit of measure.

Well, that’s now adding context. So what are some other examples of what do we mean by that context? So it’s within the boundaries or within the frame of this.

[Tom]

Yeah. I mean, like you said, it’s basically the frame of reference for the particular bit of information, right? Like you said, that data value changing may mean something, but when you add a unit measure, it means something else.

When you associate that with maybe a PNID tag, that adds more context. When you know it’s like TI99 and TI99 is part of bioreactor one, that adds even more context, right? And then when you add the context that bioreactor is producing something right now as producing in the context of order, you know, PO1234, whatever it happens to be, right?

So it’s just about adding more context and maybe PO order 123 is being made to work master, you know, standard widgets one, right? And standard widget one has the context of that temperature should be between 200 and 100 degrees or something like that, right? So it’s all these layers that you can start to context, right?

And there’s this frame of reference under which that value could mean something important or not, right? And that’s why we talk about context. So is the value 100 important?

I mean, who knows, right? We need more information. We need to know what it actually is, what do we plan it to be, and what is our standard definition of what it should be, right?

[David]

Yeah, and a lot of these, I mean, going back earlier, just to the very first thing we talked about is broker. It’s like, well, it depends on the context. So we may mean what use or within what

capacity are you using this?

So I think there’s a lot of ways that context can come in. I mean, even there was a post on LinkedIn a while back about, what does SCADA mean to you? And it’s a different thing.

If you’re an oil and gas versus in a refinery versus in a manufacturing, like if you’re doing discrete manufacturing, SCADA has very different meanings. Well, it’s within the context of what that is. So it’s always one of those funny words we all seem to kind of understand, yet we use it, and we’ve even used it on this podcast.

And here we are attempting to define it. Well, we just know it when we see it, I think. Yeah, it’s a frame of reference, I guess you could say.

Frame of reference. Great, great, great adder there for that.

Applications of the word 'Graph'

So last thing I want to talk a little bit about a bit about is this concept of graph, GraphQL, you know, dGraph.

I mean, we hear graph a lot, especially when we start talking data ops and data persistence. And it’s these things that I don’t know that people rightly understand that, you know, believe this or not, Graph and GraphQL are just as different as Java and JavaScript, right? So unpack this for us, Tom, what are these things?

[Tom]

Well, if you’re paying attention, GraphQL, that QL on the end is probably, you know, the giveaway there. GraphQL is a query language. That’s what the QL stands for, kind of like SQL, GraphQL, right?

It is purely a query language. It can sit above different databases, right? It doesn’t necessarily have to be a graph database.

So typically when people talk about graph and graph structures, maybe they’re referring to a graph database. Like there’s Neo4j, there’s Rhize, there’s dGraph as well. But GraphQL is just a query language.

So there’s nothing stopping me from putting GraphQL on top of Postgres or a relational database. It’s purely a query language. It is nice as a query language.

It’s as if sort of REST and SQL had a baby, I would say. You can do a HTTP post with your query as the body and get a response, which is really nice. So it works very well with web applications, right?

It is JSON forward as well. And so also it works very well with the web. The other cool benefit is that you can navigate relationships very easy with GraphQL.

So if I’ve been in manufacturing for a while now and doing the reporting side of manufacturing in SQL, you’ll end up with these four 500 line SQL scripts just to do your report or get your data

structure, right? And you’re always limited usually to just one table at the end, right? Here is my final aggregate result.

It took me these 500 lines, right? In GraphQL, relationships are as easy as just like nesting the values you want within the context of the thing you’re querying. So if I want equipment properties, I could just query equipment and then properties and on the properties, I maybe just want to get the ID, right?

And so you can describe joins really easy in GraphQL, which is really nice. No more left, right, join, inner join, outer join, nested tables. All that kind of falls away because you can just navigate these relationships really nicely.

The other benefit is you can describe, like kind of like SQL, you can describe what your resulting data structure looks like. And so you can do like, I guess I’ll put an equivalent for SQL. You can do like nested tables within your return result.

It’s very crude, crude analogy, but hey, give me equipment, give me their properties. Now give me that, maybe equipment, equipment actuals, their job response, maybe the job order, then the work master, right? So you can navigate all these structures quite easily in a GraphQL query, which is really nice to grab out the data set that you’re interested in.

Yeah, I think the important, as you said, it also prevents over-fetching. So in like traditional REST APIs, right? There’s usually a standard structure of which it responds with every time.

So in like an open API, you’ll say, hey, get equipment, here’s what you get, right? And you’re always going to get that only if you needed just that one value or just that other value down there or the third equipment in, right? Whenever you do REST, so the publisher of the data is already telling you what information you’re going to get in the format you’re going to get.

In GraphQL, you can just specify just the fields you want, prevent what they call over-fetching.

[David]

Yeah, I was going to say, I think the important thing here is that it’s the query language piece. It’s just the syntax, it’s how you go about retrieving data that could be in any kind of data sets, whether it’s Postgres or SQL. It’s just, it’s the language that gets you there, I think is really the key piece.

So you mentioned you don’t have to use GraphQL with a graph database. So what is a graph database then? So you mentioned like dgraph and Neo4j.

You know, it’s in a generic sense, it’s a directed graph, but what exactly is it? How is that different and what does that mean?

[Tom]

Yeah, so in a graph database, we’re typically storing the nodes and then the node, well, nodes and edges, really. That’s the key difference, right? So in a relational database, you’ve got tables and rows, right?

In a graph database, we’re storing nodes and edges, right? And we can create, and we can define schemas, you can in sort of graph databases. We do that with Rhize.

But yeah, at its core, it’s not storing like a table file with rows, it’s storing a list of nodes and basically all their attributes and then the edges. And then you can put data technically on those relationships as well. So you can add weights and things like that, right?

[David]

Okay, and the edges are the relationships of the data. So, you know, and then one way, okay, yeah, perfect. It’s, you know, the way it was described to me is imagine you’re on a football pitch and you have the number, the players that are sitting out there and what’s the relationship between all of those players at any one moment in time.

And so the player is going to be that, is the node and it’s going to have certain characteristics or attributes about that particular player. And then where is that person in relation to the position, in this case of all the other players there. And then you put in the ball, then you put in the coach and it just makes it even more challenging.

[Tom]

Yeah, the technical example is like social networks, right? Friends and, you know, you’re a node, you’re a person and you’ve got friends, but you can also create multiple relationships to the same, same node. And so it’s quite performant in traversing those relationships.

Yeah, it’s equal to a lot of like joins. So like join, join, join, join, join, maybe iterative joins, right? Based on the same table.

Whereas with graph, you can navigate those structures way more performant than you probably could with a SQL database or relational database.

[David]

Yeah, I mean, I think that’s where knowledge graphs or the directed graphs come in. It’s just that complex relationship where it’s, you know, yes, you’re a friend of, but you’re also a customer of, and then you’re also a supplier too. And, you know, that’s how we define those edges of what are all the relationships here and how all that stuff fits.

You know, very quickly you can find out, remember this old game we used to play about there’s a name of a person that lives on a street that has a particular color telephone. Well, you know, the knowledge graph helps put all these things together.

[Tom]
Yeah, they’re in, I think that was it, their impedance mismatch is also low. It’s a good thing. [David]

Yeah, we’re going to talk about impedance mismatch coming up and why this matters and why it’s so important and that, you know, that some history on what is a directed graph. And maybe we’re going to get a little in the weeds and a little too much in the, you know, where you have all the people that wear propeller hats. But I think it’s a very important, it’s very important for us to understand that because this is something you’re going to run into, you know, going back to the episode where we talked about why your backend matters of the different storage mechanisms of when you have highly complex relationships within data, you have to use certain storage formats like a graph database versus, you know, columnar data and tabular data or a row-based data in there.

Yeah, yeah.
[Tom]
Yeah, finding the shortest path between two nodes as well, right? [David]

Yeah, there’s a lot that goes into it, but once you’ve unlocked that, then it makes it very, very quick and able to retrieve a lot of different data stuff as well.

Final Thoughts and Data Calculations

So we’ve carved a lot of ground here in the last, you know, hour or so. Any other terms or things that you hear a lot of that, well, I don’t know if that really gets used properly or are there other things that’s like, hmm, yeah, I don’t know.

I don’t think that word means what you think it means kind of thing.

[Tom]

Yeah, I had an interesting discussion with a colleague this morning on data calculations in manufacturing and there was definitely a misunderstanding around that one, around the different types of data calculations you’ll do. Sometimes it’s just around state. Converting Celsius to Fahrenheit, maybe within the context of a property.

Sometimes you are using, you’re doing a calculation that requires some level of history on other tags or other properties, right? So maybe I want to calculate the rolling average of some particular value and then recalculate and republish that. And sometimes you’re calculating not only history of other tags, but of the calculation itself, right?

And this whole concept of like, well, where do these different calculations live within your ecosystem, right? So just the idea of data calcs and just trying to pair out that there is a, you

know, when we say data calculations or data transformations, sometimes they’re simple. Sometimes we’re talking about just simple math.

Sometimes it’s a lookup table, right? Maybe we’re converting one systems, what do I call it? One systems like enum or enumeration on machine state into another states or the order status into a different order status, right?

That everyone can understand. Sometimes they’re simple. Sometimes they’re complex, right?

And knowing the best place to put them is, you know, you can’t just generalize and treat them all the same. Yeah, exactly.

[David]

Yeah. What you’re talking about, we’re running into this a lot, is this what’s known as sensor to signal. So here’s what I’m measuring, but that’s not actually the thing that I want.

So how do I take the five things that I have and slice and dice that in order to get the value or the signal, the thing that I’m actually looking for?

[Tom]
Yeah. And then do you want to like persist that? Do you want to backfill persist that?
If you change that, what happens? If you redefine that, what happens? Yeah, there’s a lot. That was a little topic that came up just this morning, actually.
[David]

Yeah, no, there’s a lot that goes into it, certainly in data, but all right, cool. Well, I think we’ve covered some really great ground here in terms of, you know, let’s start defining or at least describing what some of these terms are and what we mean by that. But, you know, I think really the key takeaway here is that it all depends on the context in which the words are used, but make sure that when a word gets used, we do a level set so that people can at least understand.

Let’s make sure we’re on the same page so when we’re talking about these things, but, and hopefully it was helpful to understand what some of these things mean. And of course, going to some of these concepts, they somewhat exist in pairs, is that it’s not a good or bad. It’s just, there’s all these considerations for how you go about going and solving these problems and how you want to deploy the technology in a relationship.

[Tom]
So yeah, right tool for the right job.

[David]

Right tool for the right job. So perfect. Well, Tom, thank you so much again for being on the Rhize Up podcast.

Look forward to having you back in a future episode. And for everybody who is listening, thanks for stopping by and checking out the defining episode of the Rhize Up podcast.

[Tom]
Appreciate it, David. Thanks.

Rhize Up Podcast: Episode 13 – The Defining Episode

Broker?

Schemas and Validations

Ontology

Event Driven Architecture

Orchestration/Choreography

Microservices

Inheritance vs Composition

Aggregation

Context

Applications of the word 'Graph'

Final Thoughts and Data Calculations