This blog provides a quick introduction to the tradeoffs between industrial historians and time-series databases from a practical, manufacturing-centered perspective. Why are we writing this? If you search the internet, you’ll find no shortage of blog posts about this topic. But, as it turns out, these posts are usually written by vendors of historians and time-series databases. As you can imagine, the arguments always lean strongly towards one side or the other.
At Rhize, we’re in the business of being a contextualization hub for a variety of data sources. We have customers who use time-series databases, historians, and both. We also have much operational experience on both sides. So we can discuss this topic with less vested interest.
What’s the difference?
For manufacturers, historians and time-series databases serve essentially the same goals: collect and store operational data and provide an interface for real-time monitoring and analysis. So what’s the difference?
Let’s be brief with our definitions:
- Time-series database. a database designed to store and serve time-series data. Many implementations exist, all of which make some tradeoff between read performance and write performance. Examples include: InfluxDB, Timescale, and QuestDB
- Industrial historian. A time-series database designed for industrial data that typically comes with a suite of software to support manufacturing analysis. Examples include AVEVA PI (formerly OSI), Aspentech InfoPlus, and Canary Historian.
Besides implementation details, the other major difference is the culture from which these two systems come from. Historians have their roots in manufacturing. Some solutions that are still on the market began development in the 1980s. Time-series databases are conventionally developed as open-source solutions for a variety of applications, like IT-service monitoring and financial-data analysis.
This cultural difference also explains the main benefits and drawbacks that each system brings for the manufacturing integrator.
Historians: venerable systems for manufacturing context
Industrial historians have been around for a long time, and they have been widely used in manufacturing systems around the world. While their codebases might not reflect the bleeding-edge of software development, they absolutely can still bring value for manufacturing operations. And this specialization and history brings many advantages out the box.
Purpose-built for operational data
The essential function of historian is to ingest and store high-volume streams of operational data from the plant floor. This specialization means that historians come with many out-the-box features that would require much customization and development to achieve with a more general time-series database:
- Built-in data validation. Historians typically evaluate the data that it ingests against some known quality range. This check adds a built-in level of cleanup to make analysis more meaningful.
- Built-in contextualization designed for manufacturing. Historians store data in a representation that provides a way to view a data point in the context of its place in the wider operation (typically, this maps the to equipment structure of the plant). It is easy to address plant data in a context that makes sense for the integrator, for example, querying by “this batch, this unit procedure, this vessel”
- Ready integration with manufacturing systems. From the view of general-purpose software makers, PLC, OPC UA, and SCADA are niche data sources (if known at all). For manufacturing, these systems and data streams are foundational. Makers of employees pay employees to make these integrations work. While you might be able to find a community plug-in or even write your own, manufacturing data sources rarely have such first-class support in a general-purpose time series database.
- Modules for analysis. As mentioned in our definition, a historian is usually not only a database; it also has modules to analyze and visualize different to server manufacturing use cases.
Stable and socially safe
It’s not uncommon for a historian to remain in a plant for at least a decade. This long-term usage forces a certain commitment to backwards compatibility. Open-source databases, on the other hand, exist in a landscape that is more fluid and more vulnerable to breaking changes. As an example, InfluxDB is now in its third version, having introduced and deprecated its FLUX query language and completely rewritten its codebase in Rust. In manufacturing, which develops in a much longer time-scale than, say, a cloud-based web application, this instability can make people wary.
That leads to another point: in manufacturing, the industrial historians still represent the “safer” bet for many operational cultures. As IT departments in large banks won’t get fired for buying IBM, integrators in legacy manufacturing operations won’t get fired for buying AVEVA. While this sounds cynical, the truth is that, unless a manufacturing firm is ready to commit towards investing resources and brainpower to more modern solutions, they’re better suited with the built-in capabilities of a historian.
Time-series DBs: free, extensible, and new
If historians are strong because they support manufacturing specificity, time-series databases are strong because they support the general-purpose demands of modern computing.
Free and extensible
Time-series databases are usually open-source. While maintaining the devops infrastructure to support high amounts of time-series streams is not trivial, it also provides you with much more flexibility. Compared to the onerous licensing contracts of industrial historians, time-series databases provide many more ways to ingest operational data in the way that makes sense for your organization’s particular constraints in time, money, and IT expertise.
The open-source aspect also provides many more ways to customize and improve deployment as you need. Want to write your own compression algorithm or build a layer API for your specialized screens? The code is there to modify. Want to understand a particular bottleneck? The code is there to study.
Modernized for data-intensive purposes
Time-series databases have the benefit of hindsight, and their code has optimizations and architectural foundations that work better in this containerized, data-intensive world. They adapt better to diverse environments, from clusters to embedded systems, and their code is often high-quality. They likely have well-supported integrations with IT monitoring solutions, and the larger players have robust communities to extend and support your use case.
Time-series databases also exist amid intense competition. These days, a new time-series database appears almost as often as a new JavaScript framework. While this constant churn of novelty can be exhausting, it also means that their codebases are frequently developing and improving. And users have many options to choose a solution that fits their use case.
Industrial historians still have their place
Things that have worked a long time stay working. That being said, there’s no doubt that some cracks are starting to show in modern the way traditional historians fit in the modern landscape. : If their advantages are in their 40 years of experience, their disadvantages are that they bring 40 years of baggage:
- They are closed source.
- Their licensing policies can be expensive
- Their design limits support for horizontal scalability
For Rhize users, time-series data forms a valuable part of the data hub. And the default recommendation for our customers is to go with an open-source database. But we are not dogmatic, and the manufacturing data hub integrates with legacy systems. However, don’t get us wrong: the industrial historian space is ripe for disruption. But for manufacturers to use more modern solutions correctly, they need to have a team can combine DevOps with manufacturing know-how. And maybe one day, a special-purpose industrial time-series database will take root, and a robust community will develop from it.
Until then, industrial historians will be around.