21 Replies Latest reply on Oct 26, 2018 7:05 AM by Asle Frantzen

    Discussion: Will the data historian die?

    Asle Frantzen

      I just read this article and found it quite interesting: Will the Data Historian Die in a Wave of IIoT Disruption?


      A lot of traditional IT companies (Microsoft, IBM, etc) are on the move into the cloud, and although we all know that a time series database is the best location for time series data, it's no doubt that they will be trying to position themselves in the market traditionally belonging to automation/data historians.


      Loads of new functionality is being released every day with IoT (and IIoT - Industrial Internet of Things, as mentioned in the article), analytics and cloud. I think there's been 35 new features added to Microsoft's Azure just the first five months of the year. With sensors connecting directly to the internet, will people still go for the traditional "data historian"?


      What do you think? Are OSIsoft moving fast enough, and in the right direction, to stay up there during the explosion of devices, tools and cloud-based infrastructures coming?


      I particularly noticed the quote "Splunk in 7 years has more revenue than OSISoft in 35".

        • Re: Discussion: Will the data historian die?
          Rick Davin

          I certainly hope not but that's just my own selfish viewpoint.  I see IIoT changing our business but not killing it.

          • Re: Discussion: Will the data historian die?

            Thanks for the link Asle, it was an interesting read.  The author raises some interesting points in the article.  How OSIsoft and other data historian companies respond to the emerging technologies will be something that everyone will be watching.

            • Re: Discussion: Will the data historian die?

              Very interesting topic, but as an OSIsoft employee I'll try to stay out of it !

              As a side, have you guys looked into Qi? It is (will be) a cloud-based sequential (not necessarily time-series) data historian, and will be more flexible and scalable for large numbers (IIoT-scale) of distributed data sources:

              UC 2016 Presentation

              Additionally, OMF and the PI Connector Relay will certainly help with the large increase in usage of IoT devices:

              UC 2016 Presentation

                • Re: Discussion: Will the data historian die?
                  Asle Frantzen

                  Thanks for the links, Michael.


                  I haven't looked at Qi yet, I think one of my colleagues tried it out. I wasn't at this year's UC so I don't know about its future / current status, but upon introduction it was in alpha version and not something we'd use in customer projects. The other link is interesting, but is this stuff released? I didn't watch the videos but couldn't see any mentions in the PDF.

                    • Re: Discussion: Will the data historian die?

                      I believe that OMF and the PI Connector relay are currently in a closed beta, which should give us a better understanding of which direction to take it before releasing it as an open beta, and eventually a full release

                      Currently, there really aren't too many customers with large amounts of IIoT devices that don't speak a protocol that OSIsoft already 'speaks', but I definitely expect many new and current IoT-based protocols and embedded devices to start popping up, and that is when and where OMF and the PI Connector Relay will really come in handy.

                      Not to mention, we currently do and will always have the UFL Connector, and the PI Web API, for RESTful data flow to and from the PI System.

                  • Re: Discussion: Will the data historian die?
                    Roger Palmen

                    The key questions and my opinions below. They somewhat fall in line with the article (i wrote my response before reading the article...)


                    1) Will Historians die?

                    TV did not kill radio, but we use radio for a smaller set of cases then before the rise of the TV. For a lot of uses a historian can and will stay the best choice for many years to come, but the position where PI could be placed will become more narrow but will deepen. Pi used to be the only one handling high throughput, but it's not anymore. Why not have a historian in every car? In every PLC or DCS? You won't or can't hookup every device to the cloud 24*7, so you need ways to store data streams locally.

                    We already see PI as the "Buffer subsystem" for enterprise analytics! I expect that view to increase over time (PI as a Data Infrastructure; Integrator for BI)


                    2) Is OSIsoft moving fast enough?

                    Yes and No. Yes as the inustry has a slow pace so you can't and won't keep up with the fast developments. No as the fast pace of external parties are eating away at the market where PI could be used, but does not offer competitive advantage. So more or less the same answer to this one.

                    2 of 2 people found this helpful
                    • Re: Discussion: Will the data historian die?

                      IoT and IIoT are useful technologies; there is ample evidence of that; however, I do not believe they will displace data historians.

                      IoT and IIoT are, by nature, bandwidth intensive, and, industry will have to upgrade their bandwidth to make full use of these these technologies. The reality is: established businesses will work with a model of: if it's not broke, don't fix it. Business is also very aware of data breaches and leery of exposing data via the cloud - they don't want to expose their data to competitors or saboteurs.

                      For those that have been in the industry for a period of time, what has happened to the volumes of code written in COBOL or Fortran? It still exists... why? Because it works.

                      As long as OSISoft continues to improve their products, and, to that extent, a big shout out to future data and the direction of Coresight, they will stay on the radar.

                      Are there items I'd like to see them address? Yes, absolutely! But, I, just like they, have a backlog of items and I have to prioritize them by end-user and economic value.

                      Is it a 'nice to have' or a 'will save time and money'? In the end, stability and fiscal advantage are linchpins of reasoning.

                      • Re: Discussion: Will the data historian die?

                        Yes this is good article. But I think there is a misconception that more data = more information.


                        "In an IIoT world, time-series heat exchanger and emissions data is just one type of data. It is being used in conjunction with structured transactional business system data and unstructured real world data. This is to deliver next-generation analytics and applications in a focused use case like energy and emissions optimization."


                        I would argue that currently less than 1% of historian data are actively used in analytics and this is optimistic. The reason is most likely that tools and skills haven kept up with the growing demand. I read somewhere that data scientist are now asking for upward of $200K annual salary, which is a strong indicator that demand has outgrown supply.

                        Adding more data sources will only add complexity and even widen the gap.


                        The other challenge I foresee for IIoT is that you are losing context by making devices independent. This might work for maintenance models, but might not work for process models.

                        • Re: Discussion: Will the data historian die?

                          The world is a fast place.

                          • Re: Discussion: Will the data historian die?

                            I don't think that Historians will disappear in the medium term in the traditional industries. One of the issues with traditional historian's is the relatively high barrier to entry (upfront cost and "complex" setups) compared to some of the newer cloud based offering. This is a problem I've discussed with OSIsoft in relation to PI Cloud Connect. Most of vendors are bulking at the thought of setting up a PI system. QI looks very interesting for this.

                            • Re: Discussion: Will the data historian die?

                              I would expect the data historian to be more integrated with other databases in the future. Five years ago, when NoSQL was the hype, a similar question could have been asked: "Will SQL die?", yet today, even large web companies like LinkedIn/Facebook still use Oracle/MySQL.


                              What's different however, is that instead of a single transactional data warehouse, modern web 2.0 companies realized that in order to scale in terms of number of users, data types, and query patterns (what is now marketed as the three, four (or five?) V's), the data needed to be duplicated and persisted in many different ways (session cache, social graph, relational store, indexed search, commit log, HDFS, document store, key-value store, time-series, etc.). In other words, what Martin Fowler calls polyglot persistence. I would expect modern enterprise architectures to follow suit, given that web 2.0 architectures have tended to show the future. PI Integrators should aim to position PI in this polyglot world.

                              • Re: Discussion: Will the data historian die?

                                Hi Asle,


                                I’m going to take a stab at responding to this as a technologist and not just as an OSIsoft employee.  Keep in mind then that this is just my view (vs. official company messaging), so it’s subject to disagreement from other more official quarters either in part or in full.  But hear my comments out and see if they make sense to you.  Be forewarned:  I'm notorious for being wordy, but here goes:


                                First off, we should first distinguish the difference between “Data Historian” as a technology versus as an application.  In many discussions, including the article that you cite, “historian” is really narrowly defined as a technology used for storing, processing, and querying time series sensor data.  Certainly, the Big Data technology revolution has provided a lot of tools that allow someone to store and process vast amounts of data.  Whereas PI used to be the only game in town once you reached a certain scale with time series data, there are currently several technologies out there that can credibly claim to be able to store the amounts of data that PI stores and that can query the stored data at rates on par with PI.  On certain dimensions (albeit with a few caveats), some can even claim to scale beyond PI.


                                But PI as an historian technology isn’t really what OSIsoft sells.  That’s partially why we’ve been shying away from saying that “PI” is simply an historian; instead, for a number of years now, we’ve been trying to position it as a “data infrastructure” to mark the difference.  Another way to think of that is that “PI” is a data historian application.  That is, historian as an application is really about managing the full life cycle of the data from when the data is generated at the data sources to when that data is consumed by the different users of an organization.


                                I was speaking with a colleague the other day and he came up with a pretty nifty analogy.  Think of historian as a technology as a typewriter at the dawn of the PC age vs. historian as an application, which is more akin to the words produced on the page that the typewriter outputs.  One is a building block for producing work product, whereas the other is the work product that is actually consumed by the reader.


                                So, what does historian as an application do that historian as a technology doesn’t?  To answer that, I suggest looking at the major stages of the life cycle of industrial data.  Off the top of my head, here are the stages that come to my mind, recognizing that others might have a slightly different view.


                                Stage 1 - Data Acquisition:  What are the sources of data and how does that data get into your historian as technology to begin with?  To mis- (mal-?) appropriate the Hadoop imagery, in order to be useful, *somebody* has to feed the elephant.  That is, whatever storage and processing technology you use might be the greatest thing since sliced bread.  But it all comes to nothing more than a science project unless and until data is fed into these tools.


                                Industrial data presents a special challenge to the problem of data acquisition for a number of reasons.  The ones that come to mind are non-uniformity of assets, non-uniformity of data reporting, the limited accessibility of high value assets, and the long life cycle of these assets.


                                On the topic of non-uniformity of assets, this is the problem that gives the most heartburn to users. To be sure, there are a number of industrial standards and protocols that try to homogenize how data to presented to consuming applications (think OPC, OPC-UA, Modbus, EthernetIP, BACNet, CAN, WITSML, C37.118, MQTT, etc, etc) but they’re all imperfect in one way or another, and most (if not all) are broad enough that consuming applications still have to do a lot of heavy lifting, code-wise, to make use of the underlying data.  Take OPC-DA as a case in point.  Ultimately, it’s a specification, not an implementation. Vendors have been known to interpret parts of the standard differently.  Some have implemented the standard as they best understood it but added extensions unique to themselves to try to provide added value to their customers.  Additionally, simply by being the product of different hands, each vendor has bugs that are unique to itself.  To properly acquire data from any OPC-DA data source, you have to take this into account.  I’ve found it fairly unusual for an end customer to jettison millions, tens of millions, of dollars in capital investment that’s been working perfectly fine for what it was bought for simply because their historian technology could’t get data off the data source properly.


                                But let’s assume for the sake of argument that you’ve got a perfect site where all the assets talk perfectly according to the OPC standard.  Even in that case, the data sources will still report data somewhat non-uniformly.  There may be consistency in how the data is presented (the data types that are supported, the messaging structure, that sort of thing).  But what about the data itself and how it comes in?  Most industrial standards are fairly lenient about payload content; indeed, the more generally a standard is supposed to be applied, the more latitude is given to data sources what they can encode into their payloads.  You can see this problem writ large in proposed protocols for general IoT communications (e.g. MQTT) which everything from water coolers to connected cars could use. But what data is sent from each data source, with what specific descriptors, and at what frequency?  A 2016 model car from one vendor might have a given set of sensors, but a 2017 from another vendor might a have an overlapping or differently labeled set.  How is an automated program supposed to know to interpret the data payloads from each car?  Standards and protocols like MQTT are silent on this topic.


                                Even if we add the simplification that all data comes in marked uniformly (that is temperature is always temperature vs. temp or T and we don’t have multiple temperatures T1, T2…TN on any assets), we nevertheless need to understand that data still isn’t reported uniformly to the storage and processing engines of historian as a technology.  Most data streams are continuous, but some aren’t.  In the more typical case where they are, they often come in at different intervals.  Some data sources report data at very even, predictable rates (e.g. every 100ms, 1s, 5s, 1min, 1hr, etc), but the frequencies between data sources might be very different.  I might get an environmental temperature reading every 10 minutes to an hour, but my voltage might be reported every second.  And some systems might only report when there’s a “significant” change in value, or some combination of factors like significant change or when some specific set of other conditions are met.  But just because I didn’t have a temperature reading 15 minutes doesn’t mean there wasn’t a temperature at that time.


                                If I need situational awareness at any given time after-the-fact (e.g. forensic analysis of an accident), I need to be able to understand how to interpret the raw data itself.  Historian as a technology lets you store anything at scale.  But unless you’re use of that data is limited to knowing that was sampled from time A to time B for these streams, you still need Historian as an Application to understand how to interpret the data.  As I mentioned in the previous paragraph, data comes in differently by data source, even if everything happens to follow a governing standard.  How it comes in is often defined by a combination of the design choices and configurations made by the individual equipment manufacturers, the system installers, and the floor operators. The way the data is reported is typically configurable or programmed in (ladder logic, anyone?) by those parties because...who doesn’t want to be flexible?  But the problem with this is that the users trying to make use of the data are often very disconnected from the people who configured how the data needs to come in for their process.  The raw data tells you that the values were 1, 2, 3, 4 at times T1, T2, T3, and T4. Great.  What was the value at T1+15 seconds?  Well, that depends on the nature of that data stream.  Is it from a continuous data source?  If so, was the stream representing an analog signal, a digital one, something more exotic (e.g. real use case:  there was this one system that spat out 777 as a value to denote an error and you just had to know that 777 was an error and not, say, the actual temperature or pressure of what was measured).  Here, historian as a technology can’t do much for you, but historian as an application is crucial to making heads or tails of the underlying data.


                                On the topic of accessibility of high value assets, this is a big barrier to building a historian as an application for the industrial space on whatever technology.  In industry, the assets of most interest tend to be really expensive and difficult to procure.  No one is going to eBay, Amazon, Best Buy, etc and running a credit card to get a hold of a jet engine, turbine, oil rig, transformer.  Even PLCs can be pretty big dollar purchases.  But without access to the underlying equipment, writing proper data acquisition is a real challenge.  With luck, there might be a simulator for that equipment, but even then, simulators have been known to not only be pricey but also to lack fidelity when simulating the assets of different vendors.  So, how does one provide confidence to an end customer that this data collector for their equipment has gone through enough QA? Real use case:  At one of the Digital Bond security conferences that I attended a few years ago, the hosts acquired real equipment from different vendors and pen-tested them; the result was that asking the “wrong” question or giving the “wrong” response sometimes caused catastrophic failures in the system.


                                Regarding longevity of assets, the problem here is really that data acquisition of industrial equipment is not—and will probably never be—sexy.  High value assets aren’t like consumer devices and trendy things like iPhones. The assets whose data are of most interest need to be big capital purchases.  Things that require a CFO to sign off typically can’t be replaced in a year, two years.  In our space, we are talking about having to support assets not for months or years, but DECADES.  I saw a slide the other day where one of our customer’s inventory of transformers showed that the average age (not life expectancy!) was over 40 years! Supporting this poses some unique challenges that historian as technology doesn’t address (and isn’t looking to address).


                                Here, few companies, and perhaps especially the open source community, are not well equipped to support this reality. Things like transformers and industrial motors and sewage pumps, etc, are items of niche interest when they’re new. What is the interest in supporting that brand new 2016 whatever when it’s 2056?  And yet, the data acquisition needs to be supported for the life of the equipment.  That especially means keeping up with (and adapting to as needed!) with upgrades and patches (especially security patches).  This is not an area that businesses in the Big Data space are eager to jump into.  Open source is particularly problematic because if you survey public source repositories, you’ll see not only hot and well-supported projects like Linux, Hadoop, Spark, Cassandra, etc, but also a wasteland of abandoned projects.  I hunted around the internet to try to get some statistics on this and the most recent that I found was a report from what is now OpenHub.Net from 2012 that showed 90% of projects didn’t have a line of code committed in the preceding 12 months, and 80% didn’t have a commit within the previous 2 years!


                                This is where the unsexiness of data acquisition from industrial assets poses a serious challenge to doing away with established Historian as an Application vendors with do it yourself solutions simply using a mix of open-source Big Data and Cloud. The cautionary tale of OpenSSL and the HeartBleed bug from a few years ago comes to mind.  Here is OpenSSL, by no means a fringe project, but one that underpins the security of a good chunk of the Internet.  Yet, as important as it was, it was esoteric and didn’t attract a whole lot of contributors.  When the offending bug got introduced, it was basically maintained largely just by two guys named Steve, who’d never met one another.


                                I don’t think this can be allowed with software that operates in the industrial space, particularly in critical infrastructure.  In this age of Stuxnet and ransomware, we have to be mindful of very professional Advanced Persistent Threat actors (either state or organized crime players with access to lots of resources).  Process networks have been used to thinking that they operate under the protection of an air gap between them and the nasty Internet.  But with IoT, that reliance is becoming ever more unfounded, if it ever really was founded.  In general, a network is as secure as its weakest link.  Data collectors, whose development are effective abandoned, are therefore serious liabilities to industrial facilities where not just money is at stake but health and safety.


                                The other implication for the longevity of critical assets is that it makes compatibility with future technologies difficult to manage.  It’s one thing to build a data collector for the Kafka queue of 2016, but what about the Kafka queue of 2020, much less 2056?  What if you want to move from Kafka to Microsoft IoT Hub or Amazon Kinesis, or even just need to materially change the properties of the data ingest point (partition sizes, connections strings, end point identifiers, etc)?  Because industrial assets are not like iPhones, one often can’t simply push out updates at will to the hardware level, especially if you’re not the asset manufacturer.  Here Historian as an Application, whatever building blocks it’s developed on top of, needs to be in place so the users can get to the data.


                                Stage 2 - Data Ingest, Storage, and Real-Time Processing:

                                But enough about data acquisition.  Let’s move on to the functions that people more classically associate with the term “historian”.  This is the bulk of what IoT technologies really focus on, and probably where they most shine.  The typically IoT story that I encounter begins at ingest.  Typically, there is some sort of ingest point (Kafka, Kinesis, IoT Hub, etc) that is highly scalable, that provide various degrees of resiliency and availability guarantees.  This ingest point acts as a broker to other components that need access to the data. There’s a piece (or pieces) that stores the data for long term storage and serves up the data for either real-time viewing and/or batch operations.  And there’s a piece that takes input as it comes in and does complex event processing/real time analyses on that input.  Basically, this is what is called the Lambda Architecture.


                                What’s missing in the discussion of these components is that these are all just building blocks that a Historian as an Application might want to use, but they aren’t Historians in and of themselves.  There’s still a lot of work that needs to be done to stitch these components together and to have them properly handle the data in such a way that users can make sense of it.  The ingest queue can ingest and route messages of arbitrary type.  But what does the payload look like in these messages?  What is the complex event processing piece (e.g. Spark) supposed to do with that message? I can use something like Spark to set up an analytic on one or more streams, but what parts of the message am I supposed to look at within a given stream, much less across a diverse set of streams?  What parts of the message are relevant to any given analysis and what determines that. Something like Spark is an engine, but someone needs to configure that engine, write the code that tells the engine what to do with the messages that come into it.


                                Similarly, when storing the data and serving it up, something needs to know what of the message needs to be encoded and persisted, and how questions from users translate into in terms of how to interpret the data for consumption.  If I want to know what data samples were taken from today midnight to noon for this data set, this is easy and can be done more or less out of the box.  And many of the IoT/Big Data technologies provide relatively good, and in many cases, great, performance even if I’m asking for the data sampled for the last 1, 5, 10, or 20 years.


                                Stage 3:  Data Consumption

                                However, if my data consumption needs go beyond just getting what values were sampled within a given period, then I need to look to an application above these IoT/Big Data technologies.  For example, if I need to know what was the situation at a specific time during, say a catastrophic accident, I will likely need to know how the points in between the data samples are extrapolated because it’s highly unlikely for industrial data to all be evenly space across all streams across all the relevant equipment. What was the value of any given stream in the gaps between samples?  At first blush, this seems like a problem of simple interpolation.  But that assumes that different data streams can or should be interpolated the same.  This is not really the case.  I may get a series to times and values like {(T1, 1), (T2, 2), (T3, 3), (T4, 777)}. But what do those values means within a single stream.  Are they temperatures, pressures, voltages, running states, GPS coordinates, portions of GPS coordinates, error codes?  Are the data values continuous or discontinuous?  Are they discrete numbers that are stepped between samples, or do they represent samples from a continuous value?  Are they discontinuous error codes, or a combination of the above. For example, I’ve encountered systems that outputs values from an analog signal except when it encounters certain errors—in which case, they output a value seemingly out of left field (e.g. something like 777, not kidding!), which makes no sense unless the user reading the data understands the peculiar semantics of those systems.  But that’s the world we live in.


                                Now, the interesting thing about that reality is that how to interpret the raw data is not something that the person who uses the data typically understands.  He or she is simply too far removed from the data sources and the intimate knowledge of how those data sources work and report data. This seems to be particularly true of data scientists looking to do heavy analytics.  My experience is that we can reliably expect data scientists to have strong math and statistics skills.  They probably know the Big Data/Analytics/Machine Learning tools they plan on using to analyze the data.  They may even know the process that they’re interested in analyzing. However, it’s extremely unusual for them to know the nuances of about how Caterpillar vehicles report data (versus a specific Foxboro skid installed in 1994 versus a Rockwell PLC installed in 2000 versus a Siemens power meter in 2016, etc).  This information is known by the Equipment Manufacturers, System Installers, and Operators.


                                Historian as an Application is needed, regardless of the underlying technologies on which its built, if we want our data scientists spending their time as data scientists analyzing data.  Without it to encode the understanding that the manufacturers, system integrators/installers, and operators have, the data scientists are left to rediscover the nature of the data to properly interpret them for analyses.  Worse, they will have to encode that information into their analyses in the form of code to handle the incoming data.  And even worse than that, this process may have to be repeated again and again over the course of decades.  This is because the analytics tools will likely evolve over time, the Big Data technologies will advance or get replaced, additional types of analyses may be added, and yet, the high value equipment can be expected to operate year after year, decade after decade.



                                All the application functions that I describe in the sections above can, of course, be implemented on top of the emerging IoT/Big Data technologies and set to run on the Cloud. As a software guy, to me, it’s just a question of putting the right 1’s and 0’s together.  But does it make sense to build vs. buy if writing this kind of software is not your business’s core competency, and you don’t want to deal with issues like maintenance and providing continuity on the data collection side throughout the entire life cycle of your assets?  For some businesses,  maybe it’s worthwhile to build their own, but I don’t believe this is true of most industries.  Thus, I believe that, whatever the technological foundations, Data Historian as an Application will need to exist in one form or another.


                                As for the future of “PI” specifically, I don’t see the rise of these new technologies as competitive. Rather, these technologies provide additional tools to the developer toolkit, and OSIsoft will continue to evaluate them as they come along.  I expect that we’ll absorb those things that our leadership thinks makes sense for the product and take a pass on thinks that don’t.  From where I sit, I can’t predict with 100% confidence what the PI product will look like in 5, 10 years and what modifications will be made to the design, or even if it’ll keep the PI name.  But I think I can say with a high degree of confidence that it will exist in one form or another for a long time to come.  As I hope I’ve demonstrated in this admittedly wordy essay, the need is there.



                                Hoa Tram

                                Partner Solutions Architect

                                OSIsoft. LLC

                                4 of 4 people found this helpful
                                • Re: Discussion: Will the data historian die?
                                  Bryan Owen

                                  A view recently published in Plant Engineering positions historians as a first step towards a fully connected enterprise.

                                  Process historians can be an integral part of the IIoT | Plant Engineering http://www.plantengineering.com/single-article/process-historians-can-be-an-integral-part-of-the-iiot/67bdba3b0e9c2882284b72f1f252b28b.html?OCVALIDATE=


                                  The view of IIoT expressed by Emerson pro Mike Boudreaux in Plant Services suggests monitoring without control could develop as a separate ecosystem.

                                  What a smart device ecosystem looks like


                                  Both approaches will thrive so long as value is derived from the use cases. Since automation is necessary for modern plants it seems the integrated approach will continue for years to come!

                                  1 of 1 people found this helpful
                                  • Re: Discussion: Will the data historian die?

                                    Another interesting article Internet of Things: Five truths you need to know to succeed - TechRepublic  from a presentation from Terra data at Starta +Hadoop conference that explain some of the challenges are that well known to PI System users and that the PI System actually alleviate.


                                    Another way to look at this is with the Total cost of ownership. Many of the IoT solutions, especially the one that are based on Hadoop, can be very interesting when the devices monitored are all the same (ex, you are monitoring traffic light status for 1k+ devices in a city), but as soon as you have a very diverse set of asset (think a Refinery), then configuring the streaming individually for each asset may takes weeks in some of the newer solutions while it can be done very simply with the PI System.  So there is, and will be for some time, use cases, where Data Historian applications/infrastructure will still be the best technical and economical choices.

                                      • Re: Discussion: Will the data historian die?

                                        All of that is true in today's technology.   But to come back around to the original question, it may not be in tomorrow's.   Today's devices were never designed to have any storage, aggregation, or analytic capability.  The only way to get what you wanted easily was through the monolithic historian that captured everything.  But this creates silos and is ultimately not useful outside of those silos.  Today we struggle to connect these silos, one at a time.  


                                        I see in the future sensors and devices (and absolutely control systems) that have some of these capabilities, and aggregation devices that can pull large amounts of data from a shifting number of sources.   Memory and processors are cheap - almost disposable.   Security won't be handled at the historian level, but at various data gates.  Edge computing will be how things are done, just like the PC brought computing out of the mainframes.


                                        The first wave of the internet distributed the users.   The next will distribute the real power - data and analytics. 

                                          • Re: Discussion: Will the data historian die?
                                            Bryan Owen

                                            Future process plants will need all of the above innovations and probably more when you consider what's really involved from a plant instrumentation perspective.  A counter to the mainstream hype is a story about the IoT challenge as expressed by an ex-Microsoft IoT centric perspective.

                                            How my views on "The IoT challenge" shifted from Microsoft to Honeywell | Kajal Deepak | LinkedIn


                                            "The real challenge in IoT is identifying the business use cases that justify the instrumenting of the data capture in the first place by delivering clear value from get go, and coupling that with the benefits of analytics and prediction, for when they arrive over a longer term. Without that, the adoption can’t happen, because the technologists of IoT are working in a different plane vs where the customer lives."


                                            • Re: Discussion: Will the data historian die?

                                              I have a slightly different take.  Although I can see the benefits of having the Edge be more intelligent and being able to talk to directly to other Edge devices in principle, I really question the practicality of it in the foreseeable future.  The problem isn't the technology per se or the computing power.  It's structural.  That is, in order for devices to talk to one another, they first need to understand one another.  This can happen one of two ways:


                                              1.  There can be some sort of universal standard that covers all devices and every device (at least within a given system) agrees to adhere to the standard so they can all talk one language.  They need to have conventions in place so that they understand the semantics of each device's communications, not just overall structure.  For instance, devices need to agree that status and the values status can take mean something across the board; the same descriptor can't mean one thing here and another thing there and the values can't be subject to interpretation based on very specific knowledge of the device.  Otherwise,


                                              2.  Each individual device will need to understand the communications mechanism and semantics of all the other devices' messages in order to make heads or tails of what's being said.  The more heterogeneous or fragmented the ecosystem, the more complex code that has to be run on each device and that has to be maintained across generations as devices evolve over time. 


                                              In the industrial space where equipment can have shelf lives running in the decades while device generations may be produced every few years and where standards come and go, either scenario is a lot to ask of device manufacturers to put on their Edge devices.


                                              Finally, is it even desirable to have devices from different manufacturers talking directly with one another in the first place without a mediator component?  This requires that the equipment manufacturers all take cybersecurity and integrated testing much more seriously than they do now.  Ideally they will, but as we're seeing right now with the IoT gold rush, the first priority seems to be to get something working, not necessary something that is written with security and robustness in mind.  The pressure to go to market as quickly as possible and the relative lack of education given to developers entering the field on the topic of writing secure software doesn't give me a lot of confidence to let devices talk to one another in a free-for-all and hope that all the players do the "right" thing.


                                              My $0.02...


                                              Hoa Tram

                                              Partner Solutions Architect

                                              OSIsoft, LLC

                                          • Re: Discussion: Will the data historian die?

                                            Has anyone generated a white paper on the topic of PI vs. Splunk?  Or maybe "How both PI and Splunk coexist for different reasons."   I have customers who are challenged with the SysAdmins saying why do we need PI when we have Splunk.