This post is aimed at any of the following users:
- Those struggling to visualize how data is stored in PI
- Those struggling to shape the table returned by the SQL query to be used with the PI Interface For RDBMS
- Those wanting to deepen their understanding of data storage representations
This post will compare and contrast "normal" intuitive tables versus the PI Asset Framework and PI Points in the PI Data Archive, and it will walk through the conversion between the 2 formats.
"Normal" tables versus PI's internal "table"
Even if you have never touched a database before, you probably understand how tables, such as those below, work. They are how most of us would organize data.
|Equipment Type||Equipment Number||Model||Timestamp||Flow Rate (m³/s)||Run Mode|
|Year||Exchange Rate (CAD/USD) (from Annual Exchange Rates - Bank of Canada)|
However, PI (specifically, the PI Data Archive) stores all of its time-series data in a single "table" that looks like this:
"Tag Name" (or just "tag") is the name of the PI Point. Each PI Point stores the value of a single type of data/signal over time. An event is a timestamp-value combination—a measurement in time. While we usually talk about PI as storing data in a bunch of PI Points, the single-table representation will be more convenient for this post.
How do we get our tables of data to fit into the PI table? None of the columns even match! It can always be done. It just won't be pretty.
Converting any table to a PI table
The PI table is very basic, so to understand it, we need to revisit the basics. In our tables of data, what exactly is our data? For example, is "1.2986" data? "1.2986" is a value, but without any context, it is meaningless. "1.2986" is the exchange rate (CAD/USD) for 2017.
If any piece of data (datum) is just a value with context, then any piece of data can be put into the table below. The 1st row is filled with our "1.2986" example.
|2017 exchange rate (CAD/USD)||1.2986|
To convert this to a PI table, we just split the Context into its time and non-time components:
|Non-time portion of Context||Time portion of Context||Value|
Easy! However, before we populate the table, we must make sure that we understand each of these columns and how the table is used.
PI table in detail
In PI, the value will be the measurement itself; the thing that can vary over time; the thing that we want to record and use. It is not necessarily numeric. For example, "Full-Power" and "Power-Saving" are values in the 1st table.
A timestamp is an instant in time. It can represent a time range., but it cannot be a time range. For example, the year 2017 is not a timestamp; it is the time range from 2017-01-01 00:00:00 (inclusive) to 2018-01-01 00:00:00 (exclusive). Usually, we represent a time range as its beginning, so 2017 would be represented as 2017-01-01 00:00:00 in PI. The end of the time range should be either documented or obvious. For example, we can say that the time range ends at the timestamp of the next event, or that the time range ends 1 year after the timestamp.
Why an instant in time? Why the start of a time range? PI is intended to store mostly measurements over time, or more specifically, samples of signals that are continuous over time. These signals have values at every point in time, and PI just records the values at some points in time and interpolates or extrapolates the rest.
If the user requests the value of a PI Point after the latest stored event, PI will assume that the value has not changed since that event. In the graph below, our latest stored event was 2 days ago.
(To clarify, by the "latest stored event", I mean the "stored historical event that has the latest timestamp" and not the "event that was most recently stored". The order in which events are sent to and saved in PI does not necessarily match the order of the timestamps of those events.)
If the "Step" attribute of the PI Point is off, PI will linearly interpolate a value based on the timestamp and value of the 2 stored events that the requested timestamp sits between. This is appropriate for signals with continuous values.
If the "Step" attribute of the PI Point is on, PI assumes that the value at any point in time without a stored event is that of the previously stored event. This is appropriate for signals with discrete, nominal, or ordinal values or values that apply to time ranges (specifically, values that apply from its timestamp up to and excluding the timestamp of the next stored event). The graphs above and below use the same stored events, but the former uses Step = off and the latter uses Step = on.
With this in mind, we would store the data in the examples into PI like so:
|Pump 1 Flow Rate (m³/s)||2020-12-31 00:00:00||5|
|Pump 1 Flow Rate (m³/s)||2020-12-31 01:00:00||6|
|Pump 2 Flow Rate (m³/s)||2020-12-31 00:00:00||2|
|Pump 2 Flow Rate (m³/s)||2020-12-31 01:00:00||4|
|Fan 1 Flow Rate (m³/s)||2020-12-31 00:00:00||9|
|Fan 1 Flow Rate (m³/s)||2020-12-31 01:00:00||6|
|Fan 2 Flow Rate (m³/s)||2020-12-31 00:00:00||7|
|Fan 2 Flow Rate (m³/s)||2020-12-31 01:00:00||8|
|Pump 1 Run Mode||2020-12-31 00:00:00||Full-Power|
|Pump 1 Run Mode||2020-12-31 01:00:00||Full-Power|
|Pump 2 Run Mode||2020-12-31 00:00:00||Power-Saving|
|Pump 2 Run Mode||2020-12-31 01:00:00||Full-Power|
|Fan 1 Run Mode||2020-12-31 00:00:00||Full-Power|
|Fan 1 Run Mode||2020-12-31 01:00:00||Power-Saving|
|Fan 2 Run Mode||2020-12-31 00:00:00||Power-Saving|
|Fan 2 Run Mode||2020-12-31 01:00:00||Power-Saving|
|Exchange Rate (CAD/USD)||2017-01-01 00:00:00||1.2986|
|Exchange Rate (CAD/USD)||2018-01-01 00:00:00||1.2957|
|Exchange Rate (CAD/USD)||2019-01-01 00:00:00||1.3269|
Of course, PI Points have Description and Eng. Units attributes that can be used, instead of the tag name, to store some of the details. The tag name does not necessarily need to mention all of the non-time context; it just needs to represent it. Also, notice that a value gets its context not only from other information in the same row in the original table, but also from the column name.
The Flow Rate PI Points would have Step = off because they have a continuous range of possible values.
The Run Mode PI Points would have Step = on because their values are discrete and nominal.
The Exchange Rate PI Point would have Step = on because its values are constant across the time ranges whose beginnings are stored as the timestamps of the events.
We can now save the time-series data into PI! Now to deal with the model number…
Time-independent (constant) data in PI
We excluded the model number from the tag name because it describes the equipment and not the value, and it does not change over time. The equipment is already part of the value's context, and so mentioning the model number would be redundant. We do not want to put it in the Description of the PI Points, since we would be repeating ourselves in cases where a piece of equipment has multiple PI Points (in our case, all of them). So where can we put it?
OSIsoft realized this limitation of the PI table a long time ago and developed the PI Asset Framework (PI AF), a database that allows you to create a hierarchy of elements (equipment, buildings, business entities, etc.) to group together related PI Points. Elements can also have a Description, so we can put it there. We could have a hierarchy that looks like this:
|Pumps||Used to group pumps together|
|Pumps||Pump 1||Model: AX|
|Pumps||Pump 2||Model: BX|
|Fans||Used to group fans together|
|Fans||Fan 1||Model: CX|
|Fans||Fan 2||Model: CX|
If our original table contained more information about each equipment, such as the manufacturer or the vendor, it would become difficult to extract any 1 piece of information from the description if we are dealing with equipment in bulk, and it would be difficult enforce a consistent way to write the information (e.g. "Model: AX, Vendor: ABC" versus "vendor - ABC; model AX").
Fortunately, we can use AF Tables, which allow you to create as many tables as you want with whatever column names you want. We could create an AF Table like the one below and populate it, and then our job is done.
|Equipment Type||Equipment Number||Model||Manufacturer||Vendor|
It comes full circle…
You may be thinking: if PI supports creating your own tables, why not just create AF Tables in the format of our original tables? The answer is performance. AF Tables are not optimized for receiving or returning lots of data over time. Similarly, a relational (a "normal", "table-based" database) could be used instead of the PI Data Archive for saving time-series data if it could keep up, but it can't. In fact, the PI Asset Framework uses a relational database (SQL Server) behind the scenes.
It is a matter of trade-offs. Relational databases are good at organizing data but not at handling time-series data, and the PI Data Archive is the other way around. This is why you need to use both the PI Data Archive and the PI Asset Framework to get the best of both worlds.
Bonus: constants without the PI Asset Framework
If your organization does not use the PI Asset Framework and you need to store constants in PI that are not appropriate to be stored as part of the Description or Eng. Units of a PI Point (e.g. unit conversion factors), what can you do? Recall the PI table from earlier:
|Non-time portion of Context||Time portion of Context||Value|
You could say that constants do not have a time portion of their context. Alternatively, you could say that constants have the same value all the time. Also, recall that PI assumes that the value of a PI Point after the latest stored event is the value of that event. This means that we can save each constant in its own dedicated PI Point as a single event, where the timestamp is the earliest time that PI supports: 1970-01-01 00:00:00 (UTC). Extrapolation cause that value to be the value of the PI Point at any time after.