Skip navigation
All Places > All Things PI - Ask, Discuss, Connect > Blog

This post will contain an overview of an internship project designed to showcase the value of the PI System in a Smart City context. The project was undertaken at the Montreal OSIsoft office, by two interns – Georges Khairallah and Zachary Zoldan  – both of whom are studying Mechanical Engineering at Concordia University in Montreal.

As a proof of concept for the PI System as a tool for the management of smart cities, we chose to bring in data from public bike sharing systems. We then set about collecting data from other sources that may affect bike share usage – data such as weather, festival and events, public transit disruptions and traffic data. In undertaking this project, we hope to better understand the loading patterns of bike sharing systems, analyze how they react to changes in their environment, and be able to predict future loading.




What is a Smart City?


A smart city implements technology to continuously improve its citizen’s daily lives. Through analyzing data in real time, through the movement of people in traffic to noise and pollution, a Smart City is able to understand its environment and quickly adjust and act according to ensure it maximizes the usage of its resources. A smart city’s assets can all be monitored in real time, something which is becoming easier with the emergence of IoT technologies.


We chose to monitor bike sharing systems due to the abundance of live and historical open data, and the numerous analyses which can be run on the system. Bike sharing systems represent one aspect of a smart city, and by demonstrating what can be done with just this data, we can show the potential value of the PI System in a Smart City context.




What is a bike sharing system?


A bike sharing system is a service which allows users to rent bikes on a short term basis. It consists of a network of stations which are distributed around a city. Each station consists of a pay station, bike docks, and bikes that the user can take out, and drop off at any other station.  Bikes are designed to be durable, resistant to inclement weather, and to fit the majority of riders.

Users can be classified as “members” or “casual users”. Members are those who pay an annual, semi-annual, or monthly fee for unlimited access to the bike sharing network. Members access a network with a personal key fob, allowing the bike sharing system to keep a record of a members’ trips. Casual users pay for a one-way, 24-hour or 72-hour pass using their credit card at the pay station.




Every bike is equipped with unique identifier in the form of a RFID tag, which keeps track of the station it was taken out from and returned to. Coupled with a member’s key fob identification, or a 4 digit-code generated from the pay station for an occasional user, we can track each trip and determine whether it was taken by a member or a casual user.



What datasets were available to us?


We had access to live open data from numerous bike sharing systems , from cities such as Montreal, New York City, Philadelphia, Boston and Toronto. Each bike sharing system posts data in the form of a JSON document, updated every 10-15 minutes, containing data pertaining to individual bike stations. It reported back with the number of docks available at a station, number of bikes available at a station as well as station metadata such as station name, longitude/latitude positioning and station ID.

Example Live Data.png

We also had access to historical trip data from different bike sharing systems. This data came in the form of CSV files, with each row entry representing a single bike trip. Each individual trip is tracked using an RFID chip embedded in the bike, and can tell us the start time/end time of a trip, start station/end station and whether the trip was taken by a member or a casual user.

The specifics of the historical data vary from system to system. Montreal’s BIXI system only offers the trip duration and account type on top of the trip information, whereas New York City’s CITIBike also offers data on member’s gender and age.




We then brought in data sets which we thought might affect bike usage – data such as weather data, festival and events data, traffic data, and public transit disruptions. Weather data was collected from an open weather API, and public transit data was collected using a web scraper and the PI Connector for UFL.

Importing data into the PI System:

Much of our data was text-based, so we chose to use the PI Connector for UFL to import the data into our PI System. The Connector was set up using the REST endpoint configuration for most of our data sources. It was also set up to create PI AF elements and our PI AF hierarchy.

We stored most of our live data as PI Points, while the historical trip data was stores as Event Frames.


To learn more about using the PI Connector for UFL to create event frames, click here.

To learn more about using the PI Connector for UFL as a REST endpoint, click here.

To learn more about creating PI AF elements with the PI Connector for UFL, click here.


Historical Data Analyses:


The historical bike trips have a designated start and end time – to us, it seemed like the best way to store this data was as Event Frames, with an Event Frame for each bike trip. We used the PI Connector for UFL to create Event Frames from our CSV files, we loaded more than 5 million event frames in our PI System!

We used the PI integrator for BA to publish various “Event Views” and then used Power BI to visualize the data. We can then access this data using a Business Intelligence tool such as Microsoft Power BI.


June 2015 Trips vs Rain.png

Above we can clearly see that the days which experienced the highest amount of usage had 0 rainy events in other words it was a sunny day (as shown by the variable “temp_numerical”), whereas the days that held the least amount of trips had the most amount of rainy events.



Above, we have a dashboard that represents the trips from June 2015. It holds a total of 556,000 trips.

On the top left, we can observe bubbles that represent the location of the stations. Furthermore, the color of the bubble shows to which neighborhood it belongs to, and finally the size of the bubble indicates the amount of trips made from that bubble (or station).

On the bottom right, we can determine that “Le Plateau” neighborhood is the most popular, with “Downtown” coming in second place.

Next, we can slice through “Le Plateau” using the donut chart on the top right, to determine whether all of the stations were equally used, or some stations outperformed others?



Looks like many stations outperform other, why is that? Although we observe two stations nearby, one of them holds 7140 trips whereas the other has only 2378 trips, the station that is nearest to metro stations experience the most trips because it is the path of least resistance.




This was the same case for “Downtown” neighborhood!

Since the bike share companies have information regarding the revenue each bike station is approximately generating, the city can make better financial decisions regarding the possibility of withdrawing stations under-performing because at some point maintenance costs outweigh profit.




Live Data Analyses:


Using AF analyses and the live data available, we created several KPIs in order to keep track of usage at each station throughout a city’s bike sharing network. These include station utilization (how empty a station is), and 4/8/12/24 hour combined in/out traffic events.


Let’s take a look at the 4 hour combined in/our traffic events in Montreal, over one weekday:



4 Hr total traffic - Montreal only.png

We see that there are two characteristic peaks, one at 11 am (representing traffic from 7 am - 11 am) and another at 8 pm (representing traffic from 4 pm - 8 pm). As expected, this corresponds to commuting times for the standard 9-5 workday. But is this trend repeated on the weekends?

Let’s look at the same metric, but over one week. Different colors indicate different days. (Saturday-Friday, Left-Right)

We can see the two characteristic weekday peaks which were present in the previous graph, but the usage on weekends tell a different story. There is only one usage peak on the weekend, since users are not using the bikes to commute to work.


Can we use the PI system to explain why the midday usage on Monday has dropped off compared to other weekdays? Let’s bring in our weather data and see if it had any effect on the usage.


The added purple line which varies from 0 to 1, acts as a marker to when it was raining in Montreal, with "1" representing rainy conditions. Now, we have a clear explanation for why Monday July 18th experience a drop in usage during the day, it was raining!






Predicting future demand:


A key issue which bike share system managers must deal with is the need to ensure bikes are available to users at all times. This means they often need to redistribute bikes from one station to another to ensure bike availability. System administrators need to be able to predict usage at a station level and on the system level to account for this.

Using data collected with the PI system, we can predict the future usage at the station level and at the network level. We looked at the value at a given time, say 3:00 PM on a Monday, and then averaged values from the past three weeks. We were able to accurately predict future usage within a margin of error of 2%-5% . We store the forecasted future usage using AF analyses and PI System’s “future data” option.

Mapping Smart City assets with the PI Integrator for ESRI ArcGIS and ESRI Online:

To learn more about mapping smart city assets with the PI Integrator for ESRI ArcGIS, click here.


From the beginning of the project, we knew that the geographical location of our assets would be incredibly important for our analysis. The usage of any bike station is directly tied to its location – those located closer to major office buildings, pedestrian malls, and points of interest will experience an increase in usage. By mapping our assets (along with all associated PI points and KPIs), we can view any geographical trends that might emerge.


We started off by mapping our AF assets (Bike stations in Montreal) using the ESRI online. The examples below show how we can view our elements in a much more intuitive way, and combine secondary data sets such as the location of bike paths, subway stations and bike accidents. We set the marker size to be tied to a specific station attribute, the 4-hour total in/out traffic events.


We can click on any circle and bring up a configured pop-up showing all relevant AF attributes and KPIs.



We can then further enrich this data by adding other map layers. For example, we can see that most bike stations coincide with the location of bicycle paths (left) and the location of subway stations (right).







In this post, we were able to show the value of the PI System in just analyzing bike sharing systems, which is just one small aspect of a smart city. The PI System excelled at data collection, time-based analysis, and helping us visualize numerous datasets. We could have performed a much more in depth analysis if we had access to other closed smart city datasets such as power consumption, waste management and more. The potential for further analysis is there, all we need is access to the data.




Demo Video


OSIsoft's New Offices

Posted by chuck Employee Aug 14, 2016

Have you seen the progress of construction for OSIsoft's new offices in the new San Leandro Tech Campus (SLTC) ?


Just this month the base of the sculpture Truth in Beauty was installed.  The building is all closed in and interior finishing progressing rapidly.  The parking structure is quickly coming together.

If interested in watching the progress, there are some videos on YouTube you can enjoy! 


OSIsoft Headquarters (new) construction - view 1 -

OSIsoft Headquarters (new) construction - view 2 -

Parking structure construction -

View from the new office building - currently the tallest building in San Leandro

Sculpture installation - part 1 -


... and more information from the official website for the San Leandro Tech Campus -


( No I didn't make any of the videos, but I am really impressed with what others have done to capture SLTC construction progress! )

Last time we showed how to choose database and print some information. Today we will look on how to get an attribute value.

So we reached the database already:

PISystems myPIsystems = new PISystems();
AFDatabases myDatabases = myPIsystems["W-SRV1"].Databases;
AFDatabase myDatabase = myDatabases["test"];

How to reach the element

I have this structure in my database:

If I would like to reach the Element1, I need this code:

AFElements myElements = myDatabase.Elements;
AFElement myElement = myElements["Element1"];

This is OK, but what to do, if I would like to reach child_Element1?


There are two options. First do the same again and select element form elements under Element1:

AFElements myElements2 = myElement.Elements;
AFElement myElement2 = myElements2 ["child_Element1"];

Or use the path in the beginning:

AFElements myElements = myDatabase.Elements;
AFElement myElement = myElements["Element1\\child_Element1"];

NOTE: The symbol of \ have some function in the string, so we need to us \\ and is translated to the \ when string is used.

How to reach the attribute

Now we have myElement and we would like to have an attribute under this element. This is similar as before:

AFAttributes myAttributes = myElement.Attributes;
AFAttribute myAttribute = myAttributes["Attribute1"];

As you see nothing changed, we still use the same logic as with elements.


How to get the value

To get the value we first need to get it as AFValue:

AFValue myAFValue = myAttribute.GetValue();

This is current value, how to get value form specific time will be showed in the next chapter.

We can get the float value from the AFValue now. Value in the AFValue is of object type so I decided to  retype it to float (my value is float and I know it).

float myValue = (float) myAFValue.Value; 

Printing information

Now we can print some information:

Console.WriteLine("Attribute path: {0}", myAttribute.GetPath());
Console.WriteLine("Value: {0}", myValue);
Console.WriteLine("Timestamp: {0}", myAFValue.Timestamp);
Console.WriteLine("Value type: {0}", myAFValue.ValueType);
Console.WriteLine("Source PI Point: {0}", myAFValue.PIPoint);



So here is the code with shortcuts

AFElement myElement = myDatabase.Elements["Element1\\child_Element1"];
AFAttribute myAtribute = myElement.Attributes["Attribute1"];
AFValue myAFValue = myAtribute.GetValue();
float myValue = (float) myAFValue.Value;


So now we are able to get current value of an attribute and get some another information from it. How to get a value from specific time will be topic of the next chapter.


Hardware LifeCycles (OpenVMS)

Posted by chuck Employee Mar 29, 2016

A couple years ago I posted articles inquiring as to the state of OpenVMS and PI System dependencies on OpenVMS within our PI System user community.  Now that a couple of years or so have passed - just the other day I was thinking about this again.  There still remain a number of PI systems, mostly PI3 servers, which depend on interface nodes or application nodes running OpenVMS.  A goodly number of these OpenVMS systems, based on VAX and Alpha platforms, are considered mission or business critical.


Just how long can we expect this stuff to run that we type on and archive PI data with?  Let's think about just the disk drives for a moment… 

Disk drives for ”modern” OpenVMS systems had advertised very long lifetimes and smart hardware.  Disk drives in OpenVMS systems which were new in the late 1990’s and early 2000’s were pretty capable and included advanced features helping to implement RAID and caching implementations in drive hardware as compared to other computing systems of the same era which may have implemented such features in disk controllers, or microcode in processor boards, or in operating system disk device drivers.


A typical Wide SCSI drive from this era might have advertised 800000 hours MTBF (nearly 100 years!), however… That was a hugely big "However":  MTBF was affected by handling, storage, power up method (and power fail), error rate and so on.

  • Nearly 20% of storage capacity of such drives was reserved for system/hardware functions such as engineering partition, dedicated head landing zone, sectors reserved for error replacement, and auto calibration/head alignment functions.
  • These disks anticipated an average of as many as 100 sectors failing per year of operation. 
  • The way DC power was applied to the drive and the manner of application of the first SCSI controller commands after power up affected drive life and lifecycle.

As a consequence of these features, disk life was expected to be only 5 years in normal use.  Here at OSIsoft we are experiencing average lifetime of these disks to range from 5-8 years.


Similarly limited lifetimes apply to our OpenVMS server's disk controllers, memory boards, processor boards and so on.  The last VAX/VMS system was built September and shipped in December of 2000.

And we know the last Alpha OpenVMS system was ordered in April 2007 and shipped later that year.  We can expect nearly all VAX and Alpha based OpenVMS systems to be past end of life at this point in time.


Seagate Barracuda ST15150N specification sheet and reference manual

Hi all,


I have created a .Net application that can help in the maintenance of PI Datalink Reports in Excel.   Some of you may find this extremely useful.


This program was developed out of the need for a project that I was working on where we are renaming around 100,000 PI tags as part of a PI tag standardization effort.  Renaming of PI tags has a detrimental effect on PI Datalink report and can also impact VBA code that may contain PI tags as well.


I have a screen shot of the GUI below that shows where you select an input folder which contains all the Excel files that you want to convert, a conversion file which contains all the old and new PI tag names, and an output folder to store the converted files.  I have tested this with 100,000 tag names.  You can easily handle all of the Excel reports in your system in a single batch, test them, and then reload them to your production folder afterwards.


You can get a trial version of the software at






The LazyPI project was created to hopefully provide an API that feels more like the native AFSDK. The project has come along nicely and is ready for testing. Testing requires the development of Unit Tests and an access to a WebAPI server. The currently implement pieces should allow for basic management of AFElements, AFEventFrames, and AFAttributes. LazyPI will simplify interactions with the WebAPI and reduced required knowledge of WebAPIs.


Contributors Needed

Anyone that might be interested in helping to test or expand on the library please see the GitHub page. If you would be interested in contributing in any way please fill out this super short application.


Outstanding Tasks

  • Find and report bugs
  • Create Unit Tests for all implemented pieces
  • Implement "Controller" objects that are missing (ex. AFTable, Calculations)
  • Iron out design flaws



This code is still not ready for production as of Feb. 15, 2016. Please do not use LazyPI in its current state for important projects.

There has been a few posts about inputting data into the PI system from devices such as the Raspberry PI or Arduino using the UFL interface and PI Web API (see here, here and here).  In Carlos’s post he mentions projects using the HTML interface too, though I couldn’t find any details on it so I thought I would share some of my experiences using the HTML interface to bring data into the PI system from a Raspberry PI.




A few months ago I had some free time and wanted to learn more about both python and SQL, so I started playing around with both on my raspberry PI at home, whilst I was waiting for a temperature sensor to arrive (postage to Australia can be slow!) I started by using python to measure some performance statistics about my raspberry PI. I used Python to read the performance statistics and write them to a database in MySQL on the Raspberry PI.  I then used PHP to access the data in MySQL from a webpage, see below:


Fast forward a little while to the new year period and I again had some free time on my hands so I decided to configure a HTML interface to read this data into my PI server, so the data flow now looks like this:



The Raspberry PI is a model B running Raspbian, the HTML interface is version and the PI server (AF and Data Archive) are 2015 R2.




Once MySQL and Apache were installed I used the following Python script to read the performance statistics and write is to the MySQL database, I used crontab to make the following script run every minute.


#! /usr/bin/env python

import subprocess, os, datetime, sys, statvfs
import MySQLdb as mdb

def get_ram():
  "Returns a tuple (total ram, available ram) in megabytes"
  s = subprocess.check_output(["free","-m"])
  lines = s.split('\n')
  return ( int(lines[1].split()[1]), int(lines[2].split()[3]) )
  return 0
def get_process_count():
  "Returns the number of processes"
  s = subprocess.check_output(["ps","-e"])
  return len(s.split('\n'))
  return 0
def get_up_stats():
  "Returns a tuple (uptime, 5 min loading average)"
  s = subprocess.check_output(["uptime"])
  load_split = s.split('load average: ')
  load_five = float(load_split[1].split(',')[1])
  up = load_split[0]
  up_pos = up.rfind(',',0,len(up)-4)
  up = up[:up_pos].split('up ')[1]
  return ( up , load_five)
  return ( '' , 0 )

def get_connections():
  "Returns the number of network connections"
  s = subprocess.check_output(["netstat","-tun"])
  return len([x for x in s.split() if x == 'ESTABLISHED'])
  return 0

def get_temperature():
  "Returns the temperature in DegC"
  s = subprocess.check_output(["/opt/vc/bin/vcgencmd","measure_temp"])
  return float(s.split('=')[1][:-3])
  return 0

def get_ipaddress():
  "Returns the current IP address"
  arg='ip route list'
  data = p.communicate()
  split_data = data[0].split()
  ipaddr = split_data[split_data.index('src')+1]
  return ipaddr

def get_cpu_speed():
  "Returns the current CPU speed"
  f = os.popen('/opt/vc/bin/vcgencmd get_config arm_freq')
  cpu =
  return cpu

def get_current_time():
  "Returns the current time as a string"
  ctime ='%Y-%m-%d %H:%M:%S')
  return ctime

def get_free_MB():
  "Returns the number of available MB on the SD card and % free"
  x = os.statvfs(".")
  freeMbytes = x[statvfs.F_BSIZE] * x[statvfs.F_BAVAIL]
  totalMbytes = x[statvfs.F_BSIZE] * x[statvfs.F_BLOCKS]
  "convert bytes to MB"
  freeMbytes = freeMbytes / (1024*1024)
  totalMbytes = totalMbytes / (1024*1024)
  PCfree = 100 * (float(freeMbytes)/float(totalMbytes))

FreeRam = get_ram()[1]
Processes = get_process_count()
UpTime = get_up_stats()[0]
Connections = get_connections()
Temp = get_temperature()
Time = get_current_time()
FMB = get_free_MB()
FreeMB = FMB[0]
PCFreeMB = FMB[1]

"Insert values into SQL"
  con = mdb.connect('localhost', 'XXXXXX', 'XXXXXX', 'perfmon');
  cur = con.cursor()
  cur.execute("INSERT INTO NalaStats (id,timestamp,freeram,processes,uptime,connections,temp,freeMB,PCfreeMB) VALUES (NULL,%s,%s,%s,%s,%s,%s,%s,%s);", (Time,FreeRam,Processes,UpTime,Connections,Temp,FreeMB,PCFreeMB))
except mdb.Error, e:
  print "Error %d: %s" % (e.args[0],e.args[1])
  if con:


I also set up the following script to run every week to delete data in MySQL which was more than 14 days old so as not to fill up all the space on the SD card.


#! /usr/bin/env python

import datetime, sys
import MySQLdb as mdb

"Number of days to store data"
TimeRange = 14

"Delete historical data from SQL"
  con = mdb.connect('localhost', 'XXXXXX', 'XXXXXX', 'perfmon');
  cur = con.cursor()
  cur.execute("DELETE FROM NalaStats WHERE TIMESTAMPDIFF(DAY,timestamp,NOW()) > %s;", TimeRange)
except mdb.Error, e:
  print "Error %d: %s" % (e.args[0],e.args[1])
  if con:


The code for the webpage is below, it grabs the data from MySQL as well as doing some aggregate calculations (though I don’t read them into PI)


  <h1>Nala Performance Monitor</h1>
  // Connection information
  $server ="localhost";
  $username = "XXXXXX";
  $password = "XXXXXX";
  $dbname = "perfmon";

  // Create connection and select database
  try {
  $DBH = new PDO("mysql:host=$server;dbname=$dbname",$username,$password);
  } catch(Exception $e){
  echo $e->getMessage();

  // Prepare to query MySQL

  // Get most recent values from MySQL
  $STH_Rec = $DBH->query("SELECT * FROM NalaStats ORDER BY id DESC LIMIT 1");
  // Get averages from MySQL
  $STH_1hrAvg = $DBH->query("SELECT AVG(temp) FROM (SELECT temp FROM NalaStats ORDER BY id DESC LIMIT 60) AS temperature");
  $STH_24hrAvg = $DBH->query("SELECT AVG(temp) FROM (SELECT temp FROM NalaStats ORDER BY id DESC LIMIT 1440) AS temperature");

  // Set the fetch mode

  // Fetch data
  $Rec = $STH_Rec->fetch();
  $Temp1hrAvg = $STH_1hrAvg->fetch();
  $Temp24hrAvg = $STH_24hrAvg->fetch();

  // Print values
  print "<P>Measurement #: <B>" . $Rec['id'] . "</B></P>";
  print "<P>Timestamp: <B>" . $Rec['timestamp'] . "</B></P>";
  print "<P>Free RAM: <B>" . $Rec['freeram'] . "</B> MB</P>";
  print "<P>Free disk space: <B>" . $Rec['freeMB'] . "</B> MB</P>";
  print "<P>Free disk space: <B>" . round($Rec['PCfreeMB'],1) . "</B> %</P>";
  print "<P>Processes: <B>" . $Rec['processes'] . "</B></P>";
  print "<P>System Uptime: <B>" . $Rec['uptime'] . "</B></P>";
  print "<P>Connections: <B>" . $Rec['connections'] . "</B></P>";
  print "<P>Temperature: <B>" . $Rec['temp'] . "</B> DegC</P>";
  print "<P>Temperature 1 hour average: <B>" . round($Temp1hrAvg['AVG(temp)'],1) . "</B> DegC</P>";
  print "<P>Temperature 24 hour average: <B>" . round($Temp24hrAvg['AVG(temp)'],1) . "</B> DegC</P>";

  // Cleanup
  $DBH = null;




Here is an AF element template I created for my Raspberry PI performance monitor, it is configured so that the tags can be automatically created, to roll out an element from this template the user just needs to enter the data in the ‘Tag Configuration’ Attributes and press ‘Create or Update Data Reference’ and all tags will be created.

The current tag values, note that 226693 measurements at 1 measurement/minute is around 157 days.  The longest up-time I have had so far is around 100 days, showing that the platform is fairly stable (being in my living room the biggest threat it has are me doing the vacuuming and my cat's curiosity)

An interesting thing a found was that it is possible to determine when I had the air conditioning on by looking at the CPU temperature of the raspberry PI, you can see in the plot below that the temperature drops markedly (more than 3 degrees in about 15 minutes) when the air conditioning is turned on.


Another interesting observation is the behaviour of the free disk space on the Raspberry PI, data is inserted into a SQL table every minute and then old data is deleted once a week, you can see this pattern reflected in the free space available.


Final Thoughts

  1. It was interesting to see how I could accurately determine whether the air conditioning was on based on a measurement of the core temperature rather than measuring the air temperature directly.  It goes to show how much information you can get from data if it is available.  I think my next steps may be to use analyses and event frames to see what else I can gleam from this.
  2. The timestamp of my measurements is consistently 2 seconds past the minute though it is scheduled to run on the minute.  This probably reflects a combination of the relatively inefficient way I have implemented this and the limitations of the Raspberry PI.
  3. Configuring the HTML interface was really easy as I had complete control of both the interface configuration and data source (web page) having this degree of control is relatively uncommon in my experience, but helped a lot in this case.  The reason the values on the webpage are in bold (first figure above) is that I used the bold HTML markers to find the values with the HTML interface.

There is no way to send a negative number directly to a digital PI tag. Or more accurately, sending a negative integer to a digital state tag will result in a system digital state being written to the tag. For more details on why that happens, see this KB article.


Scott Robertson recently introduced me to a clever work-around for this issue. Instead of sending the negative number directly to the digital state tag you can instead do the following:

  1. Store the value in an integer PI Tag. We will call this PI Tag JANET.
  2. Reference JANET in an AF attribute. We will call the AF attribute CHRISSY.
  3. Use an enumeration set for CHRISSY that translates the integer into a string, and then you have two options:
    1. Write the string from CHRISSY back to a second digital PI Tag using a PI Analysis Service Expression. We will call the second PI Tag JACK.
    2. Use CHRISSY directly and save yourself a PI Tag.


If you do not have PI AF Server at your site, then you have one option:

  1. Store the integer value in JANET like before.
  2. Create JACK as a digital Performance Equation (PE) tag that uses a long, nested IF THEN ELSE statement to perform the translation from the integer to the digital state.


If you are new to enumeration sets, then these videos will help you get started:

Most users assume that setting compression deviation to 0 will mean that even the smallest change in a tag's value is still archived and that repeated values are discarded. This assumption is valid for the vast majority of data. However, there are two special cases where that assumption is not true. In an earlier blog post I described why a user can see repeated values in his or her archive even when compression deviation is set to 0. In this post I will address the other special case where changes in a tag's value are not archived even when the compression deviation is set to 0.


Let's say we have incoming data from an integer tag that looks like this:


We would expect to see each value in the archive because the value is changing. What we actually see in the archive is the following:


The values 2, 3, and 4 are compressed out because they fall exactly on the line between 1 and 5. In other words, their deviation from the slope of the line is 0, which is equal to the specified compression deviation; therefore, they do not pass compression. This is the key difference between turning compression off and setting the compression deviation to 0. With compression off, all values are archived. If compression is on and the compression deviation is set to 0, then all values whose deviation* is greater than the compression deviation are kept.


Note that we have not lost any information after compression. If we were to interpolate for the value at 00:00:02, then we would get a value of 2, which is exactly the value that entered the snapshot and was compressed out.


*that is the deviation from the line drawn between the most recent archive value and the newest value

First, if you haven't seen OSIsoft: Exception and Compression Full Details - YouTube, then go watch it now because the rest of this post assumes you have seen it.


I have gotten this question a few times from customers, "I see repeated values in my archive even though I have compression turned on. I thought compression prevented repeated values from occurring. What is wrong with my compression settings?" Chances are, there isn't anything wrong with your compression settings.


There are two possible reasons you are seeing repeated values:

  1. Your compression maximum value is taking effect.
  2. The values in the archive are necessary to avoid losing information.


The CompMax value for a tag determines the maximum amount of time between data points in the archive. By default it is set to eight hours. If no events pass the compression test for eight hours, then the current snapshot will be archived as soon as the next value is received. This means that the time between values in the archive will always be less than or equal to eight hours because the timestamp of the current snapshot must be at or before the eight hour mark. If you are looking in your archive and seeing identical values whose timestamps are almost but not quite 8 hours apart, then you are probably exceeding CompMax.


If the spacing between repeated values is much less than your CompMax setting, then you are probably running into the second scenario. As an example, let's say you have a tag writing a value of 1 or 0 once a second, and your archive data looks like this:


And your question is, "Why is the data 1,1,0,0,1,1 instead of 1,0,1?" If we plot the first data set that has the repeated values, and the Step attribute for the tag is set to OFF, then we should see a plot that looks like this:

Note that I have altered the spacing on the x-axis to highlight the transitions between 1's and 0's. If we plot the second data set that does not have repeated values, then we should see a plot that looks like this:

Now we have lost information. The first plot showed that the value held steady for an hour, changed suddenly, held steady at the new value, and then changed suddenly back. In the second plot, the values changed gradually over the course of an hour, and then gradually changed back.


Many users think that setting compression deviation to 0 and leaving compression on will eliminate all repeated values.  This is not without cause, at 10:03 in the video mentioned at the beginning of the article the narrator states, "[Setting] compression to on [and setting] the compression deviation to zero... means that successive identical values will not be archived." This statement is true the vast majority of the time but not always. In our first plot above only two values in an entire hour's worth of data were archived; the majority of repeated values were discarded. We had to keep the two values at the start and end of each horizontal line in order to show that the value did not change for the whole hour. As we saw in the second plot, if we were to discard the end point for each horizontal section, then it completely alters the information stored in the archive.


And that is why you see repeated values in your archive even when compression deviation is set to zero.

I’ve seen a lots of questions regarding URL from notifications to Coresight and decided to write a blog post regarding the topic.

During my test’s I found two main ways of doing this, but I have to say I prefer the built-in “Web Link” feature in notifications.

The second option is to specify the URL in an attribute and add the attribute value to the notification message. This works if you don’t have any spaces in your server or database name but that wasn’t the case in my environment so this is how it turned out:

How it’s done

First of all, I’ve created a ProcessBook ERD display, as you may be familiar with there is possibilities to manipulate the element of interest in Coresight by specifying it in the URL by “?CurrentElement=\\SERVER\DATABASE\PathToElement\”. (If you don’t want to use a ProcessBook ERD display, just use “?Asset=\\SERVER\DATABASE\PathToElement\”)


First, publish your ProcessBook display to Coresight.

Then you’ll need to navigate to that display in order to get the correct url.

Copy the URL and start configuring the notification.

On your right hand side click “Add->Web Link->Other”

Display name: Open in Coresight

Link address: http://complete/url/to/coresight/display example “https://democor/coresight/#/PBDisplays/55/”


Then add a parameter.
For ProcessBook display the parameter name is “CurrentElement”
For Coresight display the parameter name is “asset”
The value is simply the Target (path)

Just click OK and drag the link into the notification message





Using these kinds of URL parameters in notifications could also open up possibilities to specify the start and end time for Coresight (parameters: StartTime=*-1h&EndTime=*). You could for example use start time of the notification or create an analysis that writes a start-end time to an attribute before sending the e-mail.

A customer stopped by the PI ProcessBook pod on Wednesday asking how to grab just the M/DD part of a timestamp from a Value symbol with VBA. Here's a quick example when you have a value symbol on your display called Value1.


Sub Test()


Debug.Print showMonthAndDayOnly(Value1)


End Sub


'takes a value symbol and returns M/DD string for the timestamp

Function showMonthAndDayOnly(valueSymbol As Value) As String


Dim vValue As Variant

Dim vTime As Variant

Dim vStatus As Variant


vValue = valueSymbol.GetValue(vTime, vStatus)


Dim sDate As String

Dim i As Integer


'finds location of 2nd "/" in the timestamp's date

i = InStr(1, vTime, "/")

i = InStr(i + 1, vTime, "/")


showMonthAndDayOnly = Mid(vTime, 1, i - 1)


End Function

We have made our PI Developer Technologies more easily available. Anyone with an account on the Tech Support website or PI Square is now able to download the PI AF SDK, PI Web API, the PI OPC DA and HDA Servers, and all products in the PI SQL Framework (PI OLEDB Provider, PI OLEDB Enterprise, PI JDBC Driver and PI ODBC Driver). We are doing this to make it easier for you to build applications that leverage PI System data. See this blog post for details.


Wednesday, March 4th

Session 1: 6 am PT / 9 am ET / 3 pm CET
Session 2: 10 am PT / 1 pm ET / 7 pm CET

Keith Pierce, OSIsoft
Global Solutions Group
Chris Crosby
Industry Principal


  • Why condition monitoring and CBM are high value initiatives.
  • Best practices for designing a condition-based maintenance (CBM) strategy.
  • Hear directly from PI System customers about their implementations

This complimentary webinar will deliver content to those interested in:

  • Preventing equipment failures, improving uptime, and optimizing operations.
  • Implementing a real-time data infrastructure for asset health and CBM.
  • Integration with computerized maintenance management systems.

Register now.png

Over the decades we’ve seen PI Systems used in many creative and valuable ways. Sharing tips and tricks discovered along the way is kind of fun.  So welcome to PI Square and let the fun begin!


One of the ‘geekier’ (is that a word?) use cases involve mining system log data. OT systems have become more complex than ever. Monitoring the associated performance indicators and log events can provide an edge for operational mission assurance.


Like process data, system logs come in a variety of formats and access methods. There are some generally applicable tips and tricks for handling log data. This post explores the Windows firewall as representative of nuances related to logging based on clear text flat files. A PI System with at least one UFL interface node is a classic approach for processing file based data sources.


First we notice that although the Windows firewall is enabled by default, logging is disabled by default. Right or wrong, having to enable ‘extra’ or ‘verbose’ logging is actually fairly common for all kinds of systems. In this case, logging is enabled with a few clicks of the mouse; here are the equivalent console commands:


     netsh advfirewall set allprofiles logging droppedconnections enable

     netsh advfirewall set allprofiles logging allowedconnections enable


The default setting creates two files in the %systemroot%\system32\logfiles\firewall folder: ‘pfirewall.log’ and ‘pfirewall.old’.  Current events are appended to the log file. The log file overwrites to the old file based on reaching a size limit (4MB default).


You might have guessed a fairly common ‘monkey wrench’ with this pattern. Windows firewall keeps the current log open which blocks UFL from processing the file. Getting the events from the old log is viable but adds significant delay waiting for the current log to fill.


A small script can generate files for UFL. The trick is to copy the current log so it’s easy to measure of the number of lines in the file.  The script selects only the lines that were appended (including checking for a roll over) and outputs to a unique file for processing by UFL. Extending the script to copy logs from remote systems is left to a future post.


The Measure-Object  -line and Select-Object –last Powershell cmdlets made this script kind of fun.  The biggest trick was Out-File –encoding “Default”. Without specifying “Default” the file will be UNICODE which isn’t compatible with UFL.


Firewall log entries have a simple space delimited structure as shown in the file header:

#Version: 1.5

#Software: Microsoft Windows Firewall

#Time Format: Local

#Fields: date time action protocol src-ip dst-ip src-port dst-port size tcpflags tcpsyn tcpack tcpwin icmptype icmpcode info path


The action field notation is “ALLOW or “DROP” and the path field notation is “SEND” or “RECEIVE”. Four corresponding UFL message filters are as follows:







UFL message filters provide a natural way to store data in different PI points depending on the type of event (i.e. protocol, source IP address, source port, destination port and size in points named for inbound dropped traffic).









PI points are configured as follows: Ports and size as INT32 points; IP addresses and protocol are STRING points; optionally, protocols may be setup as a digital state set (eg. use IANA codes per %systemroot%\system32\drivers\etc\protocol).


It’s especially handy to monitor dropped traffic both for troubleshooting and even as an attempted security breach indicator. Even with the limitations in this example, one typically discovers there is more activity affecting a system than is immediately obvious.


In summary, monitoring the host firewall on a mission critical system makes sense but we are only getting started – there is more fun to come!


TL;DR  System logs come in many sizes and shapes. There can be subtle nuances in mining events from file based logging systems. This post introduced a general pattern for copying from open log files and processing with UFL. A future post in this series extends the script used in this pattern to gather log files from remote systems.

Filter Blog

By date: By tag: