INTRODUCTION

 

I’ve found a need to run statistical models against PI System data for a number of different use cases such as predicting PI System usage. So I thought it would be helpful to give a brief example of using Python Pandas Dataframes with PI. The statistical models used in this tutorial most likely will not apply to your data model, but the how-to part may help you get started.

 

For a quick Python summary (and use of Python with things like the PI Web API), Barry has created an excellent post on the subject: https://pisquare.osisoft.com/community/developers-club/blog/2015/06/04/using-pi-web-api-with-python

 

We’re going to be doing three things here:

  1. Pulling data from a PI System via the AFSDK into Python
  2. Getting the data into a Python Pandas Dataframe and running statistical models on the time series data to predict future time series data values
  3. Exporting predicted data to a future data PI tag via the AFSDK.

 

However, these techniques do not need to be limited to future data. It may be useful to pull in past data and back test statistical tests on it. Python (and R) have large statistical libraries.

We’re using Python, but if you’re interested in R – this post might help to get you started: https://pisquare.osisoft.com/message/53504#53504

 

ENVIRONMENT SETUP

 

Requirements on the PI side:

  • PI Data Archive 2015 (if using future data)
  • AFSDK read/write access to your PI Data Archive

I’m using Microsoft Visual Studio as an IDE and Python 2.7 with IPython (Jupyter) as the interactive shell:

 

In the example, we’ll use the default PI tag BA:Temp.1 as our historical data tag. We’ll be plugging our predictions in an empty future data tag BA:Temp.1_Future that we previously created.

 

CODE

Pulling data from a PI System via the AFSDK into Python (IPython):

 

#***************************************************************************************
# ©2009-2015 OSIsoft, LLC. All Rights Reserved. 
# No Warranty or Liability.  The OSIsoft Samples contained herein are licensed “AS IS” without any warranty of any kind.  
# Licensee bears all risk of use. OSIsoft DISCLAIMS ALL EXPRESS AND IMPLIED WARRANTIES,INCLUDING BUT NOT LIMITED TO THE 
# IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE and NONINFRINGEMENT. In no event will OSIsoft
# be liable to Licensee or to any third party for damages of any kind arising from Licensee’s use of the OSIsoft Samples 
# OR OTHERWISE, including but not limited to direct, indirect, special, incidental, lost profits and consequential 
# damages, and Licensee expressly assumes the risk of all such damages. This limitation applies to any claims related to
# Licensee’s use of the OSIsoft Samples and claims for breach of contract, breach of warranty, guarantee or condition, 
# strict liability, negligence or other tort to the extent permitted by applicable law.  This limitation applies even if
# OSIsoft knew or should have known about the possibility of the damages. FURTHER, THE OSIsoft SAMPLES ARE NOT ELIGIBLE 
# FOR SUPPORT UNDER EITHER OSISOFT’S STANDARD OR ENTERPRISE LEVEL SUPPORT AGREEMENTS.
#****************************************************************************************
# IPyton
import clr 
clr.AddReference(r"C:\Program Files (x86)\PIPC\AF\PublicAssemblies\4.0\OSIsoft.AFSDK")
from OSIsoft import AF  
import datetime as dt
import time
from pandas.stats.moments import ewma
import pandas as pd
import statsmodels.api as sm


# Get the first template element in the default database of the default pi server  
piDB = AF.PI.PIServers().DefaultPIServer
piPoint = AF.PI.PIPoint.FindPIPoint(piDB,"BA:TEMP.1")
#Set timerange to pull data, using PI Time
startTime = AF.Time.AFTime("T-10d")
endTime = AF.Time.AFTime("T")
timeRange = AF.Time.AFTimeRange(startTime, endTime)
#Set AF boundary type
boundaryType = AF.Data.AFBoundaryType.Inside

#Set our prediction times(using PI time)
startPredictTime = AF.Time.AFTime("T")
sp = datetime.datetime.strptime(startPredictTime.LocalTime.ToString(),'%m/%d/%Y %I:%M:%S %p')
endPredictTime = AF.Time.AFTime("T+10d")
ep = datetime.datetime.strptime(endPredictTime.LocalTime.ToString(),'%m/%d/%Y %I:%M:%S %p')

#Get the data from the PI Data Archive
#maxCount(50000 here) is not optional
recordedValues = piPoint.RecordedValues(timeRange,boundaryType,"",False,50000)

 

Once we have the data from PI via the AFSDK, we’re going to format it such that a Python Pandas dataframe can be used. Then, we'll use an auto regressive moving average to predict the future.

 

Here is the Stats Documentation for our example:

http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/tsa_arma.html

And a Stats Example:

http://statsmodels.sourceforge.net/devel/examples/notebooks/generated/tsa_arma_0.html

 

There are a ton of statistical packages for python out there and here were using just one in a large number.

#create a dictionary entry for Python
recordedValuesDict = dict()

#Add all the timestamps and values found to a dict in order to import to pandas
for event in recordedValues:  
    dt = datetime.datetime.strptime(event.Timestamp.LocalTime.ToString(),'%m/%d/%Y %I:%M:%S %p')
    recordedValuesDict[dt] = event.Value

#load the data
df = pd.DataFrame(recordedValuesDict.items(), columns=['TimeStamp', 'Value'])
#Send it to a dateTime Index then set the index 
df['TimeStamp'] = pd.to_datetime(df['TimeStamp'])
indexed_df = df.set_index(['TimeStamp'])

#Exponential SMoothing
dfewma = pd.ewma(indexed_df,span=1,freq='H')
dfewma.plot()
ar_model = sm.tsa.AR(dfewma, freq='H')
pandas_ar_res = ar_model.fit(maxlag=200, method='cmle', disp=-1)

#Need an overlap in the prediction space(Add automatic upabove)
pred = pandas_ar_res.predict(start=str(sp), end=str(ep))

#Print out the plot to take a look at it
print pred.plot()

 

This will give you an output with blue as current PI system data (BA:Temp.1) and green as predicted data (to be written to BA:Temp.1_Future).

Then we can export this predicted data to a future data PI tag and save it back to the PI Data Archive:

 

#Send Data to the PI Data Archive
#Get Pi point to send data to, this should already be created as a future data point
#alternatively you can create this automatically and search it
piPointPredict = AF.PI.PIPoint.FindPIPoint(piDB,"BA:Temp.1_Future")

#Convert data from Pred data frame to AF Values
newValues = AF.Asset.AFValues()

#TimeStamps and Values
for index,Value in enumerate(pred):
    newValue = AF.Asset.AFValue()
    newValue.Timestamp = AF.Time.AFTime(pred.index[index].to_datetime().strftime('%m/%d/%Y %I:%M:%S %p'))
    newValue.Value = float(Value)
    newValues.Add(newValue)
    
#AFUpdateOption
updateOption = AF.Data.AFUpdateOption.InsertNoCompression
#AFBufferOption
bufferOption = AF.Data.AFBufferOption.BufferIfPossible
#Write data from AF Values into PI
piPointPredict.UpdateValues(newValues, updateOption, bufferOption)

 

No guarantees as to the statistical accuracy of these predictions

 

That’s it! Hope it helps!

-Eric