# PI Data and .NET - A graphical statistical case study

Blog Post created by bperry on Apr 18, 2012

The title and timing of this post are inspired by Ahmad's excellent series of posts on PI Data and R. Ahmad shows an implementation of R hosted in Excel (RExcel) and he uses PI DataLink to pull in data. The simplicity is brilliant. What, though, about automating calculations like that?

The first place I turn for automating PI stuff is PI ACE, since the scheduling/triggering, buffering, graceful degradation, failover, management, and contextualization are all taken care of out-of-the-box. The downside for casual coders is that writing ACE code requires a full copy of Visual Studio, but since we're here on vCampus, I'll assume that won't be a problem for this audience! Not too long ago, I worked up a scenario exploring the use of an external math library within ACE. The library I chose was Alglib, but the beta of Math.NET also caught my eye. There's even an R wrapper that looks interesting though there's a caveat about multiple instances of that particular implementation. Anyway - choices abound - but I've already got some code and screenshots showing Alglib so I'll stick with that here. Also, because ACE is a creature of the PISDK (many of its methods return PISDK data types, such as PIValues and PITime) we kill two birds with one stone if we look at some of the tricks involved with using the PI SDK + ACE + external library.

Imagine the case of pump performance: the control system has a calculation for the desired flow through the pumps, and a flow meter tells us the actual pump flow. So - how closely does actual flow correspond to desired flow? One indicator might be the correlation coefficient of the data - also known as the r-value.

Low correlation: ...versus higher correlation: ## Example: Setting up a complex statistics calculation in PI ACE

We will look at a similar use case, calculating the correlation between temperature and concentration in a bioreactor. Creating an ACE calculation with these inputs/outputs will look something like: As mentioned before, I hunted around for a VB-wrapped library which could calculate Pearson correlation coefficient and came up with Alglib. I believe Alglib is targeted to .NET Framework 2, but that's always something to check - it's easy to forget and ACE can give some cryptic errors when you're referencing an assembly targeting a higher .NET framework version. In this blog post, I use .NET 3.5 because there are some features we'll be using, such as LINQ and Lambda expressions. After adding references to Alglib, OSIsoft.PISDK, OSIsoft.PISDKCommon, and OSIsoft.PITimeServer (which we'll need by the time this adventure is done), we're ready to dig in. Alglib's correlation method has a rather straightforward signature (copied from the C# documentation)...

```double pearsoncorr2(double[] x, double[] y)
```

The inputs are vectors of corresponding value pairs: for us, expectedFlowValues[] and the corresponding actualFlowValues[].

## Retrieving PI archive values the simple way from PI ACE

We'll start by getting values for concentration using built-in functionality from PI ACE:

```        'Get values for Concentration
Dim concPIValuesCOM As PIValues = BA_Conc_1.Values("*-1h", "*", BoundaryTypeConstants.btInside)
```

Notice the PIACEPoint.Values method returns a PISDK datatype (full of more PISDK datatypes like PISDK.PIValue and PITimeServer.PITime), so we're already seeing direct involvement with the PISDK. Not a bad thing, but the PIValues collection is not truly IEnumerable, so the first thing we'll do is a copy operation to get ourselves into the world of enumerables. For those of you not native to .NET, the IEnumerable collection is the foundation of a wealth of extensions and time-saving functionality. Other interfaces inherit from IEnumerable - Lists, for example, are heirs of IEnumerable but which are ordered and indexed.

```        'Native .NET data structures to hold PI value query results
Dim concPIValues As List(Of PIValue) = New List(Of PIValue)

'We would like to enumerate/filter the PIValues, but
'PISDK.PIValues only likes enumerating in ForEach loops.
'Other attempts at using this as an IEnumerable are futile.
'So here, we manually add the contents to a .NET List(Of PIValue)
For Each v As PIValue In concPIValuesCOM
Next
```

Recall the correlation inputs needed - concentration values and corresponding temperature values. So, we will interpolate the temperature values for each concentration value's timestamp. Our first use of that List(of PIValue) will be to grab those Concentration timestamps by themselves. Instead of writing a for-each loop just to extract an array of timestamps from the value objects, the LINQ Select method does this in one line:

```'Get timestamps from the Concentration values, at which we wish to interpolate Temperature values
Dim concTimes = From v In concPIValues Select (v.TimeStamp)
```

## Retrieving interpolated values using the PI SDK directly

The built-in PI ACE function for getting raw archive values from concentration was nicely simple, but now we need to do the temperature interpolation. Because the PIACEPoint object is a simpler, friendlier subset of the raw PISDK.PIPoint, we'll need to switch over to the latter for getting interpolated values. The same PIValues enumerable caveat holds, so we'll copy these interpolated values to an enumerable List too.

Module/class-level properties to hold the PISDK object and the PIPoint object:

```    Private PISDKroot As PISDK.PISDK
Private BA_Temp_1_PIPoint As PIPoint
```

During initialization, find the raw PISDK PIPoint underneath the PIACEPoint wrapper:

```        'Get PIPoint object corresponding to Temperature PIACEPoint
PISDKroot = New PISDK.PISDK
BA_Temp_1_PIPoint = PISDKroot.Servers(BA_Temp_1.Server).PIPoints(BA_Temp_1.Tag)
```

During evaluation, make a PISDK data call to the raw PIPoint:

```        'Get interpolated temperature at each concentration value timestamp
Dim tempPIValuesCOM As PIValues = BA_Temp_1_PIPoint.Data.TimedValues(concTimes.ToArray())

'Native .NET data structures to hold PI value query results
Dim tempPIValues As List(Of PIValue) = New List(Of PIValue)

For Each v As PIValue In tempPIValuesCOM
Next
```

By default, PI ACE is courteous and adjusts for clock drift between server and client. In contrast, the PISDK by default does not adjust for clock drift. Because in this case we end up using the timestamps only for matching temperature and concentration values, I don't care if the values end up as server time or client time - just as long as they're uniform. The quickest path around this is to disable clock offset adjustment for concentration in PI ACE:

```'Disable clock offset adjustment for PI ACE point wrapper
```

## Filtering PI Values with LINQ

Now that we have values from PI for temperature and concentration, at matching timestamps, we're ready to go, right? No! It turns out Alglib isn't very happy if any of the data is bad, if we even get past the process of casting the values as double (the desired datatype). So we need to filter the data down to the pairs when values where temperature and pressure are both good.

Lambda expressions (someone's name for inline anonymous functions) allow us to define ad-hoc functions with minimal extra code. When doing something simple like filtering a set based on some conditions, this is invaluable. The VB incarnation of Lambdas is a bit restricted versus the full C# implementation, and not as widely documented, but there's a good introductory blog post and Microsoft's bank of LINQ examples is a good implicit introduction. Below, we're using some friendly enough syntax to filter the temperature/concentration values collections to only the items where both temperature and concentration have a valid value, i.e. when IsGood().

```        'Filter out bad values
Dim goodTempPIValues = tempPIValues.Where(Function(temp, i) temp.IsGood() And concPIValues(i).IsGood())
Dim goodConcPIValues = concPIValues.Where(Function(conc, i) conc.IsGood() And tempPIValues(i).IsGood())
```

## Calling the statistical correlation function

Okay, NOW we're ready - we have collections of synchronized, good PI values and can call Alglib's Pearson correlation function.

```'Generate value arrays of ONLY the actual data values (float32) stripped out of their PIValue containers
Dim goodTempValues = goodTempPIValues.Select(Function(v) CDbl(v.Value)).ToArray()
Dim goodConcValues = goodConcPIValues.Select(Function(v) CDbl(v.Value)).ToArray()

'Call Alglib Pearson correlation function
Dim r As Double = alglib.pearsoncorr2(goodTempValues, goodConcValues)

'Write output value to PI
BA_Correlation_1.Value = r
```

There you have it - an external math library included in a scheduled PI ACE calculation. This example was contrived to demonstrate functionality, so isn't the soundest of engineering feats. We aren't doing any sort of weighting to the values, so the calculation is event weighted. We could have done our data retrievals as interpolations  for the same range and interval and thus we return to the realm of time weighted calculations. In calculations you create, don't underestimate the magnitude of effects which time vs. event weighting can have. And there are other ways to work the LINQ magic which might end up more efficient - perhaps zipping the temperature/concentration values side-by-side and then doing a single filtering pass. If anyone plays with this and has an improvement, please share in the comments!

## Full code

```Imports OSIsoft.PI.ACE
Imports PISDK
Imports PISDKCommon
Imports PITimeServer
Imports System.Linq
Imports System.Collections.Generic

'Project references:
'OSIsoft.PISDK.dll
'OSIsoft.PISDKCommon.dll
'OSIsoft.PITimeServer.dll
'alglibnet2.dll

Public Class Correlation
Inherits PIACENetClassModule
Private BA_Temp_1 As PIACEPoint
Private BA_Correlation_1 As PIACEPoint
Private BA_Conc_1 As PIACEPoint
'
'      Tag Name/VB Variable Name Correspondence Table
' Tag Name                                VB Variable Name
' ------------------------------------------------------------
' BA:Conc.1                               BA_Conc_1
' BA:Correlation.1                        BA_Correlation_1
' BA:Temp.1                               BA_Temp_1
'

Private PISDKroot As PISDK.PISDK
Private BA_Temp_1_PIPoint As PIPoint

Public Overrides Sub ACECalculations()

'Native .NET data structures to hold PI value query results
Dim concPIValues As List(Of PIValue) = New List(Of PIValue)
Dim tempPIValues As List(Of PIValue) = New List(Of PIValue)

'Get values for Concentration using friendly PI ACE functionality
Dim concPIValuesCOM As IPIValues2 = BA_Conc_1.Values("*-1h", "*", BoundaryTypeConstants.btInside)

'We would like to enumerate/filter the PIValues, but
'PISDK.PIValues only likes enumerating in ForEach loops.
'Other attempts at using this as an IEnumerable are futile.
'So here, we manually add the contents to a .NET List(Of PIValue)
For Each v As PIValue In concPIValuesCOM
Next

'Use PI SDK directly to get interpolated temperature value at each concentration value timestamp
Dim concTimes = From v In concPIValues Select (v.TimeStamp)
Dim tempPIValuesCOM As IPIValues2 = BA_Temp_1_PIPoint.Data.TimedValues(concTimes.ToArray())

For Each v As PIValue In tempPIValuesCOM
Next

Dim goodTempPIValues = tempPIValues.Where(Function(temp, i) temp.IsGood() And concPIValues(i).IsGood())
Dim goodConcPIValues = concPIValues.Where(Function(conc, i) conc.IsGood() And tempPIValues(i).IsGood())

'Generate value arrays of ONLY the actual data values (float32) stripped out of their PIValue containers
Dim goodTempValues = goodTempPIValues.Select(Function(v) CDbl(v.Value)).ToArray()
Dim goodConcValues = goodConcPIValues.Select(Function(v) CDbl(v.Value)).ToArray()

'Call Alglib Pearson correlation function
Dim r As Double = alglib.pearsoncorr2(goodTempValues, goodConcValues)

'Write output value to PI
BA_Correlation_1.Value = r

End Sub

Protected Overrides Sub InitializePIACEPoints()
BA_Conc_1 = GetPIACEPoint("BA_Conc_1")
BA_Correlation_1 = GetPIACEPoint("BA_Correlation_1")
BA_Temp_1 = GetPIACEPoint("BA_Temp_1")
End Sub

'
' User-written module dependent initialization code
'
Protected Overrides Sub ModuleDependentInitialization()

'Disable clock offset adjustment for PI ACE point wrapper

'Get PIPoint object corresponding to Temperature PIACEPoint
PISDKroot = New PISDK.PISDK
BA_Temp_1_PIPoint = PISDKroot.Servers(BA_Temp_1.Server).PIPoints(BA_Temp_1.Tag)

End Sub

'
' User-written module dependent termination code
'
Protected Overrides Sub ModuleDependentTermination()

'Dispose of COM objects
BA_Temp_1_PIPoint = Nothing
PISDKroot = Nothing

End Sub
End Class
```