Some customers have the occasional but pronounced need to retrieve millions upon millions of recorded values for a given PIPoint.  There is no method directly with the PI AF SDK to address this need.  Virtually every data method within the SDK has 2 known limitations:

 

  1. An operations timeout should a single data call take too long to fulfill, and
  2. No more than ArcMaxCollect values may be returned in a single data call, where ArcMaxCollect is a tuning parameter on your PI Data Archive.

 

One should not modify ArcMaxCollect on a whim.  There's a reason why the defaults are what they are (PI Server 2012+ the default is 1.5 million, earlier versions are 150K).  You would not be increasing it just for you.  The change applies to all users.  How confident are you that your own users won't be trying to fetch 200 million data values at once?  There is a workaround that you may prudently use in your code in lieu of increasing ArcMaxCollect.

 

GitHub - Rick-at-OSIsoft/pipoint-getlargerecordedvalues: A workaround that removes the limitation of ArcMaxCollect, and …

 

The GitHub repository has code versions for C# and VB.NET.

 

In a nutshell, how does this workaround get past the 2 known limitations?  By retrieving the AFValues in pages of 100K values at a time.  This is well below the default value for ArcMaxCollect, and retrieving 100K values is easily satisfied within the operations timeout.  Remember: you would want to use the GetLargeRecordedValues method when you know you have PIPoint(s) with millions and millions of recorded values.

 

Usage

Given you have a PIPoint instance in an object named tag and a defined AFTimeRange in timeRange:

 

C# Example

var values = tag.GetLargeRecordedValues(timeRange);
foreach (var value in values)
 {
    // do something with value
}

 

 

VB.NET Example

Dim values = tag.GetLargeRecordedValues(timeRange)
For Each value As AFValue in values
    ' do something with value
Next 

 

 

Cautions and What to Avoid

The biggest caution to acknowledge is that you have massive amounts of data.  The traditional best practices such as using bulk PIPointList will no longer apply.  You are in a different world if you want to loop over a tag that has 200 million data values.  There are considerations well outside the scope of AF SDK.  Two notable instances are timing and memory.  Retrieving 200 million values will be done as quickly as possible, but you have to accept that it still takes some time to consume that much data.  So the concept of "fast" goes out the window.  And if you attempt to retrieve those values into a list or array, .NET will most likely give you an out-of-memory exception long before all the values are retrieved.

 

Therefore it's best to avoid memory-hogging calls such as LINQ's ToList() or ToArray().  The best performance is to consume the values as they are being streamed without any attempt to persist to an indexed collection.  Another performance killer is using Count().  I have seen a lot of traditional applications that attempt to show a value count before looping over the values.  Since its a streamed enumerable set that isn't persisted in memory, issuing a Count() has a very negative performance consequence of retrieving and counting all the AFValues.  Subsequent passes through a loop will then require the time-consuming process of retrieving the data a second time!  If you want to display a count in your logs because it's something nice to do, you should maintain your independent count while you are looping the first time, and then display that nice count at the end rather than the beginning.

 

C# Example

var values = tag.GetLargeRecordedValues(timeRange);
int count = 0;
foreach (var value in values)
{
     ++count;
    // do something with value
}
Console.WriteLine("{0} count = {1:N0}", tag.Name, count);