AnsweredAssumed Answered

AF SDK performance, serial vs. parallel vs. bulk

Question asked by PetterBrodin on Dec 15, 2014
Latest reply on Dec 18, 2014 by PetterBrodin

AF SDK version: 2.6.1.6238

PI Server version: Latest 2012 (3.4.390.16 patched to 3.4.390.18)

 

After seeing some unexpected performance behaviour in an application we're developing, partially outlined in this thread, I decided to do some more extensive testing of how bulk calls actually perform, in particular RecordedValues and Snapshot.

 

According to Ryan Gilbert in the other thread, "The bulk calls are all enabled on 390.16 and greater, except for the summaries calls. Due to an issue with error reporting, they were disabled on the client side until version 390.19."


This matches what can be found in the AFSDK reference (2.6.0.5843) about PIPointlist.RecordedValues:

"Important

This method will use a single bulk Remote Procedure Call if the PI Server supports it, otherwise it will issue individual RPCs in parallel. Results are available for enumeration as they returned from the PI Server."


Test setup

I created a small C# program using the AF SDK that does the following:

  1. Connects to a PIServer
  2. If they don't already exist, create 100 PIPoints named from 1 - 100
  3. After creating the PIPoints, inserts a total of 6.7 million events to the points. The data for each point simulates a sine wave, where the sample rate, frequency, number of bad values and duration varies from point to point
  4. Reads the config to see whether it should use a bulk call, parallel calls or serial calls
  5. Reads the config to see how many times it should execute each call
  6. Queries the snapshot value for each of the points, using the call type specified in 4, the number of times specified in 5. To make sure there's nothing left to be enumerated, each snapshot value is read to a StringBuilder. Logs the time each execution takes in milliseconds
  7. Queries all the data from 2013-01-01 to 2014-01-01 for all the points, using RecordedValues with the call type specified in 4, the number of times specified in 5. To make sure there's nothing left to be enumerated, the name of each PIPoint and the number of values is read to a StringBuilder. Logs the time each execution takes in milliseconds

 

I've run the program 6 times. 1 time for each type of call on my local machine connecting to the PI Server across a network with some latency, and 1 time for each type of call with the program running directly on the PI Server.

 

Results

Snapshot on server

RunParallelSerialBulk with parallel reading

1

259334283
2201009
3231138
421908
5209011
Total343727319
Average68.6145.463.8

 

This behaves as I'd expect: with low latency due to the program running directly on the server, parallel isn't lagging a lot behind bulk, with serial being somewhat slower. There also appears to be some caching/query optimization going on with all three methods, as the first call is consistently slower.

 

Snapshot over network

RunParallelSerialBulk with parallel reading

1

43740296
225025911
324923113
425923412
527224711
Total14671371143
Average293.3274.628.6


Here we can really see the value of bulk calls. The parallel and serial calls behave similarly due to both needing expensive round-trips (though I expected parallel to outperform serial), while bulk leaves them all in the dust. As with the previous example, there seems to be some magic happening behind the scenes, making subsequent calls faster. I guess with further testing, one could query just a subset of all the tags in each call.

 

RecordedValues on server

RunParallelSerialBulk with parallel reading

1

70042166321605
270842046021454
366512125021718
4765421265

21838

583551963421536
Total36748104272108151
Average7349.620854.421630.2

 

 

This is not expected at all. Not only does the bulk call get its ass kicked by parallel calls, it performs as badly as serial calls, which calls into question what the documentation says about bulk calls defaulting to parallel RPCs if parallel is not available.

 

RecordedValues on client

RunParallelSerialBulk with parallel reading

1

118052325920388
2115752275420113
3111442130320053
41173121246

19091

5108672630219662
Total5712211486499307
Average11424.422972.819861.4

 

This is pretty much the same result as above, with a few seconds tacked on due to network latency.

 

So, what is going on here? Isn't bulk RPCs enabled for RecordedValues calls, or is there some other voodoo magic at work?


Thanks in advance for all help!

Outcomes