Aggregating Event Frame Data Part 2 of 9 - Let's Start at the End

Blog Post created by rdavin Employee on May 10, 2017

The Advanced AF SDK lab at UC SF 2017 was on this very topic.  The material in this 9-part series follows much of that lab which showcases AFEventFrameSearch methods new to PI AF SDK 2.9.


Blog Series: Aggregating Event Frame Data

Part 1 - Introduction

Part 2 - Let's Start at the End

Part 3 - Setting up the App

Part 4 - Classical FindEventFrames

Part 5 - Lightweight FindObjectFields

Part 6 - Summary per Model

Part 7 - GroupedSummary per Manufacturer

Part 8 - Compound AFSummaryRequest

Part 9 - Conclusion


The Final Output Report

In Part 3, we'll cover more about the AF objects being used by the 5 different applications, all of which will produce the exact same report but use completely different methods to do so.  The report we want to generate will show the count of event frames and average duration per manufacturer and model.  Here's what the desired report looks like:


Manufacturer  Model            Count Avg Duration

------------- ------------ --------- ----------------

Cervantes     DQ-M0L           8,136 03:53:21.4859882

Sailr         Nimbus 2000      1,499 03:44:28.8192128

Sailr         SWTG-3.6        13,678 03:53:35.3165667

------------- ------------ --------- ----------------

            2            3    23,313


Keep in mind this is a simple example with a small dataset.  Granted each Model currently happens to be unique to the entire dataset but we are going to assume that in the near future there may be a "Nimbus 2000" model to be offered by Cervantes.  This means we can't be lazy in our programming.  And by lazy, I don't mean deferred execution, but rather sloppy.  Instead we must be rigorous in our programming to truly count by a Model within a Manufacturer.  You are invited to think about how this would be done in AF 2.8, and in Part 4 we will go over one such implementation.


Different Methods to Produce Same Report

In Parts 4 - 8 we are going to dedicate each part to a different method that produces the exact same report.  Mind you, calling 5 different methods means we must consider 5 different ways of how we formulate a call to the respective method, plus the different type of objects that are returned from each method.  In order, the 5 apps will be dedicated to these 5 AFEventFrameSearch methods:

  1. FindEventFrames (Part 4)
  2. FindObjectFields (Part 5)
  3. Summary (Part 6)
  4. GroupedSummary (Part 7)
  5. Composite AFSummaryRequest call (Part 8 and not really an AFEventFrameSearch call)


This apps will not necessarily be presented to you in order of slowest to fastest, nor in terms of biggest resource hog to skimpiest.  Rather I present them in terms of how many rows of data is returned:

  1. FindEventFrames, returns all records with fairly heavyweight objects
  2. FindObjectFields, returns all records but with lightweight classes of just the desired values
  3. Summary, issues one aggregate call per Model (total of 3 calls for sample dataset)
  4. GroupedSummary, issues one aggregate call Per Manufacturer (total of 2 calls)
  5. Composite AFSummaryRequest call, issues one aggregate call for entire dataset (total of 1 call)


If you want to know why we just don't skip to the last one since making one call seems to make the most sense, I would caution that it is a bit more complicated to call.  Plus we really learn a bit more about the other methods because they will definitely have a place in your bag of tricks.  Furthermore, each method offers different benefits that we are about to explore, and we would miss out on an opportunity for such comparisons if we skipped to the last one.


Based on my output requirements, I will be shoehorning Summary and GroupedSummary to conform to my contrived output needs.  You will see in their respective parts that I am forced to make multiple calls.  In this respect, it's not a great use case.  To make up for this, I will show some bonus code and metrics for use cases more tailormade for those respective methods in their associated parts.


Metrics Comparison

The numbers below are from a 2-core VM using Release x64 Mode.  The smaller values are better.  Caution that we sometimes have a difference in UOM between MB and KB, but I will bold KB when needed.


Resource Usage:

Values displayed are in MB unless noted otherwise


Total GC Memory (MB)

Working Set Memory (MB)Network Bytes Sent
Network Bytes Received
FindEventFrames145.48257.089.13 MB190.08 MB
FindObjectFields1.2865.555.00 KB3.68 MB
Summary2.5455.358.58 KB261.81 KB
GroupedSummary9.8664.286.24 KB1.98 MB
AFSummaryRequest7.2965.365.00 KB3.68 MB



MethodClient RPC CallsClient Duration (ms)Server RPC CallsServer Duration (ms)Elapsed Time


The above tables are quite enlightening but don't jump to premature conclusions.  For instance, one may be tempted to proclaim that the GroupedSummary method is faster than the Summary method.  That's not true.  You will see later that my application requires me to make 2 GroupedSummary calls and 3 Summary calls.  , but there is an extra method call involved.  I also tested my app out and issued 3 GroupedSummary calls in lieu of Summary and it took 5 seconds longer.  The lesson here is to try to make the fewest calls to the server as possible.  What if we had 50 Manufacturers and each one had 3 Models each?  Then we would need 150 calls for Summary, 50 calls for GroupedSummary, but still only 1 call for the compound AFSummaryRequest.  My best advice is to avoid making too many calls if there is a better way available.


Later in Part 6, I will temporarily change my output requirements to show bonus numbers where I issue one and only one Summary call.  Bottom line: it takes only 534 ms for the client RPC duration, and a blazing 1.1 seconds total elapsed time.  Still think Summary is slow?  Not for the right use case, it isn't.


I offer a similar bonus in Part 7 as well, where I issue one and only one GroupedSummary call.  Client RPC duration is 1971 ms and total elapsed time is 2.6 seconds.


Explaining the Performance Boost

The first time the performance metrics were shown in the lab, a hearty discussion followed.  Why the big difference?  It's not due to caching.  You will see later that every exercise, including FindEventFrames, implements server-side caching.  The question isn't really about why the new methods are faster, but really more it's more of why is the older method slower.


The older method is very heavy.  All we need for each event frame is each Manufacturer (string), Model (string), and Duration (AFTimeSpan).  But FindEventFrames(fullLoad: true) brings back so much more.  It brings back every property, every attribute, and every referenced element for every event frame.  And because our event frames were generated from an Analysis, it also brings over the Analysis, which includes parsing every expression in the Analysis where it spends some time deciding whether something enclosed in single quotes is 'attribute' or a 'time'.  It is a performance drain to perform the serialization of everything from SQL Server as it makes its way to the client workspace as AF objects.


The newer methods are far more lightweight.  You will only be getting back just the skinny bit of data you've asked for.  That would explain the reduced RPC calls, and therefore reduced execution time, as well as smaller footprint of resources.


Lose Wait Now, Ask Me How!

I'm teasing you because there's lots more to cover in this series.  You have more a lot more reading time to invest.  Hopefully the metrics savings I've shown you will convince you to stay tuned for more.  The next parts will get longer, and we have 2 more to cover before we even get to the new methods!  But now you have full expectations of the benefit that can be realized from these new methods, so it should be work sticking around.


Up next in Part 3, we discuss the AF objects we will be working with in all the later parts.