rdavin

Aggregating Event Frame Data Part 7 of 9 - GroupedSummary per Manufacturer

Blog Post created by rdavin Employee on May 11, 2017

The Advanced AF SDK lab at UC SF 2017 was on this very topic.  The material in this 9-part series follows much of that lab which showcases AFEventFrameSearch methods new to PI AF SDK 2.9.

 

Blog Series: Aggregating Event Frame Data

Part 1 - Introduction

Part 2 - Let's Start at the End

Part 3 - Setting up the App

Part 4 - Classical FindEventFrames

Part 5 - Lightweight FindObjectFields

Part 6 - Summary per Model

Part 7 - GroupedSummary per Manufacturer

Part 8 - Compound AFSummaryRequest

Part 9 - Conclusion

 

Query One Level Up

GroupedSummary and Summary have something in common.  They both require a priori knowledge of what you will be summarizing before you can actual summarize it.  For Summary, this required summarizing per the inner loop of Model, which required 3 calls (one for each of our 3 models).  For GroupedSummary, we can reduce the number of calls to the server by making a call on the the outer loop per Manufacturer.  While we do need to know the manufacturers to filter upon for GroupedSummary, we don't need to know the models.

 

We will do something similar as we did the Summary in Part 6:

  • Get a priori list of Manufacturers
  • Build a new token for the given Manufacturer
  • Issue the GroupedSummary call
  • Peel back the results to feed to my DurationStats and StatsTracker

 

public void GetSummaryByMfrAndModel(StatsTracker summary, AFDatabase database, IList<AFSearchToken> baseTokens)
{
    //Absolutely critical to have a priori list of Manufacturers
    var mfrList = summary.Keys.ToList();

    foreach (var mfr in mfrList)
    {
        var tokens = baseTokens.ToList();
        tokens.Add(new AFSearchToken(AFSearchFilter.Value, mfr, "|Manufacturer"));

        using (var search = new AFEventFrameSearch(database, "GroupedSummary Example", tokens))
        {
            //Opt-in to server side caching
            search.CacheTimeout = TimeSpan.FromMinutes(5);

            //While we eventually want an average, it will be calculated from Total and Count.
            var desiredSummaryTypes = AFSummaryTypes.Count | AFSummaryTypes.Total;
            var groupedField = "|Model";
            var summaryField = "Duration";

            var perMfr = search.GroupedSummary(groupedField, summaryField, desiredSummaryTypes);

            foreach (var grouping in perMfr.GroupedResults)
            {
                var model = grouping.Key.ToString();
                var totalVal = grouping.Value[AFSummaryTypes.Total];
                var countVal = grouping.Value[AFSummaryTypes.Count];

                var stats = new DurationStats();

                if (countVal.IsGood)
                {
                    stats.Count = countVal.ValueAsInt32();
                    if (totalVal.IsGood)
                    {
                        stats.TotalDuration = ((AFTimeSpan)totalVal.Value).ToTimeSpan();
                    }
                    summary.AddToSummary(mfr, model, stats.TotalDuration, stats.Count);
                }
            }
        }
    }
}

 

While we did have some similarities, where we invoked the server call is very different.  Here with GroupedSummary, we make the call in our outer loop so we will have less trips to the server.  For Summary in Part 6, we made the call inside the inner loop.  Also the returned results are quite different, though the concept of what we do with them is the same: peel back the returned dictionary accordingly and have them conform to my output objects.

 

The metrics shown in Part 2 would make you think GroupedSummary is faster than Summary.  In general, this is really not true.  For my particular use case it is true, but that's because there are more server calls that my app is making to Summary than for GroupedSummary.  Do not walk away thinking you would want to avoid Summary.  Instead, you should not hesitate to use it for a better use case.

 

Metrics Comparison (from Part 2)

The numbers below are from a 2-core VM using Release x64 Mode.  The smaller values are better.  Caution that we sometimes have a difference in UOM between MB and KB, but I will bold KB when needed.

 

Resource Usage:

Values displayed are in MB unless noted otherwise

Method

Total GC Memory (MB)

Working Set Memory (MB)Network Bytes Sent
Network Bytes Received
FindEventFrames145.48257.089.13 MB190.08 MB
FindObjectFields1.2865.555.00 KB3.68 MB
Summary2.5455.358.58 KB261.81 KB
GroupedSummary9.8664.286.24 KB1.98 MB
AFSummaryRequest7.2965.365.00 KB3.68 MB

 

Performance:

MethodClient RPC CallsClient Duration (ms)Server RPC CallsServer Duration (ms)Elapsed Time
FindEventFrames12063337.011039118.102:27.8
FindObjectFields105360.8114547.600:06.0
Summary159484.6169310.900:10.1
GroupedSummary125527.2134938.500:06.2
AFSummaryRequest102992.2102222.200:03.7

 

 

BONUS: GroupedSummary Using ONE Call

Let's come up with a better use case where we only need to issue one call.  Allow me once again to temporarily change my requirements on the end report, purely for illustration purposes.  Let's imagine I no longer am interested in the average and counts per manufacturer and model.  Instead I want to summarize the same data set as a whole but I only care about models.  In this new scenario I have absolutely no concern about manufacturers.  The new report would look like:

 

Manufacturer  Model            Count Avg Duration

------------- ------------ --------- ----------------

<Any>         DQ-M0L           8,136 03:53:21.4859882

<Any>         Nimbus 2000      1,499 03:44:28.8192128

<Any>         SWTG-3.6        13,678 03:53:35.3165667

------------- ------------ --------- ----------------

            1            3    23,313

 

For the code to do that, I don't need to initialize my summary object to populate itself from an AFTable.

 

//I still use StatsTracker for conformity but we don't need to initialize this from our AFTable  
var summary = new StatsTracker();  

 

That is the summary instance I will pass to my new method, which now eliminates 1 level of looping.  However, I will need to later sort the results so I am going to pass summary by ref.

 

public void GetSummaryByMfrAndModel(ref StatsTracker summary, AFDatabase database, IList<AFSearchToken> tokens)
{
    summary = new StatsTracker();

    //In this bonus test, we want to only issue one GroupedSummary call.
    //Rather than rigorously issue separate calls per Manufacturer, I instead isssue one call on grouped on Model for all Manufacturers.
    //The downside is I lose the individual Manufacturer names.

    using (var search = new AFEventFrameSearch(database, "GroupedSummary Example", tokens))
    {
        //Opt-in to server side caching
        search.CacheTimeout = TimeSpan.FromMinutes(5);

        //While we eventually want an average, it will be calculated from Total and Count.
        var desiredSummaryTypes = AFSummaryTypes.Count | AFSummaryTypes.Total;
        var groupedField = "|Model";
        var summaryField = "Duration";

        var groupedSummary = search.GroupedSummary(groupedField, summaryField, desiredSummaryTypes);

        foreach (var grouping in groupedSummary.GroupedResults)
        {
            var model = grouping.Key.ToString();
            var totalVal = grouping.Value[AFSummaryTypes.Total];
            var countVal = grouping.Value[AFSummaryTypes.Count];

            var stats = new DurationStats();

            if (countVal.IsGood)
            {
                stats.Count = countVal.ValueAsInt32();
                if (totalVal.IsGood)
                {
                    stats.TotalDuration = ((AFTimeSpan)totalVal.Value).ToTimeSpan();
                }
                summary.AddToSummary("<Any>", model, stats.TotalDuration, stats.Count);
            }
        }
    }
    //Sort the results.  They have the same Manufacturer "<Any>" but the Models will be alphabetical.
    summary = summary.SortByKeys();
}

 

I am expecting to get back 3 rows, so I sort the results before returning from my method.  Let's review the metrics with making that one bonus GroupedSummary call and let's compare that to making one bonus Summary call.

 

Metric
SummaryGroupedSummary
Total GC Memory (MB)4.4812.24
Working Set Memory (MB)52.4861.84
Network Bytes Sent4.77 KB4.85 KB
Network Bytes Received260.02 KB1.98 KB
Client RPC Calls1010
Client Duration (ms)534.01913.7
Server RPC Calls1010
Server Duration (ms)353.81472.8
Elapsed Time00:01.100:02.6

 

For the right use cases, both of these methods are extremely fast, and should be welcome in your tool bag.  Don't shy away from using Summary or GroupedSummary because one table shows sluggish performance.  Use your noggin and pick the right tool for the right job.  The emphasis should be on producing the desired results with the fewest trips to the server.

 

 

Up Next: Name That Tune In One Call

Putting aside the bonus section, let's return to the original report by Manufacturer and Model.  To repeat the pattern you should have witnessed in the progression of each method in parts 4 - 7.

Part 4: Heavy detail records

Part 5: Light detail records

Part 6: Aggregation per inner loop

Part 7: Aggregation per outer loop

 

For each successive example we were making fewer calls or receiving fewer records.  The good news is that Summary and GroupedSummary are downright miserly on resources consumed.  The bad news is this whole a priori knowledge requirement as well as making multiple calls which degrades performance.  Wouldn't it be great to be able to make only ONE call and to do so without knowing what the heck we want to summarize in the first place?  That will be covered in Part 8.

Outcomes