14 Replies Latest reply on Jan 14, 2015 1:37 AM by ling

    Certain FilteredSummaries results in AFSDK aren't matching PISDK results

    kilgored

      I have an application that has been under test and recently some of the tests failed miserably, as the resulting values do not match expectations. After reviewing the code and not understanding why things could be wrong I did the unthinkable - I challenged the validity of the results from the AFSDK. In doing so, I have found certain cases where FilteredSummaries calls, on either AFAttribute or PIPoint, do not produce the same results as the PISDK.

       

      Here are some details, please show me where my blind spot is in finding the real problem, or confirm that it is truly an issue with the AFSDK RDA methods.

       

      The problem exists in both .NET 4.0 and 4.5 projects that use the AFSDK 2012 (2.5.2.5396) or 2014 (2.6.0.5843) when compared to several PISDK versions 1.4.0.416 and 1.4.2.445. I can reproduce it on my DEV system (PI 2012 - 3.4.390.16, AF 2012 - 2.4.0.4431) and another TEST system (PI 2012 - 3.4.0.390.16, AF 2014 - 2.6.0.5843). In each case, regardless of SDK versions, the PISDK always returns the same result - which is always different from the AFSDK results (that also match themselves from version to version).

       

      The scenario is getting a filtered average for a single tag where the filter is a boundary check on that tag itself. In my DEV system, I am using CDT158, in the TEST system it is a contrived dataset specific to the test being run. The filter is simple, using tag expression formatting it is:

       

      'cdt158' >= 60 and 'cdt158' < 100
      

       

      Where the limits (just for illustration) of 60 and 100 are values that get substituted in, based upon the measurement criteria of the application and test. The goal of the result is to produce a time weighted average that excludes outliers. We also request the Count summary, which provides us with a gauge of how much data was bad or unreasonable during this period.

       

      The results between the AFSDK and PISDK match perfectly as long as the data is all good and all within limits, or as long as only bad data is intermixed with data that is within the limits. However, the results from a period where outliers exist is a different story - the AFSDK results appear to be proportionally skewed towards the outliers. For example, if the limits are 1 and 2, and some data within the time range is -10000, the AFSDK result becomes a negative number - whereas the PISDK result is within the range of 1-2 (which makes sense because all of the other data should have been filtered from the averaging).

       

      So, I created a simple C# query test application to make it simpler to verify my theory. It isn't pretty or factored for error handling, but it is effective.

       

      using System;
      using System.Linq;
      using OSIsoft.AF.Asset;
      using OSIsoft.AF.Data;
      using OSIsoft.AF.Time;
      
      namespace QueryTest
      {
          class Program
          {
              static void Main(string[] args)
              {
                  string coreFilter = "'{0}' >= {1} AND '{0}' < {2}";
      
                  string attPath = args[0];
                  string startTime = args[1];
                  string endTime = args[2];
                  string min = args[3];
                  string max = args[4];
      
                  var afAtt = AFAttribute.FindAttribute(attPath, null) as AFAttribute;
                  var afPoint = afAtt.PIPoint;
                  var piPoint = afAtt.RawPIPoint as PISDK.PIPoint;
                  var ptData = piPoint.Data as PISDK.IPIData2;
                  
                  var afRange = new AFTimeRange(startTime, endTime);
                  var afSpan = new AFTimeSpan(afRange.Span);
                  var afFilter = string.Format(coreFilter, ".", min, max);
                  var ptFilter = string.Format(coreFilter, piPoint.Name, min, max);
      
                  AFValues afAvg, afPct;
                  PISDK.PIValues piAvg, piPct;
      
                  var attDict = afAtt.Data.FilteredSummaries(afRange, afSpan, afFilter, AFSummaryTypes.Average | AFSummaryTypes.Count, AFCalculationBasis.TimeWeighted, AFSampleType.ExpressionRecordedValues, AFTimeSpan.Zero, AFTimestampCalculation.Auto);
                  afAvg = attDict[AFSummaryTypes.Average];
                  afPct = attDict[AFSummaryTypes.Count];
                  Console.WriteLine("{0} :: {1} @ {2}", afAtt.GetPath(), afFilter, afAvg.First().Timestamp);
                  Console.WriteLine("{0:0.000} :: {1:0.000}", afAvg.First().Value, afPct.First().Value);
                  
                  var ptDict = afPoint.FilteredSummaries(afRange, afSpan, ptFilter, AFSummaryTypes.Average | AFSummaryTypes.Count, AFCalculationBasis.TimeWeighted, AFSampleType.ExpressionRecordedValues, AFTimeSpan.Zero, AFTimestampCalculation.Auto);
                  afAvg = ptDict[AFSummaryTypes.Average];
                  afPct = ptDict[AFSummaryTypes.Count];
                  Console.WriteLine("{0} :: {1} @ {2}", afPoint.Name, ptFilter, afAvg.First().Timestamp);
                  Console.WriteLine("{0:0.000} :: {1:0.000}", afAvg.First().Value, afPct.First().Value);
      
                  var ptNVS = ptData.FilteredSummaries(afRange.StartTime.UtcSeconds, afRange.EndTime.UtcSeconds, null, ptFilter, PISDK.ArchiveSummariesTypeConstants.asAll, PISDK.CalculationBasisConstants.cbTimeWeighted, PISDK.FilterSampleTypeConstants.fstPIPointRecordedValues);
                  piAvg = ptNVS["Average"].Value as PISDK.PIValues;
                  piPct = ptNVS["Count"].Value as PISDK.PIValues;
                  Console.WriteLine("{0} :: {1} @ {2}", piPoint.Name, ptFilter, piAvg[1].TimeStamp.LocalDate);
                  Console.WriteLine("{0:0.000} :: {1:0.000}", piAvg[1].Value, piPct[1].Value);            
              }
          }
      }
      

       

      When run from the command line, provided the parameters of attribute path, start time, end time, minimum, and maximum - it queries the attribute and underlying point using each of the three possible methods within the two SDK's abd outputs them to the screen. For example:

       

      QueryTest "\\myafserver\mydatabase\myelement|myattribute" "7-jan-15 11:00" "7-jan-15 11:10" 60 100
      

       

      Note that the quotes around the attribute path are required due to the use of the pipe symbol, and around the timestamps because they contain spaces.

       

      Any ideas on what I'm missing or where I'm going wrong?

       

      Thanks,

       

      Dennis

        • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
          pthivierge

          Hi Denis,

           

          Thanks for reporting this.

          I will contact members of our PI Data Access Team and ask them to have a look.

          And by the way I really like your code snippet, that will be very useful, thanks for that

          • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
            gregor

            Hi Dennis,

             

            I've tested you code against the RDA version of AF SDK 2.6.1.6238 and PISDK 1.4.4.484 with some slide modifications of your code. I am starting to become old and need better visibility .

            With the limit of 1 and 2 and an archive event of -10000 within the time range the result I am getting looks as follows:

            39785_01.jpg

            The message returned by AF SDK [-11101] All data events are filtered in summary calculation is pretty clear. All events have been filtered out, so there's no data for calculation.

            -2147219947 returned by PISDK translates to Calculation failed and is returned for the same reason.

            Can you please verify if you are getting the same results?

            CDT158 is serviced by Random interface and by nature reports differently on each PI Data Archive. It's always difficult to compare results of summary calls without knowing the raw data exactly. If you are able to reproduce any inconsistency, can you please share the corresponding raw data to make it easier for us to understand what's going on or to even reproduce the issue?

              • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                kilgored

                Gregor,

                 

                If I use this dataset:

                65.9493407-Jan-15 22:58:14
                60.5211907-Jan-15 23:00:14
                65.751407-Jan-15 23:01:14
                61.8324507-Jan-15 23:02:14
                61.7374207-Jan-15 23:04:14
                -100000007-Jan-15 23:04:44
                59.7959807-Jan-15 23:08:44
                60.975407-Jan-15 23:10:44

                 

                I get this result when I query the "7-Jan-15 23:00" to "7-Jan-15 23:10" period with limits of 60 and 100:

                \\DEVServ\TestDB\Element1|Attribute1 :: '.' >= 60 AND '.' < 100 @ 1/7/2015 11:00:00 PM

                -52757.719 :: 284.000

                CDT158 :: 'CDT158' >= 60 AND 'CDT158' < 100 @ 1/7/2015 11:00:00 PM

                -52757.719 :: 284.000

                CDT158 :: 'CDT158' >= 60 AND 'CDT158' < 100 @ 1/7/2015 11:00:00 PM

                62.443 :: 284.000

                 

                Whereas, when I change the -1000000 in that dataset to 0, the results become:

                \\DEVServ\TestDB\Element1|Attribute1 :: '.' >= 60 AND '.' < 100 @ 1/7/2015 11:00:00 PM

                59.182 :: 284.000

                CDT158 :: 'CDT158' >= 60 AND 'CDT158' < 100 @ 1/7/2015 11:00:00 PM

                59.182 :: 284.000

                CDT158 :: 'CDT158' >= 60 AND 'CDT158' < 100 @ 1/7/2015 11:00:00 PM

                62.443 :: 284.000

                 

                The PISDK results do not change based upon the outlier, however the AFSDK results do - in relation to the magnitude of the outlier.

              • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                bshang

                Hi Dennis,

                 

                I was able to get similar behavior. Here was my setup. Using Tag "testFilter" with Step=Off

                 

                values.PNG

                 

                Output:

                 

                \\BSHANGE6430S\TestFilter\Element1|Attribute1 :: '.' >= 5 AND '.' < 15 @ 1/8/2015 12:00:00 AM

                9.286 :: 25200.000

                testFilter :: 'testFilter' >= 5 AND 'testFilter' < 15 @ 1/8/2015 12:00:00 AM

                9.286 :: 25200.000

                testFilter :: 'testFilter' >= 5 AND 'testFilter' < 15 @ 1/8/2015 12:00:00 AM

                10.000 :: 25200.000

                 

                 

                The PISDK filters out the "0" and computes the time average as 70/(7 hrs) = 10.

                 

                The AFSDK is doing some interpolation near the 0 and is probably computing the average as (10+10+10+5+10+10+10)/7 = 9.286.

                 

                I'm not exactly sure if this type of behavior is by design but I know there are some subtleties regarding how interpolation is performed when a value is filtered depending on the method that is called.

                 

                I suspect we might get an agreement if we used expressions instead.

                PI SDK - IPICalculation.ExpressionSummaries()

                AF SDK - AF.Data.AFCalculation.CalculateSummaries()

                 

                Perhaps Arnold Woodall or Fred Zhang can chime in.

                 

                Barry

                1 of 1 people found this helpful
                  • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                    kilgored

                    Barry,

                     

                    Thanks for confirming the behavior, now to understand the reason for it!

                     

                    The goal is one RPC round trip to get the correct answer, because this is being done for hundreds of Attributes/Points in a List object. I don't know that I agree with the expressions approach as a solution, and don't think that it would provide the right result.


                    Plus, with decades of PISDK operation I would consider that its result is de facto correct and that the AFSDK should match its behavior - or at least provide a parameter to let the caller decide the algorithm to be used.


                    FYI - I also have step=0 for the tags being referenced.


                    Dennis

                  • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                    bshang

                    The difference appears to be related to the sampleType argument used in the PISDK and AFSDK's call to their respective filtered summaries methods.

                     

                    The AF SDK's AFSampleType enumeration has the members ExpressionRecordedValues and Interval.

                     

                    The PISDK's FilterSampleTypeConstants enumeration has the members fstPIPointRecordedValues (which Dennis is using) and fstExpRecordedValues (among others). For me, using the latter sample type brings the PI SDK result in line with the AF SDK result.

                     

                    Dennis, can you confirm on your side? I believe you just need to modify the line to the below:

                    var ptNVS = ptData.FilteredSummaries(afRange.StartTime.UtcSeconds, afRange.EndTime.UtcSeconds, null, ptFilter, PISDK.ArchiveSummariesTypeConstants.asAll, PISDK.CalculationBasisConstants.cbTimeWeighted, PISDK.FilterSampleTypeConstants.fstExpRecordedValues);

                     

                    Reading the PI SDK help for FilterSampleTypeConstants though, I don't think there should be a difference in our test cases here between using fstPIPointRecordedValues and fstExpRecordedValues, since the same tag is used as the "source" and used in the filter expression. However, clearly, each of them are assuming different step behaviors for the tag near the filtered value. I'll try to dig a bit deeper to see if there's an equivalent way of getting the desired average with the AF SDK.

                    1 of 1 people found this helpful
                      • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                        kilgored

                        Okay, by changing the PISDK FilterSampleType to be fstExpRecordedValues - the PISDK and AFSDK results match. I agree that it feels intuitive that the results should match regardless of the setting since there is only one tag in play here, but I'll be interested to see what you are able to dig up regarding the low level differences in the behavior.

                         

                        Here is an interesting perspective, which I suppose is more of how I expected things to behave - perhaps naively. Taking the dataset that I provided to Gregor above, if I use "Shutdown" for my outlier in lieu of -1000000 the AFSDK returns 62.443, which is the same as what the PISDK returns with fstPIPointRecordedValues - regardless of what value or state the outlier is.

                         

                        The issue that I'm trying to get straight in my own head is how can an average of values between 60 and 100 result in a value less than 60? The PISDK fstPIPointRecordedValues and bad data cases with AFSDK seem to use more of a truncation approach - which intuitively feels more correct, rather than an interpolation approach.

                          • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                            bshang

                            Hi Dennis,

                             

                            I'd like to provide some further details regarding the issue you discovered here. You are absolutely correct in your findings and differences we found reveal some interesting behaviors but hopefully I can describe them more clearly here.

                             

                            As mentioned, the differences in the filtered summary arise from the sampleType enumeration being used.

                             

                            For the AF SDK, the AFSampleType enumeration includes the two options ExpressionRecordedValues and Interval.

                             

                            The PI SDK equivalents are

                            AFSampleType.ExpressionRecordedValues  ---> FilterSampleTypeConstants.fstExpRecValWithMinSampTime

                            AFSampleType.Interval ---> FilterSampleTypeConstants.fstInterval.

                             

                            Unfortunately, the AF SDK does not include an equivalent for FilterSampleTypeConstants.fstPIPointRecordedValues, which gives the desired behavior.

                             

                            The latter will persist the value of the event right before the filter (i.e. the truncation behavior you described), whereas the others will interpolate to the outlier value, skewing the average.

                             

                            This problem with the summary calculation also manifests itself in going from Datalink 4.x to 5.x, when a transition from PI SDK to AF SDK was made. We're aware of this limitation but unfortunately, don't have the equivalent functionality for AF SDK yet.

                             

                            A possible workaround we've found is to try using Step=1 for the tag. Since we probably don't want to turn Step=1 permanently (it affects compression, trending, etc. for all clients), it may be worth trying to turn it On/Off programmatically if we need to use the AF SDK to get the correct average.

                             

                            I've posted a sample code below to do this and I was able to get the desired average with AF SDK that matches the PI SDK result.

                             

                            using System;
                            using System.Collections.Generic;
                            using System.Linq;
                            using System.Text;
                            using System.Threading.Tasks;
                            using OSIsoft.AF;
                            using OSIsoft.AF.Data;
                            using OSIsoft.AF.PI;
                            using OSIsoft.AF.Asset;
                            using OSIsoft.AF.Time;
                            using PISDK;
                            using PISDKCommon;
                            using PITimeServer;
                            
                            
                            namespace TestFilter
                            {
                                class Program
                                {
                            
                                    static void Main(string[] args)
                                    {
                                        string coreFilter = "'{0}' >= {1} AND '{0}' < {2}";
                            
                            
                                        string attPath = args[0];
                                        string startTime = args[1];
                                        string endTime = args[2];
                                        string min = args[3];
                                        string max = args[4];
                            
                            
                                        var afAtt = AFAttribute.FindAttribute(attPath, null) as AFAttribute;
                                        var afPoint = afAtt.PIPoint;
                                        var piPoint = afAtt.RawPIPoint as PISDK.PIPoint;
                                        var ptData = piPoint.Data as PISDK.IPIData2;
                            
                            
                                        var afRange = new AFTimeRange(startTime, endTime);
                                        var afSpan = new AFTimeSpan(afRange.Span);
                                        var afFilter = string.Format(coreFilter, ".", min, max);
                                        var ptFilter = string.Format(coreFilter, piPoint.Name, min, max);
                            
                            
                                        AFValues afAvg, afPct, afStepAvg, afStepPct;
                                        PISDK.PIValues piAvg, piPct;
                            
                            
                                        var attDict = afAtt.Data.FilteredSummaries(afRange, afSpan, afFilter, AFSummaryTypes.Average | AFSummaryTypes.Count, AFCalculationBasis.TimeWeighted, AFSampleType.ExpressionRecordedValues, AFTimeSpan.Zero, AFTimestampCalculation.Auto);
                                        afAvg = attDict[AFSummaryTypes.Average];
                                        afPct = attDict[AFSummaryTypes.Count];
                                        Console.WriteLine("{0} :: {1} @ {2}", afAtt.GetPath(), afFilter, afAvg.First().Timestamp);
                                        Console.WriteLine("{0:0.000} :: {1:0.000}", afAvg.First().Value, afPct.First().Value);
                            
                            
                                        //Programmatically turn step on. Then, turn it off.         
                                        piPoint.PointAttributes.ReadOnly = false;
                                        PISDK.PointAttribute ptatr = piPoint.PointAttributes["Step"];
                                        ptatr.Value = true;
                                        var attStepDict = afAtt.Data.FilteredSummaries(afRange, afSpan, afFilter, AFSummaryTypes.Average | AFSummaryTypes.Count, AFCalculationBasis.TimeWeighted, AFSampleType.ExpressionRecordedValues, AFTimeSpan.Zero, AFTimestampCalculation.Auto);
                                        ptatr.Value = false;
                                        afStepAvg = attStepDict[AFSummaryTypes.Average];
                                        afStepPct = attStepDict[AFSummaryTypes.Count];
                                        Console.WriteLine("{0} :: {1} @ {2}", afAtt.GetPath(), afFilter, afStepAvg.First().Timestamp);
                                        Console.WriteLine("{0:0.000} :: {1:0.000}", afStepAvg.First().Value, afStepPct.First().Value);
                            
                            
                                        var ptNVS = ptData.FilteredSummaries(afRange.StartTime.UtcSeconds, afRange.EndTime.UtcSeconds, null, ptFilter, PISDK.ArchiveSummariesTypeConstants.asAll, PISDK.CalculationBasisConstants.cbTimeWeighted, PISDK.FilterSampleTypeConstants.fstPIPointRecordedValues);
                                        piAvg = ptNVS["Average"].Value as PISDK.PIValues;
                                        piPct = ptNVS["Count"].Value as PISDK.PIValues;
                                        Console.WriteLine("{0} :: {1} @ {2}", piPoint.Name, ptFilter, piAvg[1].TimeStamp.LocalDate);
                                        Console.WriteLine("{0:0.000} :: {1:0.000}", piAvg[1].Value, piPct[1].Value);
                            
                            
                                    }
                                }
                            }
                            
                            
                            1 of 1 people found this helpful
                              • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                                gachen

                                Just wanted to point out that you do not necessarily need to change the tag to stepped. The same behavior can be achieved by using the AFCalculationBasis "TimeWeightedDiscrete". So if you use this in the call, it should essentially be the same as treating the tag as stepped:

                                 

                                var ptDict = afPoint.FilteredSummaries(afRange, afSpan, ptFilter, AFSummaryTypes.Average | AFSummaryTypes.Count, AFCalculationBasis.TimeWeightedDiscrete, AFSampleType.ExpressionRecordedValues, AFTimeSpan.Zero, AFTimestampCalculation.Auto);
                                
                                

                                 

                                (line 40 of Dennis' original code for PIPoint.FilteredSummaries, should be the same for AFData.FilteredSummaries)

                                  • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                                    bshang

                                    I considered TimeWeightedDiscrete as well, but I think that will interpret the entire set of events as stepped, which will differ from the desired PI SDK's fstPIPointRecordedValues behavior (interpolate outside the filter range). Using Step=1 actually doesn't affect time-weighted averages per 20480SOSI8. It will continue to interpolate outside of the filter boundary. However, I think the difference is that using Step=1 prevents interpolation right before the filter and instead persists the value of the last unfiltered event, so it acts like the PI SDK's setting. I'll probably have to do some further tests with more realistic data to confirm the behavior though. My math is getting rusty and I've been hesitant to work out the arithmetic to confirm some of these averages manually

                                      • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                                        gachen

                                        Hmm... I tested with Dennis' original data and the behavior is that all three functions, AFData.FilteredSummaries, PIPoint.FilteredSummaries and IPIData2.FilteredSummaries return the same value when the tag is stepped as to the result of [AFData/PIPoint].FilteredSummaries with TimeWeightedDiscrete with step off. So it would seem that using TimeWeightedDiscrete is the essentially the same as making the tag stepped, unless I'm missing something else?

                                         

                                        Also, the WI that you pointed out is for PIAdvCalcExpDat, which if I am not mistaken, actually corresponds with AFCalculation.CalculateSummaries (and since the WI is for DataLink and not AFSDK, I'm not sure that it applies to the underlying AFSDK function).

                              • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                                bshang

                                Thanks everyone for their patience with this issue. We've created TS case 603690 to track the issues encountered and are working with the developers to provide better documentation. We'll share those here as soon as they become available.

                                  • Re: Certain FilteredSummaries results in AFSDK aren't matching PISDK results
                                    ling

                                     

                                    PISDK fstExpRecordedValues sample type is the same as AFSDK ExpressionRecordedValues sample type. Both calls are implemented on the PI server side. PISDK fstPIPointRecordedValues is implemented by calling the RecordedValues with the filter expression and then perform the summary calculation on the client side. The fstPIPointRecordedValues sample type is created primarily to provide filterSummary support for older version of the PI Server (before PI 3.4). Since the filter expression is evaluated at the archived event of the summary tags, not the input tags of the expression, it is less accurate in determining the filter state change if the expression is data independent of the summary tag. For example, filtering product flow rate based on product type tag would not be accurate with fstPIPointRecordedValues.

                                     

                                    For filter expression containing only the summary tag, both fstPIPointRecordedValues and fstExpRecordedValues should compute the exact same filter transition points, e.g. timestamps where the filter expression change state. The summary result difference between these two sample types is primarily caused by way interpolation is done between the last good archive event to the filter off to on transition point. For that time period, fstPIPointRecordedValues sample type will use the last good archive value and flat line up to the transition point. For fstExpRecordedValues, interpolation is used between the last good value and the archive value at the transition point. If the transition point is accurate (like a discrete tag in a logic expression) or if the filter expression is independent of the summary tag, using interpolation is more accurate. But for data cleansing scenario, it is counter intuitive to include interpolated values that violate the filter expression, like filter expression of ‘sinusoid’ > 50.

                                     

                                    This issue is not alone in filterSummary. When cleansing data, it is ambiguous to just remove the “bad” event. The entire time period between the previous good value to the next good value after the “bad” event is affected because of interpolation. Should the system interpolate between the two good values; use the last good event; substitute the bad event with limit (high/low)?

                                     

                                    So the workaround of replacing filterSummary fstPIPointRecordedValues to do data cleansing in AFSDK are:

                                     

                                    1. Use Interval sample type and provide a small enough interval to ensure acceptable accurate filter state transition timestamp.
                                    2. Another alternative is to use the ExpressionSummary call and put the filter expression as part of the calculation expression in the call. In this case, user can specify what to use in the summary calculation when the filter is violated. For example, the expression could be “if ‘tag’ > 500 then 500 else ‘tag’” for substituting bad event with a limit. If want to get fstPIPointRecordedValues behavior, e.g. substitute bad value with previous good value for the period between last good value and the filter state change, then use the expression of “if ‘tag’ >500 then “filtered” else ‘tag’”. The expression “if ‘tag’ > 500 then “badInput” else ‘tag’ will have the behavior of using average value of the entire time range where data is good as the value for the bad range.

                                      

                                     

                                     

                                    1 of 1 people found this helpful