8 Replies Latest reply on Jun 13, 2013 6:43 PM by jlakumb

    Performance of PI data summary calculations

    AlistairFrith

      We have a project that requires summary calculations (min/max/average) on data that has been retrieved from PI and potentially then adjusted. The way we are doing this is...

      1. Obtain the target AF asset and attribute.
      2. Request time-seriese data for the attribute
      3. Potentially perform manipulations on the data retrieved (eg removing spikes)
      4. Calculate a selection of average, max, min, count above, etc. on the adjusted data, using the PIValues.Summary() method. There are some calculations that are not supported by Summary() and we obviously have to do those ourselves.

      Trouble is, this is hugely slow! We are suspecting that  PIValues.Summary() actually sends the values back to the PI server each time. In which case it may be more efficient for us to implement the calculations ourselves.

       

      Does anyone have any advice on the most efficient way to do this. It is not unusual for the users to ask for a report summarising a year's worth of data for 7000 assets, with average, minimum and maximum for 2 data streams at each site. This report takes several hours to run!

       

      --- Alistair.

        • Re: Performance of PI data summary calculations

          Hello Alistair,

           

          What kind of data are we talking about?

           

          Alistair Frith

          Potentially perform manipulations on the data retrieved (eg removing spikes)

           

          Where are those spikes from? Is there any reason to nut fully trust the collected data? Is this 'manipulated' data written back to PI tags - keyword substituted?

           

          Alistair Frith

          Does anyone have any advice on the most efficient way to do this

           

          Are you asking how to calculate minimum, maximum and average for data that you do have in memory on the client side already? Is a time or event weighted average required?

           

          Alistair Frith

          It is not unusual for the users to ask for a report summarising a year's worth of data for 7000 assets, with average, minimum and maximum for 2 data streams at each site. This report takes several hours to run!

           

          Is the resulting report including additional data aggregation? What's the resulting format and size of this report?

           

           

            • Re: Performance of PI data summary calculations
              AlistairFrith

              Gregor Beck

              Where are those spikes from? Is there any reason to nut fully trust the collected data? Is this 'manipulated' data written back to PI tags - keyword substituted?

               

              Spikes etc are due to unreliability in the source instrumentation. One example (not of a spike but another anomaly) is tap-changer signals. There is an instrument that should record '2' when a tap-change is in progress and '1' when not. However sometime the sensor 'sticks' on '2'. Tap changes typically last a few seconds and if they really lasted longer than 20 seconds, there would literally be an explosion. So when we see 1,2,2,1 where there is a gap of several minutes between the '2's, we know that there were actually at least 2 tap changes and possibly more but the sensor was stuck. If we want to calculate the average tap-change duration, we should only use the case where a 2 is followed by a one, so we should remove the offending '2' values.

               

              Likewise, there are some sensors that do 'spike'. If we see a single value that is way outside the standard operating values, we should remove (ignore) it when calculating an average temperature. The cleansed data is not written back to PI since the raw data must remain unaltered and it would not be cost effective to duplicate tags (although I suppose we could use annotations but the end result would be the same: we would need to manipulate the data before running standard calculations).

               

              Gregor Beck

              Are you asking how to calculate minimum, maximum and average for data that you do have in memory on the client side already? Is a time or event weighted average required?

               

              I suppose so, although the calculation is straightforward but we just want to know which way is most efficient. Although advice on the best algorithms is always appreciated!

               

              Gregor Beck

              Is the resulting report including additional data aggregation? What's the resulting format and size of this report?

               

              Some of the reports do further calculations such as detecting correlations between one tag rising steeply when another one is falling or looking for further events on other assets at the same time. The end report is simply a CSV text file.

               

              --- Alistair.

               

               

               

               

               

               

                • Re: Performance of PI data summary calculations

                  Hello Alistair,

                   

                  My approach would be to evaluate the event count (list length first). Then I would loop through the list of events and evaluate, minimum, maximum and average within that loop. To get the (event weighted) average, divide each single value by the count and build the total of that numbers by summing them up. This way you prevent running into an overflow. Here is an example for illustration that by intention doesn't involve anything PI specific:

                   

                   

                   
                  using System;
                  using System.Collections.Generic;
                  using System.Linq;
                  using System.Text;
                  
                  namespace CustomAverage
                  {
                      class Program
                      {
                          static void Main(string[] args)
                          {
                              List<double> myValues = new List<double>();
                              Random myRandom = new Random();
                              Int32 iListLength;
                              // Don't know anything about the value range
                              // => let's initialize Min & Max with pretty unlike values
                              Double myMin = 10000;
                              Double myMax = -10000;
                              Double myAvg = 0;
                              // Preparation: Fill the list with random values
                              for (int i = 1; i <= 10000; i++)
                              {
                                  myValues.Add(myRandom.NextDouble());
                              }
                              iListLength = myValues.Count;
                              // Get the results (minimum, maximum and average)
                              foreach (Double myValue in myValues)
                              {
                                  if (myMin > myValue) { myMin = myValue; }
                                  if (myMax < myValue) { myMax = myValue; }
                                  myAvg = myAvg + myValue / iListLength;
                              }
                              Console.WriteLine("Minimum: " + myMin.ToString());
                              Console.WriteLine("Maximum: " + myMax.ToString());
                              Console.WriteLine("Average: " + myAvg.ToString());
                              Console.WriteLine();
                              Console.WriteLine("Press any key to quit ...");
                              Console.ReadKey();
                          }
                      }
                  }
                  

                   

                    • Re: Performance of PI data summary calculations
                      Marcos Vainer Loeff

                      Hello Alistair,

                       

                       You may also consider using PI AF SDK 2012 (or 2.5) with Rich Data Access in order to get better performance. PI AF 2.5 was the first version of the PI AF SDK that allows direct access to PI time series data. In PI AF 2.5, two versions of the PI AF SDK are provided; one based on .NET 4 and one based on .NET 3.5.  The .NET 4 version allows direct access to PI time series data whereas the .NET 3.5 version continues to use the PI SDK to access PI time series data.  As such, the .NET 3.5 version provides backward compatibility with legacy data references that many vCampus users have built in the past.

                       

                       

                       

                      Testing has shown that AFSDK (.NET 4) in AF 2.5 retrieves PI time series data faster than PI SDK in many scenarios and also significantly reduces memory usage.  AFSDK avoids STA/MTA issues commonly seen in PISDK development and also eliminates COM interop overhead. There is a webinar about this topic if you are interested in learning more.

                       

                       

                       

                      Below there is a code snippet for you to use as a reference that shows two options.  The first part connects to the AF Server getting data from the attribute while the second part connects to the PI Server getting data from a PI Point. Both methods use only PI AF SDK (and not PI SDK).

                       

                       

                       

                       

                       
                      using System;
                      using System.Collections.Generic;
                      using System.Linq;
                      using System.Text;
                      using OSIsoft.AF;
                      using OSIsoft.AF.PI;
                      using OSIsoft.AF.Time;
                      using OSIsoft.AF.Asset;
                      using OSIsoft.AF.Data;
                      
                      namespace ConsoleApplication1
                      {
                          class Application
                          {
                              public void Run()
                              {
                                  PIServerSummary();
                                  AFServerSummary();
                              }
                      
                              private void PIServerSummary()
                              {
                                  Console.WriteLine("Getting values from a PI tag");
                      
                                  PIServers MyPIServers = new PIServers();
                                  PIServer MyPiServer = MyPIServers.DefaultPIServer;
                                  MyPiServer.Connect();
                                  PIPoint MyTag = PIPoint.FindPIPoint(MyPiServer, "sinusoid");
                      
                                  AFTime startTime = new AFTime("*-1d");
                                  AFTime endTime = new AFTime("*");
                                  AFTimeRange timeRange = new AFTimeRange(startTime, endTime);
                                  IDictionary<AFSummaryTypes, AFValue> MaxValue = MyTag.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Maximum, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                                  IDictionary<AFSummaryTypes, AFValue> MinValue = MyTag.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Minimum, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                                  IDictionary<AFSummaryTypes, AFValue> AvgValue = MyTag.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Average, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                      
                                  Console.WriteLine("The maximum value is " + MaxValue.First().Value);
                                  Console.WriteLine("The minimum value is " + MinValue.First().Value);
                                  Console.WriteLine("The average is " + AvgValue.First().Value);
                                  Console.ReadKey();
                              
                              }
                      
                              private void AFServerSummary()
                              {
                                  Console.WriteLine("Getting values from an AF attribute");
                      
                                  PISystems MyPISystems = new PISystems();
                                  PISystem MyPISystem = MyPISystems.DefaultPISystem;
                                  MyPISystem.Connect();
                                  AFAttribute myAttribute = AFObject.FindObject(@"\\marc-pi2012\Piperfmon\TagsValue|Sinusoid") as AFAttribute;
                       
                      
                                  AFTime startTime = new AFTime("*-1d");
                                  AFTime endTime = new AFTime("*");
                                  AFTimeRange timeRange = new AFTimeRange(startTime, endTime);
                                  IDictionary<AFSummaryTypes, AFValue> MaxValue = myAttribute.Data.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Maximum, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                                  IDictionary<AFSummaryTypes, AFValue> MinValue = myAttribute.Data.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Minimum, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                                  IDictionary<AFSummaryTypes, AFValue> AvgValue = myAttribute.Data.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Average, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted, OSIsoft.AF.Data.AFTimestampCalculation.Auto);
                      
                                  Console.WriteLine("The maximum value is " + MaxValue.First().Value);
                                  Console.WriteLine("The minimum value is " + MinValue.First().Value);
                                  Console.WriteLine("The average is " + AvgValue.First().Value);
                                  Console.ReadKey();
                      
                              }
                          }
                      }
                      

                       

                       

                       

                       

                      Let me know if you have any questions,

                        • Re: Performance of PI data summary calculations
                          cmanhard

                          Please make summary calls in one call (the 3 AF SDK calls should be 1).  This is a lot faster and less stressful on the PI Server as well. 

                           

                          IDictionary<AFSummaryTypes, AFValue> Results = myAttribute.Data.Summary(timeRange, OSIsoft.AF.Data.AFSummaryTypes.Maximum| OSIsoft.AF.Data.AFSummaryTypes.Minimum| OSIsoft.AF.Data.AFSummaryTypes.Average, OSIsoft.AF.Data.AFCalculationBasis.EventWeighted,

                           

                          OSIsoft.AF.Data.AFTimestampCalculation.Auto);

                           

                          Console.WriteLine("The max value is {0}", Results[AFSummaryType.Maximum].Value);

                           

                           

                           

                           

                            • Re: Performance of PI data summary calculations
                              Marcos Vainer Loeff

                              Hello Alistair,

                               

                              Remember that the examples above perform the calculations on the server. If you want to make the calculations on the client, you shall use AFValues.Summary() method. Please refer to the code snippet below:

                               

                               

                               
                                          PISystems MyPISystems = new PISystems();
                                          PISystem MyPISystem = MyPISystems.DefaultPISystem;
                                          MyPISystem.Connect();
                                          AFAttribute myAttribute = AFObject.FindObject(@"\\marc-pi2012\Piperfmon\TagsValue|Sinusoid") as AFAttribute;
                                          AFTime startTime = new AFTime("*-1d");
                                          AFTime endTime = new AFTime("*");
                                          AFTimeRange timeRange = new AFTimeRange(startTime, endTime);
                                          AFValues MyValues = myAttribute.Data.RecordedValues(timeRange,AFBoundaryType.Inside,null,"",true,0);
                                          IDictionary<AFSummaryTypes, AFValue> Results = MyValues.Summary(timeRange, (OSIsoft.AF.Data.AFSummaryTypes.Maximum | AFSummaryTypes.Minimum | AFSummaryTypes.Average),AFCalculationBasis.EventWeighted,AFTimestampCalculation.Auto);
                                          Console.WriteLine("The maximum value is " + Results[OSIsoft.AF.Data.AFSummaryTypes.Maximum].Value);
                                          Console.WriteLine("The minimum value is " + Results[OSIsoft.AF.Data.AFSummaryTypes.Minimum].Value);
                                          Console.WriteLine("The average is " + Results[AFSummaryTypes.Average].Value);
                                          Console.ReadKey();
                              

                               

                               

                              By the way, the last line of the post above, it is missing a s. Therefore, you should use AFSummaryTypes.Maximum (and not AFSummaryType.Maximum).

                                • Re: Performance of PI data summary calculations
                                  AlistairFrith

                                  Those last few posts are really interesting. we had independantly come to the conclusion that we should loop through once and do all the calculations within that one loop, as suggested by Gregor. I am currently implementing this.

                                   

                                  Using AF ADK 2012 sounds good but are there compatability considerations? The customer is running AF Server 2010 (2.4.0.44331) and AF SDK 2.4.0.331? Do I just install AF SDK 2.5 and not worry about upgrading the server?

                                   

                                  I did not realise we could 'or' the calculation types together. I suppose that is simply getting the SDK to do what Gregor is suggesting we do in our own code.

                                   

                                  Unfortunately, I have very little time to investigate these various options! For now, I will try Gregor's suggestion. I have a performance profiler and will see how much this improves things.

                                   

                                  --- Alistair.

                                    • Re: Performance of PI data summary calculations
                                      jlakumb

                                      Regarding PI AF SDK compatibility, this is a common question/confusion.  Here is what we state in PI AF Client 2012 SP1 release notes:

                                       

                                      PI AF Client 2012 supports access to any PI AF 2.x server, although all features, bug fixes and performance enhancements may not be available with older PI AF Servers.  PI AF Server 2012 also supports access from any PI AF 2.x client.  Likewise, to take advantage of all of the latest features, bug fixes and performance enhancements requires the PI AF Client 2012.  Running the latest version of both the PI AF Client and PI AF Server is recommended.