2 Replies Latest reply on Mar 1, 2017 1:49 PM by gregor

    Improve Summary Service Performance?


      Hey Community,

      I am facing performance issues on production environment with the summary services (GET streams/{webId}/summary) when getting max value of a pi point for 1+ year range.

      Indexing is running for pi data and average time to get summary data for 1 point is (1 to 1.5 secs) which is really slow knowing that the size of the data is big.

      Is there any workaround that we could use to optimize summary data collecting?




        • Re: Improve Summary Service Performance?
          Rick Davin

          Hi Wissam,


          By your own admission, the size of your data is big and you are looking at over a 1+ year range.  If your archives were saving 1 reading every minute, that would be 1440 per day or over 525600 in 365 days.  To find the Max, it means 525600 values must be read from your hard drives.  Taking 1.5 seconds to do that does not seem unreasonable, considering your Data Archive is also busy with lots of new data coming in as well as handling data requests from other users.


          For a Max, you may cache pre-calculated summaries into new PI points using Analytics.  If you calculated the Max for each day, you would now only need to read 365 values to find the yearly max. That should speed things up a bit.  This solution works fine for Max or Min, but gets trickier for Total or Average since you must consider time-weighting or problems where if you wanted a yearly average then you don't want to be producing an average of averages.


          Good luck,


          1 of 1 people found this helpful
          • Re: Improve Summary Service Performance?

            Hello Wissam,


            Performance is relative and getting the summary for a PI Point for a period of > 1 year means that all archived values for that period need to be retrieved and processed by PI Archive Subsystem. Can you tell how many archived value this PI Point has for the specified period? Please just change your summary query to return the count instead of the maximum.


            Often, when users complain about performance, we find that PI Points are configured without proper settings for Exception and even more important Compression causing way more events become archived than necessary. Because disk space can be considered cheap these days, the price to pay is when data must be retrieved over larger periods of time. You just need the maximum for a specified period and this is actually a very good example to explain why it is important to find proper settings for Compression and Exception.


            Another impact on performance is if the data for a PI Point becomes written with sub second timestamps or not. Without the sub seconds, PI Data Archive just stores the time offset to the previous event but with sub seconds, the complete timestamp becomes stored. With millions of events, the performance impact becomes noticeable.


            To figure out where the time is spent, you may want to compare PI Web API performance against the performance you get by using a client like PI DataLink or another Developer Technology. I'd suggest using AF SDK for the comparison because this is what recent clients are using and what other Developer Technologies use under the hood.


            Please note that the PI Index Search Crawler that comes with PI Web API is indexing object names to improve performance when searching for data items. There's no indexing on PI Point data but there are options to create such an index by e.g. using Asset Analytics. By setting up an Analysis to report the maximum of the past day and persisting this information to a PI Point, you would be able to query this PI Point instead of the raw data PI Point. You can imagine that finding the maximum out of 365 values (assuming 1 year as the query period) happens within a blink.

            1 of 1 people found this helpful