4 Replies Latest reply on Jan 8, 2015 7:30 PM by bshang

    Questions about RecordedValues bulk data delivery

    ritchiecarroll

      Using PIPointList.RecordedValues I can quickly get an enumeration of AFValues - i.e., all values for the specified time range for each PI point (as a collection of AFValues) in the PIPointList. I haven't stressed this function yet and I assume there are limits.

       

      With my initial testing, it appears that all data points are received in bulk when the method returns as one big batch, i.e., I call RecordedValues method, I get "all" data. I just want to make sure this is expected behavior. I am not sure if this is a single RPC call or multiple ones behind the scenes to provide all expected data to the programmer.

       

      In particular, what happens if I query a very large time-range or a very large number of points, i.e., will data continue to queue up in memory until all data is received before call to RecordedValues returns or is the RecordedValues function implemented as a "yielded return" such that it can continually provide data (per RPC return) as it is queried by the server in more of a streaming fashion, e.g., I will receive multiple AFValues for a given point over time?

       

      I noticed this note in the documentation:

       

      "The PI Server imposes a limit on the maximum number of events that can be returned with a single call. By default this is set at 150,000. This behavior can be changed on the server by editing the server's PITimeout table and adding or editing the value associated with the parameter ArcMaxCollect."

       

      Does this mean no matter how much data I request I will only ever receive 150,000 (i.e., ArcMaxCollect) points from a RecordedValues call?

       

      Also, does the PIPagingConfiguration control how the data is received from the RecordedValues function or is this a parameter to control server-side query behavior?

       

      My goal is to know how to properly retrieve very large data sets using AF-SDK in a "streaming" fashion.

       

      Thanks for any help in clarification on proper use of SDK to perform this task.

       

      J. Ritchie Carroll

        • Re: Questions about RecordedValues bulk data delivery
          Mike Zboray

          The PIPagingConfiguration determines how the data is returned to the client. The data is returned in chunks to the client based on this configuration, however it can be consumed as a single stream by the client. You can configure it to make more/fewer trips to the server by changing the page size.Ryan Gilbert has a good explanation of PIPagingConfiguration. From his answer there:

           

          The PI Server will never return partial results for a single tag even if you page by Event Count; it will always finish collecting events for the current tag even if that threshold has been reached.

           

          That means you have to be careful if the tags you are querying are potentially dense. You can still get an error or a timeout even if you've specified conservative paging parameters.

          1 of 1 people found this helpful
          • Re: Questions about RecordedValues bulk data delivery
            bshang

            Hi James,

             

            Here are some details regarding the behavior of the bulk data methods.

             

            When a bulk RecordedValues call is made, the AF SDK partitions the points in the PIPointList by PI Data Archive (Server). Then, if the PI Data Archive is 3.4.390 (2012) or higher, it will execute a bulk call (one RPC) per PI Data Archive. If the version is less than 390, then the AF SDK will internally make an RPC call per PIPoint.

             

            For now, I'll assume that the PI Data Archive is 3.4.390+ and the PIPointList contains only points from this one server.

             

            Regarding how the data is returned to the client, the data is "paged" back to the client. When a PIPointList.RecordedValues call is made, an internal Task is created that executes the bulk RPC query. This task receives the paged results from the PI Data Archive, does any post-processing (i.e. UOM conversion), and adds the results to a .NET ConcurrentQueue. This queue is then exposed to the AF SDK client as an IEnumerable<AFValues> collection.

             

            The client application simply iterates over and consumes the resulting collection in a separate thread. All the work of the page processor task is hidden from the end user so the developer only needs to iterate over the collection to obtain the results. If the collection is empty by the time the next page is requested, then the foreach loop will "block" until the next page arrives from the server, is processed, and is added to the queue/collection.

             

            The AF SDK programmer's guide has a good discussion on these topics in the "List / Bulk Data Methods Overview" section. The thread AF SDK performance, serial vs. parallel vs. bulk is also worth reading to get an idea of bulk data retrieval behaviors. Also, this presentation is worth going over to find out more about the different data retrieval patterns.

             

            Regarding ArcMaxCollect, for server 380 or higher, the defaults were changed to 1,500,000. We generally don't recommend increasing this value from the 1.5 million default unless there is a need to, as it exists to prevent overloading the network and server.

             

            In regards to your last question about retrieving data in a "streaming" fashion, are you looking to obtain real-time events from the PI DataArchive? If so, then AFDataPipe should be used instead of RecordedValues (which is more for historical data access). The presentation linked above as well as the AF SDK programmer's guide have more details regarding getting real-time data.

            1 of 1 people found this helpful