4 Replies Latest reply on May 9, 2016 5:49 PM by bshang

    Handling Errors returned by GetObserverEvents() method of a AFDataPipe

    nkumar_CAT

      Hello there,

      I am new to OSI PI AF SDK, and trying to read real time data using AFDataPipe. And i know that myDataPipe.GetObserverEvents() returns a collection of AFErrors< TKey> if any error occurred.

      I can store errored PIPoint with timestamp (when it occurred) and get the missing PIPoint value via a batch job using method GetRecordedValues().

      For my batch job to read missing data for this PIPoint , i need to know timestamp to which i have to capture the values for this PIPoint.

      One of the approach i have in mind is to, keep checking the error collection if DataPipe has started to read the values for this errored PIPoint, and then I can capture this timestamp and use this for getting the missing values for this PIPoint for the recorded time interval.

      Please let me know if you have better/efficient solution to handle this.  

        • Re: Handling Errors returned by GetObserverEvents() method of a AFDataPipe
          gregor

          Hello Nitesh,

           

          Do you see the AFErrors collection return different from empty or null? If so can you share some details about the errors contained?

           

          The first thing to do is logging which will allow to investigate further. If there are always the same set items returning errors it may indicate some systematic error or a configuration issues. A retry mechanism will not be of any use for this case.

          • Re: Handling Errors returned by GetObserverEvents() method of a AFDataPipe
            bshang

            Hi Nitesh,

             

            There are at least 2 approaches here:

             

            Approach 1 (simpler):

            Inspect AFDataLossException. AFDataLossException Class

            This will be returned in PIServerErrors collection in AFErrors. This exception is returned whenever possible data loss is detected (expired consumer, server-side overflow, exception). It will provide a Start and EndTime property that the client can keep track of per PI Data Archive. Persist these error periods and have the batch job/thread process the time ranges to recover the data.

             

            Note that the server will persist the consumer's update queue for about 10 minutes in the event of disconnection. Reconnection is automatically established in AF SDK and if done before that expiration time, you should be able to recover the data automatically (i.e. no data loss exception is returned).

             

            Approach 2 (more complex but offers more control):

            1) Maintain a checkpoint timestamp per attribute and persist this state externally. When a new event comes in, move the checkpoint forward to the new time. Assume all times prior to the checkpoint minus explicitly tracked error periods have been received through the pipe successfully.

            2) If AFErrors returns an error for an attribute that suggests data loss*, start an error period denoting a potential data loss range, and set the start time to the current checkpoint. Persist the error period state externally. For server-level errors, start the error period for all attributes under that server.

            3) When the next event comes in, set the end time for the error period and update the checkpoint. A batch job can later make historical calls passing in the error period time intervals to recover the data.

            4) When we receive a new event but can’t process successfully (i.e. 3rd party DB we are writing to is down), place the unprocessed event in a persisted queue to try again later but update the checkpoint.

            5) On startup, create error periods between persisted checkpoints to first new value received (on a per attribute basis) so we can backfill over period when application was not running.

             

            *If we only require at-least once processing, we can be fairly broad in defining what constitutes an error period.

             

            Assumptions:

            We are receiving in-order data. If we miss an event due to exception, we assume we missed events that are timestamped within that exception time window, so out-of-order events timestamped outside the window that are missed cannot be recovered easily.

            4 of 4 people found this helpful
              • Re: Handling Errors returned by GetObserverEvents() method of a AFDataPipe
                nkumar_CAT

                Thanks for reply Barry!

                In Approach 1, how can I catch this AFDataLossException, I am iterating through errors like following:

                private void writeErrorLogs(AFErrors<AFAttribute> errors)

                        {

                if (errors.PIServerErrors != null && errors.PIServerErrors.Count > 0)
                                {
                                    foreach (var item in errors.PIServerErrors)
                                    {

                                      
                                        // save to db
                                    }
                                }

                }

                And also please confirm that this AFDataLossException is only available in case of PIServerErrors, how would you recover data if this is an error which is not a PISystem or PIserver error.

                 

                Thanks again for your help.