12 Replies Latest reply on Jul 17, 2015 6:45 PM by skwan

    Calculation accuracy using snapshots


      Bit of a left-field question here....


      If I have an AF Analysis adding two input tags from separate scan classes, an inaccuracy creeps in due to stale snapshots. Whenever the calculation occurs, one of the snapshot values will be from sometime in the past. The actual value at the evaluation time is not known until the tag receives its next snapshot.


      No-one has complained about this until now but our current customer is trying to replace a cumbersome spreadsheet-based system that is actually more accurate because they have compression turned off (I know!) so every snapshot gets stored and when they run a calculation, no stale snapshots are used because the offending input values are all interpolated. Their spreadsheet only ever looks at historical data.


      I could see a mechanism where this could be fixed but I may be being naive. If any input tags snapshots are stale at the time of calculation, the engine could revisit the calculation when all the tags have received new snapshots. It can then get more accurate interpolated values and re-run the calculation, adjusting the output.


      Does this make sense? Could it work? And is such a mechanism being considered?

        • Re: Calculation accuracy using snapshots
          Mike Zboray

          You may want to try adjusting the CalculationWaitTimeInSeconds parameter (LiveLibrary docs) to tweak how long the Analysis Service waits before running calculations on new snapshot data. The default is 5 seconds. Unfortunately, this is a global parameter and you can't tweak it for an individual analysis/template.

          • Re: Calculation accuracy using snapshots

            Hi Alistair,


            Let me make sure I get this right.  So you have two inputs, as example, one input maybe 4 seconds scan class and the other input maybe 300 seconds scan class.  So if you're using event-triggered for your analysis, you may trigger based on a new value for input1 but input2 maybe 1 to 299 seconds old and since you don't have another value for input2 to interpolate, the analysis will use the input2 snapshot value.  If I described your use case correctly, there is currently no way around that.  In the same scenario, your analysis could trigger based on a new value for input2 and your input1 maybe 1-3 seconds old.  Either way you're using snapshot values and not interpolated values for all your inputs.  Unless the scan classes match for all your inputs, it's impossible to have fresh values for all your inputs.


            Options - you can always run in backfill mode.  That is a manual process but it mimics your client's spreadsheet based solution as both methods act on historical data.  You can also ask your client to match the scan classes.


            BTW, the fact that your client doesn't use compression really doesn't factor into this discussion, it's really the fact that your client's spreadsheet only works on historical data thus being about to interpolate between two values.



            Steve Kwan

              • Re: Calculation accuracy using snapshots

                Hi Steve,


                This gives me some solid context and proof about the same issue we are facing! I can show this to my team to explain the situation we are in. Thanks!


                I have posted the link in one of my replies here about the issue we are facing


                PI Analysis Incorrect Results due to Delay


                We are in a big time dilemma as to what to do to fix this because in our case, its theoretical gas calculations for upstream wells - most of them Daily Gas/Condensate Volumes, which is getting impacted because of the delay. All results are coming up as incorrect.


                Could you please help with the best workaround for this?

                  • Re: Calculation accuracy using snapshots
                    Rhys Kirk

                    Nikhil Kaul for your gas calculations you can use AF 2.7 to introduce a per Analysis wait time rather than setting something global. You can have a 1 hour delay for those periodic calculations. In your case you need 07:00 calculations but some of your inputs may be stale at 07:00. To allow a 1 hour late arrival (an arbitrary time window) of data you could make your calculation schedule be periodic at 08:00, alter your expression to use "TagVal(...., 't+7h')", and change the output timestamp to t+7h. The "delay" is only introduced for this Analysis and not globally. This is one of those reasons we harassed Stephen Kwan for this feature to alter output timestamps.


                    It gets a little tricky for an event triggered input, but you can use an intermediate Analysis to calculate when the main Analysis should run...hint, you can use "NoOutput()" to influence the main Analysis from the intermediate Analysis, but this really does depend on individual circumstances for a calculation.

                • Re: Calculation accuracy using snapshots

                  Hi Steve, yes you have the scenario essentially correct except that most of the tags are on a 1-minute scan class, just at different offsets, so matching the scan classes is a possibility, as might be Mike's suggestion of adjusting the CalculationWaitTimeInSeconds. I am tempted to suggest changing this to 65 seconds. They are generally looking at historical rather than live data so waiting a minute for a calculation result should not be an issue.


                  Are there any other implications for adjusting the CalculationWaitTimeInSeconds to 65, say for cascading calculations or for performance?


                  My comment about compression was just referring to the fact that analyses will tend to loose a little accuracy when using historical rather than snapshot values due to the 'lossy' nature of the compression algorithm.

                  • Re: Calculation accuracy using snapshots
                    Rhys Kirk

                    I think that from your description you could in theory replicate this in Abacus 2.7 onwards.

                    If your slowest inputs are 400 seconds then you could change your input variables to retrieve historical data from 400+ seconds ago, then manipulate the output timestamp of the analysis to 400+ seconds ago. In effect your are running your analysis constantly in the past to ensure you have all data present as per your client's spreadsheet. You would probably need to be clock scheduled rather than event triggered.

                    • Re: Calculation accuracy using snapshots

                      You could get analyses to check how old the timestamp is before running the calculation, i.e. build a filter on top of something like this:


                      Int('*')- Int((PrevEvent('Tag','*+1m')))


                      • Re: Calculation accuracy using snapshots

                        This is an interesting discussion. The way I look at it, there are two fundamentally different ways to think about event triggering in the context of a real-time calculation engine:

                        1. Perform the calculation as soon as a new event for any of the triggering inputs is received (i.e. use whatever latest data is available for other inputs), or
                        2. Perform the calculation for a trigger event, say with time stamp T1, only after an event with time stamp T>T1 has been received for each triggering input (including the input for which we received T1 to begin with).


                        Option (1) sacrifices accuracy in favor of timeliness (or low latency), while (2) guarantees accuracy at the expense of high latency, and could mean that your calculations could be significantly delayed. Both approaches have valid use cases, and it really depends on what you are trying to do.


                        We take a hybrid approach for Asset analytics (and also for PI ACE for that matter), which probably is closer to (1), where we try to evaluate as soon as a new trigger event is received, while allowing the user to specify a wait time (CalculationWaitTimeInSeconds) depending on their data pattern. So as Mike Zboray suggested, setting an appropriate wait time should work in this case.


                        A few things you need to be careful about:

                        • This is a global parameter, and would introduce latency to all other calculations.
                        • You would need to make sure that this does not cause any skipping, as by default the calculation engine would only queue 50 events (EvaluationsToQueueBeforeSkipping) when load shedding is enabled (IsLoadSheddingEnabled).   
                        • This would slightly increase the memory footprint, as you are holding on to more things - but should not be that significant.




                        1 of 1 people found this helpful
                        • Re: Calculation accuracy using snapshots

                          Hi Alistair,


                          Though a month later, but I am going through the same trouble I think:


                          PI Analysis Incorrect Results due to Delay


                          I haven't read through this thread entirely, but whatever I went through, looks like WaitTime is the only setting we can manipulate. But as that is a global feature, it might lead to skips for some other calcs. So what is the correct way out?