We have a PI tag that records an emissions rate (values are roughly 1 minute apart). The 30-day average of the emissions rate has to stay under a limit, say "x". I am trying to build an AF analysis that estimates how long it will be until the 30-day average exceeds the limit "x", assuming that the current emissions rate continues perpetually. This would only be relevant if the current 30-day average is beneath the limit, but the current rate is above the limit, of course.

The math for this is rather simple. However, the analysis I've created (outlined below) takes far too long to evaluate using 30 days worth of data. I fear that when I backfill this calculation, it will have faulty results, or will create problems when trying to display or update in real time.

When I tried this using only an hour of data, on the other hand, it evaluated almost instantaneously. This makes me believe the problem with my approach is how computationally expensive it is. As an amateur "programmer", I have little idea of how computationally expensive calculations are in general. I hope someone here has an idea of how to make this analysis less expensive, or if making a CDR instead is the optimal approach.

Here is the analysis:

EndTime := '*'

StartTime := EndTime - 3600*720 (3600 seconds in 1 hour, 720 hours in 30 days)

Limit := .12 (The "x" as outlined above)

CurValue := TagVal('Tag','*')

CompressedValues := RecordedValues('Tag',StartTime, EndTime)

TWACurValPerpetuity :=

MapData(CompressedValues, ((TimeStamp($val)-StartTime)*CurValue + (EndTime-TimeStamp($val))*TagAvg('Tag',Timestamp($val),EndTime))/(EndTime-StartTime))

FilterOnlyExceedTimes := FilterData(TWACurValPerpetuity, $val>Limit)

TimeUntilExceed := TimeStamp(FilterOnlyExceedTimes[1])-StartTime

The variable "TWACurValPerpetuity" calculates, at every emissions rate data point timestamp, a rolling 30-Day average, replacing values at the back end of the time frame with the current value at the front end. The last two variables effectively identify the first timestamp where this extrapolated, time-weighted average exceeds the Limit variable. Sorry if this is poorly described, and I will try to elaborate if asked.

I also tried replacing the TagAvg() piece inside the "TWACurValPerpetuity" variable with a simple Avg function of the "CompressedValues" variable. I thought this might reduce the number of times the analysis had to pull the tag's archive values. Unfortunately, the analysis had still not finished evaluating after ~10 minutes even with this adjustment.

My idea for the CDR approach would be to first pull the compressed tag values, then replace each value in the array one-by-one with the current value, until the average of the new array surpasses the limit.

Any thoughts or advice would be greatly appreciated. Thanks.

Gregory,

Please keep a few general principles in mind as you develop this. Always beware of the data density and data rate when you design/build an analysis. In your case, you're calculating a 30 days rolling average every time you execute this analysis. Depending on your data density and data rate, this could be a lot of individual values and this happens with every single evaluation of your analysis. In addition, surely the number of values needed for your 30 days rolling average is exceeding the size of the Data Cache, thus every time your analysis executes, the software has to make a network round trip to the PI Data Archive to query and retrieve all those values. This is why it takes a while to do.

Secondly, a custom data reference and creating an analysis will have little difference in performance because your bottleneck is data density, data rate and network round trip. At the end of the day, the actual math that you're doing will "cost" the same regardless of whether you use a CDR or with a scheduled analysis. Just as a side note, a CDR will almost always be less performing than using a scheduled analysis (calculations performed by PI Analysis Service).

So let's look at how to improve your analysis. A few things to consider. If you must have a 30 days running average in perpetuity, create a stand alone analysis that does nothing but perform 30 days running average with event trigger and store the output to a separate PI Point. This analysis could be as simple as: TagAvg(attribute1, '*-30d', '*') => OutputPIPoint. Just let this analysis run forever. This is fast as with every new execution in real time, the system would work with cached data and most importantly you now have a dedicated output PI Point that stores the 30 days running average. Now every time you need a 30 days running average, you can just make a single query to this output PI Point for a single value and that's extremely fast. By backfilling this analysis, you can look back in history and search for the last time the 30 days running average was greater than a certain value by simply utilizing the FindGT() function.

In your use case, I don't think it's necessary to gather all the data for the past 30 days into a RecordedValue() array. That's a lot of unnecessary work and is less than optimum.

I hope this gets you going in the right direction. Happy to continue this conversation if you need more help. You're welcome to post picture snippet of your analysis configuration here if that helps the conversation.