sraposo

Asset Analytics Best Practices - Part 3: Input Attributes

Blog Post created by sraposo on Apr 22, 2019

If you're looking for Part 1: Use Variables, it's right here.

If you're looking for Part 2: Data Density and Data Pattern, it's right here.

 

Asset Analytics Best Practices Blog Posts:

In the upcoming months I will be publishing several blog posts on different Asset Analytics best practices.

 

The goal of these blog posts is to help prevent customers from falling into common pitfalls that lead to poor performance of the PI Analysis Service. This will be done by:

  1. Increasing the visibility of the PI Analysis Service Best Practices knowledge base article.
  2. Providing concrete examples that show why it's important to follow our best practices.

 

To show the performance of various setups, I will be using the logs of the PI Analysis Processor (pianlysisprocessor.exe) with the Performance Evaluation logger set to Trace. If you are not familiar with the logs, loggers and logging levels, more information can be found in our documentation here. Alternatively, there is also a full troubleshooting walkthrough video on Youtube here. The video shows a troubleshooting approach using the PI Analysis Processor logs, it does not however go over best practices.

 

Asset Analytics Best Practices - Part 3: Input Attributes

 

There are two examples in this blog post. I know it's a long one ! The second example doesn't build on top of the first.

 

Example 1, Table Lookup:

 

Suppose I need to setup an analysis in AF to calculate the volume of a certain product in a tank. Suppose that the tank manufacturer provided me with a table that has the list of all the tank models with their width and length.For the purpose of this blog post, there is a little over 500k tank models. I know I should have picked a different asset to make this a little more realistic, I only thought about it afterwards !

 

The table in SQL with the tank information looks like this:

 

 

There is exactly 1 124 335 rows in this table.

 

The table in AF is configured as a linked table with a cache interval of 1 minute. The query for the table is simply Select * from TableName

 

There is a PI Point in the PI Data Archive in which the current level of the product is written to in real time.

 

The analysis template on the tank element template is:

 

 

The attribute templates are configured as such:

 

 

I have to deploy 1 000 elements based on this element template. After deploying one, I want to double check the performance of the analysis as I noticed PI System Explorer is extremely slow. It takes about 30s-1min to load the value of all the attributes on the element:

 

 

I also noticed that visualizing the table in PSE is extremely slow, it takes also about 30s - 1min to view the table.

 

To check the performance I note down the path of one of my analysis by right clicking and selecting "copy" and pasting it in a text editor. I then enable trace on the performance logger. If you're not familiar with the procedure, please watch the Youtube video here. In the case of the performance evaluation logs, you need to escape using the "\" character in the analysis path.

 

Here are two consecutive evaluations, the first one is immediately after the service was started:

 

2019-04-16 11:01:31.5395|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\AFServer\\AFDatabase\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "5a7629bd-7f36-47e4-83bb-0d4262607cea",

"Status": "Success",

"ExeTime": "2019-04-16T14:59:18Z",

"Detail": "",

"ElapsedMilliSeconds": 98283.746500000008,

"LagMilliSeconds": 103396.47970000001

}

 

2019-04-16 11:01:31.5395|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\AFServer\\AFDatabase\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "5a7629bd-7f36-47e4-83bb-0d4262607cea",

"Status": "Success",

"ExeTime": "2019-04-16T14:59:48Z",

"Detail": "",

"ElapsedMilliSeconds": 1.1489,

"LagMilliSeconds": 103404.0447

}

 

So the first time the analysis was started it took a little over 98s, the second time it took about 1ms. What a huge difference! I know that all analyses should ideally evaluate within 100ms, and the smaller the evaluation time (ElapsedMilliSeconds) the better. I need to figure out why it took so long to evaluate the first. If I look at the logs maybe I'll have an idea. Looking at the next evaluations:

 

2019-04-16 11:01:31.5395|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\AFServer\\AFDatabase\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "5a7629bd-7f36-47e4-83bb-0d4262607cea",

"Status": "Success",

"ExeTime":"2019-04-16T15:00:18Z",

"Detail": "",

"ElapsedMilliSeconds":3.182,

"LagMilliSeconds": 43471.3462

}

 

2019-04-16 11:01:31.5545|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\AFServer\\AFDatabase\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "5a7629bd-7f36-47e4-83bb-0d4262607cea",

"Status": "Success",

"ExeTime": "2019-04-16T15:00:48Z",

"Detail": "",

"ElapsedMilliSeconds": 1.0629,

"LagMilliSeconds": 43473.3406

}

 

2019-04-16 11:04:03.4763|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\AFServer\\AFDatabase\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "5a7629bd-7f36-47e4-83bb-0d4262607cea",

"Status": "Success",

"ExeTime": "2019-04-16T15:01:48Z",

"Detail": "",

"ElapsedMilliSeconds": 100274.82620000001,

"LagMilliSeconds": 105366.4189

}

 

The logs show the same pattern over and over. If you take into account that it takes ~1 minute and 40s for the long evaluation, it seems like the long evaluations are about 1 minute apart. This coincides with the Cache Interval of 1 minute specified on the table.

 

Let's see how long the query would take in SQL. We need to first run the query in the table configuration and then run the query in the table lookup data reference.

 

 

So 48s to run the query to retrieve the table. And then a similar query again against the table that lives in the memory of the PI Analysis Service (pianalysisprocessor.exe). Afterwards, the values are cached until the cache interval expires.

 

Ok, well since this table is static, I could set the cache refresh interval to manual and only encounter the poor performance once when the service is started. Is this the best solution though? What's the point here of having data for 500k tanks when I only need data for 1k tanks. If only there was a way to only get the data that I needed....

 

I could write an AF SDK application to fetch the data from the SQL Server and write it to an attribute with a data reference of <none>. This isn't a problem since the value will never change once it is set. I would definitely get really good performance if I'm not using a data reference. But then if I need to deploy more tanks in AF, I need to modify my code, recompile and re run. Or if a PowerShell script, modify the code and re run. If only there was a way in AF.....

 

Well there is! We can use parametrized queries. The documentation can be found here.

 

So I can reconfigure my table with the following query:

 

 

 

Using two parameters. I can also change the refresh interval from 1 minute to Manual.

 

I need to change the configuration of the attribute templates to use the parameters:

 

 

Now if I look at the performance, it is much better.

Looking at the logs:

 

2019-04-16 13:16:24.4727|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\SRAPOSO-AF2105\\PISquare_Blog\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "6e242e58-8744-4857-9f3f-5c2dded8afca",

"Status": "Success",

"ExeTime": "2019-04-16T17:16:18Z",

"Detail": "",

"ElapsedMilliSeconds": 1010.8520000000001,

"LagMilliSeconds": 6192.359

}

 

2019-04-16 13:16:53.1701|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\SRAPOSO-AF2105\\PISquare_Blog\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "6e242e58-8744-4857-9f3f-5c2dded8afca",

"Status": "Success",

"ExeTime": "2019-04-16T17:16:48Z",

"Detail": "",

"ElapsedMilliSeconds": 1.9079000000000002,

"LagMilliSeconds": 5106.0063

}

 

2019-04-16 13:17:23.3506|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\SRAPOSO-AF2105\\PISquare_Blog\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "6e242e58-8744-4857-9f3f-5c2dded8afca",

"Status": "Success",

"ExeTime": "2019-04-16T17:17:18Z",

"Detail": "",

"ElapsedMilliSeconds": 0.5207,

"LagMilliSeconds": 5117.1971

}

 

2019-04-16 13:17:53.2791|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\SRAPOSO-AF2105\\PISquare_Blog\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "6e242e58-8744-4857-9f3f-5c2dded8afca",

"Status": "Success",

"ExeTime": "2019-04-16T17:17:48Z",

"Detail": "",

"ElapsedMilliSeconds": 0.4928,

"LagMilliSeconds": 5106.5498000000007

}

 

2019-04-16 13:19:53.1951|TRACE|ANPerformanceTrace:PerformanceEvaluation|Type: AnalysisEvaluated, Data: {

"AnalysisName": "\\\\SRAPOSO-AF2105\\PISquare_Blog\\TableLookupExample\\Tank1\\Analyses[VolumeCalc]",

"AnalysisId": "2183a29b-5c97-11e9-914b-001ddab729f5",

"GroupId": "2a5d5f94-0a5e-4427-9ec6-4fb0e555bda0",

"TimeclassId": "6e242e58-8744-4857-9f3f-5c2dded8afca",

"Status": "Success",

"ExeTime": "2019-04-16T17:19:18Z",

"Detail": "",

"ElapsedMilliSeconds": 0.40800000000000003,

"LagMilliSeconds": 5115.1077000000005

}

 

Wow the performance is so much better !! The first time the table is loaded, it took ~1s compared to ~100s, an improvement of two orders of magnitude! Even each evaluation is faster.

 

Why? Well instead of loading one gigantic table in the PI Analysis Service's memory and then performing multiple lookups on the gigantic table, we are loading multiple tiny tables and performing each lookup on each tiny table. We no longer have data that we don't need loaded into memory.

 

Key takeaways:

 

  1. Don't bring data you don't need into the PI System. In the case of table lookups, you can use parametrized queries to only fetch data that will be used.
  2. There is a query that is run to retrieve the table for the PI Analysis Service (or any AF Client), the table is then stored in the memory of the PI Analysis Service. There is a second query ran against the table that lives in memory. If either of these queries are expensive, this will affect the performance of the analysis.
  3. Be mindful of the Cache Refresh Interval. Never refresh more than necessary.
  4. PSE is a good indicator of how well a table lookup is doing. Remember at the beginning of the example I stated that PSE was slow at retrieving the values. If PSE is slow, Asset Analytics will most likely be slow as well.
  5. Not exactly related to this post, but I've seen this so many times that I will mention it here, never use Asset Analytics to bring operational time series data into the PI Data Archive. We have a PI Interface for Relational Database (RDBMS) to do that.

 

Example 2, Formula and Analysis Data Reference:

 

If you haven't read this KB already, I would recommend for you to read it as it explains quite well how the Analysis Data Reference (Analysis DR) works.

 

What's really important for the below example is that the Analysis DR, like any data reference, is evaluated client side. That means, the PI Analysis Service will need to calculate the value returned by the Analysis DR to determine what the input value to the analysis is.

 

Let's consider the following analysis named OffendingMinimumAverage. To check the performance I will use the new UI features introduced in the 2018 version of Asset Analytics.

 

 

This analysis is periodic and runs once daily at midnight.

 

Here is the performance of this analysis:

 

(Max and Avg are the same as there is only 1 evaluation for 't' to '*' now that the analysis is periodic daily).

 

This is very very bad performance. Analyses should evaluate within 100ms, and definitely never over 1s. This analysis on average takes about 97s to evaluate. Since it runs once daily, we could maybe get away with the bad performance for one analysis. At scale though, if we create more analyses from the same analysis template this will affect the performance of the service as a whole.

 

Why is the performance so bad? A summary function over 1 weeks worth of data shouldn't take that long even if the input data is quite dense.

 

Since the configuration of the analysis doesn't provide any more insight as to why the performance is so bad, let's look at the input.

 

Here are the attributes on this element:

 

 

The OffendingMinimum input to my analysis is actually an Analysis DR. I know that the value will need to be calculated before I can do the TagAvg('OffendingMinimum','*-7d','*').

 

Let's look at the configuration for this analysis data reference, the name of the analysis where this is configured is FindOffendingMinimum:

 

 

This looks to be pretty expensive for an analysis data reference. For every event in any of the inputs (DataPoint_1, DataPoint_2 and DataPoint_3) we need to calculate all of these minimums and maximum as well as verify the value of another input called 'CheckRange'. This analysis should probably write to a PI Point rather than be configured as an Analysis DR.

 

Let's have a look at the performance using the UI again:

 

 

Ok so the performance of our TagAvg('OffendingMinimum','*-7d','*') can be roughly explained.

 

We have on average a value every 280.2s in the FindOffendingMinimum Analysis Data Reference (analysis DR). This means for 1 week, we have about 2 159 evaluations: 60sec * 60min * 24hours * 7 days / 280.2s = 2158.45

 

The amount of time it takes to calculate all of these evaluations is 2 159* 35.2ms = 75 996 ~= 76sec

 

So it takes more than a minute to simply get the data needed to calculate the TagAvg(). Clearly if this data was readily available in a PI Point rather than being calculated at each iteration the performance would be so much better.

 

However, even if the above analysis DR was converted to be an analysis that outputs to a PI Point, the analysis would still have some performance issues as it sometimes takes as much as 508ms to evaluate.

 

We can change the scheduling to periodic daily of this OffendingMinimum Analysis now that we are outputting to a PI Point and this should considerably reduce the load. However, an evaluation time of 508ms is still quite high. Let's see if we can improve that.

 

We know that the DataPoint_* attributes are PI Point data references. For these ones there is not much investigation to be done. The performance should be great as long as the PI Data Archive performance is great.

 

The 'CheckRange' attribute in the vCheck variable though is a Formula. This is also calculated client side. The equation is:

 

If A > 0 then 1 else 0 where A is another formula attribute called 'Range'.

 

The equation of 'Range' is:

 

A-B where A, Maximum and B, Minimum are analysis data references.... we are getting pretty deep in a nested dependency chain here.

 

The configuration of Maximum and Minimum respectively are:

 

 

 

All of the inputs here are PI Points.

 

So every time 'OffendingMinimum' (level 1) is called, we need to call 'CheckRange' (level 2) which calls 'Range' (level 3) which calls 'Minimum' and 'Maximum' (level 4). Once 'Minimum' and 'Maximum' are calculated, we can calculate 'Range', once that's done 'CheckRange' can be calculated afterwards 'OffendingMinimum' can be calculated and finally once this has be done over a 7 day period we can calculate TagAvg('OffendingMinimum','*-7d','*').

 

There is 4 level of nested dependencies here. That's a lot especially that some of the data references are expensive, notably the 'OffendingMinimum', 'Minimum' and 'Maximum' analysis DRs.

 

Ok so what's the solution here? We need to avoid:

 

  1. Expensive data references
  2. Too much nesting

 

Ok the formulas aren't super expensive by themselves, but the 'Maximum' and 'Minimum' analysis data reference they depend on may be.

 

The 'OffendingMinimum' analysis data reference is without a doubt expensive and needs to be converted to write to a PI Point. I don't want to use too many PI Points, so we could just try changing this one. However, since there is a maximum evaluation time of 508ms which is way above 100ms, I'll reconfigure the 'Minimum' and 'Maximum' analysis data reference to output to PI Points instead. The formulas can stay as is as they aren't expensive.

 

The performance now of FindOffendingMinimum analysis after changing 'Minimum' and 'Maximum' analyses to output to PI Points is:

 

 

(Max and Avg are the same as there is only 1 evaluation for 't' to '*' now that the analysis is periodic daily).

 

So the average and max evaluation time are well below 100ms, we are doing very good here.

 

What about the OffendingMinimumAverage analysis? Now that FindOffendingMinimum analysis is outputting to a PI Point the performance is:

 

5.9ms instead of 96.8s ! That's a difference of 4 orders of magnitude.

 

I will leave it up to you to figure out why the OffendingMinimumAverage analysis it out performing FindOffendingMinimum analysis !

 

Key takeaways:

  1. Make sure you understand the Analysis Data Reference before using it! Read this KB.
  2. Date references are evaluated client side. If a data reference is expensive, this will slow down analysis evaluation considerably. Expensive analysis data references or expensive formula data references should be converted to analyses outputting to PI Points!
  3. Be mindful when you start nesting input dependencies. A formula A+B may not appear to be expensive, but if input A is also another formula that does A+B and that input A is also another formula ... and so on ... it can get expensive for AF Clients to retrieve all of the nested dependencies and then perform the calculations by moving back up the chain.



 

 

 

BIG TAKEAWAY:

  • If an analysis appears to be lightweight, but it isn't performing well, look at each of the inputs. If the input isn't a PI Point or a <none> data reference, then ask yourself is the input expensive? Move down the dependency chain until you only find PI Point or <none>. If there are a lot of levels in the dependency chain, consider breaking it up.

 

 

I made an assumption while writing this post that the PI Data Archive is performing well. If the PI Data Archive isn't performing well, it goes without saying that Asset Analytics will not be performing well.

Outcomes