sraposo

Asset Analytics Best Practices - Part 4: Analyses in Warning or Error

Blog Post created by sraposo on May 29, 2019

If you're looking for Part 1: Use Variables, it's right here.

If you're looking for Part 2: Data Density and Data Pattern, it's right here.

If you're looking for Part 3: Input Attributes, it's right here.

 

Asset Analytics Best Practices Blog Posts:

In the upcoming months I will be publishing several blog posts on different Asset Analytics best practices.

 

The goal of these blog posts is to help prevent customers from falling into common pitfalls that lead to poor performance of the PI Analysis Service. This will be done by:

  1. Increasing the visibility of the PI Analysis Service Best Practices knowledge base article.
  2. Providing concrete examples that show why it's important to follow our best practices.

 

To show the performance of various setups, I will be using the logs of the PI Analysis Processor (pianlysisprocessor.exe) with the Performance Evaluation logger set to Trace. If you are not familiar with the logs, loggers and logging levels, more information can be found in our documentation here. Alternatively, there is also a full troubleshooting walkthrough video on Youtube here. The video shows a troubleshooting approach using the PI Analysis Processor logs, it does not however go over best practices.

 

Asset Analytics Best Practices - Part 4: Analyses in Warning or Error

 

For this post, I'll be using the Maximum latency performance counter to show why analyses in error or in warning should be disabled until they are fixed. If you are not currently historizing the performance counters for the PI Analysis Service, I would strongly recommend for you to do so. They are an extremely useful tool that can be used to both monitor the health of the service and help with diagnosing issues. You can use the PI Interface for Performance Monitoring to historize the performance counter data in the PI Data Archive. 

 

Let's first look at the setup I'm using. There are currently 9999 analyses running in my system: 

 

 

All of these analyses are based on a single analysis template. The scheduling is periodic 10s, and the configuration is: 

 

 

Both of these inputs have PI Point data references. 

 

On 2000 elements I intentionally changed the name of the PI Data Archive for 'Input1' to the name of a PI Data Archive that isn't working well (PI Snapshot and PI Archive Subsystems aren't running).

 

On 1000 elements I changed the name of the PI Point for 'Input1' to a PI Point that doesn't exist. 

 

Consequently, there are 3000 analyses in warning, and 6999 running well.

 

The two warnings are:

 

 

 

These are runtime warnings so the PI Analysis Service will attempt to evaluate these analyses and will output Calc Failed:

 

 

Currently the maximum latency performance counter shows an average maximum latency of about 6.6s. This is 1.6s over the designed minimum latency of 5s.

 

 

This performance isn't bad, but let's see what happens if we disable all analyses in warning. In the management tab in PI System Explorer I can filter based on Service Status: Warning

 

 

Select all analyses and disable them. 

 

 

 

We can immediately see a gain in performance. The maximum latency performance counter now shows an average closer to about 6.1s:

 

 

The vertical red line is the approximate time I disabled all of the analyses in warning. 

 

The performance improvement here is about 0.5s. These analyses in warning didn't have a huge impact on the performance, but keep in mind that:

 

  1. These analyses are extremely simple with only 2 inputs. More complex analyses and/or analyses with more inputs would have a bigger impact. 
  2. The effects on performance will vary depending on the error or warning. For example, input attributes with a table lookup data reference to a linked table for which the external system is offline would have a pretty significant impact on performance. I didn't want to do another table lookup example, so I chose a different issue that has less of an impact, but an impact nonetheless. 
  3. This example was done on a 2018 SP2 system. The effects on the performance would be much worse in some of the older versions, trust me  ! 

 

Key Takeaway:

 

Analyses in warning or error usually output Calc Failed. This result isn't useful in most cases, therefore these analyses serve no purpose in their current state and should be disabled until they are repaired and/or the systems on which they rely on are repaired. 

 

Suggested Solution:

 

In this example, there were two issues:

  1. A PI Data Archive wasn't working well. 
  2. Some analyses only needed 1 input as the other input does not exist for this asset even though it exists on the template. 

 

For (1) we could leave the analyses enabled if the PI Data Archive would be fixed in the near future. Otherwise, we can disable those analyses until it is fixed. In my case I just had to restart those services as I had killed them from task manager. If you need help identifying an issue with the PI Data Archive please reach out to Tech Support or ask questions on PI Square! 

 

For (2) given that some assets based on the same template don't have the 'Input1' data point, we can exclude it on the element instance and add a BadVal() check in the analysis configuration. In theory we don't need to exclude it for this to work, but then there would still be some exception handling to be done which causes some performance overhead. Inputs in error aren't good for performance! Since the data point doesn't exist on some instances of the template it makes more sense to exclude it. The PI Analysis Service is aware of the excluded property as of version 2018. 

 

Excluding the attribute can be done in bulk using PI Builder:

 

 

 

 

The retrieve the values, since these attributes have a PI Point data reference, follow the procedure described here

Sort on AttributeValue, the excel sheet looks like:

 

 

Change AttributeIsExcluded to TRUE and publish. 

 

All that is left to do is to add the BadVal() logic at the template level:

 

 

This is the attributes tab on one element in PSE:

 

 

After fixing the PI Data Archive and adding the exclusion logic, here is the maximum latency with all 9999 analyses running. 

 

 

Some additional notes:

I've often had push backs to disabling analyses in warning or error from customers. Usually the main point of concern is that there are analyses that are currently disabled. How can one tell which ones were disabled prior to disabling all analyses in warning or error? In smaller systems, often times we can use the analysis template filter in the management UI to figure out which ones we are disabling and should be fixed and enabled in the future. In most systems this is not possible. As of 2018 SP2, it's now possible to query analysis runtime information via the AF SDK. Consquently, we could log in a file (.txt or .csv or ...) all analyses in error and\or warning prior to disabling them. We could then use another AF SDK script to parse the file and start those analyses once they have been fixed. 

 

For more information on how to query analysis runtime statistics, please read Nitin's great post here

 

If you have any questions please post a comment and I will answer when I can! 

Outcomes