Skip navigation
All Places > All Things PI - Ask, Discuss, Connect > Blog > 2020 > January

The following is from the lab notes for the hands-on lab "Exploring PI AF Analytics for Advanced Analysis and Prediction" at PI T&D + Power Generation Users Group Meeting, 2018.  

Lab VM is available via OSIsoft Learning

The Lab manual is attached; the manual is intended for an instructor led interactive workshop in a classroom setting.


In this  lab, we explore several data access methods in the PI System for extracting contextualized datasets for data science projects. PI AF will play an important role in shaping the data. In each case, a simple statistical model will be developed. The models will be evaluated, tested and operationalized using PI AF.


The lab includes:
Example 1 – Single Asset Predictive Model using Python and the PI Integrator for Business Analytics
Example 2 – Multiple Asset Predictive Models using Python and PI SQL Client

The following is from the lab notes for the hands-on lab "Operational Forecasting" at OSIsoft Users Conference 2017, San Francisco, CA.  Lab VM is available via OSIsoft Learning

The Lab manual is attached; the manual is intended for an instructor led interactive workshop in a classroom setting.


The lab's objective is to step through an end-to-end data science/machine learning task -  collect data, publish historical data, develop a predictive model and deploy the model in real-time for wind turbine operations .  

The predictive model is to forecast power generation for each turbine in our fleet as shown below


Operational Forecasting - Wind Farm

Figure shows a graph of Active Power vs. Time - actual power in purple and forecasted power in yellow.

The predictive model is based on forecasted wind speed and air temperature.


The tools used are: 

  • PI Integrator -  publish historical turbine operations data to a SQL endpoint
  • Power BI and its built-in support for R scripts  -  data munging, data diagnostics and exploring the features
  • Azure ML - develop and deploy the model (as web services)
  • Windows script (or, alternatively a .Net C# code via AF-SDK) is used to read/write forecast data to PI 


Wind turbine Power vs Windspeed, also correlation plot

Figure shows a graph of Active Power vs. Wind Speed from operations data. 

For additional details, please see the Lab  Manual.

We use PI Integrator to publish the data in a row-column format for the next steps.


Feed Dryer PI Integrator output


And, use PowerBI for descriptive analytics with this large dataset covering several months of minute resolution data.


Feed Dryers - Power BI  screen


Next, R is used for more data munging and extract the golden temperature profile.


Feed Dryer Golden Temperature Profile

And, validate the model to confirm if it can flag bad runs using shape metrics.


Feed Dryer Shape is not OK


And, after it is validated,  wedeploy it for real-time operations by writing to a PI future tag.


Feed Dryer Operationalize expected temperature profile


During operation, deviation from the expected temperature profile is continuously evaluated and it triggers a Notification to take corrective action. 

Feed Dryer Notification

 Feed Dryer PI Vi



Go to Part 1

Go to Part 2

This Lab was part of PI World 2019 in San Francisco. The Lab manual used during the instructor led interactive workshop is attached.  Lab VM is available via OSIsoft Learning 


In previous years, we have explored the use of advanced analytics and machine learning for:

  • Anomaly detection in an HVAC air-handler - more
  • RUL (remaining useful life) prediction based on engine operations and failure data - more
  • Golden-run identification for the temperature profile from a feed dryer (silica gel/molecular sieve) in an oil refinery - more


Additionally, as part of the above labs, we have used analytical methods such as PCA (principal components), SVM (support vector), shape similarity measures etc. And, in other similar labs, we have covered well-known algorithms for regression, classification etc. and reviewed the use of Azure Machine Learning - more - and open source platforms such as R and Python.


In this year’s lab, we explore the use of historical process data to predict quality and yield for a product (Yeast ) in batch manufacturing. We’ll use multivariate PCA modeling to walk-through the diagnostics for monitoring the 14-hour evolution of each batch. And, alert you when a batch may go “bad” as critical operating parameters violate “golden batch” criteria ((high pH, low Molasses etc.). And, then we utilize PLS – projection to latent structures - to predict product quality and yield at batch completion.


The lab illustrates the end-to-end tasks in a typical data science project – from data preparation, conditioning, cleansing etc. to model development using training data, testing/validation using unseen data, and finally, deployment for production use with real-time data.


The techniques explored in the lab are not limited to batch manufacturing; they can be applied to several industries and to numerous processes that are multivariate.


No coding or prior experience with open source R or Python is necessary but familiarity with the PI System is a prerequisite.


Who should attend? Power User and Intermediate

Duration: 3 hours


Problem statement

In this lab, we review Yeast manufacturing operations – specifically, the fermenter. A typical batch fermenter cultivation takes 13 to 14 hours. Raw material variability in molasses or operational issues related to other feeds such as air and ammonia can cause ethanol (a byproduct) to exceed limits or the pH in the fermenter tank to become too acidic resulting in “bad” batch runs.


 We want to use historic operations data with known “good” runs as a basis for alerts when current production parameters deviate from “golden batch” conditions. We also want to predict quality parameters, referred generically as QP1 and QP2, and the expected yield for each batch.


In the hands-on portion, you:

  • Review the AF model
  • Use PI Integrator to publish the process values and lab data for available batches - this is used for model development
  • Use R for model development - golden tunnel and control limits (Python script is also available)
  • Review model deployment
    • Model is deployed using PI asset analytics
    • Use PI Vision displays and PI Notifications to monitor a batch in  real-time using the golden tunnel criteria


Yeast AF Model


Yeast golden tunnel

Yeast PCA equation for Asset Analytics in AF

Yeast PI Vision golden tunnel

This Lab was part of PI World 2018 in San Francisco. The Lab manual used during the instructor led interactive workshop is attached.  Lab VM is available via OSIsoft Learning 


In a crude oil refinery, gasoline is produced in the stabilizer (distillation) column. Gasoline RVP is one of the key measurements used to run and adjust the column operations. Refineries that do not have an on-line RVP analyzer have to use lab measurements - available only a few times - say, a couple of samples, in a 24-hour operation. 


As such, column process values (pressure, temperature, flow etc.) and historical RVP lab measurements can be used  via machine learning models to predict RVP more often (say, every 15 minutes or even more frequently) to guide the operator.


Stablizer (distillation) column producing gasoline in an oil refinery

Figure: Stablizer column 


AF data model

Figure: Stablizer column - AF  data model


In the hands-on portion, you

  • Review the AF model
  • Use PI Integrator to prepare and publish historical data (to a SQL table) - this data is used for model development
  • Review the step-by-step machine learning model development process in Python/Jupyter
  • Deploy the model for real-time operations
    • Use PI Integrator to stream real-time stabilizer process data to Kafka. And, using Python and kafka consumer,  calculate the model-predicted RVP and write it back to PI via PI WebAPI


Stabilizer historical process data and lab RVP used for model development

Figure: Stablizer column - historical process data and lab RVP measurements 


RVP Jupyter Python kafka consumer

Figure: Python Jupyter notebook - shows Kafka consumer and WriteValuesToPI  snippet


The data flow sequence is as below: (to pause/play animation, save the GIF file to a local folder and open in Windows Media Player)



Gasoline RVP predicted values

Figure: Stablizer column - historical lab RVP measurements overlaid with predicted RVP 

Oil refinery process unit operation – Alkylation feed dryer (Exercise 1)

This exercise uses an oil refinery Alkylation feed dryer process to illustrate the layers of analytics - descriptive, diagnostic, predictive and prescriptive.

First, the descriptive and diagnostic portions are reviewed below.




The process consists of twin dryers – Dryer A and Dryer B - each with stacked beds of desiccant and molecular sieve to remove moisture from a hydrocarbon feed.  The dryers are cycled back and forth i.e. when one is removing moisture from the feed, the other is in a regeneration mode where the bed is heated to dry out the moisture from a previous run.


The modelling objective is to create a temperature profile representing proper regeneration of the dryer bed.  This profile is analyzed via AF Analytics and then a golden profile is extracted via R/MATLAB and subsequently operationalized again using AF Analytics, PI Notifications and PI Vision. 


The data used for this Exercise comes from an actual oil refinery and covers a year's (2017) data  at six-minute intervals.


PI Vision displays below show the Dryers in Process (green) and Regeneration (red) states.




The descriptive analytics consists of calculations using sensor data for temperatures, flows, valve positions etc. to identify the dryer status i.e. Operations vs. Regeneration.




The process piping configuration (via valve open/close) and the measurement instruments generating the sensor data are such that you have to perform several calculations similar to those shown above to prepare the data for subsequent steps i.e. diagnostic, predictive and prescriptive.


Also, event frames are constructed to track the start and end of each regeneration cycle for Dryers A and B.



More calculations with the flow sensor data is done for Dryer processing age defined as:

Lifetime volume of feed dried by a bed (bbls)
Molecular sieve load in dryer (lbs)


Since the feed flow rate varies, additional analysis is done to calculate the volume (bbls) of feed processed before each regeneration cycle.






Event frames with the requisite data for additional diagnostics is exported using PI Integrator for Business Analytics.




Fit for Purpose - Layers of Analytics using the PI System


Continue reading

The following is from the lab notes for the hands-on lab "Fit for Purpose - Layers of Analytics using the PI System: AF, MATLAB, and Machine Learning" at PI World 2018, San Francisco, CA

Lab VM is available via OSIsoft Learning 

Lab Manual is attached - scroll to the end of this page. 


For an updated version of this Lab using open source tools such as R/Python see Operationalizing analytics - simple and advanced (data-science) models with the PI System  

Part 1 Introduction

Part 2 Alky feed dryer – process analytics - descriptive and diagnostic (Exercise 1)

Part 3 Alky feed dryer – process analytics - diagnostic/predictive/prescriptive (Exercise 1 continued)

Part 4 Motor/Pump – maintenance analytics – usage based, condition based and predictive (Exercise 2)


Layers of analytics can be viewed through many lenses.  Frequently, it refers to the levels of complexity and the kinds of computations required to transform “raw data” to “actionable information/insight.”  It is often categorized into:

  • descriptive analytics - what happened
  • diagnostic analytics - why did it happen
  • predictive analytics - what can/will happen
  • prescriptive analytics - what should I do, i.e. prescribing a course of action based on an understanding of historical data (what happened and why) and future events (what might happen)

The purpose of the analytics i.e. whether it is for descriptive or diagnostic or predictive or prescriptive will influence the “raw data” calculations and transforms.  The following graph shows “value vs. difficulty” as you traverse the layers.



Layers of analytics can also be viewed through a “scope of a business initiative” lens – for example, in asset maintenance and reliability, the layers are:

  • UbM – Usage-based Maintenance  -  AF
  • CbM – Condition-based Maintenance -  AF
  • PdM – Predictive Maintenance - AF plus third party libraries




Layers of analytics can also be categorized by where the analytics is done, such as:

  • Edge analytics
  • Server based analytics
  • Cloud-based analytics


Analytics at the edge include those done immediately with the collected data.  It lessens network load by reducing the amount of data forwarded to a server - for example, Fast-Fourier Transform (FFT) on vibration time wave-forms to extract frequency spectrums. Or, when an action is to be immediately taken based on the collected data without waiting for a round-trip to a remote analytics server.


In the hands-on portion of this Lab:

  • Exercise 1 uses an oil refinery process unit operation (Alky feed dryer) to walk-through the layers i.e. descriptive, diagnostic, predictive and prescriptive
  • Exercise 2 uses a maintenance/reliability scenario (pump/motor assembly) to illustrate the layers i.e. UbM, CbM, and PdM


Items not included in the detailed hands-on portion will be covered as discussion topics during the Lab.


Continue reading:

Part 2 Alky feed dryer – process analytics - descriptive and diagnostic

On Dec 4th, 2019, we had a one-day "Data & Analytics to Support Knowledge Management in Life Sciences" event at the MIT Samberg Center in Cambridge, MA.  


The presentations were not recorded, but the links to the slides are below.

If you have questions, please ask in the Comments section below. 







Filter Blog

By date: By tag: