Advanced Analytics and Machine Learning Use Cases with Industrial Sensor Data - Yeast - MSPC - Multivariate Statistical Process Control

Blog Post created by gopal on Jan 24, 2020

This Lab was part of PI World 2019 in San Francisco. The Lab manual used during the instructor led interactive workshop is attached.  Lab VM is available via OSIsoft Learning 


In previous years, we have explored the use of advanced analytics and machine learning for:

  • Anomaly detection in an HVAC air-handler - more
  • RUL (remaining useful life) prediction based on engine operations and failure data - more
  • Golden-run identification for the temperature profile from a feed dryer (silica gel/molecular sieve) in an oil refinery - more


Additionally, as part of the above labs, we have used analytical methods such as PCA (principal components), SVM (support vector), shape similarity measures etc. And, in other similar labs, we have covered well-known algorithms for regression, classification etc. and reviewed the use of Azure Machine Learning - more - and open source platforms such as R and Python.


In this year’s lab, we explore the use of historical process data to predict quality and yield for a product (Yeast ) in batch manufacturing. We’ll use multivariate PCA modeling to walk-through the diagnostics for monitoring the 14-hour evolution of each batch. And, alert you when a batch may go “bad” as critical operating parameters violate “golden batch” criteria ((high pH, low Molasses etc.). And, then we utilize PLS – projection to latent structures - to predict product quality and yield at batch completion.


The lab illustrates the end-to-end tasks in a typical data science project – from data preparation, conditioning, cleansing etc. to model development using training data, testing/validation using unseen data, and finally, deployment for production use with real-time data.


The techniques explored in the lab are not limited to batch manufacturing; they can be applied to several industries and to numerous processes that are multivariate.


No coding or prior experience with open source R or Python is necessary but familiarity with the PI System is a prerequisite.


Who should attend? Power User and Intermediate

Duration: 3 hours


Problem statement

In this lab, we review Yeast manufacturing operations – specifically, the fermenter. A typical batch fermenter cultivation takes 13 to 14 hours. Raw material variability in molasses or operational issues related to other feeds such as air and ammonia can cause ethanol (a byproduct) to exceed limits or the pH in the fermenter tank to become too acidic resulting in “bad” batch runs.


 We want to use historic operations data with known “good” runs as a basis for alerts when current production parameters deviate from “golden batch” conditions. We also want to predict quality parameters, referred generically as QP1 and QP2, and the expected yield for each batch.


In the hands-on portion, you:

  • Review the AF model
  • Use PI Integrator to publish the process values and lab data for available batches - this is used for model development
  • Use R for model development - golden tunnel and control limits (Python script is also available)
  • Review model deployment
    • Model is deployed using PI asset analytics
    • Use PI Vision displays and PI Notifications to monitor a batch in  real-time using the golden tunnel criteria


Yeast AF Model


Yeast golden tunnel

Yeast PCA equation for Asset Analytics in AF

Yeast PI Vision golden tunnel