Skip navigation
All People > ernstamort > Holger Amort's Blog > 2016 > April

Although compression and exception settings are essential for the data quality, the concept is difficult to understand and to apply. Compression improves the signal quality especially for noisy signals if it is set correctly. But finding the correct settings can be tedious especially if this is done by manually tweaking the tags.


There is a way to calculate the best fit for the compression, but this requires a lot of data and some assumptions about the true signal. A more hands on approach is to define sampling rates for tags, for example a good target for slow moving signals is ~ 10 sec./point or 6 points/min.:


pointDistance [sec.] = Abs(LastTime [sec] - FirstTime[sec])/NoPoints

writeSpeed [point/min.] = 60/pointDistance


Since both exception and compression lead to a data reduction most companies use a ratio of 3 to 5 to specify exception based on compression. For example:


exception deviation = 3 * compression deviation


Since for fast moving signals exception deviation degrades the signal quality, it is recommended to have a larger exception-to-compression ratio. The cost function in C# to optimize the compression looks as follows:


   public class OptimizationResult
        public double PointDistance { get; }
        public double WriteSpeed { get; }
        public double Delta { get; }
        public int NoPoints { get; }
        public bool IsValid { get; }
        public OptimizationResult(double pointDistance, double writeSpeed, double delta, int noPoints, bool isValid)
            PointDistance = pointDistance;
            WriteSpeed = writeSpeed;
            Delta = delta;
            NoPoints = noPoints;
            IsValid = isValid;


public static OptimizationResult CompressionCostFunction(List<TimeValue> rawValues,
            double targetSpeed,
            double compression,
            double compressionExceptionRatio)
            var exception = compression / compressionExceptionRatio;
            var ExceptionAndCompression = new ExceptionAndCompression(exception, compression);
            var compressedValues = rawValues.Select(
                timeValue => ExceptionAndCompression.Calculate(timeValue)).Where(compValue => compValue != null).ToList();

            if (compressedValues.Count < 2) return new OptimizationResult(0, 0, 0, compressedValues.Count, false);

            var pointDistance = Math.Abs((compressedValues[0].TimeStamp - compressedValues[compressedValues.Count - 1].TimeStamp).TotalSeconds) /
            var writeSpeed = 60 / pointDistance;
            var delta = Math.Pow(targetSpeed - writeSpeed, 2);
            return new OptimizationResult(pointDistance, writeSpeed, delta, compressedValues.Count, true);


To optimize the problem just requires a hill climbing method such as Golden Section Search:


public static double GoldenSectionSearch(List<TimeValue> rawValues,
            Func<List<TimeValue>, double, double,double, OptimizationResult> func,
            double a,
            double b,
            double targetSpeed,
            double exceptionCompressionRatio,
            double tau = 1e-7, int maxiter = 1000)
            OptimizationResult cOptimizationResult, dOptimizationResult;
             var gr = (Math.Sqrt(5) - 1) / 2;

            var c = b - gr * (b - a);
            var d = a + gr * (b - a);
            var n = 0;
            while (true)
                n = n + 1;
                cOptimizationResult = func(rawValues, targetSpeed, c, exceptionCompressionRatio);
                dOptimizationResult = func(rawValues, targetSpeed, c, exceptionCompressionRatio);

                if (Math.Abs(c - d) < tau || n > maxiter) break;
               // cost function is not strictly monotonic, so the following tweak avoids convergence
                if ((cOptimizationResult.NoPoints == dOptimizationResult.NoPoints & cOptimizationResult.WriteSpeed < targetSpeed)
                    || (cOptimizationResult.NoPoints != dOptimizationResult.NoPoints & cOptimizationResult.Delta < dOptimizationResult.Delta))
                    b = d;
                    d = c;
                    c = b - gr * (b - a);
                    a = c;
                    c = d;
                    d = a + gr * (b - a);
            return (b + a) / 2;


Here are the results of optimizing the default SINUSOID that has been configured with a scan rate of 1 sec.:



     Target Write Speed = 9 points/min

     Exception Compression Ratio = 4

     Lower Bound = 0

     Upper Bound = 10 (10% of range)

     Window Size = 1000 (1000 data points for calculation)



     No Points = 147 (after exception and compression)

     Point Distance = 6.77 seconds

     Write Speed = 8.86 points/min.


Write speed or the average scan rate is a concept which is much easier to understand than exception and compression. Therefore some companies choose selecting an interface scan rate and then remove exception and compression. A much better approach is optimizing the compression settings and down sampling the raw signal.


PI Batch to EF Migration

Posted by ernstamort Apr 11, 2016

Most PI users have already transitioned to the AF system or at least build some demo system that uses AF. Some companies use their own enterprise structures or use already existing templates that mirror ISA-95 compliant equipment models(PI and MES: Equipment Model).


For the Event Frames (EF) many are still holding off to switch from batch to EF and this for good reasons. Migrating historical batches is important in order to preserve data for any type of future modeling. Historical batch data contain a rich library of past processing conditions that can be used to extract univariate (e.g. Golden Batch) or multivariate (PCA or PLS) process models. But in order to process historical data the system design needs to be completed both on the AF side and the EF side.


There is also valid point to me made about keeping both PI Batch and EF system running in parallel for a while. Especially for risk adverse industries there is a need to fully qualify the EF systems and its client applications before making the change. In addition, there might be some custom applications that write directly to PI Batch database that cannot be migrated to EF.


To understand the migration process, let’s have a look at the different ways batches have been created in the legacy batch system:


PIBaGen:                This is a tag based batch generator that can be configured in the PI-SMT

Batch Interfaces:     EMDVB and others process text or database information and create batches

PI-SDK:                   The software development kit is being used by 3rd party vendors to create batch context


OSIsoft has upgraded most its interfaces to write to the EF database. History recovery is also available to reprocess old data and create Event Frames.


An alternative way is to programmatically migrate historical batches and perform a real time synchronization. The goal shouldn’t be to perform a 1:1 migration, but rather a migration that fully utilizes the new system while preserving the existing structure.


The first challenge is that the structures of the PI Module database and AF database almost certainly don’t match. Due to its performance limitation the PI Module database was often only used to create very limited structures to allow base operations such as batch creation, develop element relative displays, RtReports and others. To the contrary the AF database is being used to create full enterprise models with context specific analysis and therefore is far more complex. The following shows a made up example:


Module Database:

AF Database:

To synchronize unit batches requires to link the PI Unit in the Module database to an AF element. This is accomplished by creating the following attribute on the PI Unit Template.

The unit reference on the unit batch can then be converted into an element reference on an event frame.

Module Database Guid: Unique Id of the PI Unit in the Module Database


Also the EF templates require some additional attributes for the sync process. The following shows the relationship between ISA-88, PI Batch and EF (see alsoPI and MES: Batch Model):



ISA 88

PI Batch

EF Batch



PI Batch



Unit Procedure

PI Unit Batch

Unit Procedure



PI Sub1 Batch




PI Sub2 Batch




PI Sub3 Batch

Phase State



PI Sub4 Batch

Phase Step


On each EF Batch template the following attribute should be added to link the historical batches to the event frames. In this example the attributes are added to the base template:


Id:                               The batch, unit batch, or sub batch id

IsHistorical:                This flag is set if the template is created from a historical batch

PI Batch Guid:           The unique identifier for batch, unit batch or any sub batch level

PI Batch Sync State: An enumeration to indicate if the sync process was successful

PI Batch Sync Time:  The time stamp of the last modification


With both the AF and EF templates in place the historical and real-time synchronization is now straightforward. The PI-SDK can be used to poll the batches, which will be mapped to an internal data model. Once processed event frames are either created or modified based on differences in the queues.


Batch Sync.png

The following shows the result of the migration of simulated data:


PI Batch Database:

EF Database:

While historians and manufacturing execution systems (MES) have been around for a long time, there are still a lot of questions on how to best integrate both in order to maximize the benefit of each.


In the past historians have been used to store process data and allow comparing current to past process conditions. In the last decade this has changed and historian have now grown into powerful real time calculation engines that allow context specific analysis of a large number of assets (> 1 Million). This allows real time analysis of large and complex systems, such as windfarms, data center or turbines.


In MES systems the historian is mostly used as data source similar to SQL, OPC, LIMS, SCADA or others. This shallow integration only uses the data storage capabilities of the historian without benefitting from the real time calculations, data conditioning and abstraction. The following flow chart show both MES and historians in the ISA-95 functional hierarchy.



In general, there are two types of information flows in a manufacturing enterprise:


  •    Transactional\Relational data
  •     Real time data


Transactional data are found in order processing, resource management, Quality, labor, maintenance etc., while real time data mostly originate from the plant floor. Real time data bubble up from production level (Level 0, 1 and 2) to the site and enterprise level while changing their characteristics:


          Level 0, 1, 2: High frequency data milliseconds to seconds, source specific, noisy

·             Level 3: Medium frequency seconds to minutes, abstract, aggregates

·             Level 4: Low frequency minutes to hours, days or weeks, abstract, aggregates


It is important to note that the main data transformation occurs at the historian level, where data are compressed, aggregated (min, max, total, sum, …) and most importantly abstracted. The abstraction is performed by mapping for example a controller tag TIC01234.PV to the temperature property of a reactor (e.g. Reactor\Temperature). A data scientist will now be able build a reactor model or predictive maintenance calculation based on the abstraction layer, instead of searching through a vast amount of uncategorized data.


For similar reasons the MES system should not consume raw production data. Its primary function is order management, production performance calculations, forecast, quality & resource planning etc.
But performing real time transformations of process data on the MES system itself, will often lead to a loss in both performance and accuracy.

Therefore, interfacing the historian with the MES should be performed on the abstraction layer as the following diagram shows:




To successfully connect the historian to the MES system requires both to adhere to common standards such as S95 for the equipment model and/or S88 for the batch model. The interface will replicate the data structure between systems and validate the structural integrity.

The benefit of the above architecture is a deep integration of historian and MES that maximizes the utility of both systems. It separates the manufacturing data flows and creates common interfaces to exchange data and structures.




Historians play a central role in the manufacturing data flow of real time information. The main historian operations such as data compression, de noising, aggregation and abstraction can benefit the enterprise data analytic as well as the MES operations. This requires a deep integration of the historian by abstracting the data layer using common standards such as S95 and S88 and interfacing with the analog MES data structures. The result is an architecture that utilize both systems to the full extent of their capabilities while acknowledging the differences in the data properties and requirements.

While historians and manufacturing execution systems (MES) have been around for a long time, there are still a lot of questions on how to best integrate both in order to maximize the benefit of each.