AF Analysis is a pretty powerful tool and covers the majority of use cases. But there are also situations, where more advanced solutions are required.

The following is a common request: Basic Linear Regression: Slope, Intercept, and R-squared

 

And although the math is not difficult and code snippets are available on the internet, I don't think its a good idea to create modules just for that specific need. There will always something missing.

     You start with linear regression,

     then you need something to clean up outliers and missing values,

     maybe you want to robustify your calculations,

     a non linear function might give you a better fit,

     one variable might not enough to describe your data

     ...

Solutions for these problems are easy in MATLAB and R since they have already plenty of libraries available. The following describes how to build a R custom data reference.

Prerequisites

 

You need Visual Studio and an 64 bit R installation: Microsoft R Open: The Enhanced R Distribution · MRAN

Setup a standard class library, similarly to this post: Implementing the AF Data Pipe in a Custom Data Reference

In addition you will need the following nuget packages
    https://www.nuget.org/packages/R.NET.Community/

    NuGet Gallery | Costura.Fody 1.3.3

 

There are alternatives for both nugets, but I just tested this combination.

And of course a standard AF client installation.

Its also useful to include a logger, I really like the following:  Simple Log - CodeProject

 

In the project set up you need to set the bitness of the library to x64.

 

To automate the testing I used the following: Developing the Wikipedia Data Reference - Part 2

which I changed to the x64 deployment.

 

Before you start I would also execute RSetReg.exe in the R home directory.

 

The Code

 

     We first need to set the config string:

        public override string ConfigString
        {
            get
            {
                return $"{AttributeName};" +
                       $"{WindowSizeInSeconds};" +
                       $"{NoOfSegments};" +
                       $"{RFunctionName}";
            }
            set
            {
                if (value != null)
                {
                    string[] configSplit = value.Split(';');
                    AttributeName = configSplit[0].Trim('\r', '\n');
                    WindowSizeInSeconds = Convert.ToDouble(configSplit[1]);
                    NoOfSegments = Convert.ToInt32(configSplit[2]);
                    RFunctionName = configSplit[3].Trim('\r', '\n');
                    SaveConfigChanges();
                }
            }
        }


 

The idea is to have a function based on a source attribute that is executed on window size with a number of points

In the property setter you can already set the attribute:

 

        private string _AttributeName;
        public string AttributeName
        {
            private set
            {
                if (_AttributeName != value)
                {
                    _AttributeName = value;
                    // get the referenced attribute
                    var frame = Attribute.Element as AFEventFrame;
                    var element = Attribute.Element as AFElement;
                    if (element != null) SourceAttribute = element.Attributes[_AttributeName];
                    SaveConfigChanges();
                }
            }
            get { return _AttributeName; }
        }
        public AFAttribute SourceAttribute { private set; get; }
        public double WindowSizeInSeconds { private set; get; }
        public int NoOfSegments { private set; get; }
        public string RFunctionName { private set; get; }
        private REngine engine { get; set; }



 

Next we add the R engine in the constructor:

 

        // initialize REngine
        public CalculateInR()
        {
            // set up logger
            string pathAppData = Environment.GetFolderPath(Environment.SpecialFolder.CommonApplicationData);
            SimpleLog.SetLogDir(pathAppData + @"\CalculateInR", true);
            SimpleLog.SetLogFile(logDir: pathAppData + @"\CalculateInR",
                prefix: "CalculateInR_", writeText: false);
            SimpleLog.WriteText = true;
            try
            {
                // create R instance - R is single threaded!
                REngine.SetEnvironmentVariables();
                engine = REngine.GetInstance();
                // set working directory
                engine.Evaluate("setwd('C:/Source/ROSIsoft')");
                // source the function
                engine.Evaluate("source('Regression.R')");
                // might need to install and load libraries in R
            }
            catch (Exception ex)
            {
                SimpleLog.Error(ex.Message);
            }



}    

At minimum you would need to set the working directory and source your R code. For more advanced calculation you also might need to install\load libraries.

 

Next we need to build the helper method to send the values to R and get the results back:

 

        private double ExecuteRFunction(AFValues values)
        {
            var vector = engine.CreateNumericVector(values.Select(n =>
            (n.IsGood)?n.ValueAsDouble():Double.NaN).ToArray());
            // make symbol unique; R is single threaded and share the variable space
            var uniquex = "x" + Attribute.ID.ToString().Replace("-", "");
            var uniquer = "r" + Attribute.ID.ToString().Replace("-", "");
            // set symbol
            engine.SetSymbol(uniquex, vector);
            // perform calculation
            string executionString = uniquer + "<-" + RFunctionName +
                                     "(" + uniquex + "," + WindowSizeInSeconds + "," + NoOfSegments + ")";
            double result;
            try
            {
                result = engine.Evaluate(executionString).AsNumeric()[0];
            }
            catch (Exception ex)
            {
                SimpleLog.Error(ex.Message);
                result = Double.NaN;
            }
            return result;
            
        }
        private AFValues CreateVector(DateTime endTime)
        {
            var timeRange = new AFTimeRange(endTime - TimeSpan.FromSeconds(WindowSizeInSeconds), endTime);
            AFTimeSpan span = new AFTimeSpan(TimeSpan.FromSeconds(timeRange.Span.TotalSeconds / NoOfSegments));
            return SourceAttribute.Data.InterpolatedValues(timeRange, span, null, "", true);
        }



s prett}y much follows the examples here: Basic types with R.NET  | R.NET -- user version

Since R is single threaded and different instances share the same variable space, I would recommend to make the R variables unique. I measured the execution time from .NET and it took ~ 1 ms. This of course depends on the type of calculation you perform. There is also some overhead on the PI side when requesting interpolated values.

 

Next we need to define the GetValue and GetValues methods:

        public override AFValue GetValue(object context, object timeContext, AFAttributeList inputAttributes,
            AFValues inputValues)
        {
            var currentContext = context as AFDataReferenceContext?;
            var endTime = ((AFTime?)timeContext)?.LocalTime ?? DateTime.Now;
            // get the function result from R
            var values = CreateVector(endTime);
            return new AFValue(null, ExecuteRFunction(values), endTime);
        }
        public override AFValues GetValues(object context, AFTimeRange timeRange, int numberOfValues,
            AFAttributeList inputAttributes, AFValues[] inputValues)
        {
            AFValues values = new AFValues();
            DateTime startTime = timeRange.StartTime.LocalTime;
            DateTime endTime = timeRange.EndTime.LocalTime;
            // loop through the timeRange
            double span = (endTime - startTime).TotalSeconds;
            for (var index = 0; index < numberOfValues; index++)
            {
                var tmpValues = CreateVector(startTime + TimeSpan.FromSeconds(index * span));
                values.Add(new AFValue(null, ExecuteRFunction(tmpValues), endTime));
            }
            return values;
        }




 

Then we just populate the data methods using https://techsupport.osisoft.com/Downloads/File/5cbefb97-d253-46dd-b369-f36cda374e47

and create the data pipe using Daphne Ng code.

 

So now we have custom data reference that can execute a function that takes the following inputs: x,WindowSizeInSeconds and NoOfSegments

In R we can develop the code for the linear regression, which is basically just calling the lm-function. I believe this is included ion the standard installation. Since for this example we are only interested in the slope the R wrapper code looks as follows:

 

regression <- function(x,WindowSizeInSeconds,NoOfSegments) {
  span<-WindowSizeInSeconds/NoOfSegments
  lm(seq(0,WindowSizeInSeconds,span)~x)$coefficients[2]
}

 

After registration, we can use the CDR in AF Analysis, which provides the all the plumbing to call the CDR based on point updates.

 

Result

 

Here is the result of 10 min average a linear regression with a 10 min window of a fast moving 1h sinusoid: