The more I read about R, the more interested I get. It really provides solutions for a wide range of problems. The R packages are well documented, have example to get you started and I found that the creators and authors are very approachable . There is a learning curve especially if you want to create some custom scripts, but at large you get more work out of it than you have to invest.
For PI user the most interesting packages are the time series applications. I can recommend the following:
https://cran.r-project.org/web/packages/forecast/forecast.pdf by Prof Rob Hyndman
https://cran.r-project.org/web/packages/robfilter/robfilter.pdf from TU Dortmund
One of the challenges is that R time series packages are mostly designed for homogeneous time series (identical point to point distance), but PI due to the compression algorithm stores heterogenous data series (varying point-to point distance). So, in almost all cases the data have to be re- and down sampled.
The following are common steps for down sampling
Data cleansing, outlier removal
Resampling at lower rate
This process can be very resource intense, especially when applied in real-time. One approach is to use a set of algorithms based on the exponential moving average algorithm. The following is an excellent read:
And here are some R code examples:
The last package you need is a R data access package – I attached ROSIsoft to the post.
Quick primer on ROSIsoft
The library is built using the rClr package and a wrapper dll. The wrapper dll is necessary to do the plumbing between .NET data and basic R types. I simplified the AF data models to make them more compatible with R.
Installation is done manually. In RStudio select Tools\Install packages … and when the open dialog opens, change the option “Install from:” to “Packaged Archive File”
After the installation the library is loaded with: library(ROSIsoft)
(If you are missing a library or package, the process is always the same)
To connect to AF and PI server use first: AFSetup()
This will also install the rClr package, which is included in the ROSIsoft package.
All functions are documented in the help file, although I have to spend some more time on it. To connect to the PI server use the following:
The connector object contains information about the PI and AF server as well as their connection states. It’s also the only object that needs to be initiated, all other methods are static.
To get values just use the GetPIPointValues() function. It requires a retrieval type constant as string, which can be looked up with the GetRetrievalTypes() function.
To get some recorded values for the sinusoid (sinusoid1H is a faster moving sinusoid for testing) is then straightforward:
values <- GetPIPointValues("sinusoid1H","T+8H","T+10h","recorded",10)
Plotting requires the xts package to convert the string datetime into a R time object.
which produces the following plot:
As I mentioned above, most R packages require homogeneous time series. Since I didn't find all the functions in R I added a couple of real time operators and also exception\compression functions:
ApplyCompression: to apply different exception\compression settings to the time series
CalculateEMA: calculate realtime exponential moving average
CalculateMA: calculate realtime moving average
CalculateMSD: calculate realtime moving standard deviation average
CalculateZScore: calculate realtime moving zscore average; helpful for outlier detection and removal
CalculateOutlier: outlier removal based on zscore
I will provide some data sets in upcoming posts that are a good starting point. Here is an example of using the ApplyCompression function on the same time series:
and then the plot:
New version as of 08/06/2017