"The Power of Data" - sounds familiar? It was the tag line for OSIsoft Users Conference 2012 in San Francisco, and not for no reason. After the awesome PI Systems collect data from different parts of an enterprise we need to extract useful information out of raw data. This is especially true given how easily the new and mighty PI System can handle millions of tags (data streams) and events in a blink of an eye. To really get to "the Power of Data" we have no choice but to use more advanced analytics to massage the raw data and seek insight out of this huge volume that comes to us very fast.


To this end, I would like to share some of the current efforts we are taking here at vCampus with the help of some of my OSIsoft colleagues as well as third-party partners to enable more advanced analytics on PI Data. I would also love to hear your comments, feedback, and ideas along the way:


1) PI System and MATLAB: we have been working on such integration for about two years now. Not only we have a white paper in the Library describing the integration, we have also presented some machine learning applications during vCampus Live! 2011 and other places. MATLAB is a very powerful tool to do general and specialized analytics across several disciplines including machine learning, statistics in general, mathematical optimization, signal processing, and control systems among others. It has good penetration in research communities and academia as well as some industries. We have been working with Mathworks over the past year and will continue the joint venture.


2) PI System and R: R is the emerging and de facto language of Big Data Analytics. A lot of organizations are actively using R for development, in production, or are adopting the technology. The open source nature of the platform makes its use ubiquitous for many different applications. The language is specifically designed for handling data; therefore, it is extremely powerful in making tables, joins, selecting, and performing statistics on the data with a very efficient and smart syntax. Another huge advantage of R is its powerful graphics which make data much more readable and improves interpretability. We have been working with Revolution Analytics, who commercialize R in their product Revolution R Enterprise; it offers a much better development environment (like an IDE) and also offers supported packages for parallel computing. The following infograph shows the trend in terms of the number of books sold on different programming languages. The light green box on the bottom is about R with 127% annual increase and on par with some other major languages.


In my previous blog post I showed a way to use PI Data in R. I will follow up with other integration methods and more cool applications and graphics in the weeks to come.




3) PI System and Python: Python is a programming language that lets you work more quickly and integrate your systems more effectively. It is free to use (even for commercial purposes) and runs on multiple platforms. We have recently started looking into using PI Data with Python. A group of our enthusiastic engineers at OSIsoft have some good experience in doing so. We hope we can offer more concrete documents and results in the months to come.


Do  you use any of such analytical tools with your PI Data? Would you be interested to do so? Do you face any particular challenges? What would be the most valuable result out of a successful analytics package running on your PI Data?