I often wonder how many tags are ever retrieved from the archive of a PI Server containing, say, 10000 tags. Most users rely on a handfull of DL reports and PB schematics to monitor their processes, and these documents are based on a small fraction of the total number of tags. So, perhaps 9 out of 10 tags never ever get trended or displayed in a report or schematic, and they never contribute to operations. I do not mean to imply that archiving 10000 tags is silly; actually the more tags one archives the better, because one never knows when something will be needed.
But given data from a 10000 tags PI Server, can one identify which 1000 tags are most important without knowing what they really represent? This is done for web search, where google ranks pages first based not on the semantic meaning of the information they contain but on the topological structure of the web (which pages link to which). Some other criteria would have to be devised to assign an importance score to each tag.
Similar to PageRank for ranking webpages, it would be really interesting to come up with a TagRank which quantitatively rates the importance of each tag on the server without knowing what it really represents. Why? The system could use this to say "this tag must be important, and I must notify the user when I notice an unusual pattern involving this tag." So you come in the morning, and the system will suggest that you checkout an interesting drop in the value of a tag. You know the tag represents a temperature, but the system flagged it simply because its behavior was not "in synch" with the rest of the system. The idea is that the system should not flag a 100 tags a day, but just a few important ones (and give you links for tags with similar behavior).
Such a system can be used to automate the analysis, highlight the exceptions and guide the user directly to the area that needs attention. This gives the user an alert about the problem, not a report to find the problem. A potential application for StreamInsight.