Mamunur

Efficient way to detect and remove duplicate entries from PI Archives

Blog Post created by Mamunur on Nov 17, 2020


I was recently analyzing some old historical archives for some work on some 5 to 8 years historical data for about 10k tags in the PI Archives. I encountered there were huge amount of duplicate entries for most of the data in there. We expected to have a single occurrence of each entries for a single timestamp. 

 

So, it was decided to remove these garbage entries and to keep a single event for each of those duplicates. I tried searching of some custom programs or code that could save us from these trouble. I found that there were some questions, discussions and suggestions which either suggested to use PI OLEDB or PI AFSDK. The programs that were shared or suggested, I believe, were not up to to the mark considering performance and effectiveness.

 

So, I went on to design my own program using my .NET C# skills and obviously leveraging the powerful AFSDK library. Please find the below steps to understand how I achieved the requirement.

 

Step 1: First retrieve all the records for the tag using a call to #recorded values.

 

List<AFValue> recordedValues = myTag.RecordedValues(new AFTimeRange(startTime, endTime), AFBoundaryType.Inside, "", true);


Step 2: Use LINQ to fetch the duplicate entries

var duplicates = recordedValues.GroupBy(x => x).Where(g => g.Count() > 1).Select( z => new { data = z.Key, count = z.Count() });

 

Step 3: Use Replace Values for each of the duplicate pairs. Check how a single event is kept in the archives, by pushing a single AFValue inside the AFValue collection during the code flow. We can even log the entries for evidence or backup before removing them, as shown.

 

foreach (var item in duplicates)
        {
            log.WriteLine("{0},{1},{2},{3}", myPIPoint, item.data.TimeStamp, item.data.Value, item.count);
            tag.ReplaceValues(new AFTimeRange(item.data.TimeStamp, item.data.TimeStamp), new List<AFValue> { item.data } );
        }


Note: I have found that this method is much more efficient than using UpdateValue, by using AFUpdateOption.Remove, which has some constraints/dependencies before it successfully deletes the data.

 

 

PI Developers Club

Any suggestions, updates or tips you want to share, you are welcome ! Happy Coding. 

Outcomes