AnsweredAssumed Answered

PI UFL Interface - Parsing complex CSV files with strings

Question asked by pmc2018 on Sep 26, 2018
Latest reply on May 24, 2019 by PIOhMY

10/1/2018: I have updated the discussion title as well as the contents to be more clear:


Background information:

I have a CSV data file that has several columns with values that are populated with a string that is inside quotation marks. I know there is a way to parse this using the \"(*)\" sequence however that does not work in my situation. My CSV file contains columns where the tag names live in the first line and the corresponding values are below. The column lay out is generally like this: Time, Tag A, Error code A, Tag B, Error code B... and so forth. The time column is pretty straightforward. Just a timestamp. The Tag values (Tag A, Tag B, ... ) are typically numeric. The Error codes (Error code A, Error code B, ...) are typically string variables. The error codes occasionally contain commas within them, so typically when there is an error it is encased in double quotes to differentiate from a delimiting comma.


My issue:

Where I am hung up on parsing the file is the following: If there is a value for a tag at a particular time stamp, the corresponding error code column will contain double quotation marks. If there is no error, it will be double quotation marks without a space inside (""), and obviously if there is an error there will be text encased in double quotes ("This is an error, please investigate"). If there is not a value for a tag at a particular timestamp, then there will definitely not be an error code, and there will not be any double quotes so it will just be commas (,,). It is not predictable when there will be a string in double quotes and when there will not. Since the delimiter in this CSV file is a comma, and some of the strings have comma's in them, I need to capture the entire string in side double quotes.


This is a rough example of what I am dealing with:


Time, Tag A, Error code A

11/9/2017 08:01 AM,1329,"Elec",

11/9/2017 06:02 AM,1328,"BG, Met, Oxi",

11/9/2017 04:01 AM,1327,"BG, Elec, Met, pH",

11/9/2017 02:02 AM,,,

11/9/2017 12:03 AM,1325,"",


So in some columns a blank string looks like ,"", but in other columns a blank string shows as ,, and I do not think the PI UFL Interface can handle this. I just wanted to see if there were any recommended approaches.


The PI UFL Connector was suggested but I a having issues installing that.


I have attached an excerpt of the actual data file and INI file I am dealing with.





Message was edited by: Prianka Chapin