7 Replies Latest reply on Aug 8, 2018 5:56 PM by coelacanth

    PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?

    coelacanth

      The PIWebApiClient library provides functions that return recorded and interpolated values as pandas DataFrame objects. But, I can't find one that does this for summary values. There is a get_summary function, but it returns a PIItemsSummaryValue object. Is there some particular reason why this hasn't been implemented? Does anyone know of a workaround or a way to convert this type of object to a pandas DataFrame?

        • Re: PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?
          Marcos Vainer Loeff

          Hi Penn,

           

          Yes, you are right. All the methods that return a pandas dataframe are in the DataApi class. Summary and summaries were not implemented yet. My suggestions is to use the traditional get_summary() method and implement the logic to convert the PIItemsSummaryValue object into a pandas data frame by referring to the content of the DataApi class. If you are successful and you want to add this feature to the library, please send me the code and I will add to the project.

           

           

          Thanks!

          1 of 1 people found this helpful
            • Re: PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?
              coelacanth

              Thanks for the quick response! I tried converting the PIItemsSummaryValue object to a JSON string (since that's what it looked like to me) and then reading that with pandas.read_json(), but that didn't work. (Maybe that object should have a method to directly export JSON?) I'll try what you suggested, and if I come up with a good solution, I'll send you my code.

              • Re: PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?
                coelacanth

                I wrote the following function for getting the summary values in a data frame based on the other ones I found in data_api.py. The problem is that the get_summary function seems to return a different type of object than what the convert_to_df function is expecting. Each item in res.items is a PISummaryValue object. To pull out the various columns, you would need to access them as item.value.timestamp, for example. But in convert_to_df, it's using item.value, item.timestamp, etc. Do you know if get_summary is supposed to behave differently from the other functions like get_recorded? Or am I missing something obvious?

                 

                def get_summary_values(self, path, calculation_basis='TimeWeighted', end_time="*", filter_expression=None,

                        sample_interval=None, sample_type='ExpressionRecordedValues', selected_fields=None,

                        start_time="*-1d", summary_duration=None, summary_type='Total', time_type='Auto', time_zone=None ):

                    if (path is None):

                        print("The variable path cannot be null.")

                        return

                 

                    web_id = self.convert_path_to_web_id(path)

                    res = self.streamApi.get_summary(web_id, calculation_basis, end_time, filter_expression,

                            sample_interval, sample_type, selected_fields, start_time, summary_duration,

                            summary_type, time_type, time_zone)

                    df = self.convert_to_df(res.items, selected_fields)

                    return df

                  • Re: PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?
                    Marcos Vainer Loeff

                    Hello Penn,

                     

                    If you are a look at the data_api.py and stream_api.py files, you will realize that self.streamApi.get_recorded returns a PIExtendedTimedValues. On the other hand, streamApi.get_summary will return a PIItemsSummaryValue. This is why you can't call self.convert_to_df(res.items, selected_fields) directly. You need to create another method such as convert_summary_to_df whose input will be PIItemsSummaryValue.

                     

                    Please take a look at the object returned by the get_summary method and then try to think how your dataframe should look like. You can request multiple summaryTypes. This is why it has a different structure.

                     

                    Hope this helps!

                      • Re: PIWebAPIClient for Python: No function to return summary values as pandas DataFrame?
                        coelacanth

                        Here's a function that does that. I made it such that there's a column called 'Type' that indicates the summary type. I did that because I assume that the record metadata (e.g., Good/Questionable/Substituted fields) could be different for the same timestamps across summary types. So, this gives you one big data frame with all summary values in a single column that you would then need to slice depending on what you need to do. There are other ways of doing this, of course...maybe have it return a dictionary of data frames keyed by summary type. You could pivot the single large table based on the summary type so that the they become columns, but that assumes that the record metadata is always the same for each timestamp.

                         

                        def convert_summaries_to_df(self, items, selected_fields, summary_type):

                            """Convert a list of PISummaryValue objects to a pandas data frame."""

                            if (items is None):

                                raise Exception('The returned data is Null or None')

                         

                            streamsLength = len(items)

                            if (streamsLength == 0):

                                raise Exception('The returned data is Null or None')

                               

                            addValues = False

                            addTimeStamp = False

                            addUnitAbbr = False

                            addGood = False

                            addQuestionable = False

                            addSubstituded = False

                            summaryCalc = []

                            value = []

                            timestamp = []

                            unitsAbbreviation = []

                            good = []

                            questionable = []

                            substituted = []

                         

                            # Figure out which columns need to appear in the data frame.

                            if (selected_fields != None and selected_fields != ""):

                                if ("timestamp" in selected_fields):

                                    addTimeStamp = True

                                if ("value" in selected_fields):

                                    addValues = True

                                if ("questionable" in selected_fields):

                                    addQuestionable = True

                                if ("unitabbr" in selected_fields):

                                    addUnitAbbr = True

                                if ("good" in selected_fields):

                                    addGood = True

                                if ("substituted" in selected_fields):

                                    addSubstituted = True

                            else:

                                addValues = True

                                addTimeStamp = True

                                addUnitAbbr = True

                                addGood = True

                                addQuestionable = True

                                addSubstituted = True   

                         

                            for item in items:

                                summaryCalc.append(item.type)

                                if (addValues == True):

                                    value.append(item.value.value)

                                if (addTimeStamp == True):

                                    timestamp.append(item.value.timestamp)

                                if (addUnitAbbr == True):

                                    unitsAbbreviation.append(item.value.units_abbreviation)

                                if (addGood == True):

                                    good.append(item.value.good)

                                if (addQuestionable == True):

                                    questionable.append(item.value.questionable)

                                if (addSubstituted == True):

                                    substituted.append(item.value.substituted)

                         

                            data = {}

                            data['Type'] = summaryCalc

                            if (addValues == True):

                                data['Value'] = value;

                            if (addTimeStamp == True):

                                data['Timestamp'] = timestamp;

                            if (addUnitAbbr == True):

                                data['UnitsAbbreviation'] = unitsAbbreviation;

                            if (addGood == True):

                                data['Good'] = good;

                            if (addQuestionable == True):

                                data['Questionable'] = questionable;

                            if (addSubstituted == True):

                                data['Substituted'] = substituted;

                           

                            df = pd.DataFrame(data)

                           

                            return  df