5 Replies Latest reply on Sep 29, 2017 7:24 PM by avanfosson

    PI Data Archive > Archive size calculation containing many PIPoints in string

    dbochert

      Hi,

       

      I'm trying to calculate the archive size with the lastest "Hardware_System_Sizing_Recommendations" file.

       

      But I have to integrate some significant sizes depending on string PIPoints containing 230 characters.

       

      So, my question is : what is the impact of strings and how can I calculate the ideal archive size having that in mind?

       

      Regards,

        • Re: PI Data Archive > Archive size calculation containing many PIPoints in string

          Hello David,

           

          PI Archive subsystem always stores events consisting of value and timestamp information. The Hardware System Sizing Recommendations is a System Manager resource that makes some assumptions e.g. that the timestamp information consumes 2 bytes. If the timestamp is without sub second information, PI Archive subsystem stores a time offset in seconds relative to the previous event. This takes between 1 and 4 bytes. Sub second timestamps are stored as timestamps and this information takes about 8 bytes on disk. The spreadsheet however assumes 2 bytes for the timestamp information.

           

          The "Measurement Data Type" dropdown offers with int32, int64, float32 and float 64 only a subset of the data types available for PI Points. You'll recognize that with 32 bit data types, 6 bytes and with 64 bit data types 10 bytes is the "Estimated Event Size (on Disk)". Let's roughly estimate a string of 230 characters length takes 230 bytes on disk plus 2 bytes for the timestamp information in order to stick with the assumption made within the spreadsheet.

           

          When you unprotect the spreadsheet (Review -> Unprotect Sheet) you can overwrite the formula in cell B12 with 232.

           

          You'll have recognized that using strings is expensive, not only because of the consumed space on disk but also because the more space an event takes on disk, the more time is needed to read this event from disk. Whenever possible, you should make use of Digital State sets to translate between string information and it's numerical representation from the Digital State table. A Digital State takes up to 2 bytes, so let's assume it would take 4 bytes together with the timestamp information. Staying with the 230 characters strings, the string takes 58 times the space that you would need with a point of type Digital.

           

          I am well aware, that using Digital States instead of strings is not always an option but I recommend to check case by case if using Digitals instead of strings could be an option.

           

          Another thing to keep in mind, please take care that the same value (same string) doesn't always repeat with only an updated timestamp. There's usually no value on repeating values because we can assume them valid until a new value updates the snapshot.

           

          In the real world, you don't have a PI System with all tags having the same point type but you can use the spreadsheet to estimate the archive size for each point type and later sum up the numbers.

          • Re: PI Data Archive > Archive size calculation containing many PIPoints in string
            avanfosson

            The Hardware and PI System Sizing Recommendations Spreadsheet is being phased out. A new online Hardware Sizing Tool has replaced it. At this time the tool only gives sizing recommendations for PI Data Archive and PI AF Server.