3 Replies Latest reply on Apr 10, 2018 12:44 PM by rschmitz

    Archive sizing on a large system

    AlistairFrith

      We have a system with around 1.5 million tags, most of which, for several reasons, have compression and exception turned off, so the archives are receiving values every minute for most of these tags and are filling at a rate of 1GB per hour!

       

      If we were able to use compression, this would massively reduce (and performance would similarly benefit) but unfortunately as I say, there are reasons why we can't do this. So is it sensible to increase the archive size from the current 4GB up to, say, 96GB, giving us roughly 4 days per archive? Our PI Server currently has 32GB of memory. Would we also have to increase that to, say, 128GB?

       

      Essentially, what are the practical constraints on archive sizing?

       

      Regards,

       

      --- Alistair.

        • Re: Archive sizing on a large system
          rschmitz

          Hi Alistair,

           

          Can I ask why you're looking to increase the archive size in the first place? Are you seeing some performance issues or is there another reason you want to avoid having the archive shift so frequently?

           

          Our guidelines for archive sizing are to never make your archive larger than 1/3 of your total RAM on the machine for performance reasons (both for reprocessing should that be needed, as well as query performance and allowing the system to properly cache values in memory). The second guideline we have is in general we aim to have the archive size (in MB) be (license point count x 3)/1024, however this recommendation assumes some compression and as you say that's not the case for your system.

           

          So taking a step back, you could still stay within the first guideline (archives no larger than 1/3 of your RAM) and with 128 Gb of RAM that puts you at an archive size of ~ 42Gb giving you not quite two days worth of data in an archive. Even still the question remains, why are you looking to make the archive files larger than they are right now?

           

          Cheers,

          Rob

            • Re: Archive sizing on a large system
              AlistairFrith

              Thanks for that quick response.

               

              Our current system has around 400K tags and will soon have closer to 1.5 million. We are also deploying a 3rd party client and will potentially have several hundred users accessing displays that each show a few 10s of tag snapshots and a handful of trends spanning several days.

               

              We have set up a stress-test system with representative numbers of tags and are using scripting to emulate the user connections. We are finding performance issues once we get up to around 100 users which seem to be on the PI side more than the client system and were thinking that part of this could be that the historical data being retrieved spans a large number of archives and having fewer larger archives would be more efficient. It looks like the bottleneck may be in disk access.

                • Re: Archive sizing on a large system
                  rschmitz

                  So first off, allow me to amend me previous reply. I took a look at our sizing tool with you values subbed into it and it seems we recommend keeping the archive size on the smaller size even if you do have the many points (I suspect for manageability of the archives). For your system I came up with ~6Gb/archive as the recommend size, given the tool.

                   

                  That being said, in general the overhead of reading from one archive to the next (rather than all of the data out of a single archive) is pretty negligible. Moreover, that performance issue should only be seen on the first query of that data after which point the data which has been queried for should end up cached in memory, should another user go looking for the same information. You can feel free to test with larger archives but, I don't think it will help improve performance much if at all and it makes the archives very unmanageable when they're that large.

                   

                  I would recommend setting up a data collector set when running those stress test and taking a look at those statistics for disk throughput. If you are maxing out the disk read limit, you'd be better off switching from an HDD to an SSD. My other thought is that if it really is Disk throughput that's the issue, it could be the fact you're writing an upwards of 1Gb/hour to disk. With HDD's the writes affect the reads on disk (the actual actuator arm needs to move into place for the correct location on the disk) and the amount of data you're writing could be negatively affecting your ability to have user view that much data. Also please take a look at KB00717 when making considerations about the disk throughput and the Data Archive.

                   

                  Cheers,

                  Rob