8 Replies Latest reply on Nov 26, 2013 3:48 PM by Robert Raesemann

    PI HA Backups

    rickark

      Just wondering the necessity of having backups of the Secondary Members of a PI HA Collective.  My PI System is located in 3rd party managed data centres with the servers (virtual machines) located in geographical locations distinctly separate from each other.  With regular Collective Re-initialisations to keep the archives as valid as possible both servers will look fairly similar at all times.  The incremental backup of the Primary Server (which in my opinion is the most critical) is carried out every 24 hours along with an incremental backup of the all PI VM images including the backup server.  So as you can appreciate, the data is protected from Data Centre Failure and Physical Server Failure which leaves the possibility of total data loss extremely low. I see a few possibilities if the back files were required;

       

      Failure of the Primary HA Member.  Recovered and archives restored from the backup stored on the Backup Server which is in a different data centre.

       

      Failure of the Secondary Member.  Recovered, Reinitialised and Synchronised with the PI HA Collective Manager.

       

      Failure of the Backup Server.  Recovered with the backup regime implemented by the 3rd Party Data Centre Provider.  Any gaps are filled from the Primary Archives through the PI Server Backup Process.

       

      Now I realise that self managed servers without the level of redundancy that I have would validate the added security of having a backup of the Secondary Members but from what I have listed above is it really necessary to backup the Secondary Server?

        • Re: PI HA Backups
          Robert Raesemann

          I'll be interested to see what others have to say on this topic.

           

          From my experience, it pays to be paranoid and to do everything that is possible and economically justifiable. If you can make backups of the secondary server without spending too much money, then I would do it. I once had a scenario where a server was wiped out by a lightning strike, the backup was being copied to a separate SAN which was also damaged. When we went to restore the tape backup, we found that a technician had not followed up on entries in the error log that started to occur after the backup agent was updated, and so the system didn't make a complete backup to tape. We were only saved by the fact that I was also copying the backups to a workstation. Things like this don't happen very often, but when they do, it has always paid to be paranoid and have multiple levels of backup. I would only decide not to make the backup if it is going prohibitively expensive to do so.

            • Re: PI HA Backups
              rickark

              Yes fair comments and appreciate your view.  Believe it or not although disk space should not be cost prohibitive this day in age, because of the agreement we have with this third party provider and level of data backup agreed to an extra 2TB across 4 VM's is about an extra $35K per year over five years!  This PI System is filling 1GB archives every 1.3 days so yes, disk space is not only expensive in my world it also gets consumed quite rapidly.

                • Re: PI HA Backups
                  Marcos Vainer Loeff

                  Hi Karl,

                   

                  In my opinion, backing up the secondary is important when there is a data loss on the primary PI Server for some reason but not on the secondary. This means that there is data that only exists on the secondary but not on the primary as data is not replicated between different members of PI Server yet.  If you want to have a backup of this data in this situation, then you should also backup the secondary.

                   

                  It is uncommon that you will face the situation above but it is possible especially if you have some applications that are not sending data through the buffer. If you do face the situation above, you will lose data only if the archives from the secondary PI Server get corrupted otherwise you are able to recover data from them.

                   

                  In order to decide if it is worthwhile backing up the secondary PI Server, you need to consider the costs involved and the importance of the data for your company.

                   

                  Hope this helps you!

                    • Re: PI HA Backups
                      rickark

                      Thanks Marcos, I had not considered that possiblity.  But one question I do have, if I have this data on the secondary only which is not replicated from the Primary and I carry out a Re-Initialise from the PI HA Collective manager, won't I lose this data as the archives are copied across from the Primary?  I suppose this is antoher instance where a backup of the Secondary would be handy.  

                        • Re: PI HA Backups

                          Hello Karl,

                           

                          When using PI Collective Manager to synchronize members of a PI Collective, you chose what archives are included. PI Collective manager than takes a PI Backup, coipies it over to Secondary nodes and restores the Secondary nodes with files backed up on the Primary before. For the case described by Marcos, you would indeed overwrite the 'good' archives on a Secondary node with the 'bad' ones from the Primary. However, there are other options:

                           

                          - Manually take a backup of a Secondary node and restore archives to other members of the Collective. In case of any doubts, please ask OSIsoft Technical Support for assistance or at least discuss involved steps detailed with a Technical Support engineer.
                          - Add the missing data to the Primary's archive(s) by reprocessing 'good' archives against 'bad' ones.

                           

                          Robert Raesemann

                          From my experience, it pays to be paranoid and to do everything that is possible and economically justifiable.

                           

                          Thank you for your comment Robert. Sharing your experience is pretty valuable. A data loss situation is the most severe. With a PI Collective as a redundant PI System, the backup strategy should be redundant as well. 

                            • Re: PI HA Backups
                              Robert Raesemann

                              The situation that I gave as an example was a combination of hardware failure compounded by human error. It's kind of like when you hear about the nuclear accidents. It is never just the hardware that failed. They usually cover that kind of thing pretty well. It is always hardware plus poor decisions or ignoring key signals. In our case, the backup system was tested throughly when it was first setup, but then a later upgrade to part of the system caused a problem that wasn't followed up on. A defense in-depth type approach saved the day.

                               

                              When considering the cost of the backups, it is key to consider the cost of losing the data. For many of the systems that I work on, the operations folks don't like it when data is missing, but they can live with it if it was an extreme issue. To them it is not worth the extra money to prevent missing data because it is inconvenient, but not a show stopper. The environmental folks, however, have a much different view. Missing data can not only result in large fines, but can also ruin your credibility with the regulatory agencies. Especially if it happens more than once. In those cases, an extra $35k/yr might not seem extravagant.

                               

                              As an engineer, I have found that the best approach is to put together the most comprehensive design that you can, break it down into 2-3 levels, starting with a bronze level of backup, and progress to the gold or platinum (The business folks love the bronze-gold thing. It is almost as cool to them as the red, yellow, green light thing). Put a price tag on each of the levels, and then let the business folks decide what level they think is appropriate for that particular situation. That way everyone is doing the job that they are supposed to be good at. As an engineer you have presented the technical details to them and hopefully adequately outlined the risks associated with each. You are informing them of the different decisions that must be made in the design, and the consequences of each, and they are balancing risk and reward and making the business decision.

                               

                              One more thing. It is absolutely vital that no matter what you do, you perform an initial disaster recovery drill, and then test it again annually. I am emphatic about this, and yet I still have clients who just don't get around to it. Anyone can backup, but only a subset of those people will be able to successfully restore. No backup plan is a backup plan if it hasn't been tested and proven to work, and it most be proven periodically because things will change.