8 Replies Latest reply on Jan 21, 2016 6:57 AM by pallavi.soni

    PI Interface self heal


      Hello community,


      I am working on a solution which requires some automation. Here is a brief description:


      Frequent issue: After server reboot, there are chances that PI interface service would not start properly and goes to "STOPPED" state or in a hanged state (the PI interface shows as "Running" but upon checking the Health tag/Status tags of the interface, the values shown as Bad)


      Blueprint of the solution: We are trying to build an automatic mechanism/utility which will start/restart the PI Interface service whenever the interface is down (the service has been stopped
      or is in hung state) due to system reboot. The idea is that we have to read the Interface Health tags from PI server and check for the current values, if the current value is not Good, then the Utility should go and restart the PI interface service to refresh the Interface.


      We can create the utility or write a code to perform this operation using Visual Studio (.net/C#) but since we don't have VS on the Interface machine, it is not possible. Also, we don't have Excel to do VBA coding.


      Hence, I am looking for an idea where I can read the current value of PI tags (Interface health/status tags) using SDK. But I am not sure how can I achieve this.


      It would be great if anyone can post any other solution for this automation.



      Pallavi Soni.

        • Re: PI Interface self heal

          Some interfaces, like the PI OPC interface for example, have a delay setting before the interface connects to the OPC server. This allows the OPC server time to get started properly and things to get "settled" before the interface connects and Groups are formed etc. You do not mention the interface but perhaps something like this (depending on what is available for the interface you have) could be incorporated in to your solution design.


          Do the logs show why the interface does not start properly or goes in to a hanged state? This is not normal...and not a feature.


          Also, reading the values on the PI Data Archive server (snapshot) may not determine whether the interface is at fault, the buffering is "blocked" or there is a Network disconnect. Restarting the interface would do little if the issue is with the buffering or there is a Network disconnect.


          Just as an aside, the interface Service > right click Properties > Recovery tab > you can set options on what to do in case of a failure (Restart the Service for example). Not sure whether this is recommended as it is not addressing the cause of the issue.

          2 of 2 people found this helpful
            • Re: PI Interface self heal
              Roger Palmen

              Could not agree more with Stephen. Focus on getting the interface running properly instead of working around the issue. Interfaces should handle restarts without issues.

              1 of 1 people found this helpful
              • Re: PI Interface self heal
                Rhys Kirk

                Number of restarts on failures is a good starting point, it can be set to continually restart.

                In addition to checking the current value of an interface status, you should look at IO rates, automatically parse the event logs for messages from the service (or crash messages for the interface exe), PI Buffer Subsystem statistic checks etc ... Even then it might be better for an alert to a human rather than restarting the interface automatically - depends how much you trust your coding.

                Depending on where you are running this, you could check Interface nodes remotely from a single machine, or you could run a PowerShell script in the Startup Policy that will check the service status and restart it if it crashes one time after server startup.


                It is not always possible to fix the issue because there are times where the issue is within the Interface executable itself, which means waiting for a fix. In the mean time you have to put a temporary workaround in place.

                1 of 1 people found this helpful
                • Re: PI Interface self heal

                  Agree with this approach, I'd also recommend checking your service dependencies, these should as a minimum include PI buffer subsystem and PI Net manager (and PINet Manager should have a dependency on the TCP/IP Protocol Driver). Check also if there are software such as OPC Tunneller etc. that the interface is dependent on those services already being in a running state (the delayed startup should provide ample time for initial connection). An interface crashing on startup sounds like it's not able to connect to something and is timing out, it should be clear exactly what is causing the issue from the logs.

                  1 of 1 people found this helpful
                • Re: PI Interface self heal
                  Bryan Owen

                  The root cause on intermittent issues and transients could also be a platform issue.  Perhaps extraneous suggestion in this case but it might be worthwhile to consider a failover strategy. Much of the logic required for the workaround is already a standard part of interface failover.

                  • Re: PI Interface self heal

                    Thanks Stephen, Roger, Rhys and Keilan for the suggestions. It really helpful.



                    I agree with the point that troubleshooting and digging into the issue is required if there is any interface failure. But as first step, interface restart is being done to check if it fails due to any connectivity issues.


                    Also, in my original post, I have specifically mentioned "Sometimes Interface couldn't start after server reboot". I would like to reframe and mostly correct it.


                    We face situations where we get alerts from one of the Alert monitoring team that the interface is down and the status tags(Health tag, Device status tag, I/O rate tag etc.) for that interface instance are not reading values as expected. And most of the time the issue gets resolved by manually restarting the interface.



                    Hence, we are looking for more of an automated process which will be scheduled for a specific frequency and will check the current values of the Health tags of all the instances and then in case, the values are "Not Good", then it should trigger that particular interface instance service to restart.



                    The idea was to reduce more of the manual work and get it automated. I have just framed this idea but not sure how successfully it works.



                    The biggest challenge is that we generally don't have any coding environment say, Visual Studio or Excel on the interface machine.



                    Any ideas if we can create an application using .net/C# and run it as windows service on the interface nodes.




                    Pallavi Soni

                      • Re: PI Interface self heal



                        Firstly, I agree with the previous posts that configuring the interface to start with a few seconds delay and for PI-OPCInt setup a server reconnect delay also. That has solved similar issues for us here in the past.


                        With that said, I always like to provide an answer to the question posted whether it's the first choice or not. If you have VS in a computer or laptop (even an Express edition might do), you can program and compile your own Windows service. The host machine doesn't need to have VisualStudio. It does need to have the Micorsoft .NET Framework version you used to compile. For example, if your host server has .NET 4.0, then program and compile your service using that version of .NET and also choose the correct version (32 or 64 bit) that will match your server where you will be running the custom Windows service you develop. As for the logic, create a timer, check for status tags (I normally manually create 2 or 3 status tags for the interface I am monitoring to avoid false positives), and if the status is "Not Receiving Data" then restart the interface.


                        Another solution (much simpler) is to create a .BAT script that reads PI tag values (same status tags I mentioned above) and if the value matches a certain value (1=Not Receiving Data), you do a Net Stop "PI interface service name" followed by a Net Start "PI interface service name".


                        I myself like the script idea. It's simple and can be added as a schedule task to run, say, every hour and check the status of your tags.


                        Hope this helps.

                        1 of 1 people found this helpful