25 Replies Latest reply on Dec 12, 2011 3:38 PM by RJKSolutions

    PISDK + Buffering

      It is getting late here and I have a numb brain so excuse any obvious questions...


      With PISDK 1.4 communicating with PIBufSS, how does the communication handle large volumes of data - does it break up the values in to buckets of data that are passed?  At the moment for PIAPI you can define the size of such buckets of values in interfaces such as UFL.  If I had a single PI Point with 250,000 PI Values and send to PI, does PISDK send all 250,000 in one go (or divide out in to buckets)?  Until ListData supports UpdateValue(s) I'll save some other questions   Lastly, are the PISDK calls translated to PIAPI by PISDK or by PIBufSS?

        • Re: PISDK + Buffering

          The PISDK hands off data to pibufss, which then controls how the data stream is sent to PI Servers.  The familiar parameters to tweak maximum rate of events are available for pibufss. (See ICU configuration page).


          The PIAPI is not involved.  However, pibufss can take input from the PIAPI and PISDK as separate data streams, and send the whole collection to the PI Server.

            • Re: PISDK + Buffering

              Ok thanks Charlie.


              What protocol does PIBufSS use to send the data, thought that was PIAPI?


              Also, if I send snapshot data from a PISDK 1.4 client and then snapshot from a PISDK 1.3 client to same PI server tag, will the 1.3 client get told off?

                • Re: PISDK + Buffering

                  pibufss uses the PI3 Server's protocol, which is the shared protocol used in the PISDK.  The PIAPI uses PI2 Server protocol which is then translated on the PI3 Server into PI3 formats.  bufserv uses the PIAPI when sending data.


                  Yes, pibufss does snapshot locking so that only a node running pibufss can write a snapshot value to a given tag.  If both nodes are running pibufss, then control of the snapshot toggles and both should succeed.  I wouldn't recommend that because the 'correct' snapshot then becomes indeterminate.


                  vCampus gurus: now would be a good time to find a link to a thread on snapshot locking!

                    • Re: PISDK + Buffering

                      Charles Henze

                      vCampus gurus: now would be a good time to find a link to a thread on snapshot locking!


                      are you refering to this thread?

                        • Re: PISDK + Buffering

                          If I can I want to dig a bit deeper.


                          In the piapi scenario on a buffered node, if I write 100,000 values to a single PI Point via UFL it gets split up in to chunks of data and passed to PIBufSS.  PIBufSS by default sends the data to the PI server in 5000 event chunks every 100ms (this in configurable).  I presume here that the piapi data is translated by PIBufSS on the node before sending to PI Server?...I turned on PIBufSS tracing to see what PIBufSS sees on a non-PIDK 1.4 node from piapi.  If I sent the exact same data using PISDK 1.4 (collection of 100,000 PIValues for a PIPoint) would I see the exact same trace in PIBufSS compared to the PIAPI trace?


                          In the thread that Andreas posted, Roger Chow asked about the same thing I am looking at but he never got an answer.  If I have a bunch of buffered tags with limited history and I want to backfill the data, I could use UFL (with the /lb parameter)...all tags have compression switched off.  Should work fine, but hitting snags (TechSupport ticket logged and colleague dealing with it).  I am sat wondering if PISDK 1.4 was used in place of UFL (piapi) would the PISDK translation behave differently...before testing it I wanted to understand the translation a bit better.  It seems despite being archive values, some are being flagged as snapshot and causing small gaps in data.  Just trying to understand the mechanism a bit more while TechSupport do their stuff.



                            • Re: PISDK + Buffering

                              The PIAPI generates an array of data that is the same as the PI Server snapshot format, though coercion is required in some cases to the target PIPoint data type.  The PIAPI can write an array of data (as big as you can allocate – until memory allocation starts failing – pisn_putsnapshotsx) or can split the data into chunks (queue calls – pisn_putsnapshotqx).  With buffering, there is also a chunking done under the hood while inserting into the buffers (MEM1SIZE, MEM2SIZE).  pibufss retrieves the PIAPI data from the memory buffers in MEMnSIZE blocks and inserts that into queues.


                              The PISDK 1.4 passes data directly into pibufss which then inserts the data into queues.  You can choose PIPoint.UpdateValue or PIPoint.UpdateValues to write one or more values.  If you pass 100,000 events through UpdateValues, that is what is passed to pibufss without chunks.


                              Once pibufss has the data, it is treated the same regardless of the source.  So it will dequeue to the PI Server with the same array sizes and send rates you are familiar with for PIAPI buffering.  I have not compared the trace output, but I don't expect any differences.


                              Regarding the data backfilling through buffers, I'll need to get back to you on describing all the behaviors – I'm squeezing replies in between meetings for which I need to prepare.

                                • Re: PISDK + Buffering

                                  Thanks for your input so far Charlie, very much appreciated! (beer on me at the UC)

                                    • Re: PISDK + Buffering

                                      One more thought popped in to my head (a dangerous phenomenon!)...for me PIBuffSS is all about buffering, the fanning is a bonus. If PISDK shares the same PI3 format with PIBuffSS why was fanning not added to the core of PISDK? Then you would fallback to offloading the fanning to PIBuffSS if there was trouble with the PISDK fanning attempt (member down etc). You would have better feedback and control from an application during it's execution rather than assuming PIBuffSS does it's job, which I am sure it will.


                                      Was it an architectural decision that PIBuffSS does all the hard work? (To accommodate PIAPI etc)

                                        • Re: PISDK + Buffering

                                          Yes, it was an architectural decision to have pibufss handle writes to collectives.


                                          We originally proposed doing the fanning in the PISDK, but the risk of race conditions in writing to the various members was considered enough to consolidate all writes through pibufss.


                                          However, the issue of identity management was a very strong argument in favor of the PISDK doing the fanning and leaving out buffering.  The user identity is not currently used in pibufss, but the service identity.  At this point, we get into a discussion of SSB...

                                            • Re: PISDK + Buffering

                                              SSB, now there is a whole other discussion that we could have and how it would make life easier.


                                              Does PISDK still perform data write security checks before the data is passed to PIBufSS?  Identity management surely with the introduction of WIS raises some concerns?


                                              When updating values via the PISDK UpdateValue method, does the method determine the mode (e.g. precomp_arc, precomp_snap) before PIBufSS gets the chunk of values by first checking the snapshot ?


                                              One more question...when PISDK/PIAPI passes a Point in to PIBufSS for the first time, what initialisation takes place on the point (snapshot data caching etc)?


                                              Sorry for all the questions, I tend to like to know what is going on under the hood as much as possible.

                                                • Re: PISDK + Buffering

                                                  I'll take questions one at a time – I may not get to them all immediately!


                                                  The PISDK must authenticate with the PI Server and then based on what is then authorized, it may retrieve PIPoints and such.  When it comes time to write data, the authorization check is done by the PI Server when data arrives for that user (authorization is not done on the client).  With pibufss as the intermediary, the PISDK must hand off data to pibufss which then must be authorized to write the data.  For the current pibufss release, the service user may be set to a domain user that the PI Server authenticates (could be the same as the user who commonly logs in to a node) or it may use one of the built-in machine accounts (for example, Local System).  In the latter case, the PI Server needs to create an Identity mapping for the machine.  If a node is not on a trusted domain with the PI Server, then the only option is PITrust authentication (for both PISDK and pibufss).


                                                  As you probably guessed by now, writing data from an application server or terminal server may lose the user who originated the data write.  This is the identity management that is an issue for this release.  We have a design to do the equivalent of delegation of the writer to pibufss, but that will require the most current server version and some more changes to pibufss.

                                                  • Re: PISDK + Buffering

                                                    PIData.UpdateValues has a mode that indicates how to treat duplicates.  These are direct translations of the underlying PIevent modes passed to the PI Server (in developer lingo, a PIevent in the PI Server is the same as a PISDK.PIValue, namely a timestamp, value and/or status). Thus, it is the caller that determines the mode used for writes.


                                                    pibufss gets enough information to coerce inputs to the correct type if needed and do the compression algorithm on the client.  This is done the first time a new point is received and, to prevent the client cache from getting stale, any needed PointDb changes are picked up from the PI Server as they occur.


                                                    I'm not sure this answers your questions, but we're ready for more questions!

                                                      • Re: PISDK + Buffering

                                                        My next thought of the day...backfilling, event modes, OOO events and buffered ownership/snapshot locking.


                                                        I understand the PI Value -> Event Mode relationship now.  What I don't get completely is the logic that the snapshot system has for snapshot locking when 2 buffered sources are updating the same tag where 1 buffered source is only backfilling, the other providing real time values.


                                                        I believe ownsership of a point will switch under certain event modes (e.g. if backfilling buffered source uses "replace") but will not continue to backfill once it hits another mode (e.g. "precomp_arc") because it (snapshot ss) thinks there is a failover situation of buffered sources.  A workaround is to use "Snapshot_DiscardOOOCompEvents"...


                                                        I wonder if you could explain this a little better than my jumbled approach above - a list of event modes and the effect of the mode for a buffered point with multiple buffering clients.

                                                          • Re: PISDK + Buffering

                                                            Backfilling, event modes, and snapshot locking, why so much complexity?

                                                            Let’s start with the basics of swinging-door compression. The snapshot value of a point (most current value) is only archived when a new value comes after it. If that new value is out of order, compression is bypassed and the value is archived. With the Buffer Subsystem, compression takes place on the client node. In other words, the fate of the snapshot value on the server lies in the hands of pibufss on the client. As a result, the PI Server must prevent all applications (other than pibufss) from sending new snapshot values, since it could introduce data inconsistencies (e.g. missing or duplicate values). That’s why the PI Server locks snapshot values of buffered points (only from pibufss only, bufserv doesn’t do compression).

                                                            Why exactly did we implement compression in pibufss? I’m glad you asked. The primary reason is to guarantee identical archive values between all servers in a PI Collective. Swinging-door compression guarantees that archive values would be similar within compression deviation, but for some customers this wasn’t sufficient. Outside of PI Collectives, we found that by performing compression on client nodes, we would save CPU on the PI Server, and more importantly, allow better concurrency in the Snapshot Subsystem. One last goal for the Buffer Subsystem was to allow PI Interfaces to send data using the newer, more secure PInet3 protocol, instead of PInet1. For instance the Buffer Subsystem that ships with PI SDK 1.4 will allow Windows authentication, which isn’t available for the PI API.

                                                            What about the parameter “Snapshot_DiscardOOOCompEvents”? Again, I’m glad you mention this. This parameter controls the behavior of Snapshot Subsystem after a Collective member is re-initialized (from the Primary PI Server). Since re-initialization includes both the snapshot table (piarcmem.dat) and one or more archive files it is likely that all buffered data received by the Primary PI Server will be copied to the Secondary PI Server. As a result, buffered data coming from pibufss when it reconnects to the Secondary PI Server (after the re-initialization is complete) will likely be duplicates. If the same pibufss instance sends pre-compressed data out of order, the Snapshot Subsystem discards it in order to avoid the same archive inconsistencies mentioned above. This is how the Snapshot Subsystem detects the situation based on information from the snapshot table. The only reason to turn this parameter off would be if you manually perform re-initialization and don’t copy archive files. As you can imagine, we highly recommend against this approach and suggest that you don’t touch this parameter.

                                                            Lastly, what about out-of-order events with regards to snapshot locks for buffered points? Well it really isn’t applicable. Just like the Snapshot Subsystem bypasses compression with out-of-order data, the Buffer Subsystem does the same and events are archived by the PI Server as if there was no pibufss in the picture. The snapshot lock I described above is only relevant for events that are newer than the snapshot value. Backfilling or archive editing is always allowed regardless of the source application or buffering mechanism that sends real-time values. Now I’m not saying that multiple applications editing the same data stream at the same time is a good idea. In most cases, I would recommend using separate points, but I understand it is not always possible.

                                                            I’m sorry for the length of my post. I hope it somewhat clarifies the picture.

                                                              • Re: PISDK + Buffering

                                                                Thanks for the details Denis, appreciate it.


                                                                I still have a niggling issue, maybe a problem with UFL behaviour but I'll try to avoid the UFL talk.


                                                                If I have a tag with no compression that is receiving snapshot data from a PIBufss node - it's part of a 2 server collective. The tag only has last few months of data and empty archive for 2, 3 or so years ago.  If I put a value in to the tags archive for 3 years ago it goes in fine.  If I then drop a file for UFL (on another node running PIBufss) for 1 years worth of data (in an ascending chronological order) that includes the 1 value added earlier, then the tag buffer ownership is switched even though all dates are less than the snapshot date. Now the first event has the "replace" mode and the others have "precomp_arc" - does the "replace" mode of the first event somehow trick the PI server into thinking there is a failover of buffer sources? As a side note, I can see that UFL seems to mark some events with "precomp_snap" despite being archive values - would these event modes trigger the above?


                                                                Oh and when buffered ownership changes, do messages only get written to the primary server?


                                                                I'll start a SSB discussion once I get this off my plate  

                                                                  • Re: PISDK + Buffering

                                                                    For reference, there is a PLI now: PLI#23807OSI8 related to this discussion.


                                                                    It would appear that when another buffer source comes along with a historical event that PIBuffss marks with "REPLACE" event mode (via UFL using piapi) that pisnapss changes the point ownership - subsequently the precompressed events for backfilling that follow logically get rejected.  Maybe it is rooted to specific versions being used.


                                                                    Surely, if a second non-owning buffer source attempts to write historical data then it should be processed regardless of event mode...so is there some specific logic around event modes for point ownership?


                                                                    Thanks for all the info so far, helps to solidify my understanding of the behaviours we see (hopefully helps others reading this too).

                                                                      • Re: PISDK + Buffering

                                                                        Are there any detailed "pretty" diagrams to visualise each of the buffering possibilities discussed in this thread available from OSI?  I was about to start creating some for a project but would appreciate the "official" ones if they exist - ones that mention protocols used as well.

                                                                          • Re: PISDK + Buffering

                                                                            Question...PIbuffss caches the Points of a PI System and sends point updates to each PIBuffss instance connected.  If you have a million point system then each instance of PIBuffss gets to know about each of those million points (right?), but how does the PI System handle sending those updates?  - especially when it initialises for the first time.  Also, lets say my million point system is being fed by 100 instances of PIBuffss.  How does the PI Server handle that type of update distribution?  By creating 100 pipes to each instance of PIBuffss and sending everything together at the same time down the pipes?  Or does it somehow stagger the distribution by pipe, % of points, ...


                                                                            Trying to gauge the optimal use of PIBuffss and PItoPI to be highly scalable.  Obviously you need to know the above to assess the globally distributed impact on a network.  


                                                                            I am thinking about clients too that are making use of PIBuffss in  PI SDK 1.4+ against large (>500k points) PI Servers, especially if you want to make use of PIBuffss from a custom PI SDK application.

                                                                              • Re: PISDK + Buffering

                                                                                The PI Buffer Subsystem caches snapshot and compression information for only the points it needs.  It does this on demand when a new PointID is discovered.


                                                                                Changes in point configuration are published to pibufss, similar to what is done for the local snapshot.

                                                                                  • Re: PISDK + Buffering

                                                                                    If a Server dies then comes back, does the publishing to each instance of PIBuffss happen at once?


                                                                                    PItoPI has this "issue" in higher latency networks where placement of the interface can effect data throughput, is this eliminated by PIBuffss?  Or does PIBuffss behave in the same manner?

                                                                                      • Re: PISDK + Buffering



                                                                                        I have a correction.  pibufss keeps is PIPoint attributes and digital state information up to date when it sends events to the server.  The server's snapshot is kept up to date.  So there is a possible lag between when attributes are changed and they are implemented on the pibufss client.  It does have an advantage of avoiding unnecessary work until needed (kind of like my kids doing homework?)




                                                                                        PItoPI is based on the PIAPI.  Innate in the protocol used for the PIAPI are some size limitations.  As data quantity increases, so does number of packets.


                                                                                        pibufss is using an improved protocol that can scale upward to large chunks.  Additionally, the client to server handshaking has become much more efficient with 3.4.380 code base requiring fewer network acks (I guess product names with 2010?).  Additionally, pibufss can be configured to use those larger packets if it appears necessary.

                                                                                          • Re: PISDK + Buffering

                                                                                            When you say PIBuffss caches only the PI Points it needs, is this defined by what PI Points have sent data through the buffer subsystem?  So if you send data for 10,000 points through PIBuffss to a Server with 1,000,000 points, it only keeps those 10,000 points up to date and ignores the other 990,000?  Does PIBuffss maintain it's local cache of PI Points during reboots, disconnects, etc.?  I see PItoPI is getting support for disconnected startup.


                                                                                            Thanks for the other information, appreciate it.  I'll follow up the larger packets configuration with TechSupport.