11 Replies Latest reply on Apr 25, 2017 7:57 AM by gregor

    Unknown or invalid WebID format

    abrodskiy

      Hi all,

      I'm observing a weird behaviour from PI Web API streamsets call and wonder if you have come across this issue before and what's the way to resolve?

      When doing a standard GET call, all middleware encodes the strings, so all special characters like &, = get encoded into %26 etc.

      So with the streamsets call, this URL gets me the above error, even in browser. While when I just use the special characters in the browser it works fine.

       

      so this one throws an error: unknown or invalid WebID format

      https://server/piwebapi/streamsets/value?webid={webid}%26webid%3d{webid}

       

      Help pages:  PI Web API HelpStreamSetGetValuesAdHoc  


      { "Errors": [ "Unknown or invalid WebID format: 'P0omyG2UkrbkmTCCEv8bf3egFwAAAASlpWUFRJTU9BUERBSDAxXEE0M0xBTDAwMjEuUFY&webid=P0omyG2UkrbkmTCCEv8bf3egIwAAAASlpWUFRJTU9BUERBSDAxXEo2MEZJMTEzMi'." ] }

      and this one just works fine

      https://server/piwebapi/streamsets/value?webid={webid}&webid={webid}

        • Re: Unknown or invalid WebID format
          Marcos Vainer Loeff

          Hi Alexandre,

           

          This is the expected behaviour. You should only encode the string inputs of the functions. When you generate the URL, you shouldn't encode the whole URL.

           

          Let me know if this helps you!

            • Re: Unknown or invalid WebID format
              abrodskiy

              Hi Marcos,

              Could you elaborate on what do you mean by string inputs of the functions and not the whole URL? Encoding of the URL, unfortunately, is done automatically by the middleware (Biztalk in our case), not something I can control.

                • Re: Unknown or invalid WebID format
                  Marcos Vainer Loeff

                  Hi Alexandre,

                   

                  In your case, you would encode each WebId first (which actually won't change anything) and then build the url  ( https://server/piwebapi/streamsets/value?webid={webid}&webid={webid} with the encoded WebIds) as you have pointed out.

                   

                  If this is done automatically by the middleware, you should try doing the same request through Batch. Please refer to this link. If you search for PI Web API Batch, you will find a lot of useful articles to understand this concept.

                    • Re: Unknown or invalid WebID format
                      abrodskiy

                      Hi Marcos,

                      Thank you for the answer. Using batch is an option (although a bit complicated for the purpose), but I would like to understand first why streamsets don't work the way they are expected (I guess it's a bug).

                      So let's take an example from the link you shared about the batch:

                      "4": { 
                           "Method": "GET",
                           "Resource": "https://localhost/piwebapi/points?path=%5C%5CMyPIServer%5Csinusoid",
                           "Headers": { "Cache-Control": "no-cache" } },
                      "6": {
                           "Method": "GET",
                           "Resource": "https://localhost/piwebapi/streamsets/value?webid={0}&webid={1}",
                           "Parameters": [ "$.4.Content.WebId", "$.5.Content.WebId" ], "ParentIds": [ "4", "5" ] },

                       

                      So the points call does support a proper URI encoding. But if I try to do the same URI encoding for the streamsets call (e.g. put %26 instead of &), even in the batch, i get the same above error for that step of the batch. Could you please ask PI Web API developers to look into this?

                        • Re: Unknown or invalid WebID format
                          gregor

                          Hi Alexander,

                           

                          I find quite a few discussions when searching the internet for BizTalk and percent encoding. My understanding is that it is recommended to replace characters like & with their percentage code (&26) because they represent reserved characters according to RFC 3986. Reserved characters is dealt with in section 2.2. quoted below

                           

                          2.2.  Reserved Characters

                            URIs include components and subcomponents that are delimited by
                            characters in the "reserved" set.  These characters are called
                            "reserved" because they may (or may not) be defined as delimiters by
                            the generic syntax, by each scheme-specific syntax, or by the
                            implementation-specific syntax of a URI's dereferencing algorithm.
                            If data for a URI component would conflict with a reserved
                            character's purpose as a delimiter, then the conflicting data must be
                            percent-encoded before the URI is formed.

                           

                          Berners-Lee, et al.        Standards Track                    [Page 12]


                          RFC 3986                  URI Generic Syntax              January 2005

                                reserved    = gen-delims / sub-delims

                                gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

                                sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                                            / "*" / "+" / "," / ";" / "="

                            The purpose of reserved characters is to provide a set of delimiting
                            characters that are distinguishable from other data within a URI.
                            URIs that differ in the replacement of a reserved character with its
                            corresponding percent-encoded octet are not equivalent.  Percent-
                            encoding a reserved character, or decoding a percent-encoded octet
                            that corresponds to a reserved character, will change how the URI is
                            interpreted by most applications.  Thus, characters in the reserved
                            set are protected from normalization and are therefore safe to be
                            used by scheme-specific and producer-specific algorithms for
                            delimiting data subcomponents within a URI.

                            A subset of the reserved characters (gen-delims) is used as
                            delimiters of the generic URI components described in Section 3.  A
                            component's ABNF syntax rule will not use the reserved or gen-delims
                            rule names directly; instead, each syntax rule lists the characters
                            allowed within that component (i.e., not delimiting it), and any of
                            those characters that are also in the reserved set are "reserved" for
                            use as subcomponent delimiters within the component.  Only the most
                            common subcomponents are defined by this specification; other
                            subcomponents may be defined by a URI scheme's specification, or by
                            the implementation-specific syntax of a URI's dereferencing
                            algorithm, provided that such subcomponents are delimited by
                            characters in the reserved set allowed within that component.

                            URI producing applications should percent-encode data octets that
                            correspond to characters in the reserved set unless these characters
                            are specifically allowed by the URI scheme to represent data in that
                            component.  If a reserved character is found in a URI component and
                            no delimiting role is known for that character, then it must be
                            interpreted as representing the data octet corresponding to that
                            character's encoding in US-ASCII.

                           

                          So according to the standard & is a sub-delim and that's exactly what it is used for in a Streamset query to PI Web API. It delimits WebID's.

                          I would be curious why BizTalk does not consider & being used as a delimiter? Is this may be a configuration issue?

                          1 of 1 people found this helpful
                            • Re: Unknown or invalid WebID format
                              abrodskiy

                              Hi Gregor,

                              I think we are trying to loook at the wrong end of the stick

                              I am rather wondering why some calls in PI Web API (like points call) support proper URI encoding per standards, and other (like streamsets) don't?

                              It's rather common when you pass URI through multiple chains of processing in middleware, they get encoded...

                              So usually RESTful web services are compliant with the encoding standards...

                                • Re: Unknown or invalid WebID format
                                  gregor

                                  Alexander Brodskiy wrote:

                                   

                                   

                                  I am rather wondering why some calls in PI Web API (like points call) support proper URI encoding per standards, and other (like streamsets) don't?

                                  Are you talking about replacing a space within a PI Point name with %20? If so, please note that a space within a point name is not a delimiter. RFC 3986 foresees replacing reserved characters because of their usage as delimiters. It does not foresee replacing delimiters but that's just my opinion.

                                   

                                  Before we end you beating me through the forums , let me reach out to the PI Web API development for a statement.

                                    • Re: Unknown or invalid WebID format
                                      abrodskiy

                                      Thanks Gregor

                                      I was talking about the example I gave above from the batch, but I get your point, yes... Please let me know what developers have to say on this.

                                      "https://localhost/piwebapi/points?path=%5C%5CMyPIServer%5Csinusoid"

                                        • Re: Unknown or invalid WebID format
                                          schristian

                                          Hello! (I'm a PI Web API developer)

                                           

                                          PI Web API is actually using the correct decoding behavior here.  According to the URI specification, '&' represents the start of a new query parameter, whereas '%26' represents the ampersand character in the query parameter.  They have different behavior; if PI Web API decoded every '%26' to '&' before handling the URI, it wouldn't be able to tell the difference between the two!

                                           

                                          Likewise, the '%5C' in your Point call example represents the backslash character in the "path" query parameter.  However, backslashes are not used as delimiters in a URI, so it is safe to use them unencoded ('\') as well.  This is probably why the two calls seem different: '\' and '%5C' have the same behavior, so you can safely replace them, but '&' and '%26' do not!

                                           

                                          As for the original issue, I noticed that in your first URI ("https://server/piwebapi/streamsets/value?webid={webid}%26webid%3d{webid}"), the first '=' is unencoded, but the second one gets turned into '%3d'.  I think that BizTalk might not encode every special character, but instead interprets everything after "webid=" as a query parameter string, and encodes any special characters after it.  Is there anything you are aware of that would cause this to happen?

                                           

                                          Hope that helps

                                          3 of 3 people found this helpful
                                            • Re: Unknown or invalid WebID format
                                              abrodskiy

                                              Hi Stephen,

                                              Thank you for taking time replying to this thread! I think I understand now the difference and reasons, though not sure what we are going to do with that..

                                              You rightly noticed that the first "=" is not encoded - that's because it's part of the original URI, while the rest is dynamically concatenated within the Biztalk logic (from multiple tags, number of tags is unknown to the logic upfront), and then passed over as a parameter. So Biztalk is encoding the URI parameters as if they are not just delimeted, but rather all special characters...

                                              We'll see how we can tweak that logic, without hardcoding the number of webids in the original URI *e.g. /value?webid={0}&webid={1}&webid={2} etc..

                                              The logic needs to be able to orchestrate requests for any number of tags, from 1 to n (although we split the calls into multiple groups, if number of tags goes above 150 or so).

                                                • Re: Unknown or invalid WebID format
                                                  gregor

                                                  Alexander Brodskiy wrote:

                                                   

                                                   

                                                  You rightly noticed that the first "=" is not encoded - that's because it's part of the original URI, while the rest is dynamically concatenated within the Biztalk logic (from multiple tags, number of tags is unknown to the logic upfront), and then passed over as a parameter. So Biztalk is encoding the URI parameters as if they are not just delimeted, but rather all special characters...

                                                   

                                                  Shouldn't BizTalk offer an option to specify the delimiting character between parameters?