bshang

Why you should use the new AF SDK Search

Blog Post created by bshang Employee on Jun 23, 2016

Motivation

 

Do you want to potentially increase the speed of your AF SDK asset searches by more than one order of magnitude? If so, please read on .

 

PI AF SDK 2016 introduced new methods to perform searches against assets, such as AF Elements and Event Frames. The goal of this blog post is to demonstrate the superior performance and usage of the new search and convince you that you should be using it

 

We will present a series of generic use cases and compare "new" and "traditional" methods for searching.

 

  1. Find and process a list of elements belonging to a template
  2. Find and process a list of elements with a certain attribute value
  3. Find and process a list of child elements
  4. How the new search obtains a consistent snapshot of a collection

 

We use the term "process" generally here to mean tasks such as modifying the properties of the element, querying its attributes' values, or signing up its attributes to a data pipe.

 

We use an example AF Database that has one root element called LeafElements and 10,000 child elements. Half of the child elements belong to the "Leaf_Rand" element template and the other half to the "Leaf_Sin" element template. Both element templates have attribute templates with a mix of static and PI Point data references. We adopted the AF database structure from our Large Asset Count Project. Please note we try here to not make the details of the AF database important, but merely provide a generic database from which to demonstrate examples.

 

Note: This post is part of the blog series on new AF SDK 2016 features.

 

 

Examples

Our examples are run on an Azure VM with the following specs:

  •      7 GB RAM
  •      AMD 2.10 GHz
  •      SQL Server 2014 Express
  •      AF Server and AF Client 2.8.0.7444

PI Data Archive, AF Server, SQL Server, and AF client application are all on the same machine. We don't guarantee the performance numbers below for your environment and network. These results are just a small data point in a large design space, but we hope it can provide you a reference and encourage you to perform your own tests.

 

 

1. Find and process a list of elements belonging to a template

 

Traditional search

const int pageSize = 1000;
int startIndex = 0;
int totalCount;
AFNamedCollectionList<AFElement> myElements;

do
{
     myElements = AFElement.FindElementsByTemplate(database,
                    null,
                    database.ElementTemplates["Leaf_Sin"],
                    false,
                    AFSortField.Name,
                    AFSortOrder.Ascending,
                    startIndex,
                    pageSize,
                    out totalCount);
     ProcessElements(myElements);

     startIndex += pageSize;
} while (startIndex < totalCount);

 

New search

AFSearchToken templateToken = new AFSearchToken(AFSearchFilter.Template, AFSearchOperator.Equal, database.ElementTemplates["Leaf_Sin"].GetPath());
AFElementSearch elementSearch = new AFElementSearch(database, "FindSinusoidElements", new[] { templateToken });
elementSearch.CacheTimeout = TimeSpan.FromMinutes(10); // Opt in to server-side caching

IEnumerable<AFElement> elements = elementSearch.FindElements(0, false, 1000);
foreach (AFElement element in elements)
{
     ProcessElement(element);
}

 

Code comparison

You can already notice some differences between the traditional and new patterns of searching.

  • The newer pattern for searching presents a more uniform abstraction. The class AFElementSearch represents the search. The search query is described by a list of AFSearchToken objects. Each token corresponds to an AFSearchFilter denoting what condition or property we want to filter on. The search is executed via a FindElements call and the caller receives an IEnumerable<AFElement> query result that can be looped through. LINQ aficionados should be excited about this
  • The former pattern relies on various method overloads of FindElements to specify the search criteria. These methods also have long argument lists that can be cumbersome to build. These methods also require the developer to write "wrapper" code such as a do-while loop to loop through the found elements and also keep track of the loop state. The do-while pattern tends to force the developer to think in terms of pages or batches, when in many cases, a foreach loop on individual elements can be more natural. In many cases, code using the new search will be more declarative and easier to maintain.

 

Performance comparison

I know what you are thinking. What about performance?

This search returns 50,000 elements and we restarted the SQL Server after each run. Our ProcessElement(s) methods simply write the AF Element name to the Console. For more expensive tasks, the timings below would be expected to be higher.

 

Traditional search: 3.2 minutes

New search without caching: 1.1 minutes

New search with caching: 0.15 minutes

 

Why is the new search faster?

  • The new search query does not require a sort (to be executed by SQL Server). Note that FindElements requires us to specify an AFSortField, but we do not need to specify one when creating an AFElementSearch. In the new search, we can opt in to sorting by using a token with AFSearchFilter.SortField.
  • The new search allows you to opt in to caching object identifiers of found elements on the server. This is done via setting the AFElementSearch.CacheTimeout property. This effectively takes a "snapshot" of the found collection, caches it, and provides a server-side iterable that the AF client can consume. This is enabled via Line 3 of the "New search" code above. Caching of identifiers allows SQL Server to retrieve subsequent items faster by avoiding repetitive queries. The traditional search which does not implement caching will incur more overhead on the SQL Server side.
  • Should you opt in to caching when using the new search? Let's see what the docs for CacheTimeout say: If you will only be getting items from the first page, then it is best to leave the cache disabled. If you will be paging through several pages of items, then it is best to enable the cache by setting this property to a value larger than you expect to be using the search.
  • Exercise for the interested reader: You can use SQL Server Profiler to look at the differences in implementation between the traditional search and new search (with and without caching).

 

Let's look at another example (if you are not yet convinced )

 

 

2. Find and process a list of elements with a certain attribute value

 

Traditional search

const int pageSize = 1000;
int startIndex = 0;
int totalCount;
AFNamedCollectionList<AFElement> myElements;

AFElementTemplate template = database.ElementTemplates["Leaf"];
AFAttributeValueQuery query = new AFAttributeValueQuery(template.AttributeTemplates["SubTree"], AFSearchOperator.Equal, "1");
do
{
     myElements = AFElement.FindElementsByAttribute(
                    null,
                    "*",
                    new AFAttributeValueQuery[] { query },
                    true,
                    AFSortField.Name,
                    AFSortOrder.Ascending,
                    startIndex,
                    pageSize,
                    out totalCount);
     ProcessElements(myElements);

     startIndex += pageSize;
} while (startIndex < totalCount);

 

New search

AFElementTemplate template = database.ElementTemplates["Leaf"];
AFSearchToken templateToken = new AFSearchToken(AFSearchFilter.Template, AFSearchOperator.Equal, template.GetPath());
AFSearchToken valueToken = new AFSearchToken(AFSearchFilter.Value, AFSearchOperator.Equal, "1", template.AttributeTemplates["SubTree"].GetPath());
AFElementSearch elementSearch = new AFElementSearch(database, "FindSubTreeElements", new[] { templateToken, valueToken });
elementSearch.CacheTimeout = TimeSpan.FromMinutes(10); // Opt in to server-side caching

IEnumerable<AFElement> elements = elementSearch.FindElements(0, false, 1000);
foreach (AFElement element in elements)
{
     ProcessElement(element);
}

 

Search tokens are ANDed together

Here, we have two tokens. As mentioned in the search query syntax documentation, please note that all the search tokens are ANDed together.

 

Performance comparison

This search returns 10,000 elements.

Traditional search: 0.20 minutes

New search without caching: 0.14 minutes

New search with caching: 0.09 minutes

 

Wow, this new search is amazing. Fewer lines of code and better performance. A developer's paradise

 

 

3. Find and process a list of child elements given a parent

 

Traditional search

const int pageSize = 1000;
int startIndex = 0;
int totalCount;
AFNamedCollectionList<AFElement> myElements;

do
{
     myElements = AFElement.FindElements(database,
                    database.Elements["LeafElements"], 
                    "*",
                    AFSearchField.Name, false,
                    AFSortField.Name,
                    AFSortOrder.Ascending,
                    startIndex,
                    pageSize,
                    out totalCount);
     ProcessElements(myElements);

     startIndex += pageSize;
} while (startIndex < totalCount);

 

New search

AFSearchToken rootToken = new AFSearchToken(AFSearchFilter.Root, AFSearchOperator.Equal, database.Elements["LeafElements"].GetPath());
AFSearchToken descToken = new AFSearchToken(AFSearchFilter.AllDescendants, AFSearchOperator.Equal, "false");
AFElementSearch elementSearch = new AFElementSearch(database, "FindLeafElements", new[] { rootToken, descToken });
elementSearch.CacheTimeout = TimeSpan.FromMinutes(10); // Opt in to server-side caching

IEnumerable<AFElement> elements = elementSearch.FindElements(0, false, 1000);
foreach (AFElement element in elements)
{
     ProcessElement(element);
}

 

Performance comparison

This search returns 100,000 elements.

Traditional search: 9.6 minutes

New search without caching: 5.4 minutes

New search with caching: 0.32 minutes

 

 

4. How the new search obtains a consistent snapshot of a collection

 

We mentioned earlier that the new search has the ability to cache the search results. See for example the properties under AFElementSearch Class. If your application is using the traditional paging pattern and another client is modifying that collection, you may miss items or see duplicates. If you use the new search and opt in to server-side caching, then upon the initial search call, the server will take a "snapshot" of the found items and cache their identifiers. The server will use this cache to provide items as the client iterates. Thus, the client will see a consistent snapshot of the collection at the time of the query and be immune from any modifications to the query result set that could occur as it iterates through the results.

 

 

AFElementSearchBuilder

 

I'm a fan of using the Builder Pattern to construct complex objects. Notice in the above, we have to first construct the filters before constructing the search object. This seems a little backwards. Intuitively, we'd like to be able to construct the search object and then add our filters.

 

We provide an AFElementSearchBuilder class to help with this. Example usage is below:

AFElementSearch elementSearch = AFElementSearchBuilder.Create()
     .SetDatabase(database)
     .SetName("FindLeafElements")
     .AddToken(new AFSearchToken(AFSearchFilter.Root, AFSearchOperator.Equal, elementPath))
     .AddToken(new AFSearchToken(AFSearchFilter.AllDescendants, AFSearchOperator.Equal, "false"))
     .Build();

List<AFElement> elementsList = elementSearch.FindElements(0, false, 1000).ToList();

 

You can follow a similar pattern to write your own builders for AFEventFrameSearch.

 

As mentioned in the search query syntax documentation, please note that all the search tokens are ANDed together.

 

 

Conclusion

 

I hope this post demonstrates that using the new search in AF SDK can be a valuable investment and that it is not that much code to transition. It is desirable when some of the following are true:

  • Your AF database contains large collections (10,000 elements or event frames+)
  • Your applications perform processing on large collections of elements or event frames.
  • Your applications don't require the returned collection to be sorted. You can still opt-in to sorting using AFSearchFilter.SortField.
  • You find that some of your current asset search implementations are slow
  • You want to be ensured that the server provides consistent snapshots of your collections

 

For more information, please consult the PI AF SDK Reference:

Search Query Syntax Overview

AFSearch Class

 

Thank you for reading and as always, questions and comments are welcome below. Please look forward to the next post in this series on asynchronous data access!

Outcomes