Every day more and more customers get in contact with us asking how does PI could be used to leverage their GIS data and how their geospatial information could be used in PI. Our answer is the PI Integrator for Esri ArcGIS. If your operation involves any sort of georeferenced data, geofencing or any kind of geospatial data, I encourage you to give a look at what the PI Integrator for Esri ArcGIS is capable of. But this is PI Developers Club, a haven for DIY PI nerds and curious data-driven minds. So, is it possible to create a custom data reference that provides access to some GIS data and functionalities? Let's do it using an almost-real-life example.
The manager of a mining company has to monitor some trucks that operate at the northmost deposit of their open-pit mine. Due to recent rains, their geotechnical engineering team has mapped an unsafe area that should have no more than three trucks inside of it. They have also provided a shapefile with a polygon delimiting a control zone (you can download the shapefile at the end of this post). The manager wants to be notified whenever the number of trucks inside the control area is above a given limit.
Caveat lector, I'm not a mining engineer, so please excuse any inaccuracy or misrepresentation of the operations at a zinc mine. It's also important to state that the mine I'm using as an example has no relation to this blog post nor the data I'm using.
If you are familiar with GIS data, you know it's an endless world of file formats, coordinate systems, and geodetic models. Unless you have a full-featured GIS platform, it's very complicated to handle all possible combinations of data characteristics. So, for the sake of simplicity, this article uses Esri's Shapefile as its data repository and EPSG:4326 as our coordinate system.
1.3 A Note on CDRs
As the name implies, a CDR should be used to get data from an external data source that we don't provide support out-of-the-box. Simple calculations can be performed, but you should exercise caution as, depending on how intensive your mathematical operations are, you can decrease the performance of an analysis using this CDR. For our example, shapefiles, GeoJsons, and GeoPackages can be seen as a standalone data source files (as they contain geographic information in it) and the math behind it is pretty simple and it won't affect the server performance.
1.4 The AF Structure
Following the diagram on 1.1, our AF structure renders pretty simply: a Zinc Mine element with Trucks as children. The mine element has three attributes: (a) the number of trucks inside the control area (a roll-up analysis), (b) the maximum number of trucks allowed in the control area (a static attribute) and (c) the control area itself.
The control area is a static attribute with several supporting children attributes holding the files of the shapefile. Due to shapefile specification, together with the SHP file you also need the other three.
Finally, the truck element has two historical attributes for its position and the one using our CDR to tell if it's currently inside the control area or not (this is the one used by the roll-up analysis at the zinc mine element). Here I'm using both latitude and longitude as separated attributes, but if you have AF 2017 R2 or newer, I encourage you to have this data stored as a location attribute trait.
2. The GISToolBox Data Reference
The best way to present a new CDR by showing its config string:
Breaking it down: Shape is the attribute that holds the shapefile and its supporting files. It's actually just a string with a random name. What is important are the children underneath it that are file attributes and hold the shape files. Method is the method we want to execute. Latitude and Longitude are self-explanatory and they should also point to an attribute. If you don't provide a lat/long attribute, the CDR will use the location attribute trait defined for the element. There are also two other parameters that I will present later.
The code is available here and I encourage you to go through it and read the comments. If you want to learn how to create a custom data reference, please check the useful links section at the end of this post.
The CDR starts by overriding the GetInputs method. There we use the values passed by the config string, to get the proper attributes. You should pay close attention to the way the shapefile is organized, as there are some child attributes holding the files (these child attributes are AFFiles). Once this is done, the GetValue is called. It starts by downloading the shapefile from the AF server to a local temporary folder and creating a Shapefile object. Although Esri's specification is open, I'm using DotSpatial to incorporate the file handling and all spatial analysis we do. Once we have the shapefile, it goes through some verifications and we finally call the method that gets the data we want: GISGelper.EvaluateFunction(). For performance reasons, I'm also overriding the GetValues method. The reason is that we don't need to recreate the files for every iteration on the AFValues array sent by the SDK.
2.2 Available Methods
Taking into account what I mentioned on 1.3, we should not create sophisticated calculations so the CDR doesn't slow down the Analysis engine. To keep it simple and with good performance, I have implemented the following methods:
2.3 CRS Conversion
The GISToolbox considers that both lat/long and shapefiles are using the same CRS. If your coordinate uses a different base from your shapefile, you can use two other configuration parameters (FromEPSGCode and ToEPSGCode) to convert the coordinate to the same CRS used by the shapefile.
Let's say you have a shapefile using EPSG:4326, but your GPS data comes on EPSG:3857. For this case, you can use:
- It doesn't implement an AF Data Pipe, so it can't be used with event-triggered analysis (only periodic).
- It handles temporary files, the user running your AF services must have read/write permissions on the temporary folder.
- It only supports EPSG CRS
- It only supports shapefiles.
Let's go back to our manager who needs to monitor the trucks inside that specific control area.
3.2 Truck Simulation
In order to make our demo more realistic, I have created a small simulation. You can download the shapefile at the end of this post (Trucks_4326.zip). Here's a gif showing the vehicles' position
The trucks start outside of the control area and they slowly move towards it. Here's a table showing if a given truck is inside the polygon at a specific timestamp:
|TS||Truck 001||Truck 002||Truck 003||Truck 004||Total|
The simulation continues until the 14ᵗʰ iteration, but note how the limit is exceeded on the timestamp 5, so we should get a notification right after entering the 5ᵗʰ iteration.
The notification is dead simple: every 4 seconds I check the Active Trucks attribute against the maximum allowed. And as I mentioned before, the Active Trucks is a roll-up counting the IsInside attribute of each truck.
Shall we see it in action?
The simulation files are available at the end of this post. Feel free to download and explore it.
This proof of concept demonstrates how powerful a Custom Data Reference can be. Of course, it doesn't even come close to what the PI Integrator for Esri ArcGIS is capable of, but it shows that for simple tasks, we can mimic functionalities from bigger platforms and can be used as an alternative while a more robust platform is not available.
If you like this topic and think that AF should have some basic support to GIS, please chime in on the user voice entry I've created to collect ideas from you.