Creating and registering a custom dataset for detailed information

entry

Creating and registering a custom dataset for detailed information

Published on: 2022-12-27

The custom object mechanism lets us extend the library of operators, datasets and visualization objects that can be used in map configurations. This mechanism has been part of the Carmenta Engine SDK for a long time, but recent versions have introduced some upgrades. For example, the custom operator interface now supports parallelization of custom operations for enhanced performance when used with a TileLayer.

In Carmenta Engine 5.15, custom dataset got a couple of new functionalities. One of them is the support for using the dynamic cache mode in an OrdinaryLayer containing a custom dataset. The other, which we’ll go through in this guide, is the ability to generate and use the DataSetInfo class for a custom dataset, in the same way that it can be used for a built-in dataset.

The DataSetInfo class answers two main use cases when working with Carmenta Engine:

Use case 1: getting detailed information from a dataset instance
Use case 2: dynamically instantiating a dataset from a given file or folder path

When working with custom datasets, it is up to you to implement the relevant interface methods to support the two use cases above. Let’s start with a little refresher on how to create and use a custom dataset.

Basic setup for a custom dataset

For more hands-on experience in creating a custom dataset, or custom objects in general, you can follow the Carmenta Engine self-study training course.

Creating a custom dataset

When creating a custom dataset for Carmenta Engine, you need to create a class, which either implements the ICustomDataSet interface, or extends the CustomDataSetAdapter class. The latter is usually faster to kick-off the implementation, as it provides some recurrent functionality used in many custom datasets, but the ICustomDataSet interface can give more flexibility in some cases.

Both have two main methods to implement or override, one for the dataset initialization, and one for querying features from the dataset. The initialization method would typically take a file or folder path from the custom dataset properties and process these files to create a set of features out of them. But the custom dataset can also be implemented to create features differently, from class instances in application code, for example.

There are more considerations, for example when working with rasters, or when you want to support dynamic cache mode, but the core functionality of a custom dataset is declared in the ICustomDataSet interface.

Using a custom dataset

In most cases, the custom class is placed in a separate library, either a DLL for C# and C++ custom objects, or a JAR for Java. This stems from the flexibility provided by the custom object mechanism: a custom object can be used in a Carmenta Engine application of any API, regardless of which API the custom object library is using. For example, one might write a C++ custom object library, but use it in a C# or Java UI application, or even in a simple px-file viewed in Carmenta Explorer!

In that case, the custom dataset can simply be used by adding a CustomDataSetProxy to a ReadOperator, and configuring at least its three main properties:

api: Net/Cpp/Java, the API in which the custom object library is written
className: the name of the custom class, possibly with namespace
libraryName: the absolute or relative path and file name of the library

The CustomDataSetProxy can also be instantiated from code, rather than going through a configuration file. In that case, its constructor can take these same three properties, if we have a separate library. In case you do not want to work with separate libraries and their paths, you can also use the constructor overload which takes in an instance of the custom class directly.

CustomDataSetProxy proxy = new CustomDataSetProxy(
    new CsvDataSet(), 
    new AttributeSet(/*Fill with your specific user properties here*/)
);

Use case 1: getting detailed information from a custom dataset instance

When calling GetDataSetInfo on an existing instance of a built-in dataset, a DataSetInfo instance is returned, containing various information on the data such as its bounds, its path and file name, its estimated viewing scales, the available attributes on vector data, and more.

This use case is about letting a custom dataset provide that same detailed information when calling GetDataSetInfo on an existing CustomDataSetProxy instance. For example, here’s the difference between a custom dataset where the DataSetInfo support is not implemented, compared to the same custom dataset with DataSetInfo support.

For both use cases, the interface ICustomDataSetInfoProvider Interface is the answer. The custom dataset class should implement this interface, in addition to the base ICustomDataSet or CustomDataSetAdapter.

This interface provides two methods: a GetInfo method with 2 overloads, and an Instantiate method.

For use case 1, the method that needs a full implementation is the parameter-less GetInfo method.

DataSetInfo ICustomDataSetInfoProvider.GetInfo()
{
	DataSetInfo dataSetInfo = new DataSetInfo();
	dataSetInfo.Bounds = bounds;
	dataSetInfo.FileName = fileName;
	dataSetInfo.Path = path;
	dataSetInfo.Crs = Crs.Wgs84LongLat;
	dataSetInfo.CanCreateDataSet = true;
	dataSetInfo.DataFormatCode = "CustomCSV";
	dataSetInfo.DataFormatDescription = "Custom DataSet for reading points from CSV files";
	dataSetInfo.DataRepresentation = DataRepresentation.Vector;
	dataSetInfo.HasVisualization = false;
	dataSetInfo.EstimatedScale = 400000.0;
	dataSetInfo.EstimatedMaxScale = 10000000.0;
	dataSetInfo.FeatureAttributes.Add(new FeatureAttribute("NAME", AttributeType.String));
	return dataSetInfo;
}

If support for use case 2 is not required, the other methods can simply return null.

DataSetInfo ICustomDataSetInfoProvider.GetInfo(String filePath) { return null; }
ICustomDataSet ICustomDataSetInfoProvider.Instantiate(DataSetInfo dataSetInfo) { return null;

And that’s it! With that simple code, calling GetDataSetInfo on an instance of CustomDataSetProxy will return detailed information. This is also useful when building custom datasets into a separate library and sharing the library between projects, because the project using this DLL might not have access to the source implementation code, so retrieving its DataSetInfo can provide much needed information on data formats, and more.

Use case 2: dynamically instantiating a custom dataset from a source file

This second use case is more advanced than the first one. The goal is to be able to dynamically create an instance of a custom dataset based on source files on disk.

This mechanism already exists for built-in datasets, and is useful in contexts where we want to dynamically add more geodata sources to a map application, for example. For a single GeoPackage file, this is what it could look like:

DataSetInfo info = DataSetInfo.FromFile("path/to/a/file.gpkg");
Debug.Assert(info.CanCreateDataSet);
DataSet myNewDataSet = info.CreateDataSet() as MapPackageDataSet;
Debug.Assert(myNewDataSet != null);

What if we want the same support for our custom dataset? There are two steps to take to support this second use case.

Fully implement all methods in ICustomDataSetInfoProvider
Register the custom dataset implementation as a DataSetInfo provider for a certain file type.

This is what the implementation of the ICustomDataSetInfoProvider methods might look like to answer use case 2.

DataSetInfo ICustomDataSetInfoProvider.GetInfo(String filePath)
{
	// Check if file exists and is readable.
	try { File.OpenRead(filePath); } catch { return null; }
	DataSetInfo dataSetInfo = new DataSetInfo();
	// Fill out dataSetInfo...
	dataSetInfo.Path = Path.GetDirectoryName(filePath);
	dataSetInfo.FileName = Path.GetFileName(filePath);
	return dataSetInfo;
}

ICustomDataSet ICustomDataSetInfoProvider.Instantiate(DataSetInfo dataSetInfo)
{
	// CsvDataSet constructor loads data from file at Path/FileName.
	// Path, FileName are set in GetInfo above.
	return new CsvDataSet(dataSetInfo.Path, dataSetInfo.FileName); 
}

The minimum work for GetInfo is to try to open the file, to see if it exists and is readable, but in a real example, we would try to open and read the contents of the file, and fill the contents of the DataSetInfo accordingly.

The important part of the implementation is to fill out the DataSetInfo Path and FileName properties based on the existing file path, because the Instantiate method will directly receive the same DataSetInfo created by GetInfo. The Instantiate method can then return a new instance of the custom dataset. Typically you would create a constructor for the dataset which takes in the path and filename retrieved from dataset info.

The second step is the registration of the custom dataset as a DataSetInfo provider for a certain file type. Registering allows Carmenta Engine to know that it should try this custom dataset implementation when it comes across a file with that extension. This is done in application start-up code, through a call to the static method CustomDataSetProxy.RegisterInfoProvider.

This method has two overloads, similarly to the CustomDataSetProxy constructor overloads. One can be used if the custom dataset is defined in a separate library, in which case the three properties api, className, and libraryName must be provided as parameters to RegisterInfoProvider. The other can be used if the custom dataset is defined in the application’s codebase, in which case you must provide a dummy instance of this custom dataset in the RegisterInfoProvider parameters.

For both overloads, the first parameter represents the file extension for which you are registering this custom dataset. For example, registering a custom dataset reading CSV files could look like this:

CustomDataSetProxy.RegisterInfoProvider("csv", new CsvDataSet());

Or like this:

CustomDataSetProxy.RegisterInfoProvider("csv", CustomApi.Net, "./MyCustomObjects.dll", "MyNamespace.CsvDataSet");

Note that if the provided extension is already supported by a built-in Carmenta Engine dataset, the priority is given to the custom dataset when searching for a DataSetInfo provider matching a given file. This allows easily overriding the built-in Carmenta Engine dataset implementation by a custom dataset implementation which might be more specific to your usage of the data. A wildcarded expression can also be used as the extension, to match several file types.

Once the implementation and registration steps are both done, custom dataset instances can be dynamically created based on a given file or folder:

DataSetInfo info = DataSetInfo.FromFile("path/to/a/file.csv");
Debug.Assert(info.CanCreateDataSet);
DataSet myNewCustomDataSet = info.CreateDataSet();
Debug.Assert(myNewCustomDataSet != null);

Here’s a simple example application, based on the Carmenta Engine sample custom CsvDataSet, which can dynamically load any CSV file from the disk, and add it to the view: CsvViewer.zip

In this example, a simple file browser can be opened to choose any file of extension “txt” or “csv”. A custom DataSet instance is created using the process described in use case 2. You can try it out with the CSV file “points.txt” included in the code folder.

The application code also provides a few example usages for the created instance, but the list is not extensive!

The dataset instance is added to an existing DataSetSet connected to the View. The point Features returned by the instance will simply be visualized with a symbol.
The Features will also be visualized with a text attribute if the DataSetInfo returned at least one Feature attribute.
The View area is updated to center on the loaded data.

As a bonus, since the DataSetInfo mechanism is exactly the same as for built-in datasets, this application also works with other vector files! Simply change the target extensions in the file browser to load other file types, such as a Shapefile.