This article describes all the primary factors influencing the processing performance of MapTiler Engine and their respective codependence. There are three main factors: Storage, CPU, and RAM. Let's look at them in more detail.
Storage
In most cases, the storage solution is the number one bottleneck when it comes to MapTiler Engine performance. You need a storage solution that will be able to handle the data throughput generated by MapTiler Engine. The more CPU cores used for rendering and the faster they are, the higher will be the data throughput needed to handle them all. Here are our recommendations regarding the storage solution:
Use fast SSD drives (SATA or NVMe)
Do not use HDDs, as they are really slow and will not keep up with any of the modern CPUs.
RAID 0 Configurations
If you already have a fast SSD and still need higher performance, explore the RAID 0 configuration options.
Render the data locally.
Don't use network-shared drives for either input or output. Here is why:
a) Most network storage solutions are significantly slower than a local SSD drive. Even if you have a fast local network connection (10+Gbps), the storage configuration of the network share might not be sufficiently efficient. Moreover, the overhead of network file protocols negatively impacts the actual data throughput. Therefore, unless you have an enterprise-grade network share and unless you're sure that it will handle really high data throughputs, we recommend refraining from using network shares in the processing pipeline.
b) A network connection can be unstable and introduce file reading errors. Due to network access failure, MapTiler Engine might be unable to access a certain file, resulting in the rendering process being interrupted. Of course, the longer the rendering time (i.e. the bigger the project), the higher chances of network errors occurring.
CPU
MapTiler Engine is primarily a CPU-intensive application. However, since most modern machines have CPUs powerful enough, the storage solution is the number one bottleneck. Regarding CPU, there intuitively are two significant properties that come into play:
- The number of CPU cores - the more cores your CPU has, the bigger the parallel processing power.
- CPU clock speed - the faster the individual cores are, the faster data processing as a whole.
MapTiler Engine is a multi-threaded application, and it will utilize all resources it has at its disposal (taking into account the limits of your license). However, this holds to a certain degree, as there are some caveats:
- Most modern CPUs have HyperThreading capabilities (every core can handle two threads at a time). In these cases, the operating system might report two times more available cores than the CPU has in reality. However, the processing speed of such CPUs isn't twice as higher as those without HyperThreading. While the CPU can distribute the load onto twice as many virtual cores (which might be beneficial in some cases), the processing speed remains unchanged.
- In rare cases, when your CPU is powerful enough and the subscription plan allows you to utilize enough of its capabilities, you might hit bottlenecks caused by other parts of your system (most often, it is the storage solution). In some cases, this leads to worse performance than if you'd be using only a part of your CPU cores (you can always adjust the number of CPU cores used for the rendering process in Account -> License key.)
RAM
The RAM consumption varies a lot based on the format of the data you feed into MapTiler Engine. However, the general rule is that the bigger the input file or the more of the input files, the more RAM MapTiler Engine will consume. We recommend having 16GB minimum, preferably 32GB. RAM clock speed might slightly impact the performance, but it won't as the impact caused by the storage solution or the CPU.
Additional notes
Parameters of the input data
Apart from the hardware itself, the way how your input data is structured can greatly influence the resulting processing speed. Understandably, you can't always change some of them, but here are the general rules:
- Resolution of the input files - the bigger the resolution, the more pixels to process, and the more time needed to process them all.
- Compression of the data inside the input files. Some formats allow for compression of the data stored inside them (GeoTiff, for example). If the data is compressed, the overhead needed to decode it negatively impacts the processing speed.
- Number of files in the dataset. If there are many files, the overhead of opening and closing them negatively impacts the processing speed.
- Overlap of the input files. The bigger the overlap of the tiles, the longer time needed to render data covering a certain area.
- The desired precision (aka max zoom level). This heavily depends on the resolution of the input data. The bigger the resolution of the input data, the higher will be the maximal native zoom level. This influences the number of tiles that will be generated. We don't recommend increasing the max zoom level by more than 1-2 zoom levels over the natively calculated value, as this doesn't lead to an increase in the output quality but only significantly increases the rendering time.
Structure of the data inside the input files
MapTiler Engine is most effective in reading 256x256 blocks of data. For large raster datasets, a tile-based format (as opposed to scanline-based) will drastically speed up processing. A block size that is similar to the tile output size e.g. 256x256 or 512x512 will significantly speed up processing time.
Both of these can be done using command line tools of the GDAL library, such as gdal_translate.
One of the formats that utilize all of the mentioned optimizations is the tiled GeoTIFF format. To convert your file using the gdal_translate utility, use this command in the terminal:
gdal_translate -of GTiff -co COMPRESS=DEFLATE -co TILED=YES \
-co BLOCKXSIZE=512 -co BLOCKYSIZE=512 \
input.tiff output.tiff
Conclusion
Geodata comes in many different shapes and sizes. Many different factors influence each other to a different extent. Therefore it's basically impossible to come up with a general formula to improve the processing performance. If you've hit some bottlenecks in your processing, you'll need to use the knowledge from this article and empirically try to determine the root cause. If you have a stable data processing pipeline set up, you'll most likely encounter a limited number of differently structured datasets. With time, you'll learn what are the usual weak spots in your setup and how to alleviate them to use MapTiler Engine to its full potential.
Useful links
Additional Performance Tips
MapTiler Engine Technical Specification
Comments
0 comments
Please sign in to leave a comment.