This document shows how to select and prepare raster data for use in a data visualization created with the MapTiler SDK and weather module. Data from different time slices are prepared to create an animation in the browser. The document highlights the different steps required to get the best results when visualizing the data.
Introduction
Data visualization is a key use of maps and is growing in popularity using specialist software. While it can be easy to produce simple spatial data visualizations such as basic choropleths using JavaScript mapping libraries, controlling the finer details or presenting data differently for people with different needs can be difficult.
I wanted to create an interactive data visualization taking the form of a map showing how the global population has changed over time. To reveal the details, the user needs to be able to zoom and pan the map and move backward and forward in time. In this blog, I will take you through the processes and thinking behind the decisions made while building it using the MapTiler SDK Weather module.
Get the code > View Fullscreen >
This tutorial documents how the data was acquired and the processing and design decisions made to get the data ready for visualizing with the MapTiler SDK.
Along with this tutorial, there is a news article about advanced data visualization and two more tutorials on how to process the data in MapTiler Engine and how to code the demo using the MapTiler SDK:
- Visualizing population density on JavaScript Maps
- Global Population Density Data Processing
- Visualize and animate population data in a browser
Data Preparation
Visualizing global Population density data over time will require aligning datasets from different time points in terms of their spatial extent and format.
Proper data preparation is key to any good data visualization. Sketches and planning will help you better understand what is possible and give you a better idea of the attributes and IDs you may need to display the data correctly or join it to other datasets.
Global Population Data
For this project, I used a gridded population of the world dataset from NASA’s Socioeconomic Data and Applications Center (SEDAC), namely the Population Density v4.11 dataset. I opted for the GeoTIFF format with the highest resolution (30 seconds, which is approximately 1km squares) and downloaded the data for 2000, 2005, 2010, 2015, and 2020.
Note: Downloading the data from SEDAC is free but requires you to be logged in.
More information about this data can be found in the very comprehensive sidecar document from SEDAC (p. 14 "2. Population Density, v4.11 (2000, 2005, 2010, 2015, 2020)").
Data Processing
I am using MapTiler’s weather Library for this data visualization, as it is designed to display raster maps that change over time. We want the maps to display on the web, so we’ll use the Web Mercator projection, which all JavaScript Mapping Libraries can handle. We’ll also want the data to be in PNG format to ensure they are compact enough to load fast enough to be interactive maps. The maps should zoom in to at least level 7 to see all the details.
First, let's look at the output size. Here, I made decisions to ensure the output is manageable and has enough detail.
The MapTiler Weather module uses square-shaped web Mercator tiles. The total size of the data depends on the maximum zoom level and individual tile size. Let's see how far we can go:
Zoom Level |
No. of tiles on each axis |
Total size in pixels |
No. of tiles |
---|---|---|---|
0 |
1 |
512 |
1 |
1 |
2 |
1024 |
4 |
2 |
4 |
2048 |
16 |
3 |
8 |
4096 |
64 |
4 |
16 |
8192 |
256 |
5 |
32 |
16 384 |
1024 |
6 |
64 |
32 768 |
4096 |
7 |
128 |
65 536 |
16 384 |
The original data is 43 200 x 21 600px and could be reprojected to a 43 200 x 43 200px image in Web Mercator without losing detail along the longitude axis. Since the total size of a tiled dataset can only be in powers of 2, we need to choose wisely which zoom level (z) we want to generate tiles up to. In this situation, we have two possibilities:
-
I downsample from 43 200 to 32 768, targeting a max zoom level of 6
-
I upsample from 43 200 to 65 536, targeting a max zoom level of 7
To avoid losing precision, we'll upsample and generate tiles up to zoom level 7. (spoiler: each yearly tileset up to zoom level 7 will be ~100MB, for a total of 21 845 tiles!)
Next we need to look at the scaling of the data. I’ve chosen to use PNG tiles, though you could also use WebP or JPEG. PNG pixel values are limited to integer values between 0-255. Since the population density greatly exceeds 255, we must scale the density values. We can find the max value by reading the highest pixel value in each tiff file using gdalinfo or any image analysis software. Be aware that if there are errors in the data, the software will pick these up.
Note: If your data contains an erroneous outlier, it could lead to a mistake at this point. Cross-check any maximum value results with trusted research on the topic of your data! The data used in this example contained a peak value of over 800 000 people per km2. Research on the topic reveals a sensible value to be nearer 80 000.
These are peak values found at the heart of the densest urban areas such as Macau, Paris, or Manhattan. When it comes to linear scaling down to a smaller integer range, there is no “one size fits all” strategy. We can include the maximum densities, but this will create larger "bins": each value from the range [0-255] will represent more people, hence losing granularity.
Example:
-
scaling 0-20 000 down to 0-255 will result in a precision step of 78
-
scaling 0-40 000 down to 0-255 will result in a precision step of 156
-
scaling 0-80 000 down to 0-255 will result in a precision step of 312
In terms of data visualization, capturing the peak values is great as they often play the role of a reference point (at least visually) to compare to the rest of the data. However, capturing the fluctuations where values are minimal is important in less populated places. As we see above, if we chose a maximum value of 40 000, we won't be able to represent places with a density somewhere between 0 and 156 people per square kilometer, and this granularity is too coarse for this project. It especially matters for visualizing changes in rural areas.
But scaling does not have to be linear! For capturing peak values and fine variations on the lower end, it’s better to encode the square root (sqrt) of the values. Let's see how the two compare once decoded:
Tile value (uint8) |
Population density (linear) |
Population density (sqrt) |
---|---|---|
0 |
0 |
0 |
1 |
156 |
1 |
2 |
313 |
4 |
3 |
470 |
9 |
4 |
627 |
16 |
... |
... |
... |
120 |
18 823 |
14 400 |
121 |
18 980 |
14 641 |
122 |
19 137 |
14 884 |
... |
... |
... |
251 |
39 372 |
63 001 |
252 |
39 529 |
63 504 |
253 |
39 686 |
64 009 |
254 |
39 843 |
64 516 |
255 |
40 000 |
65 025 |
While the linear encoding applies the same step on the whole range, the sqrt method applies a much finer step on small values and a larger one on the upper end. As a bonus, sqrt can also encode larger peak values so the cap is now 65 025!
Here's how they compare in a more graphical form (green is linear, red is sqrt, black is 255, the maximum value possible on the PNG). I have swapped the axes to make it easier to decode the values.
Having a fine granularity on the lower end also makes our visualization more interesting in less crowded areas. Here is the East Coast of the US seen with the TURBO color ramp capped at 40 000 (year 2000):
Population density (linear, max: 40000) |
Population density (sqrt, max: 65025) |
---|---|
We can already spot patterns at a semi-global scale thanks to the sqrt encoding and its smaller steps at low density. Have a look at the Appalachian terrain and its population density:
Appalachian Terrain |
Appalachian Population density (sqrt) |
---|---|
To compute the sqrt form from the original GeoTiff provided by SEDAC, we can use the following GDAL command:
gdal_calc.py -A gpw_v4_population_density_rev11_2000_30_sec.tif --outfile=density_2000_sqrt.tif --calc="numpy.where(A<0, 0, numpy.sqrt(A))" --hideNoData
In the above, we also put all the no-data values from -3.402823e+38 to 0.
As a result, we get a GeoTiff that is still float32 and the original projection, but with scaled values and no-data replaced by zeros (which makes sense in our case because no-data was placed only where population density is zero).
Global Population data conversion
The final step in data processing is to convert from the data’s original projection, Plate-Carrée, to Web Mercator. The most convenient method to tile a GeoTiff, regardless of its source projection, is to use MapTiler Engine.
Once you open MapTiler Engine, you must follow a few steps to convert a GeoTiff, export it as mbtiles, and host it on MapTiler Cloud. The steps are detailed in the Global Population Density Data Processing with MapTiler Engine tutorial.
At the end of the process, the new raster tilesets are available under the My Tiles section in your MapTiler Cloud space, where you can get some info, including their tileset IDs (this will be useful later!)
Building an interactive JavaScript Web Map Visualization
Now that our data is ready for visualization, we need to use technology that can handle zooming, panning, and timelapse animation. At MapTiler, we have developed the Weather Library module for our web mapping SDK. This JavaScript library and the data layers we provide with it let you show many different animated layer types (temperature, wind with particles, cloud coverage, etc.) in a super easy way.
This library could also be used for non-weather raster data visualization, animated and interpolated over time; this is exactly what I did with the population density! The complete source code for the interface seen in the demo at the top of this article can be found in the Visualize and animate the evolution of population data tutorial.
Comments
0 comments
Article is closed for comments.