Appendix A β€” Data preparation

library(tmap)
library(sf)
worldvector = read_sf("data/worldvector.gpkg")

A.1 Data simplification

Geometries in spatial vector data consists of sets of coordinates (Section 2.2.1). Spatial vector objects grow larger with more features to present and more details to show, and this also has an impact on time to render a map. Figure A.1 (a) shows a map of countries from the worldvector object.

tm_shape(worldvector) +
  tm_polygons()

This level of detail can be good for some maps, but sometimes the number of details can make reading the map harder. To create a simplified (smoother) version of vector data, we can use the ms_simplify function of the rmapshaper package. . It expects a numeric value from 0 to 1 – a proportion of vertices in the data to retain. In the example below, we set keep to 0.05, which keeps 5% of vertices (Figure A.1 (b)).

library(rmapshaper)
worldvector_s1 = ms_simplify(worldvector, keep = 0.05)
tm_shape(worldvector_s1) +
  tm_polygons()

The process of simplification can also be more controlled. By default, the underlining algorithm (called the Visvalingam method, learn more at https://bost.ocks.org/mike/simplify/), removes small features, such as islands in our case. This could have far-reaching consequences - in the process of simplification, we could remove some countries! To prevent the deletion of small features, we also need to set keep_shapes to TRUE. In the case of one country consisting of many small polygons, only one is sure to be retained. For example, look at New Zealand, which is now only represented by Te Waipounamu (the South Island). To keep all of the spatial geometries (even the smallest of islands), we should also specify explode to TRUE.

worldvector_s2 = ms_simplify(worldvector, keep = 0.05,
                             keep_shapes = TRUE, explode = TRUE)
tm_shape(worldvector_s2) +
  tm_polygons()

Figure A.1 (c) contains a simplified map, where each spatial geometry of the original map still exists, but in a less detailed form.

(a) original data
(b) simplified data with 5% of vertices kept
(c) simplified data with 5% of vertices, all features, and all polygons kept
Figure A.1: A map of world’s countries based on: