Loading Geospatial Data into SingleStore min read


SingleStore can load geographic data (points, paths, and polygons) that are represented in “Well-Known-Text” (WKT) format. This topic explains how to download map data that is available in the public domain, convert the map data into WKT format, create a SingleStore table that will store the map data, and use the LOAD DATA statement to load the downloaded map data into the table.

Let us look at an example of how to load geographic data of different countries into SingleStore.

Info

As an alternative to reading the steps in the example below, you can also watch a video which covers the same steps.

  1. Navigate to the Natural Earth website and download the dataset for country boundaries. The files are downloaded in the shapefile (SHP) format, which is not supported by SingleStore. Therefore, you need to convert the files to the WKT format, as shown in the steps that follow.

  2. Navigate to the MyGeodata Converter tool and convert the SHP file to a WKT file. It generates a CSV file in the WKT format. The CSV file contains all the polygons, along with other data about each country. Note: SingleStore does not support MULTIPOLYGON. Therefore, if the CSV file contains MULTIPOLYGON, then convert it into multiple single POLYGONs, as shown in the steps that follow.

  3. You can visualize a MULTIPOLYGON as separate POLYGONs by using the Wicket library website. After navigating to the website, copy the MULTIPOLYGON set from the CSV file and paste it to the Wicket box. Click Map It: image In this example, the map displays the country Fiji with three polygons.

  4. Navigate to the CSV file and separate the row containing MULTIPOLYGON into three individual rows, with each row containing a POLYGON. Copy the other column data in the original MULTIPOLYGON row to each of the three POLYGON rows.

image

All the data is now available in the CSV file in the WKT format to load into SingleStore.

  1. Create a table Countries:
CREATE TABLE Countries (
  boundary GEOGRAPHY, name_short VARCHAR(3), name VARCHAR(50),
  name_long VARCHAR (50), abbrev VARCHAR (10), postal VARCHAR (4),
  iso_a2 VARCHAR (2), iso_a3 VARCHAR(3), name_formal VARCHAR(100),
  SHARD KEY(name)
);

In this table, all polygons will be loaded into the boundary column, which is of the data type GEOGRAPHY.

  1. Load the data from the CSV file to the Countries table:
LOAD DATA INFILE '/data/natural_earth_countries_110m-1.csv'
INTO TABLE Countries (boundary, @, @, @, @, @, @, @, @, @, @, name_short, @, name, name_long, abbrev, postal, name_formal, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, iso_a2, iso_a3, @, @, @, @, @, @, @ @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @, @,)
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES;
Info

In the LOAD DATA syntax, the @ symbol is used to ignore unwanted columns. The FIELDS TERMINED BY ',' and OPTIONALLY ENCLOSED BY '*' clauses define the column delimiters and ensure that the commas in the polygon data are not treated as a separate field.

  1. To see the results of loading the data, select the data from the Countries table:
SELECT * FROM Countries ORDER BY name DESC;

You can now use this data for all Geospatial Functions.