Load Data and Run Queries into SingleStore
On this page
Welcome to SingleStore Helios! We put together this guide to walk you through using the Cloud Portal to create a database, load sample data quickly, and then run some queries against that data.
Specifically, this guide walks you through:
-
Creating a SingleStore Helios workspace.
-
Creating a database and a table.
-
Creating an S3 Pipeline and running that pipeline to load sample data into your database.
The data we are loading is approximately 20 million rows of sales data for a cosmetics store. -
Running queries against the data you loaded.
Create a SingleStore Helios Workspace
If you already have a workspace, feel free to skip to the next step and create the database.
Go to the Cloud Portal and follow these instructions to create a workspace.
Create a Database and Table
Use the SQL Editor to run the following commands.
Note
Free shared tier users cannot create new databases.
Create the database:
-- create a databaseCREATE DATABASE cosmeticshop;
Create the table:
-- create a tableUSE cosmeticshop;CREATE TABLE cosmeticshopfunnel(event_time TIMESTAMP,event_type CHAR(18),product_id CHAR(10),category_id CHAR(20),category_code TINYTEXT,brand TINYTEXT,price FLOAT,user_id TINYTEXT,user_session TINYTEXT);
Create and Run an S3 Pipeline
Again, run these commands in the SQL Editor.
Note
For this example, you must set data_
to a value of 7.
or lower.
First, create the pipeline:
-- create a pipeline to ingest the data in AWS S3CREATE or REPLACE PIPELINE cosmeticshoppipeAS LOAD DATA S3 's3://studiotutorials/eCommerce/*'CONFIG '{"region": "us-east-1"}'INTO TABLE `cosmeticshopfunnel`FIELDS TERMINATED BY ',' ENCLOSED BY '"';
Next, run the pipeline.
-- start the pipelineSTART PIPELINE cosmeticshoppipe;
You can check on the progress of the pipeline by checking to see how many rows are in the table.
-- see how many events have been ingestedSELECT COUNT(*) FROM cosmeticshopfunnel;
Once you rows are loaded, you can verify the data has been ingested correctly by retrieving some rows.
-- see the data that has been ingestedSELECT * FROM cosmeticshopfunnel LIMIT 100;
You should see 100 rows of cosmetics sales data.
Run Queries on the Data
The following queries show some possible ways to interact with the data.
-- create a holiday reference table to store all holiday datesCREATE REFERENCE TABLE holidays(holiday TINYTEXT,date_of_holiday DATE PRIMARY KEY);-- insert holiday datesINSERT INTO holidays VALUES("New Year's Day", "2020-1-1"),("Martin Luther King Jr. Day", "2020-02-20"),("Memorial Day", "2020-05-25"),("Independence Day", "2020-07-04"),("Labor Day", "2020-09-07"),("Veterans Day", "2019-11-11"),("Thanksgiving", "2019-11-28"),("Christmas Day", "2019-12-25");SELECT * FROM holidays;-- find out which holiday has the most activity, interesting results!SELECT holiday, count(holiday) FROM cosmeticshopfunnelJOIN (SELECT holiday, DATE_SUB(date_of_holiday, INTERVAL 3 DAY) AS beforedate, DATE_ADD(date_of_holiday, INTERVAL 3 DAY) AS afterdate FROM holidays)ON event_time > beforedate and event_time < afterdateGROUP BY holidayORDER BY count(holiday) desc;-- find out which is the top brand purchased during each of the holidaysSELECT holiday, brand, count(brand) FROM cosmeticshopfunnelJOIN (SELECT holiday, DATE_SUB(date_of_holiday, INTERVAL 3 DAY) AS beforedate, DATE_ADD(date_of_holiday, INTERVAL 3 DAY) AS afterdate FROM holidays)ON event_time > beforedate AND event_time < afterdateWHERE event_type = "purchase" AND brand != ""GROUP BY holidayORDER BY count(holiday) desc;-- find out if customers are sensitive to the average price of the brandsSELECT brand, AVG(price), COUNT(event_type) FROM cosmeticshopfunnelWHERE brand != ""GROUP BY brandORDER BY count(event_type) desc;-- find out which brands have been removed from cart the mostSELECT brand, COUNT(brand) COUNT FROM cosmeticshopfunnelWHERE event_type = "remove_from_cart" AND brand != ""GROUP BY brand ORDER BY COUNT desc;-- find out which brands have been purchased the mostSELECT brand, COUNT(brand) AS c FROM cosmeticshopfunnelWHERE event_type = "purchase"GROUP BY brandORDER BY c desc;-- find out which categories have been purchased the mostSELECT DISTINCT category_code, COUNT(category_code) AS c FROM cosmeticshopfunnelWHERE event_type = "purchase" GROUP BY category_code ORDER BY c desc;-- find out which brands have been purchased the mostSELECT brand, COUNT(brand) AS c FROM cosmeticshopfunnelWHERE event_type = "purchase" GROUP BY brand ORDER BY c desc;-- find out which product_id has been the most removed from cartSELECT product_id, COUNT(product_id) AS c FROM cosmeticshopfunnelWHERE event_type = "remove_from_cart" GROUP BY product_id ORDER BY c desc;
Last modified: October 28, 2024