Load and Analyze AdTech Data into SingleStore
On this page
In this tutorial, we will create a database to ingest millions of simulated Ad Campaign events from a Kafka workspace.
Create the Database and Tables
Create a database called adtech with two tables: events and campaigns.
CREATE DATABASE IF NOT EXISTS adtech;USE adtech;CREATE TABLE events (user_id int,event_name varchar(128),advertiser varchar(128),campaign int(11),gender varchar(128),income varchar(128),page_url varchar(128),region varchar(128),country varchar(128),SORT KEY adtmidx (user_id, event_name, advertiser, campaign),SHARD KEY user_id (user_id));CREATE REFERENCE TABLE campaigns (campaign_id smallint(6) NOT NULL DEFAULT '0',campaign_name varchar(255) CHARACTER SET utf8 COLLATE utf8_general_ci DEFAULT NULL,PRIMARY KEY (campaign_id));
The events table is a columnstore table containing information about the advertiser, campaign and demographic information about the user.
The campaigns table is a small reference rowstore table.
After creating the tables, we will insert the names of the 14 ad campaigns that are currently running into the campaigns table.
INSERT INTO `campaigns` VALUES (1,'demand great'),(2,'blackout'),(3,'flame broiled'),(4,'take it from a fish'),(5,'thank you'),(6,'designed by you'),(7,'virtual launch'),(8,'ultra light'),(9,'warmth'),(10,'run healthy'),(11,'virtual city'),(12,'online lifestyle'),(13,'dream burger'),(14,'super bowl tweet');
Create a Pipeline to Ingest Event Data
This query will create a pipeline to ingest event data from a Kafka workspace into the events table.BATCH_
has been set to 2500 milliseconds to slow down and simulate event data.
CREATE PIPELINE eventsAS LOAD DATA KAFKA 'public-kafka.memcompute.com:9092/ad_events'BATCH_INTERVAL 2500IGNORE INTO TABLE eventsFIELDS TERMINATED BY '\t' ENCLOSED BY '' ESCAPED BY '\\'LINES TERMINATED BY '\n' STARTING BY ''(user_id,event_name,advertiser,campaign,gender,income,page_url,region,country);
Configure the pipeline to start reading from the earliest (or oldest) available offset in the data source.
ALTER PIPELINE events SET OFFSETS EARLIEST;START PIPELINE events;
Download and Install a Dashboard Tool
SingleStore Helios works with all major dashboard tools because it is MySQL wire compatible.
Download and install Metabase on your preferred platform: https://www.
Set Up the Metabase Application
Launch the Metabase application that you installed and add a database with the following information by selecting Settings > Admin > Setup > Add a database:
-
Database Type: MySQL
-
Name: adtech
-
Host: From your Cloud Portal page, copy your endpoint name and paste it in here
-
Port: 3306
-
Database name: adtech
-
Database User: Enter your workspace user name (e.
g. , "admin") -
Password: Enter the password you used for the workspace
If the database connection is correctly established, you will see the message: "Your database has been added!"
Exit via Settings > Exit Admin.
Create Dashboard Questions in Metabase
To create a question in Metabase, click Ask a question > Native Query.
Copy the following SQL queries and paste each of them into Metabase’s SQL editor and run the query.
Total Number of Events
SELECT count(*) FROM events;
Events by Region
SELECTevents.country AS `events.country`,count(events.country) AS 'events.countofevents'FROM adtech.events AS eventsgroup by 1;
Events by Top 5 Advertisers
SELECTevents.advertiser AS `events.advertiser`,COUNT(*) AS `events.count`FROM adtech.events AS eventsWHERE(events.advertiser LIKE '%Subway%' OR events.advertiser LIKE '%McDonals%' OR events.advertiser LIKE '%Starbucks%' OR events.advertiser LIKE '%Dollar General%' OR events.advertiser LIKE '%YUM! Brands%' OR events.advertiser LIKE '%Dunkin Brands Group%')GROUP BY 1ORDER BY 2 DESC;
Ad Visitors by Gender and Income
SELECT * FROM (SELECT *, DENSE_RANK() OVER (ORDER BY z___min_rank) as z___pivot_row_rank, RANK() OVER (PARTITION BY z__pivot_col_rank ORDER BY z___min_rank) as z__pivot_col_ordering, CASE WHEN z___min_rank = z___rank THEN 1 ELSE 0 END AS z__is_highest_ranked_cell FROM (SELECT *, MIN(z___rank) OVER (PARTITION BY `events.income`) as z___min_rank FROM (SELECT *, RANK() OVER (ORDER BY CASE WHEN z__pivot_col_rank=1 THEN (CASE WHEN `events.count` IS NOT NULL THEN 0 ELSE 1 END) ELSE 2 END, CASE WHEN z__pivot_col_rank=1 THEN `events.count` ELSE NULL END DESC, `events.count` DESC, z__pivot_col_rank, `events.income`) AS z___rank FROM (SELECT *, DENSE_RANK() OVER (ORDER BY CASE WHEN `events.gender` IS NULL THEN 1 ELSE 0 END, `events.gender`) AS z__pivot_col_rank FROM (SELECTevents.gender AS `events.gender`,events.income AS `events.income`,COUNT(*) AS `events.count`FROM adtech.events AS eventsWHERE(events.income <> 'unknown' OR events.income IS NULL)GROUP BY 1,2) ww) bb WHERE z__pivot_col_rank <= 16384) aa) xx) zzWHERE (z__pivot_col_rank <= 50 OR z__is_highest_ranked_cell = 1) AND(z___pivot_row_rank <= 500 OR z__pivot_col_ordering = 1) ORDER BY z___pivot_row_rank
Visualize the Data
When the results are returned, you can select the way you want to visualize your data by clicking on the Visualization button.
After you are satisfied with how your data is represented, click Save.
Create a Dashboard in Metabase
Create your dashboard by clicking on the + icon and selecting New dashboard.
Give your dashboard a name.
Next, add questions that you created in the previous step by clicking on the + icon or Add a question.
Last modified: October 28, 2024