Part 2: Creating a SingleStoreDB Database and GCS Pipeline
Now that you have a GCS bucket that contains an object (file), you can use SingleStoreDB to create a new pipeline and ingest the messages.
Create a new database and a table that adheres to the schema contained in the books.
CREATE DATABASE books;
CREATE TABLE classic_books(title VARCHAR(255),author VARCHAR(255),date VARCHAR(255));
These statements create a new database named books
and a new table named classic_
, which has three columns: title
, author
, and date
.
Now that the destination database and table have been created, you can create a GCS pipeline.
-
The name of the bucket, such as: my-bucket-name
-
Your Google account’s access HMAC keys, such as: Access Key ID:
your_
Secret Access Key:access_ key_ id your_
secret_ access_ key
Using these identifiers and keys, execute the following statement, replacing the placeholder values with your own:
CREATE PIPELINE libraryAS LOAD DATA GCS 'my-bucket-name'CREDENTIALS '{"access_id": "your_access_key_id", "secret_key": "your_secret_access_key"}'INTO TABLE `classic_books`FIELDS TERMINATED BY ',';
You can see what files the pipeline wants to load by running the following:
SELECT * FROM information_schema.PIPELINES_FILES;
If everything is properly configured, you should see one row in the Unloaded
state, corresponding to books.
.CREATE PIPELINE
statement creates a new pipeline named library
, but the pipeline has not yet been started, and no data has been loaded.
START PIPELINE library FOREGROUND;
When this command returns success, all files from your bucket will be loaded.information_
again, you should see all files in the Loaded
state.classic_
table to make sure the data has actually loaded.
SELECT * FROM classic_books;
+------------------------+-----------------+-------+
| title | author | date |
+------------------------+-----------------+-------+
| The Catcher in the Rye | J.D. Salinger | 1945 |
| Pride and Prejudice | Jane Austen | 1813 |
| Of Mice and Men | John Steinbeck | 1937 |
| Frankenstein | Mary Shelley | 1818 |
+------------------------+-----------------+-------+
You can also have SingleStoreDB run your pipeline in the background.
DELETE FROM classic_books;ALTER PIPELINE library SET OFFSETS EARLIEST;
The first command deletes all rows from the target table.books.
so you can load it again.
To start a pipeline in the background, run
START PIPELINE library;
This statement starts the pipeline.SHOW PIPELINES
.
SHOW PIPELINES;
+----------------------+---------+
| Pipelines_in_books | State |
+----------------------+---------+
| library | Running |
+----------------------+---------+
At this point, the pipeline is running and the contents of the books.classic_
table.
Note
Foreground pipelines and background pipelines have different intended uses and behave differently.
Last modified: September 8, 2023