Load Data from MySQL

Note

This is a Preview feature.

SingleStore does not recommend using this feature in a production environment.

You can replicate your MySQL databases in SingleStore using Change Data Capture (CDC).

Syntax

CREATE LINK <link_name> AS MYSQL
CONFIG '{"database.hostname": "<hostname>",
"database.exclude.list": "mysql,performance_schema",
"database.port": 3306 [, "database.ssl.mode":"<ssl_mode>"]}'
CREDENTIALS '{"database.password": "<password>", "database.user": "<user>"}';
CREATE TABLE [IF NOT EXISTS] <table_name> AS INFER PIPELINE AS LOAD DATA
LINK <link_name> "<source_db>.<source_table>" FORMAT AVRO;
CREATE TABLES [IF NOT EXISTS] AS INFER PIPELINE AS LOAD DATA
LINK <link_name> "*" FORMAT AVRO;

Remarks

  • The CREATE TABLE [IF NOT EXISTS] <table_name> AS INFER PIPELINE statement,

    • Connects to the MySQL instance using the specified LINK <link_name> clause.

    • Infers the schema of the table and creates a table (named <table_name>) using the inferred schema. When the IF NOT EXISTS clause is specified and a table with the specified name already exists, a new table is not created and the existing table is used instead.

    • Creates a pipeline (named <table_name>) and stored procedure (named <table_name>_apply_changes) that maps the AVRO data structure to the SingleStore data structure. The IF NOT EXISTS clause is ignored for pipelines and stored procedures. If a pipeline or stored procedure with the same name already exists, the CREATE TABLE ... AS INFER PIPELINE statement returns an error.

  • The CREATE TABLES [IF NOT EXISTS] AS INFER PIPELINE statement creates a table in SingleStore for each table in the source MySQL database using the same set of operations as the CREATE TABLE [IF NOT EXISTS] <table_name> AS INFER PIPELINE statement (specified above). Additionally, it discovers the available databases and tables filtered by database.exclude.list, database.include.list, table.exclude.list, and table.include.list.

  • All the source MySQL tables must have primary keys.

  • If the replication process fails with an out of memory error in Java, increase the heap size using the pipeline_cdc_java_heap_size engine variable.

  • The MySQL instance must allow incoming traffic from the SingleStore cluster.

  • To view pipeline errors, run the following SQL statement:

    SELECT * FROM information_schema.pipelines_errors;
  • Before restarting the INFER PIPELINE operation, delete all the related artifacts.

    DROP TABLE <target_table_name>;
    DROP PIPELINE <pipeline_name>;
    DROP PROCEDURE <procedure_name>;

Required Configurations to Replicate MySQL Tables

Configure MySQL

  1. Enable binary logging for MySQL replication by setting the log-bin option to mysql-bin in your MySQL configuration. Refer to Enabling the binlog for more information.

  2. Configure the following additional options in MySQL, as specified:

    binlog_format=ROW
    binlog_row_image=FULL

    You may need to restart MySQL in order to incorporate the configuration changes. Consult the MySQL documentation for more information.

  3. Create a new database named singlestore. This database is required for internal usage.

    CREATE DATABASE IF NOT EXISTS singlestore;

    The singlestore database is meant to store the pipeline metadata information to confirm the status of the pipeline and other operations. SingleStore does not allow replicating this database and the tables that are included in it.

  4. Grant the following privileges:

    GRANT CREATE, INSERT, DELETE, SELECT ON singlestore.* TO <user>;
    GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO <user>;
    GRANT SELECT ON *.* TO <user>;
  5. For authentication, the <user> must use the mysql_native_password plugin. Refer to Native Pluggable Authentication for more information.

Configure SingleStore

Configure the following engine variables in SingleStore:

SET GLOBAL experimental_enable_mysql_migration = true;

Replicate MySQL Tables

  1. Create a link to the MySQL endpoint using the CREATE LINK command.

    CREATE LINK <link_name> AS MYSQL
    CONFIG '{"database.hostname": "<hostname>", "database.exclude.list": "mysql,performance_schema", "database.port": 3306, "database.ssl.mode":"<ssl_mode>"}'
    CREDENTIALS '{"database.password": "<password>", "database.user": "<user>"}';
  2. Run the CREATE {TABLE | TABLES} AS INFER PIPELINE SQL statement to replicate the tables.

  3. Start the pipeline(s):

    1. To start all the pipelines, run the START ALL PIPELINES SQL statement.

    2. To start a specific pipeline, run the START PIPELINE <pipeline_name> SQL statement. By default, the pipeline has the same name as the target table.

    Note

    Once the pipeline(s) (CDC operation) has started, do not run ALTER TABLE statements.

Example

Perform the following tasks after configuring MySQL:

  1. Create a new SingleStore Kai-enabled workspace and connect to it.

  2. Create a link to the MySQL endpoint.

    CREATE LINK pLink AS MYSQL
    CONFIG '{"database.hostname": "svchost", "database.exclude.list": "mysql,performance_schema", "database.port": 3306, "database.ssl.mode":"required"}'
    CREDENTIALS '{"database.password": "s2user", "database.user": "admin"}';
  3. Create tables, pipelines, and stored procedures in SingleStore based on the inference from the source tables.

    CREATE TABLES AS INFER PIPELINE AS LOAD DATA
    LINK pLink "*" FORMAT AVRO;
  4. Start the pipelines and begin the replication process.

    START ALL PIPELINES;

Last modified: April 26, 2024

Was this article helpful?