SingleStore Managed Service

Building a New Application

Welcome to SingleStore Managed Service! We put together a guide to help you build a new application using SingleStore. Use this guide to follow a top-down approach to:

  • Preparing your data for SingleStore

  • Interacting with SingleStore

  • Designing your SingleStore tables to be optimized for your applications

  • Importing data from your originating database

  • Optimizing your queries for your applications

  • Connecting your application to SingleStore

Tutorial Outline

This tutorial is divided into the following sections:

Getting Started

  • Benefits of leveraging S2MS for your application

  • Why SingleStore is a developer’s first choice for applications

  • What kinds of apps are developers building?

Preparing Your Data for SingleStore

SingleStore Database Administration

  • SingleStore Portal

  • Drivers

  • MySQL Workbench

  • MySQL Command Line

Designing Your Tables

  • Data Types

  • Designing a schema for an application

    • Sharding for application workloads

    • Sort keys for application workloads

    • Reference Tables

Ingesting Data

  • Using SingleStore Pipelines

  • Using INSERT Statements

Testing Your Queries

  • Run queries

  • Visual Explain

  • Concurrency test your queries

Connecting to Your Application Development Tools

  • Code samples to connect to SingleStore

Getting Started

We are excited for you to get started building your first application with SingleStore Managed Service (S2MS). Hundreds of developers have built apps using S2MS to power their new startup idea, pet project, or even school projects! There are many reasons people come to us, so read on as we discuss the simple steps to getting speed.

Some terminology used in this guide:

  • App/Application: Web- or mobile-based, customer or internal facing application.

  • Object storage: Cloud-based data repository.

There are a few key steps to building the database for your first app, and here's how SingleStore can help:

  • Design your schemas - Make a few simple, straightforward selections on how your data is distributed and sorted to set yourself up for millisecond response

  • Load your data - Five lines of SQL to ingest thousands of records per second

  • Test your queries - Simple Portal interface to run queries, with in-depth visual profiling tool to make them fast

  • Connecting to your app - We offer many connectivity options, no matter where you're building your first app

These steps are applicable to all of the applications that have been built on SingleStore.

Here are a few ideas to help you brainstorm on some concepts to embed within your app:

  • Geospatial apps for real-time tracking and fencing using points, paths, and polygons

  • Analytical apps with complex aggregations and real-time updates

  • Mobile apps with rapid queries and high concurrency

Let's get started by discussing some of the basic things to consider when bringing your data over from one of these databases.

Sample Architecture

Here's example of how SingleStore is used by many to give immersive, responsive experiences to their customers. This can be done in many ways, and you'll see below how one user leveraged SingleStore Managed Service not just as a fast query layer for an app, but as the data hub to send information downstream to other apps.

16083b57a37f35.png

Users just getting started may already have data being generated by their application. If not, maybe you just have some synthetic data. Regardless of what kind of data you have or where it's coming from, there is sure to be a way to ingest it within SingleStore and test out queries.

Preparing Your Data for SingleStore

SingleStore has simple, powerful methods of bringing data in from object storage and streaming data sources. If your data is already in object storage like S3, Azure Blob or GCS, you will be set for the data import section. Also, if your data is in more of a real-time store you will be able to ingest data easily.

That said, we realize if you're just getting started you may not have bulk or streaming data sources set up. If you'd like to write directly from your application, that's fine too! After we discuss schema design, we'll walk you through connecting directly from your application and writing data.

Database Administration

Now that we have identified our data source, let's talk a bit about how we'll interact with SingleStore.

Recommended: SingleStore Portal and Studio

When you signed up in Portal, you got access to our SQL Editor and Visual Explain tools, as well as to SingleStore Studio via a link within your cluster details. Portal and Studio are the best places to interact with your SingleStore data, build new data pipelines, and test out queries. Click here to get a tour of Studio and make sure you explore the features in Portal, as well as the Tutorials available through the Help button.

If you do not plan to use SingleStore Studio, make sure that you have the appropriate client drivers prior to using other database administration tools.

MySQL Workbench

If you're coming from MariaDB or MySQL, you may already be comfortable with Workbench. You can download MySQL Workbench here.

Notice

When defining your connection, you will need to go to the Advanced tab and insert defaultAuth=mysql_native_password in the "Others:" field to ensure proper authentication.

MySQL Command Line

Within your SingleStore Portal, your cluster details will include a MySQL Command section. This gives you an easy way to use the command line to connect to SingleStore. You can find the download for the MySQL shell here.

Notice

When defining your connection, you will need to enter this at the MySQL command line: --defaultAuth=mysql_native_password to ensure proper authentication.

Designing Your Tables

At this point, you should be using SingleStore SQL Editor or Studio, or some other MySQL client to work with your cloud database. Before checking out how fast SingleStore can bring in your data, let's make sure your tables are designed optimally for your application workload.

By default within SingleStore Managed Service, database tables are created using our Universal Storage format (i.e., disk-based columnstore). There are a few important things to consider when designing tables to house your data.

Data Types

If you're familiar with relational databases, you may not need too much guidance on data types.

In addition to data types traditionally supported in relational databases, SingleStore also supports JSON and geospatial data types. Read about Data Types in SingleStore.

Shard Key

This key determines how data is distributed across the database cluster, and is critical to ensure that your data isn't skewed. Skewed data can lead to longer query times. Data contained within unique values of your shard key will reside in individual partitions of the database.

We offer free training on sharding if you'd like to learn more!

So how do I pick a shard key best for my application workload?

  • If you have a primary key, make sure the shard key is a subset of it (because cardinality matters!).

  • If your application queries include frequent joins or filters on a specific set of columns, make sure the shard key is a subset of those.

  • Concurrency is very important with application workloads, so make sure your shard key allows your queries to be single partition, as explained below.

In this example, we use user_id as our shard key, which works nicely given its high cardinality as a part of this dataset. All records with the same user_id will be maintained together, which will improve query response time.

CREATE TABLE clicks (
        click_id BIGINT AUTO_INCREMENT,
        user_id INT,
        page_id INT,
        ts TIMESTAMP,
        SHARD KEY (user_id),
        KEY (click_id, user_id USING CLUSTERED
        COLUMNSTORE)
);
Columnstore Key

In addition to identifying your shard key, it's important to tell SingleStore how you would like to sort your data within each data segment. This helps SingleStore enable segment elimination, which ensures a minimal amount of data needs to be read for each query. This also helps SingleStore presort data for your queries

So how do I pick a columnstore key best for my application workload?

  • If you have common filter columns, make sure those are in the columnstore key.

  • If you're inserting in order by some column, it's best to put that column first in the columnstore key.

  • Lower cardinality columns should be first in the columnstore key.

In this example, we use price as our sort key, so items are sorted in order of that column when queried.

CREATE TABLE products (
     ProductId INT,
     Color VARCHAR(10),
     Price INT,
     Qty INT,
     KEY (`Price`) USING CLUSTERED COLUMNSTORE,
     SHARD KEY (`ProductId`)
);
Reference Tables

If you have small, infrequently changing table(s) that are required for joins, consider making them Reference tables.

  • Reference tables are replicated to each leaf in the cluster ensuring data does not need to go across the network between partitions to join data.

  • Reference table commands need to be run from the endpoint listed in the SingleStore Portal.

Data Ingest

Now for the fun part, ingesting data! This is where things may look a bit different to you compared to other databases, because SingleStore has this unique ingest capability called Pipelines that supports high-frequency, parallel ingest of data from sources like S3, Azure Blob, GCS, Kafka, etc. Skip-list indexes and concurrent versioning technologies allow these writes to not lock tables, allowing reads to continue unimpacted during ingest.

SingleStore Pipelines

To use a Pipeline to import data into SingleStore, write a CREATE PIPELINE statement using our SQL Editor or a MySQL client.

A few things to consider:

  • Make sure that your security settings in your blob storage will allow for access from SingleStore. For example, AWS S3 security settings can be found here.

  • Make sure your buckets are not public, but you should be able to obtain an access and secret key using the AWS doc here.

  • You can use wildcard notation when identifying your files from within the bucket.

Here's an example of a CREATE PIPELINE statement:

CREATE PIPELINE library
AS LOAD DATA S3 'my-bucket-name'
CONFIG '{"region": "us-west-1", "suffixes": ["csv"]}'
CREDENTIALS '{"aws_access_key_id": "your_access_key_id", "aws_secret_access_key": "your_secret_access_key"}'
INTO TABLE `classic_books`
FIELDS TERMINATED BY ',';

GCP Documentation:

Load Data from Google Cloud Storage (GCS) using a Pipeline

Bucket Security

Azure Documentation:

Load Data from Azure Blobs using a Pipeline

Bucket Security

Insert Statements

Perhaps instead of bulk loading data, you'd like to write directly from your application. That's great! Many of our users do this with the MySQL JDBC driver. You can find examples on writing your insert statements here.

Testing Your Queries

Here are some helpful hints for testing your queries.

Running Queries

Hopefully at this point you have your data in SingleStore. You can check this by running some basic SELECT statements within the Studio SQL Editor. Next, you may want to try out some queries that you were running with your last database. Generally, you'll want to run queries twice to get a true understanding of the runtime as the first run must create and cache the query plan. See why here.

Visual Explain

One great feature of SingleStore Portal is our Visual Explain functionality. If you encounter a situation in which your query is taking longer than expected, highlight the desired query in the SQL Editor and click the Visual Explain icon to the left of the Run button. The icon resembles a tree. After clicking, you may then choose between EXPLAIN and PROFILE.

the visual explanation or profiling of your query.

Once you identify a bottleneck, you should be able to make changes either to your schema or to your query itself in order to improve speed.

You can manually (non-visually) run EXPLAIN or PROFILE from any client; see the links above for details on the commands.

Benchmarking

At SingleStore, we've developed an easy-to-use tool for benchmarking called dbbench. You can check that out here.

Once you've installed the packages to your host machine, you can walk through this tutorial. All you will have to do is change the host from 127.0.0.1 to the endpoint listed for your cluster in Portal (it should look something like: svc-xxx-dml.aws-virginia-1.db.memsql.com). The port will remain 3306, as listed. You can then move onto testing based on your application workload.

Connecting to Your Application Development Tools

We've created tutorials on how to connect to SingleStore using a variety of different frameworks, which you can find in the list below.

The tutorials below show how to leverage both SingleStore Managed Service and SingleStore DB (our self-managed product). You can skip to the part of each that details how to make the connection and then get started. Again, you'll be using your endpoint here provided in Portal, along with the credentials you used when spinning up your cluster.

JavaScript / Node

SQL: https://github.com/singlestore-labs/start-with-singlestore-node

Stored Procedures: https://github.com/singlestore-labs/start-with-singlestore-node-stored-procedure

C#

SQL: https://github.com/singlestore-labs/start-with-singlestore-csharp

Stored procedures: https://github.com/singlestore-labs/start-with-singlestore-csharp-stored-procedure

Java

SQL: https://github.com/singlestore-labs/start-with-singlestore-java

Stored procedures: https://github.com/singlestore-labs/start-with-singlestore-java-stored-procedure

Go

SQL: https://github.com/singlestore-labs/start-with-singlestore-go

Stored procedures: https://github.com/singlestore-labs/start-with-singlestore-go-stored-procedure

Ruby

SQL: https://github.com/singlestore-labs/start-with-singlestore-ruby

Stored procedures: https://github.com/singlestore-labs/start-with-singlestore-ruby-stored-procedure