SingleStore Helios Sizing Guidelines

Overview

This document is designed to help users with capacity planning to size a SingleStore Helios deployment appropriately.

Size a Workload

Workspace Sizing Considerations

Description and How to Measure

Data Size

It is important to understand how much data will be stored in SingleStore Helios.

The most critical element for sizing is to understand how much of the data is the “working set,” which equates to how much data is required to meet the query latency requirements/SLA of an application. For optimal performance, it is imperative that the working set of data fits in the persistent cache.

For example: A Fintech application displays stock performance over the last 30 days and stores the historical data for a full year for other types of analysis. There is a specific SLA for how quickly the dashboard must be visualized when working with data from the last 30 days, but the SLA for the analysis over the year’s worth of data is more flexible.

  • The market performance data for the last 30 days is considered the working set of data

  • The full year’s worth of data is the total data set

Even if the back-end database stores the previous year’s worth of market performance data (which is periodically queried but does not have an SLA), this would not impact the functionality of the application, and thus would not factor into sizing, as this data is stored in the object storage and not in the local cache storage.

Data size can be used as a baseline/lower bound for sizing. To calculate data size:

  1. Multiply the working set size times the compression ratio. SingleStore typically sees 75% compression on average but it depends on the data set.

  2. Take the total size of the raw data and multiply it by the compression ratio

  3. Multiply that result by two (2) to account for High Availability (HA)

  4. Add 25% to that result for other metadata (logs, snapshots, etc.)

This can be expressed formulaically as:

RawDataSizeinTBs * (1 - CompressionRatio)*2 + (RawDataSizeinTBs * (1 - CompressionRatio)*.25)

Questions

  1. How large is the working set of data?

  2. How much data does the application need to perform?

  3. In lieu of an application, how many distinct data sets are there?

  4. How many tables? What are the core tables? How large are they?

  5. What types of data are primarily being stored? Strings? INTs? Floats? JSON? etc.

  6. Is a single tenant or multi-tenant structure being maintained? If multi-tenant, how many tenants are there? What is the approximate size of each?

  7. Is the data set growing? If so, by how much? How frequently?

  8. For those SLAs that are not as stringent for latency and concurrency, an important factor in determining cluster size largely depends on data size.

Data Ingest

Data ingest is about how much, in both rows and bytes, and how fast data is written into the database. This is typically inserts, but could also be updates and/or deletes.

Questions

  1. What is the data size? The number of rows is not as helpful as the overall size of the data.

  2. What is the average ingest rate in bytes/second? What is the peak?

  3. How frequently is this volume of data ingested into the database? Does it stream in continuously, in small batches (every hour, etc.), or infrequently (once a day, etc.)

  4. How is the data currently ingested? What is the data source and format? Is it performed via an ELT, an ETL, or other transformation, like a stored procedure? The goal is to understand how much work will be performed during ingestion. For example, is data being inserted straight into a table or is there significant work performed in a stored procedure to process and transform the data?

  5. How is the ingest rate growing? By what percentage over what time?

Query Shapes

The “core queries” of an application are required to calculate the appropriate cluster size. Core queries are the queries most commonly run in the application and which require a specific SLA in order to meet the desired/required user experience.

Some core queries are run many times, possibly with different parameters (literals) but otherwise the same structure.

It is important to understand the shapes of the core queries that are running as they can impact the performance of the database differently.

A query shape (type) is the form of the query plan that determines how much work a query needs to do. These are the different query shapes. Visual Explain can be used to profile these queries.

  • Selective queries: Queries that return a small number of rows.

    SELECT * FROM table WHERE id = 5

  • Aggregation queries

    SELECT AVG(col1), SUM(col2) FROM table GROUP BY (col3)

  • Joins

    SELECT Count(*) FROM lineitem JOIN orders ON o_orderkey = l_orderkey;

It's also important to identify queries that can run on one partition vs. queries that must run on every partition. Select vie queries typically hit one or a small number of partitions. Aggregations often touch all the partitions. Those queries that run on one partition will tend to have lower total CPU.

For example: Filtering on id if id is a primary key (which is a shard key and will have a unique index) can enable  SELECT * FROM table WHERE id = 5 to run on one leaf node. If id is not a shard key, then the query must touch every partition, and thus every leaf node.

Graphical profile plans as well as JSON profile plans will reveal the indexes being used.

Consider CREATE TABLE t(id int, id2 int, primary key(id)); and query profile SELECT * FROM t WHERE id2 = 1;

The following in the JSON profile means it is touching all shards and scanning:

"index":"KEY __UNORDERED () USING CLUSTERED COLUMNSTORE",
"storage":"columnar",
"table_type":"sharded_columnstore",
"columnstore_in_memory_scan_type":"TableScan",
"columnstore_in_memory_scan_index":"KEY __UNORDERED () USING CLUSTERED COLUMNSTORE",

For profile SELECT * FROM t WHERE id = 1; means it is seeking one shard and seeking:

"index":"KEY __UNORDERED () USING CLUSTERED COLUMNSTORE",
"storage":"columnar",
"table_type":"sharded_columnstore",
"columnstore_in_memory_scan_type":"IndexSeek",
"columnstore_in_memory_scan_index":"UNIQUE KEY PRIMARY (id) USING HASH",

Questions

  1. What are the core queries?

  2. What are the query shapes?

  3. What is the total CPU of each core query? All core queries?

  4. What is the response time of each core query? All core queries?

Latency

Typically, customers will provide one or more latency targets for their core queries, such as a 5-second response time, or a range of response times for each core query.

Questions

  1. What are the latency expectations for each core query?

  2. Is there a latency expectation for all core queries? If so, what is it?

Concurrency

Concurrency refers to the number of queries hitting the data at a given moment, including both typical and peak traffic. Ideally, this concurrency can be measured per query shape. Review the concurrency of each of the core queries.

General numbers can be used to record the concurrency: 1, 5, 10, 20, 50, 100, 500, 1000 etc.

Questions

  1. When a user logs into the application, how many queries are triggered?

  2. On average, how many of the core queries are running in the application concurrently?

  3. Do large spikes in traffic occur? If so, what is the minimum number of queries? The maximum?

  4. How is concurrency growing? By what percentage over what time?

Partitions

As part of capacity planning you need to determine how many partitions are required for optimal performance.

SingleStore is a distributed cluster of nodes that talk to each other. To ensure maximum performance you should have multiple cores to handle queries against the same database partitions on a node.

The performance also depends on the types of queries run, whether they implement distributed joins, and how many are run concurrently.

The general recommendation for most clusters is to have 4 CPU cores per database partition on each leaf. This is based on the fact that every query is a single-threaded operation.

For example, if you have a cluster of 4 leaf nodes with 16 cores on each leaf (64 CPU cores in total across all leaf nodes), you should have 4 partitions on each leaf, that is, 16 partitions in total across the cluster.

It's also important to choose the number of partitions that is divisible by the number of leaves you have. In the above example, a database with 16 partitions within a cluster of 4 leaves will divide those 16 partitions evenly across the 4 leaves (16 / 4 = 0).

However, if you set 18 partitions instead of 16 then you will not have evenly distributed data. 18 (partitions) / 4 (leaves) = 2, which means two leaves would have an extra partition each. The number of partitions modulo the number of leaves should always equal zero.

You should consider the workload when deciding on the number of partitions per leaf. For instance, if you have many parallelized queries that access the same information, it is advisable to have more cores per partition. But if the scale of your application is smaller and queries can be run sequentially or are not run often, you can consider a lesser number of cores per partition. SingleStore does not advise having a small number of just 1 or 2 cores per partition within any production environment.

By default, the partition count and core count match 1:1, and you need to manually change the number of partitions to maximize the performance. The number of partitions on a database is determined by either of the two ways:

  • Explicitly stating the number of partitions when creating the database. This can be done by adding PARTITIONS=X where X equals the number of partitions you want for that database.

  • By multiplying the number of leaves in the cluster with the value of the default_partitions_per_leaf variable.

Questions

  1. How frequently are queries with distributed joins run?

  2. How many queries are run concurrently?

  3. Is the number of partitions chosen divisible by the number of leaves in the cluster?

  4. How frequently are parallelized queries that access the same information run?

Comparative Performance for Capacity Planning

To estimate a workspace size, review the examples below and select the one that most closely resembles your requirements.

To determine the optimal workspace size, test the core queries with the selected workspace size and either increase or decrease the size until it fulfills your latency requirements.

Workspace Sizing Examples

Workspace Size

Small

Medium

Large

Use Case

Support ultra-fast, high-performance queries and massive concurrency while minimizing monthly costs

Support real-time analytics and user-defined metrics while minimizing the time to onboard new tenants

Maintain existing customers and prepare for significant customer growth by increasing platform availability and resiliency, and optimizing application responsiveness

Data Size *

500GB

7TB

1TB

Data Ingest

< 5,000 RPS

Current solution takes 40 - 60 minutes to fully load product data. Load times for SingleStore Helios do not increase the overall ETL duration by more than 10%

While not specifically tested on an S-8 cluster, load times tested on an S-6 cluster with a single tenant (~135s) and seven tenants concurrently (~340s) demonstrated that replacing the existing database with SingleStore Helios would significantly reduce ETL times.

< 5,000 inserts/second from the application

Query Latency

Sub-second

P50 query latency of < 1s

P95 query latency of < 5s

P99 query latency of < 10s

(About latency)

<100ms for 27 different queries

Query Shapes

Analytical queries/summary statistics on 4 fact tables of 2M records each, and 20 reference tables of approximately 5,000 records each

Medium complexity analytical queries

SingleStore Helios was evaluated as a means to power an analytics system and provide customized, user-defined metrics within the customer’s app, where “user-defined metrics” are metrics and signals that customers wanted to track/use within their individual environment.

Analytical (large table scans and aggregations)

Concurrency tested (QPS)

10 - 50

100

100

(achieved with 16:1 CPU:partition ratio)

Proposed Workspace Size

(About workspace sizes)

S-1

S-8

S-32

* Data sizes shown are compressed, where the compression rate can vary up to 80% based on the type of data.

Last modified: August 23, 2024

Was this article helpful?