The following components are fundamental to a SingleStore DB cluster. They are also key terms that are used throughout the documentation.
A node is a server that has an installation of an SingleStore DB instance. See Node Requirements for SingleStore DB Commands to learn about the commands supported by different types of nodes.
A leaf is a node that stores a subset of a cluster’s data. A cluster typically contains many leaves.
A partition contains a subset (a shard) of a database’s data. A leaf contains multiple partitions. When you run CREATE DATABASE, SingleStore DB splits the database into partitions, which are distributed evenly among available leaves. With
CREATE DATABASE, you can specify the number of partitions with the
PARTITIONS=X option. If you don’t specify the number of partitions explicitly, the default is used (the number of leaves times the value of the
default_partitions_per_leaf engine variable.
Types of Tables
A distributed table’s data is distributed across partitions. By contrast, a reference table’s data is replicated, in full, to all nodes in a cluster.
A temporary table is distributed, but is not persisted.
An aggregator is a node that routes queries to the leaves, aggregates intermediate the results, and sends the results back to the client. There are two types of aggregators: master and child. A cluster contains exactly one master aggregator, a specialized aggregator responsible for cluster monitoring and failover. A cluster may also contain zero or more child aggregators (depending on query volume).
Aggregator to Leaf Ratio
The minimal setup for a SingleStore DB cluster is just one aggregator (the master aggregator) and one leaf. You can add more aggregators, which will read metadata from the master aggregator, and can run DML commands on the leaves.
The number of deployed aggregator and leaf nodes determines the storage size and performance of a cluster. Typical deployments have a 5:1 ratio of leaf:aggregator nodes. In a well-designed cluster:
- Applications that require higher connection capabilities from application servers have a higher aggregator-to-leaf node ratio.
- Applications with larger storage requirements have a higher leaf-to-aggregator node ratio.
Interaction of Cluster Components
You run database queries against an aggregator. Most of the communication between aggregators and leaves for query execution is also implemented as SQL statements.
Parallel Execution of Queries
By default, SingleStore DB will create one partition per CPU core on the leaves for maximum parallelism. This number is configurable cluster-wide with the
default-partitions-per-leaf variable, or as an optional parameter to CREATE DATABASE.
In the context of query execution, a partition is the granular unit of query parallelism. In other words, every parallel query is run with a level of parallelism equal to the number of partitions. You can view the partitions in a database with SHOW PARTITIONS.