AWS EC2 Best Practices
On this page
This document summarizes findings from the technical assistance provided by SingleStoreDB engineers to customers operating a production SingleStoreDB environment at Amazon EC2.
SingleStoreDB Aggregator Node - What is a SingleStoreDB Aggregator?
SingleStoreDB Leaf Node - What is a SingleStoreDB Leaf?
SingleStoreDB will run on any Amazon Machine Image (AMI) with a kernel version of 3.
SingleStoreDB is a shared-nothing MPP cluster of servers.
The recommended decision making method is a two-step process:
Select the proper instance type for the SingleStoreDB server.
This will be a cluster’s
Determine the required number of instances to scale out the cluster capacity horizontally to meet storage, response time, concurrency, and service availability requirements.
The basic principle when provisioning virtualized resources for a SingleStoreDB server is to allocate CPU, RAM, storage I/O and Networking in a balanced manner so no particular resource becomes a bottleneck, leaving other resources underutilized.
EC2 users should keep in mind that
Amazon EC2 dedicates some resources of the host computer, such as CPU, memory, and instance storage, to a particular instance.AWS Instance Types
SingleStore recommends provisioning a minimum of 8GB of RAM per physical core or virtual processor.
When selecting an instance type as a building block of a cluster in a production environment, users should only consider instances with 8GB or more RAM per vCPU.
Several available instance types meet the above guideline.
FAQ: Is it better to have a smaller number of bigger machines or a larger number of smaller machines?
Answer: Though there is no
one size fits all answer, the smallest number of
medium size machines may be a good baseline.
medium size instance may be a good starting point.
Note: If you select a multi-socket EC2 instance, you must enable NUMA and deploy one leaf node per socket.
SingleStore recommends 10 Gbps networking and deploying clusters in a single availability zone on a VPC for latency and bandwidth considerations.
You can set up replication in a secondary cluster for DR configuration in a different availability zone in the same region or a different region.
Nitro-based Instance types support network bandwidth up to 100 Gbps based on the instance type selected.
Instance type selection based on Memory requirement of the Aggregator node and Leaf node.
EBS volume IOPS should also be supported by the EC2 instance type selected.
As a general rule, all EC2 instances of a cluster should be configured on a single subnet.
All AWS customers get a VPC allocated to the account.
For DR configurations or geographically distributed data services, customers can provision two or more clusters, each in a dedicated VPC, typically in separate regions, with VPC peering between VPCs (regions).
The following examples illustrate use cases leveraging VPC peering:
A SingleStoreDB environment includes a primary cluster in one region and a secondary cluster in a different geography (region) for DR.
Connectivity between the primary and secondary sites is provided by VPC peering.
A cluster is ingesting data subscribing to a Kafka feed.
A customer would typically set up a Kafka cluster in one VPC and a cluster in a different VPC, with a VPC peering to connect the SingleStoreDB database to Kafka.
For VPC peering setup, scenarios, and configuration guidance see the VPC Peering Guide.
Partition placement groups help reduce the likelihood of correlated hardware failures for your application.
SingleStore recommends using Partition Placement Groups with SingleStoreDB Availability Groups to implement HA solutions.
Aligning AWS Availability Zones with SingleStoreDB Availability Groups
If you are considering using AWS Availability Zones and SingleStoreDB Availability Groups to add
AZ level robustness to operating environments, note the following:
SingleStoreDB operates most efficiently when all nodes of a cluster are within a single subnet.
When provisioning EBS volumes’ capacity per application data retention requirements, SingleStoreDB EC2 administrators need to include a
fudge factor and ensure that no production environment is operated with less than 40% free storage space on the data volume.
For rowstore SingleStoreDB deployments, provision a storage system for each node with at least 3 times the capacity of main memory and for columnstore workloads, provision SSD based storage volumes.
Please note that an under-configured EC2 storage is a common root cause of inconsistent SingleStoreDB EC2 cluster performance.
SingleStoreDB is a shared-nothing MPP system, i.
To ensure a permanent SingleStoreDB server storage, users need to provision EBS volumes and attach them to SingleStoreDB servers (EC2 instances).
EBS is a disk subsystem that is shared among instances, which means that SingleStoreDB EC2 users may encounter somewhat unpredictable variations in I/O performance across SingleStoreDB servers and even performance variability of the same EBS volume over time.
noisy neighbor problem), by file replication for availability, by EBS rebalancing, etc.
The elastic nature of EBS means that the system is designed to monitor utilization of underlying hardware assets and automatically rebalance itself to avoid hotspots.
To maximize consistency and performance characteristics of EBS, SingleStore encourages users to follow the general AWS recommendation to attach multiple EBS volumes to an instance and stripe across the volumes.
Users can consider attaching 3-4 EBS volumes to each leaf server (instance) of the cluster and present this storage to the host as a software RAID0 device.
Studies show that there is an appreciable increase in RAID0 performance up to 4 EBS volumes in a stripe, with flattening after 6 volumes.
For more information, see the AWS documentation section
Amazon EBS Volume Performance on Linux Instances, in particular:
SingleStoreDB EC2 customers with extremely demanding database performance requirements may consider provisioning enhanced EBS types such as io1, delivering very high IOPS rates.
The General Purpose SSD (gp2) option provides a good balance of performance and cost for most deployments.
You do not have to over provision your EBS volumes based on your future expected workloads.
As a reminder, SingleStoreDB provides native out-of-the-box fault tolerance.
In SingleStoreDB database environments running on physical hardware, SingleStore recommends supplementing a cluster’s fault tolerance with storage level redundancy supported by hardware RAID controllers.
However, in EC2 environments storage-level redundancy provisions are not applicable because:
EBS volumes are not statistically independent (they may share the same physical network and storage infrastructure).
Studies and customer experience show that performance of software RAID in a redundant configuration, in particular RAID5 over EBS volumes is below acceptable levels.
For fault tolerance, SingleStoreDB EC2 users can rely on cluster level redundancy and under-the-cover mirroring of EBS volumes provided by AWS.
EC2 instance types that meet recommendations for a SingleStoreDB server typically come with preconfigured temporary block storage referred to as instance store or ephemeral store.
However due to instance storage’s ephemeral nature, proper care must be taken (configure HA, understand the limitations and potential risks) when deploying persistent data storage in a production environment.
The use of instance storage for SingleStoreDB data is typically limited to scenarios where the database can be reloaded entirely from persistent backups or custom save points.
development sandbox, or for one-time data mining/ad hoc analytics, or when data files loaded since the last save point are preserved and may be used to restore the latest content, etc.
SingleStore recommends enabling EBS encryption.
SingleStore recommends backing up to an S3 bucket with cross-region replication enabled to protect against region failure and to meet disaster recovery requirements.
Application clients access a SingleStoreDB database cluster by connecting to aggregator nodes.
Application side connection pool.
Sophisticated connection pool implementations offer load balancing, failover and failback, and even multi-pool failover and failback.
NLB, Network Load Balancing service.
Expiring security certificates can be a security risk.
Last modified: October 31, 2023