System Requirements and Recommendations

The following are some requirements and recommendations you should follow when provisioning and setting up your hosts to optimize the performance of your cluster.

Cloud Deployment Recommendations

For cloud deployments, all instances should be geographically deployed in a single region.

Here are some recommendations to help optimize for performance:

  • Cross-AZ / Multi-AZ failover is not recommended. Please see the Recommended Configurations to Tolerate Failure of a Cloud AZ or Nearby Data Center.

  • Network throughput - look for guaranteed throughput instead of bursting "up to" amount. For example, favor "10 Gbps" over "Up to 10 Gbps".

  • For memory-intensive workloads, consider the memory optimized SKUs, typically with a ratio of 8 GB of memory per vCPU.

  • Optimize for NUMA (Non-Uniform Memory Access). Queries across NUMA nodes can be expensive. Optimize for CPU types with a single NUMA node for VM type.

  • For storage, cloud providers should use SSD disks at a minimum. Provisioned SSD with higher IOPS throughput if storage performance is an issue with SSD, though there is a cost trade-off.

  • Each leaf node should map to a separate SSD disk. Parallel I/O is important due to limitations in disk I/O.

  • Use a Network Load Balancer (NLB) for TCP (Layer 4) connectivity. Do not use the classic load balancer option in AWS.

Here are some platform-specific recommendations:

Platform

Compute

Storage

AWS

Memory Optimized: r5.4xLarge

i.e. r5.4xLarge: 16 vCPU and 128 GB of RAM

EBS volumes with SSD: gp3

For higher throughput and cost, use provisioned IOPS: io2

Azure

Memory Optimized SKU: eds_v5

i.e. Standard_E16ds_v5: 16 vCPU and 128 GB of RAM

Managed Disks: LRS only

Premium SSD: Ultra SSD for more performance/cost

GCP

General Purpose SKU with an 8:1 ratio: N2 series

i.e. n2-highmem-16: 16 CPU and 128 GB of RAM

For n2-highmem-16 Minimum CPU platform: Intel Ice Lake

SSD storage type: pd-ssd

pd-extreme is a more expensive option for higher throughput provisioned storage

The following are additional hardware recommendations for optimal performance:

Component

Recommendation

CPU

For each host, an x86_64 CPU and a minimum of 4 cores (8 vCPUs)

SingleStore is optimized for architectures supporting SSE4.2 and AVX2 instruction set extensions, but will run successfully on x86_64 systems without these extensions.

Refer to Recommended CPU Settings (below) for more information.

Memory

A minimum of 8GB of RAM available for each node

A minimum of 32GB of RAM available for each leaf node

It is strongly recommended to run leaf nodes on hosts that have the same hardware and software specifications.

Storage

Provide a storage system for each node with at least 3x the capacity of main memory.

SSD storage is recommended for columnstore workloads.

Both ext4 and xfs filesystems are supported.

Refer to Storage Requirements (below) for more information.

Platform and Kernel

Our recommended platforms are:

  • Red Hat Enterprise Linux (RHEL) / AlmaLinux 7 or later

  • Debian 8 or 9 (version 9 is preferred)

CentOS 6, 7, and 8 are supported but are not recommended for new deployments. CentOS 8 has reached end of life and CentOS has discontinued numbered releases.

When provisioning your hosts, the minimum Linux kernel version required is 3.10 or later. For SingleStore 8.1 and later, glibc 2.17 or later is also required.

Toolbox Platform and Kernel Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Note that the associated sdb-report collect and sdb-report check Toolbox commands may also be run from the command line.

Host Setting

Checker

Description

Host Configuration

cpgroups

cgroupDisabled

Checks if control groups (cgroups) are disabled

Disabled in non-VM deployments.

Run the kernel with the following arguments to disable cgroups:

intel_pstate=disable

cgroup_disable=memory

Defunct (Zombie) Processes

defunctProcesses

Checks if there are defunct (zombie) processes on each host

Kill all zombie processes on each host before deploying SingleStore

Filesystem Type

filesystemType

Checks if a host’s filesystem can support SingleStore

xfs and ext4 are supported filesystems

Kernel Version

kernelVersions

Checks for kernel version consistency

Kernel versions must be the same for all hosts

Linux Out-of-Memory Killer

OutOfMemory

Checks in dmesg for invocations of the Linux out-of-memory killer

The Linux out-of-memory killer is not running on any host

Major Page Faults

majorPageFaults

Checks the number of major page faults per second on each host and determines if it’s acceptable

Major page faults per second on each host:

  • < 10: Host recommended

  • 10 - 20: Use host with caution

  • > 20: Host not recommended

Orchestrator Processes

orchestratorProcesses

Checks if any orchestrator process is found on any host

Not a requirement nor any action required. The sdb-report command will simply notify if one is found.

Proc Filesystem (procfs)

procFs

Collects diagnostic files from /proc/fs, specifically: "info", "options", and "es_shrinker_info"

Configure File Descriptor and Maximum Process Limits

A SingleStore cluster uses a substantial number of client and server connections between aggregators and leaf nodes to run queries and cluster operations. SingleStore recommends setting the Linux file descriptor and maximum process limits to the values listed below to account for these connections. Failing to increase this limit can significantly degrade performance and even cause connection limit errors. The ulimit settings can be configured in the /etc/security/limits.conf file, or directly via shell commands.

Permanently increase the open files limit and the max user processes limit for the memsql user by editing the /etc/security/limits.conf file as the root user and adding the following lines:

memsql soft NOFILE 1024000
memsql hard NOFILE 1024000
memsql soft nproc 128000
memsql hard nproc 128000

Note

Each node must be restarted for the changed ulimit settings to take effect.

The file-max setting configures the maximum number of file handles (file descriptor limit) for the entire system. On the contrary, ulimit settings are only enforced on a process level. Hence, the file-max value must be higher than the NOFILE setting. Increase the maximum number of file handles configured for the entire system in /proc/sys/fs/file-max. To make the change permanent, append or modify the fs.file-max line in the /etc/sysctl.conf file.

Configure the Linux nice Setting

Given how the Linux kernel calculates the maximum nice limit, SingleStore recommends that you modify the /etc/security/limits.conf file and set the maximum nice limit to -10 on each Linux host in the cluster. This will allow the SingleStore engine to run some threads at higher priority, such as the garbage collection threads.

To apply this new nice limit, restart each SingleStore node in the cluster.

Alternatively, you may set the default nice limit to -10 on each Linux host in the cluster prior to deploying SingleStore.

Configure Linux ulimit Settings

Most Linux operating systems provide ways to control the usage of system resources such as threads, files and network at an individual user or process level. The per-user limitations for resources are called ulimits, and they prevent single users from consuming too much system resources. For optimal performance, SingleStore recommends setting ulimits to higher values than the default Linux settings. The ulimit settings can be configured in the /etc/security/limits.conf file, in the /etc/security/limits.d file, or directly via shell commands.

Configure Linux vm Settings

SingleStore recommends letting first-party tools, such as sdb-admin and memsqlctl, configure your vm settings to minimize the likelihood of getting memory errors on your hosts. The default values used by the tools are the following:

  • vm.max_map_count set to 1000000000

  • vm.overcommit_memory set to 0WARNING: vm.overcommit_memory should be set to 0. Using values other than 0 is recommended only for systems with swap areas larger than their physical memory. Please consult your distribution documentation.

  • vm.overcommit_ratio ignore, unless vm.overcommit_memory is set to 2, in which case, set this to 99. See the warning above for vm.overcommit_memory values other than 0.

  • vm.min_free_kbytes set to either 1% of system RAM or 4 GB, whichever is smaller

  • vm.swappiness set between 1 and 10

If the SingleStore Tools cannot set the values for you, you will get an error message stating what the value should be and how to set it. You can set the values manually using the /sbin/sysctl command, as shown below.

sudo sysctl -w vm.max_map_count=1000000000
sudo sysctl -w vm.min_free_kbytes=<either 1% of system RAM or 4 GB, whichever is smaller>

Toolbox vm Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Max Map Count

maxMapCount

Checks that vm.max_map_count is at least 1000000000

The vm.max_map_count kernel setting should be at least 1000000000

Minimum Free Kilobytes

minFreeKbytes

Checks if vm.min_free_kbytes is greater than the recommended minimum

The value of vm.min_free_kbytes kernel setting is at least 4 GB

Swappiness

vmSwappiness

Checks the value of vmSwappiness

The swappiness value (0 - 100) affects system performance as it controls when swapping is activated, and how swap space is used.

When set to lower values, the kernel will use less swap space. When set to higher values, the kernel will use more swap space.

Recommended: Swappiness should never be set to 0.

vmOvercommit

vmOvercommit

Checks if the vmOvercommit  kernel setting is too low.

By design, Linux kills processes that are consuming large amounts of memory when the amount of free memory is deemed to be too low.

Overcommit settings that are set too low may cause frequent and unnecessary failures.

Refer to Configuring System Memory Capacity for more information.

Providing virtual memory without guaranteeing physical storage for it

  • vm.overcommit_memory = 0: Host recommended

  • vm.overcommit_memory = 2 and vm.overcommit_ratio = 99: Use host with caution

  • vm.overcommit_memory = 1: Host not recommended

Configure Swap Space

It is recommended that you create a swap partition (or swap file on a dedicated device) to serve as an emergency backing store for RAM. SingleStore makes extensive use of RAM (especially with rowstore tables), so it is important that the operating system does not immediately start killing processes if SingleStore runs out of memory. Because typical hosts running SingleStore have a large amount of RAM (> 32 GB/node), the swap space can not be small (>= 10% of physical RAM).

For more information setting up and configuring swap space, please refer to your distribution’s documentation.

After enabling these settings, your hosts will be configured for optimal performance when running one or more SingleStore nodes.

Toolbox Swap Space Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Swap

swapEnabled

Checks if swapping is enabled.

Recommended: Enabled.

The total swap memory on each host should be >= 10% of the total RAM (total physical memory)

Swap Usage

swapUsage

Checks if the swap space that is actively being used is less than 5%

  • < 5%: Host recommended

  • > 5%: Use host with caution

  • > 10%: Host not recommended

Configure Transparent Huge Pages

Linux organizes RAM into pages that are usually 4 KB in size. Using transparent huge pages (THP), Linux can instead use 2 MB pages or larger. As a background process, THP transparently re-organizes memory used by a process inside the kernel by either merging small pages to huge pages or splitting few huge pages to small pages. This may block memory usage on the memory manager, which may span for a duration of a few seconds, and prevent the process from accessing memory.

As SingleStore uses a lot of memory, SingleStore recommends that you disable THP at boot time on all nodes (master aggregator, child aggregators, and leaf nodes) in the cluster. THP lag may result in inconsistent query run times or high system CPU (also known as red CPU).

To disable THP, add the following lines to the end of /etc/rc.local before the exit line (if present), and reboot the host:

echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
echo no > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo 0 > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag
echo no > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag

On Red Hat distributions, THP will be under redhat_transparent_hugepage, on other distributions of Linux it will just be transparent_hugepage. You can check which your system uses by running ls /sys/kernel/mm/. Keep only the four relevant settings.

The khugepaged/defrag option will be 1 or 0 on newer Linux versions (e.g. CentOS 7+), but yes or no on older versions (e.g. CentOS 6). You can check which your system uses by running cat /sys/kernel/mm/*transparent_hugepage/khugepaged/defrag. Keep only the matching setting. For example, if you see 1 or 0, keep the line with echo 0; if you see yes or no, keep the line with echo no.

You should end up with at least three lines of settings, for enabled, defrag, and khugepaged/defrag. For example, the correct configuration for CentOS 7 is:

echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
echo 0 > /sys/kernel/mm/redhat_transparent_hugepage/khugepaged/defrag

Typically, you may include all eight settings and THP will still be disabled as expected.

Note

Refer to the documentation for your operating system for more information on how to disable THP.

Toolbox THP Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Transparent Huge Pages (THP)

transparentHugepage

Checks if transparent huge pages are disabled on each host in /sys/kernel/mm/transparent_hugepage/enabled and /sys/kernel/mm/transparent_hugepage/defrag

Recommended: Disabled.

The value of these two settings must be either [never] or [madvise].

Configure Network Time Protocol Service

Install and run ntpd to ensure that system time is in sync across all nodes in the cluster.

For Debian distributions (like Ubuntu):

sudo apt-get install ntpd

For Red Hat / CentOS/ AlmaLinux:

sudo yum install ntp

Configure Cluster-on-Die Mode

If you are installing SingleStore natively and have access to the BIOS, you should enable Cluster-on-Die in the system BIOS for hosts with Haswell-EP and later x86_64 CPUs. When enabled, this will result in multiple NUMA regions being exposed per processor. SingleStore can take advantage of NUMA nodes by binding specific SingleStore nodes to those NUMA nodes, which in turn will result in higher SingleStore performance.

Configure NUMA

If the CPU(s) on your host supports Non-Uniform Memory Access (NUMA), SingleStore can take advantage of that and bind SingleStore nodes to NUMA nodes. Binding SingleStore nodes to NUMA nodes allows faster access to in-memory data since individual SingleStore nodes only access data that’s collocated with their corresponding CPU.

If you do not configure SingleStore this way, performance will be greatly degraded due to expensive cross-NUMA-node memory access. Configuring for NUMA should be done as part of the installation process; however, you can reconfigure your deployment later, if necessary.

Note

Linux and numactl cannot detect when the virtual environment hosts have NUMA. Therefore, the processor and memory binding must be done by the management system of the virtual environment. The recommended setup is to configure the VMs with SingleStore nodes, such that processors and memory are bound or reserved for that NUMA node.

This configuration is recommended as it allows for VM portability with the associated performance improvements afforded by optimizing your system for NUMA. .

SingleStore Tools can do the NUMA binding for you; however, you must have numactl installed first. Perform the following steps on each host:

  1. Log into each host and install the numactl package. For example, for a Debian-based OS:

    sudo apt-get install numactl
  2. For Red Hat / CentOS / AlmaLinux, run the following:

    sudo yum install numactl
  3. Check the number of NUMA nodes your hosts by running numactl --hardware. For example:

    numactl --hardware
    available: 2 nodes (0-1)

    The output shows that there are 2 NUMA nodes on this host, numbered 0 and 1.

For additional information, see Configuring SingleStore for NUMA.

Toolbox CPU Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

CPU Features

cpuFeatures

Read the content of cpuinfo and check that the flags field contains the sse4_2 or avx2 extensions set.

Recommended: sse4_2 and  avx2 are enabled

SingleStore is optimized for architectures supporting SSE4.2 and AVX2 instruction set extensions, but will run successfully on x64 systems without these extensions.

Refer to AVX2 Instruction Set Verification for more information on how to verify if your system supports AVX2.

CPU Frequency

cpuFreqInfo

Collects information about CPU frequency configuration

The following folders should exist:

/sys/devices/system/cpu/cpufreq

and

/sys/devices/system/cpu

CPU frequency scaling data is collected from

/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

CPU Hyperthreading

cpuHyperThreading

Checks that hyperthreading is enabled on each host via the lscpu command (available on each host)

Hyperthreading is enabled

CPU Threading Configuration

cpuThreadingInfo

Collects information about CPU threading configuration

lscpu is available on each host

Hyperthreading is enabled if the number of threads per core > 1

CPU Idle and Utilization

cpuIdle

Checks the CPU utilization & idle

Checks if the CPU is frequently more than 5% idle.

If not, this typically indicates that your workload will not have room to grow, and more cores are (will likely be) required

Percentage of time the CPU is idle:

  • > 25%: Recommended

  • < 25%: Use host with caution

  • < 5%: Host not recommended

CPU and Memory Bandwidth

cpuMemoryBandwidth

Check that CPU and memory bandwidth is appropriate for safe performance on your hosts.

mlc and lscpu are available on each host

  • CPU-memory bandwidth must be at least 4 Gb/s per CPU

  • CPU-memory latency should be 500000ns max.

CPU Model

cpuModel

Checks if all CPU models are the same on each host

CPU Power Control

cpuFreqPolicy

Checks that power saving and turbo mode settings on all hosts are disabled

turbo and powerSave are disabled

NUMA Configuration

numaConfiguration

Checks if SingleStore with NUMA is configured via numactl for optimal performance as described in Configuring SingleStore for NUMA

Enabled on each leaf node host and configured via numactl for optimal SingleStore performance

  • The SingleStore aggregator node should not be NUMA-bound

  • Total NUMA memory capacity should be less than total host memory capacity

  • The number of NUMA nodes should not be less than the number of SingleStore leaf nodes

  • cpunodebind = membind for all hosts

Toolbox Memory Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Committed Memory

memoryCommitted

Checks the committed memory on each host and determines if it’s acceptable

Committed memory on each host:

  • < 30%: Host recommended

  • 70-90%: Use host with caution

  • > 90%: Host not recommended

Maximum Memory Settings

maxMemorySettings

Checks the host’s maximum memory settings

Maximum memory settings are recommended to be a percentage of the host's total memory, with a ceiling of 90%

Recommended:

  • maximum_table_memory < 91%

  • Total maximum_memory < either (0.91 * total RAM) minus 1 MB, or total RAM minus 10 GB, whichever is larger

  • As Kubernetes does not support swap, maximum_memory for the Operator is set to 80% of physical memory, or physical memory minus 10 GB, whichever is larger

POSIX-Compliant

To maintain data durability and resiliency, SingleStore’s data directory (as defined by the datadir engine variable, and which holds database snapshots, transaction logs, and columnstore segments) must reside on a POSIX-compliant filesystem.

SingleStore officially supports ext4 and XFS file systems. However, any POSIX-compliant file system can be used with SingleStore. We have conducted extensive testing with ext4 and XFS file systems and can provide support for all non-infrastructure related issues when running on a POSIX-compatible host.

Most Linux filesystems, including ext3, ext4, XFS, and ZFS are POSIX-compliant when mounted on a POSIX-compliant host.

Storage Requirements

Storage guidelines include:

  • The storage location must be on a single contiguous volume, and should never be more than 60% utilized

  • 30 MB/s per physical core or vCPU

  • 300 input/output operations per second (IOPS) per physical core or vCPU. For example, for a 16-core host, the associated disk should be capable of 500MB/s and 5000 IOPS.

  • For rowstore, the amount of required disk space should be about 5x the amount of RAM. Rowstore storage capacity is limited by the amount of RAM on the host. Increasing RAM increases the amount of available data storage.

  • For columnstore, the amount of required disk space should be about the size of the raw data you are planning to load into the database. For concurrent loads on columnstore tables, SSD storage will improve performance significantly compared to HDD storage.

  • When using high availability (HA), the amount of disk space required will be 2x the size of the raw data..

Toolbox Disk Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Presence of an SSD

validateSsd

Verifies if a host is using an SSD for storage

Recommended: Each host is using an SSD for storage

  • SSD-based drives: Host recommended

  • Non-SSD-based drives: Use host with caution

Disk Bandwidth

diskBandwidth

Checks that disk bandwidth allows safe operation with a SingleStore cluster

Read, write, and sync-read speed >= 128.0 operations/second (ops/s)

Disk Latency

diskLatency with diskLatencyRead, diskLatencyWrite

Checks the read and write latency of the disk to determine overall disk performance

Read/write latency:

  • < 10 ms: Host recommended

  • > 10 ms: Use host with caution

  • > 25 ms: Host not recommended

Disk Storage in Use

diskUsage

Checks the amount of free disk space and determines if it’s approaching the capacity limit

Free disk space:

  • > 30%: Host recommended

  • Disk space is full for 70 - 80% of the time: Use host with caution

  • Disk space is full for more than 80% of the time: Host not recommended

Configure Network Settings

Note: Perform the following steps on each host in the cluster.

  1. As root, display the current sysctl settings and review the values of rmem_max and wmem_max.

    sudo sysctl -a | grep mem_max
  2. Confirm that the receive buffer size (rmem_max) is 8 MB for all connection types. If not, add the following line to the /etc/sysctl.conf file.

    net.core.rmem_max = 8388608
  3. Confirm that the send buffer size (wmem_max) is 8 MB for all connection types. If not, add the following line to the /etc/sysctl.conf file.

    net.core.wmem_max = 8388608
  4. Confirm that the maximum number of connections that can be queued for a socket (net.core.somaxconn) is at least 1024.

    SingleStore will attempt to update this value to 1024 on a node's host when the node starts. After the cluster is up and running, run the following on each host to confirm that this value has been set.

    sudo sysctl -a | grep net.core.somaxconn

    If this value could not be set by SingleStore, add the following line to the host's /etc/sysctl.conf file. Note that values lower than 1024 could allow connection requests to overwhelm the host.

    net.core.somaxconn = 1024
  5. Persist these updates across reboots.

    sudo sysctl -p /etc/sysctl.conf
  6. At the next system boot, confirm that the above values have persisted.

Default Network Ports

Depending on the host and its function in deployment, some or all of the following port settings should be enabled on hosts in your cluster. These routing and firewall settings must be configured to:

  • Allow database clients (such as your application) to connect to the SingleStore aggregators

  • Allow all nodes in the cluster to talk to each other over the SingleStore protocol (3306)

  • Allow you to connect to management and monitoring tools

Protocol

Port

Direction

Description

TCP

3306

Inbound and Outbound

Default port used by SingleStore. Required on all nodes for intra-cluster communication. Also required on aggregators for client connections.

TCP

22

Inbound and Outbound

For host access. Required between nodes in SingleStore tool deployment scenarios. Also useful for remote administration and troubleshooting on the main deployment host.

TCP

443

Outbound

To retrieve public repo key(s) for package verification. Required for nodes downloading SingleStore APT or YUM packages.

TCP

8080

Inbound and Outbound

Default port for Studio. (Only required for the host running Studio.)

The service port values are configurable if the default values cannot be used in your deployment environment. For more information on how to change them, see the SingleStore configuration file, the sdb-toolbox-config register-host command, and Studio Installation Guide.

We also highly recommend configuring your firewall to prevent other hosts on the Internet from connecting to SingleStore.

Toolbox Network Checkers

The following Toolbox checkers are run prior to deploying SingleStore.

Host Setting

Checker

Description

Host Configuration

Network Settings

networkBuffersMax

Checks that the network kernel settings mem_max and rmem_max are not set too low

Recommended: Set each of these values to a minimum of 8 MB (>= 8 MB)

Self-Managed Columnstore Performance Recommendations

SingleStore supports the ext4 and xfs filesystems. As many improvements have been made recently in Linux for NVMe devices, SingleStore recommends using a 3.0+ series kernel. For example, CentOS 7.2 uses the 3.10 kernel.

If you use NVMe drives, set the following parameters in Linux (make it permanent in /etc/rc.local):

# Set ${DEVICE_NUMBER} for each device
echo 0 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/add_random
echo 1 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/rq_affinity
echo none > /sys/block/nvme${DEVICE_NUMBER}n1/queue/scheduler
echo 1023 > /sys/block/nvme${DEVICE_NUMBER}n1/queue/nr_requests

Last modified: February 28, 2024

Was this article helpful?