Cluster-in-a-Box CLI Online Deployment - Red Hat Distribution
Introduction
Installing SingleStoreDB on bare metal, on virtual machines, or in the cloud can be done through the use of popular configuration management tools or through SingleStore’s management tools.
In this guide, you will deploy a SingleStoreDB cluster onto physical or virtual machines and connect to the cluster using our monitoring, profiling, and debugging tool, SingleStoreDB Studio.
A four-node cluster is the minimal recommended cluster size for showcasing SingleStoreDB as a distributed database with high availability; however, you can use the procedures in this tutorial to scale out to additional nodes for increased performance over large data sets or to handle higher concurrency loads. To learn more about SingleStore’s design principles and topology concepts, see Distributed Architecture.
Note
There are no licensing costs for using up to four license units for the leaf nodes in your cluster. If you need a larger cluster with more/larger leaf nodes, please create an Enterprise License trial key.
Prerequisites
For this tutorial you will need:
One (for single-host cluster-in-a-box for development) or four physical or virtual machines (“hosts”) with the following:
Each SingleStoreDB node requires at least four (4) x86_64 CPU cores and eight (8) GB of RAM per host
Eight (8) vCPU and 32 GB of RAM are recommended for leaf nodes to align with license unit calculations
Running a 64-bit version of RHEL/AlmaLinux 7 or later, or Debian 8 or later, with kernel 3.10 or later
For SingleStoreDB 8.1 or later,
glibc
2.17 or later is also required.Port 3306 open on all hosts for intra-cluster communication. Based on the deployment method, this default can be changed either from the command line or via cluster file.
Port 8080 open on the main deployment host for the cluster
A non-root user with sudo privileges available on all hosts in the cluster that be used to run SingleStoreDB services and own the corresponding runtime state
SSH access to all hosts (installing and using
ssh-agent
is recommended for SSH keys with passwords).If using SSH keys, make sure the identity key used on the main deployment host can be used to log in to the other hosts.
Refer to How to Setup Passwordless SSH Login for more information on using SSH without a password.
A connection to the Internet to download required packages
If running this in a production environment, it is highly recommended that you follow our host configuration recommendations for optimal cluster performance.
Duplicate Hosts
As of SingleStoreDB Toolbox 1.4.4, a check for duplicate hosts is performed before SingleStoreDB is deployed, and will display a message similar to the following if more than one host has the same SSH host key:
✘ Host check failed.host 172.26.212.166 has the same ssh host keys as 172.16.212.165, toolbox doesn't support registering the same host twice
Confirm that all specified hosts are indeed different and aren’t using identical SSH host keys. Identical host keys can be present if you have instantiated your host instances from images (AMIs, snapshots, etc.) that contain existing host keys. When a host is cloned, the host key (typically stored in /etc/ssh/ssh_host_<cipher>_key
) will also be cloned.
As each cloned host will have the same host key, an SSH client cannot verify that it is connecting to the intended host. The script that deploys SingleStoreDB will interpret a duplicate host key as an attempt to deploy to the same host twice, and the deployment will fail.
The following steps demonstrate a potential remedy for the “duplicate hosts” message. Please note these steps may slightly differ depending on your Linux distribution and configuration.
sudo root ls -al /etc/ssh/ rm /etc/ssh/<your-ssh-host-keys> ssh-keygen -f /etc/ssh/<ssh-host-key-filename> -N '' -t rsa1 ssh-keygen -f /etc/ssh/<ssh-host-rsa-key-filename> -N '' -t rsa ssh-keygen -f /etc/ssh/<ssh-host-dsa-key-filename> -N '' -t dsa
For more information about SSH host keys, including the equivalent steps for Ubuntu-based systems, refer to Avoid Duplicating SSH Host Keys.
As of SingleStoreDB Toolbox 1.5.3, sdb-deploy setup-cluster
supports an --allow-duplicate-host-fingerprints
option that can be used to ignore duplicate SSH host keys.
Network Configuration
Depending on the host and its function in deployment, some or all of the following port settings should be enabled on hosts in your cluster.
These routing and firewall settings must be configured to:
Allow database clients (e.g. your application) to connect to the SingleStoreDB aggregators
Allow all nodes in the cluster to talk to each other over the SingleStoreDB protocol (3306)
Allow you to connect to management and monitoring tools
Protocol | Default Port | Direction | Description |
---|---|---|---|
TCP | 22 | Inbound and Outbound | For host access. Required between nodes in SingleStoreDB tool deployment scenarios. Also useful for remote administration and troubleshooting on the main deployment host. |
TCP | 443 | Outbound | To get public repo key for package verification. Required for nodes downloading SingleStore APT or YUM packages. |
TCP | 3306 | Inbound and Outbound | Default port used by SingleStoreDB. Required on all nodes for intra-cluster communication. Also required on aggregators for client connections. |
TCP | 8080 | Inbound and Outbound | Default port for Studio. (Only required for the host running Studio.) |
The service port values are configurable if the default values cannot be used in your deployment environment. For more information on how to change them, see:
The cluster file template provided in this guide
The sdb-toolbox-config register-host command
We also highly recommend configuring your firewall to prevent other hosts on the Internet from connecting to SingleStoreDB.
Install SingleStore Tools
Caution
Studio should be installed on a host that is open to local (internal) network traffic only, and not open to the Internet.
The first step in deploying your cluster is to download and install the SingleStore Tools on one of the hosts in your cluster. This host will be designated as the main deployment host for deploying SingleStoreDB across your other hosts and setting up your cluster.
These tools perform all major cluster operations including downloading the latest version of SingleStoreDB onto your hosts, assigning and configuring nodes in your cluster, and other management operations. For the purpose of this guide, the main deployment host is the same as the designated Master Aggregator of the SingleStoreDB cluster.
Note: If SingleStoreDB is installed as a sudo
user via packages, systemd
will automatically start the associated SingleStoreDB processes when a host is rebooted.
Online Installation - Red Hat Distribution
Run the following commands to install SingleStore Tools.
sudo yum-config-manager --add-repo https://release.memsql.com/production/rpm/x86_64/repodata/memsql.repo && \ sudo yum install -y singlestore-client singlestoredb-toolbox singlestoredb-studio
Troubleshooting
If SingleStore Tools cannot be installed using the above commands, verify the following and re-run the above commands to install SingleStore Tools.
Verify that the SingleStore repo information is listed under
repolist
.sudo yum repolist **** repo id repo name status memsql MemSQL 125
Verify that the
which
package is installed. This is used during the install process to identify the correct package type for your installation.rpm -q which
If
which
is not installed, you must install it before proceeding.sudo yum install -y which
If you cannot install
which
, you will need to specify the package asrpm
during the deployment phase.
If you receive an error that the GPG check failed, which resembles the following:
Importing GPG key <gpg-key>: Userid : "MemSQL Release Engineering <security@memsql.com>" Fingerprint: <fingerprint> From : https://release.memsql.com/release-aug2018.gpg Key import failed (code 2). Failing package is: singlestore-client-1.0.7-1.x86_64 GPG Keys are configured as: https://release.memsql.com/release-aug2018.gpg Public key for singlestoredb-studio-4.0.7-7ce5bef293.x86_64.rpm is not installed. Failing package is: singlestoredb-studio-4.0.7-1.x86_64 GPG Keys are configured as: https://release.memsql.com/release-aug2018.gpg Public key for singlestoredb-toolbox-1.13.13-9fa4ef2d34.x86_64.rpm is not installed. Failing package is: singlestoredb-toolbox-1.13.13-1.x86_64 GPG Keys are configured as: https://release.memsql.com/release-aug2018.gpg The downloaded packages were saved in cache until the next successful transaction. You can remove cached packages by executing 'yum clean packages'. Error: GPG check FAILED
View the current crypto policies.
update-crypto-policies --show
If
SHA1
is not present, update the crypto policies to work with SingleStore Tools.sudo update-crypto-policies --set DEFAULT:SHA1
Refer to Using system-wide cryptographic policies for more information.
Deploy SingleStoreDB
Prerequisites
Warning
Before deploying a SingleStoreDB cluster in a production environment, please review and follow the host configuration recommendations. Failing to follow these recommendations will result in sub-optimal cluster performance.
In addition, SingleStore recommends that each Master Aggregator and child aggregator reside on its own host when deploying SingleStoreDB in a production environment.
Notes on Users and Groups
The user that deploys SingleStoreDB via SingleStoreDB Toolbox must be able to SSH to each host in the cluster. When singlestoredb-server
is installed via an RPM or Debian package when deploying SingleStoreDB, a memsql
user and group are also created on each host in the cluster.
This memsql
user does not have a shell, and attempting to log in or SSH as this user will fail. The user that deploys SingleStoreDB is added to the memsql
group. This group allows most Toolbox commands to run without sudo
privileges, and members of this group can perform many Toolbox operations without the need to escalate to sudo
. Users who desire to run SingleStoreDB Toolbox commands must be added to the memsql
group on each host in the cluster. They must also be able to SSH to each host.
Manually creating a memsql
user and group is only recommended in a sudo
-less environment when performing a tarball-based deployment of SingleStoreDB. In order to run SingleStoreDB Toolbox commands against a cluster, this manually-created memsql
user must be configured so that it can SSH to each host in the cluster.
Minimal Deployment
SingleStoreDB has been designed to be deployed with at least two nodes:
A Master Aggregator node that runs SQL queries and aggregates the results, and
A single leaf node, which is responsible for storing and processing data
These two nodes can be deployed on a single host (via the cluster-in-box
option), or on two hosts, with one SingleStoreDB node on each host.
While additional aggregators and nodes can be added and removed as required, a minimal deployment of SingleStoreDB always consists of at least these two nodes.
Cluster-in-a-Box CLI Online Deployment
Note
If you are deploying SingleStoreDB on a host or virtual machine (VM), and want to connect to the cluster from a separate host or the VM's host, set the bind address from 127.0.01
to 0.0.0.0
.
To do so, include either:
The
--bind-address
option on the command line. Refer to sdb-deploy cluster-in-a-box for more information.The
bind_address
field for the Master Aggregator in the cluster configuration file. Refer to sdb-deploy setup-cluster for more information.
Option 1: Command Line
You can deploy your SingleStoreDB cluster on a single host using the sdb-deploy cluster-in-a-box
command. This command will create two nodes: A Master Aggregator node that runs SQL queries and aggregates the results, and a single leaf node, which is responsible for storing and processing data. These two nodes form the most basic SingleStoreDB cluster.
The default port for the Master Aggregator can be changed by adding the
--master-port
option and specifying the desired port.The default port for the leaf node can be changed by adding the
--leaf-port
option and specifying the desired port.
sdb-deploy cluster-in-a-box --license <license> --version 8.1 --password <secure-password>
Note: You can retrieve the license from the SingleStore Customer Portal.
Warning
If your host does not have the which
command available, you will need to specify the correct package through the --force-package-format {rpm|deb}
flag when running the cluster-in-a-box
command.
Option 2: Cluster File
This example is equivalent to sdb-deploy cluster-in-a-box
, where a single-host cluster is created with two nodes: a Master Aggregator and a single leaf node.
Set package_type
to either rpm
for Red Hat distributions or deb
for Debian distributions to download and deploy the appropriate singlestoredb-server
.
license: <license-from-portal.singlestore.com> memsql_server_version: 7.3.10 package_type: <rpm|deb> hosts: - hostname: 127.0.0.1 localhost: true nodes: - register: false role: Master config: password: <secure-password> port: 3306 - register: false role: Leaf config: password: <secure-password> port: 3307
Using this cluster configuration file, sdb-deploy setup-cluster
:
Registers a single, local host to the cluster.
Installs
singlestoredb-server
v7.3.10 on this local host.Creates a Master Aggregator node on port
3306
and sets the SingleStoreDB password to the one specified in the cluster file.Creates a leaf node on port
3307
and sets the SingleStoreDB password to the one specified in the cluster file.Run the following with the path to the cluster file as input.
sdb-deploy setup-cluster --cluster-file </path/to/cluster-file>
Additional Deployment Options
Note
If this deployment method is not ideal for your target environment, you can choose one that fits your requirements from the Deployment Options.
Interact with your Cluster
Start Studio
On your main deployment host, run the following command to use Studio to monitor and interact with your cluster.
Enable the Studio service to start Studio at system boot (recommended).
sudo systemctl enable singlestoredb-studio.service **** Created symlink /etc/systemd/system/multi-user.target.wants/memsql-studio.service → /lib/systemd/system/memsql-studio.service.
SSH into your main deployment host and run the following:
sudo systemctl start singlestoredb-studio
If your Linux distribution does not use systemd
, you can run Studio directly instead.
sudo singlestoredb-studio &
The Studio Web server will now be running on port 8080, which can be accessed via Web browser at http://<main-deployment-host>:8080
.
Add a New Cluster to Studio
With Studio running, go to
http://<main_deployment_host>:8080
and click Add New Cluster to set up a cluster.Important
Studio is only supported on Chrome and Firefox browsers at this time.
To run Studio on a different port, add
port = <port_name>
to/etc/singlestore/singlestoredb-studio.hcl
and restart Studio.Paste the main deployment host IP address or hostname into Hostname.
Set Port to
3306
.Specify
root
as the Username.In the Password field, provide the Superuser password that was set during cluster deployment.
Click Create Cluster Profile and set Type as Development.
Fill in Cluster Name and Description to your preference.
After you have successfully logged in, you will see the dashboard for your cluster. To run a query against your cluster, navigate to the SQL Editor through the navigation in the left pane.
Next Steps After Deployment
Now that you have installed SingleStoreDB and connected to Studio, check out the following resources to continue your learning:
How to Run Queries: Provides example schema and queries to begin exploring the potential of SingleStoreDB.
How to Load Data into SingleStoreDB: Describes the different options you have when ingesting data into a SingleStoreDB cluster.
Optimizing Table Data Structures: Learn the difference between rowstore and columnstore tables, when you should pick one over the other, how to pick a shard key, and so on.
Overview: Contains information about SingleStore Tools, including SingleStoreDB Toolbox and related commands.
SingleStoreDB Studio Overview: More information on how to use Studio.