# Online Deployment Using YAML File - Debian Distribution

## Introduction

Installing SingleStore on either bare metal or virtual machines can be done through the use of popular configuration management tools or through SingleStore’s management tools.

In this guide, you will deploy a SingleStore cluster onto physical or virtual machines and connect to the cluster using a SQL client.

A four-node cluster is the minimal recommended cluster size for showcasing SingleStore as a distributed database with high availability; however, you can use the procedures in this tutorial to scale out to additional nodes for increased performance over large data sets or to handle higher concurrency loads. To learn more about SingleStore’s design principles and topology concepts, see [Distributed Architecture](https://docs.singlestore.com/db/v9.1/introduction/distributed-architecture.md).

> **📝 Note**: For supported deployments on AWS, Azure, or GCP, please use [SingleStore Helios](https://www.singlestore.com/pricing/).

## Prerequisites

For this tutorial you will need:

* One (for single-host cluster-in-a-box for development) or four physical or virtual machines (“hosts”) with the following:

  * The number of vCPUs, the amount of RAM, and the size of the persistent cache will vary based on the license version used to deploy SingleStore. Refer to [What are license units and how do they apply to my cluster?](https://docs.singlestore.com/db/v9.1/introduction/faqs/general.md) for more information.
  * Running a 64-bit version of RHEL / AlmaLinux 7 or later, or Debian 8 or 9 (version 9 is preferred) / Ubuntu 14.04 and later, with kernel 3.10 or later

    For SingleStore 8.1 or later, `glibc` 2.17 or later is also required.
  * Port 3306 open on all hosts for intra-cluster communication. Based on the deployment method, this default can be changed either from the command line or via cluster file.
  * Port 8080 open on the main deployment host for the cluster
  * A non-root user with sudo privileges available on all hosts in the cluster that be used to run SingleStore services and own the corresponding runtime state
* SSH access to all hosts

  * Installing and using `ssh-agent` is recommended for SSH keys with passwords. Refer to [ssh-agent and ssh-add](https://cylab.be/blog/230/ssh-agent-and-ssh-add) and [Use ssh-agent to Manage Private Keys](https://www.linode.com/docs/guides/using-ssh-agent/) for more information.
  * If your environment does not support the use of `ssh-agent` , make sure the identity key used on the main deployment host can be used to log in to each host in the cluster. Refer to [How to Setup Passwordless SSH Login](https://linuxize.com/post/how-to-setup-passwordless-ssh-login/) for more information.
* A connection to the Internet to download required packages

If running this in a production environment, it is highly recommended that you follow our [host configuration recommendations](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/cluster-configuration/system-requirements-and-recommendations.md) for optimal cluster performance.

## Duplicate Hosts

As of **SingleStore Toolbox 1.4.4**, a check for duplicate hosts is performed before SingleStore is deployed, and will display a message similar to the following if more than one host has the same SSH host key:

```
✘ Host check failed.host 172.26.212.166 has the same ssh
host keys as 172.16.212.165, toolbox doesn't support
registering the same host twice

```

Confirm that all specified hosts are indeed different and aren’t using identical SSH host keys. Identical host keys can be present if you have instantiated your host instances from images (AMIs, snapshots, etc.) that contain existing host keys. When a host is cloned, the host key (typically stored in `/etc/ssh/ssh_host_<cipher>_key`) will also be cloned.

As each cloned host will have the same host key, an SSH client cannot verify that it is connecting to the intended host. The script that deploys SingleStore will interpret a duplicate host key as an attempt to deploy to the same host twice, and the deployment will fail.

The following steps demonstrate a potential remedy for the “duplicate hosts” message. Please note these steps may slightly differ depending on your Linux distribution and configuration.

```shell
sudo root
ls -al /etc/ssh/
rm /etc/ssh/<your-ssh-host-keys>
ssh-keygen -f /etc/ssh/<ssh-host-key-filename> -N '' -t rsa1
ssh-keygen -f /etc/ssh/<ssh-host-rsa-key-filename> -N '' -t rsa
ssh-keygen -f /etc/ssh/<ssh-host-dsa-key-filename> -N '' -t dsa

```

For more information about SSH host keys, including the equivalent steps for Ubuntu-based systems, refer to [Avoid Duplicating SSH Host Keys](https://blog.digitalocean.com/avoid-duplicate-ssh-host-keys/).

As of **SingleStore Toolbox 1.5.3**, `sdb-deploy setup-cluster` supports an `--allow-duplicate-host-fingerprints` option that can be used to ignore duplicate SSH host keys.

## Network Configuration

Depending on the host and its function in deployment, some or all of the following port settings should be enabled on hosts in your cluster.

These routing and firewall settings must be configured to:

* Allow database clients (e.g. your application) to connect to the SingleStore aggregators
* Allow all nodes in the cluster to talk to each other over the SingleStore protocol (3306)
* Allow you to connect to management and monitoring tools

| Protocol | Default Port | Direction            | Description                                                                                                                                                            |
| -------- | ------------ | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| TCP      | 22           | Inbound and Outbound | For host access. Required between nodes inSingleStoretool deployment scenarios. Also useful for remote administration and troubleshooting on the main deployment host. |
| TCP      | 443          | Outbound             | To get public repo key for package verification. Required for nodes downloading SingleStore APT or YUM packages.                                                       |
| TCP      | 3306         | Inbound and Outbound | Default port used bySingleStore. Required on all nodes for intra-clustercommunication. Also required on aggregators for client connections.                            |

The service port values are configurable if the default values cannot be used in your deployment environment. For more information on how to change them, see:

* The cluster file template provided in this guide
* The [SingleStore configuration file](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/engine-variables/memsql-cnf.md)
* The [sdb-toolbox-config register-host](https://docs.singlestore.com/db/v9.1/reference/singlestore-tools-reference/sdb-toolbox-config-commands/register-host.md) command

We also highly recommend configuring your firewall to prevent other hosts on the Internet from connecting to SingleStore.

## Install SingleStore Tools

The first step in deploying your cluster is to download and install the SingleStore Tools on one of the hosts in your cluster. This host will be designated as the main deployment host for deploying SingleStore across your other hosts and setting up your cluster.

These tools perform all major cluster operations including downloading the latest version of SingleStore onto your hosts, assigning and configuring nodes in your cluster, and other management operations. For the purpose of this guide, the main deployment host is the same as the designated Master Aggregator of the SingleStore cluster.

**Note**: If SingleStore is installed as a `sudo` user via packages, `systemd` will automatically start the associated SingleStore processes when a host is rebooted.

## Online Installation - Debian Distribution

1. Verify you have `apt-transport-https` installed.
   ```shell
   apt-cache policy apt-transport-https
   ```
   If `apt-transport-https` is not installed, you must install it before proceeding.
   ```shell
   sudo apt -y install apt-transport-https
   ```

2. SingleStore packages are signed to ensure integrity, so the GPG key needs to be added to this machine.
   ```shell
   wget -O - 'https://release.memsql.com/release-aug2018.gpg' 2>/dev/null | sudo apt-key add - && apt-key list
   ```
   **Without using `apt-key`**:
   ```shell
   wget -q -O - 'https://release.memsql.com/release-aug2018.gpg' | sudo tee /etc/apt/trusted.gpg.d/memsql.asc 1>/dev/null
   ```

3. Add the SingleStore repository to retrieve its packages.
   ```shell
   echo "deb [arch=amd64] https://release.memsql.com/production/debian memsql main" | sudo tee /etc/apt/sources.list.d/memsql.list
   ```

4. Run the following to install SingleStore Tools.
   ```shell
   sudo apt update && sudo apt -y install singlestoredb-toolbox singlestore-client
   ```

## Deploy SingleStore

## Prerequisites

> **⚠️ Warning**: Before deploying a SingleStore cluster in a production environment, please review and follow the [host configuration recommendations](https://docs.singlestore.com/db/v9.1/reference/configuration-reference/cluster-configuration/system-requirements-and-recommendations.md). Failing to follow these recommendations will result in sub-optimal cluster performance.In addition, SingleStore recommends that each Master Aggregator and child aggregator reside on its own host when deploying SingleStore in a production environment.

## Notes on Users and Groups

The user that deploys SingleStore via SingleStore Toolbox must be able to SSH to each host in the cluster. When `singlestoredb-server` is installed via an RPM or Debian package when deploying SingleStore, a `memsql` user and group are also created on each host in the cluster.

This `memsql` user does not have a shell, and attempting to log in or SSH as this user will fail. The user that deploys SingleStore is added to the `memsql` group. This group allows most Toolbox commands to run without `sudo` privileges, and members of this group can perform many Toolbox operations without the need to escalate to `sudo`. Users who desire to run SingleStore Toolbox commands must be added to the `memsql` group on each host in the cluster. They must also be able to SSH to each host.

Manually creating a `memsql` user and group is only recommended in a `sudo`-less environment when performing a tarball-based deployment of SingleStore. In order to run SingleStore Toolbox commands against a cluster, this manually-created `memsql` user must be configured so that it can SSH to each host in the cluster.

## Minimal Deployment

SingleStore has been designed to be deployed with at least two nodes:

* A Master Aggregator node that runs SQL queries and aggregates the results, and
* A single leaf node, which is responsible for storing and processing data

These two nodes can be deployed on a single host (via the `cluster-in-box` option), or on two hosts, with one SingleStore node on each host.

While additional aggregators and nodes can be added and removed as required, a minimal deployment of SingleStore always consists of at least these two nodes.

## Online Deployment Using YAML File

As of SingleStore Toolbox 1.3.0, the `sdb-deploy setup-cluster` command now accepts a YAML-based cluster configuration file (or simply “cluster file”), the format of which is validated before attempting to set up the specified cluster. Using a cluster file is the recommended method for creating new SingleStore clusters.

The command is designed to be consistent, where re-running the `sdb-deploy setup-cluster` command with the same cluster file will always produce the same cluster. This methods is also resilient, allowing errors encountered at any stage of the cluster construction process to be corrected, and `sdb-deploy setup-cluster` re-run, in order to generate the desired cluster.

## Complete Cluster File Template

```yaml
license:                           <LICENSE | /path/to/LICENSE-file> [Required to bootstrap Master Aggregator]
high_availability:                 <true | false>
memsql_server_version:             <the version of memsql you want to install (6.7+)>
memsql_server_file_path:           <path to the downloaded memsql server file>
memsql_server_preinstalled_path:   <equivalent to using the '--preinstalled-path' option;
                                   the path to the unpacked singlestoredb-server file
                                   where the unpacked folder name must be of the form
                                   'singlestoredb-server-<version>*' or
                                   'memsql-server-<version>*'>
skip_install:                      <true | false> [ADVANCED]
skip_validate_env:                 <true | false> [ADVANCED]
allow_duplicate_host_fingerprints: <true | false> [ADVANCED]
assert_clean_state:                <true | false> [ADVANCED]
package_type:                      <rpm | deb | tar> [Required if multiple package managers are present]
root_password:                     <default password to be used for all nodes>
optimize:                          <true | false>
optimize_config:
  memory_percentage:               <percentage of memory you want memsql to use>
  no_numa:                         <true | false>
sync_variables:                    [ADVANCED]
  <variable's name>:               <variable's value>            
hosts:
- hostname:                        <host-name> [Required]
  localhost:                       <true | false> 
  skip_auto_config:                <true | false>
  memsqlctl_path:                  <path to memsqlctl> [ADVANCED]
  memsqlctl_config_path:           <path to memsqlctl config> [ADVANCED]
  tar_install_dir:                 <path to tar install dir> [ADVANCED]
  tar_install_state:               <path to tar install state> [ADVANCED]
  ssh:                             [Required for remote Hosts]
    host:                          <ssh host name>
    port:                          <ssh port>
    user:                          <ssh user>
    private_key:                   <path to your identity key>
  nodes:
  - register:                      <true | false>
    force_registration:            <true | false> [ADVANCED] 
    role:                          <Unknown | Master | Leaf | Aggregator> (case sensitive) [Required]
    aggregator_role:               <voting_member | follower (by default)> [Applicable only for role Aggregator]
    availability_group:            <availability group>
    no_start:                      <true | false>
    config: 
      auditlogsdir:                <path to auditlogs directory> [ADVANCED]
      baseinstalldir:              <path to base install directory> [ADVANCED]
      configpath:                  <path to configuration path> [ADVANCED] [Required if register is true]
      datadir:                     <path to data directory> [ADVANCED]
      disable_auto_restart:        <true | false>
      password:                    <password>
      plancachedir:                <path to plancache directory> [ADVANCED]
      port:                        <port number> [Required for node creation]
      tracelogsdir:                <path to tracelogs directory> [ADVANCED]
      bind_address:                <bind address> [ADVANCED]
      ssl_fips_mode:               <true | false > [ADVANCED]
    variables:
      <variable's name>:           <variable's value>
```

## Deploy a Cluster

You can deploy your own SingleStore cluster with your desired cluster configuration using the cluster file template above, and/or the example cluster files in the following sections.

After creating the cluster file, you can deploy the corresponding SingleStorecluster via the `sdb-deploy setup-cluster` command.

Run the following with the path to the cluster file as input.

```shell
sdb-deploy setup-cluster --cluster-file </path/to/cluster-file>

```

## Cluster File Notes

* **`high_availability`**: Used to enable [high availability](https://docs.singlestore.com/db/v9.1/user-and-cluster-administration/high-availability-and-disaster-recovery/managing-high-availability.md) on the cluster.

  * If set to `true`, each node may be assigned an availability group via the `availability_group` field.
  * Refer to [Availability Groups](https://docs.singlestore.com/db/v9.1/introduction/distributed-architecture/high-availability.md) for more information.
* **`license`**: Use your license from the [Cloud Portal](https://portal.singlestore.com). This can be the license itself, or the full path to a text file with the license in it.
* `singlestoredb-server_version`: You may specify either a major release of SingleStore (such as `7.3`) or a specific release (such as `7.3.10`). When a major release is specified, the latest patch level of that release will be deployed.
* **Setting a Password**: There are two ways to set a password in the cluster file YAML:

  * A global root password: Including the `root_password` field with a password will ensure that each node uses the same root password. Recommended. See Example 1.
  * A node-specific root password: Including a `password` field in each node definition. This is only recommended if your security protocols require each node to have its own root password. See Example 2.
* **`register`**: Set the value of this field to `false` to create a new node. Set the value to `true` if the node is already present and you want to register it to SingleStore Toolbox. The `configpath` field and value are also required when `register` is set to `true`. *Do not set this value to `true` to create a new node*. For more information, refer to the `sdb-deploy setup-cluster` [reference page](https://docs.singlestore.com/db/v9.1/reference/singlestore-tools-reference/sdb-deploy-commands/setup-cluster.md).
* **Indicating a Host**: You may use either an IP address or a hostname when indicating a host in the cluster file.
* **Aggregator Hosts**: When deploying SingleStore, SingleStore recommends that you deploy each Aggregator to its own individual host. If the Master Aggregator goes down, the Child Aggregators can keep running queries, and coordinating and executing writes. In this scenarios, the only operations that can’t be done are DDL commands and reference table management, which must be done on the Master Aggregator.
* **Optimize the Cluster**: SingleStore recommends that you include the `optimize` field in the cluster file and set it to `true`. Doing so checks your current cluster configuration against a set of best practices and either makes changes to maximize performance or provides recommendations for you. For hosts with NUMA support, this command will bind the leaf nodes to specific NUMA nodes.

## Cluster File Examples

SingleStore uses a combination of *aggregator* and *leaf* nodes that are typically configured in a specific ratio. For more information, refer to [Cluster Components](https://docs.singlestore.com/db/v9.1/introduction/distributed-architecture/cluster-components.md).

The examples below deploy two different types of SingleStore cluster:

* A multi-host, multi-node SingleStore cluster with four hosts, two aggregators, and two leaf nodes
* A multi-host, multi-node SingleStore cluster with two hosts, a single aggregator, and two leaf nodes

These cluster file examples can be used as a starting point for deploying a SingleStore cluster that fulfills your specific requirements.

* *Example 1: Two Hosts, Four Nodes*

  For this example, you will need two hosts and the ability to `ssh` into each host from the main deployment host.> **📝 Note**: Set `package_type` to either `rpm` for Red Hat distributions or `deb` for Debian distributions to download and deploy the appropriate `singlestoredb-server`.```yaml
  license: <license-from-portal.singlestore.com>
  memsql_server_version: rc:9.1

  package_type: <rpm | deb>
  root_password: <secure-password>
  hosts:
  - hostname: 172.16.212.165
    localhost: true
    nodes:
    - register: false
      role: Master
      config:
        auditlogsdir: /data/memsql/Master/auditlogs/
        datadir: /data/memsql/Master/data
        plancachedir: /data/memsql/Master/plancache
        tracelogsdir: /data/memsql/Master/tracelogs
        port: 3306
    - register: false
      role: Leaf
      config:
        auditlogsdir: /data/memsql/Leaf1/auditlogs
        datadir: /data/memsql/Leaf1/data
        plancachedir: /data/memsql/Leaf1/plancache
        tracelogsdir: /data/memsql/Leaf1/tracelogs
        port: 3307
  - hostname: 172.16.212.166
    localhost: false
    ssh:
      host: 172.16.212.166
      private_key: /home/<user>/.ssh/id_rsa
    nodes:
    - register: false
      role: Leaf
      config:
        auditlogsdir: /data/memsql/Leaf2/auditlogs
        datadir: /data/memsql/Leaf2/data
        plancachedir: /data/memsql/Leaf2/plancache
        tracelogsdir: /data/memsql/Leaf2/tracelogs
        port: 3307

  ```Using this cluster file, `sdb-deploy setup-cluster`:1) Registers two hosts to the cluster.

  2) Installs the latest 9.1 version of `singlestoredb-server` on both hosts.

  3) Creates a Master Aggregator node on port `3306` on host `172.16.212.165` and sets the SingleStore password to the one specified by the value in the `root_password` field.

  4) Creates a leaf node on port `3307` on host `172.16.212.165` and sets the SingleStore password to the one specified by the value in the `root_password` field.

  5) Creates a leaf node on port `3307` on host `172.16.212.166` and sets the SingleStore password to the one specified by the value in the `root_password` field.

  6) Sets the paths for the audit logs, data, plancache, and trace logs on each host. **Notice that each field has its own path.**

     Alternatively, you can replace these individual fields and paths on each node definition with a single `baseinstalldir` field and path, such as `baseinstalldir: /data/memsql/Master`. Each node definition would then resemble:
     ```yaml
     nodes:
       - register: false
         role: Master
         config:
           baseinstalldir: /data/memsql/Master
           port: 3306

     ```

  7) Run the following with the path to the cluster file as input.
     ```shell
     sdb-deploy setup-cluster --cluster-file </path/to/cluster-file>

     ```

* *Example 2: Four Hosts, Four Nodes*

  For this example, you will need four hosts and the ability to `ssh` into each host from the main deployment host.> **📝 Note**: Set `package_type` to either `rpm` for Red Hat distributions or `deb` for Debian distributions to download and deploy the appropriate `singlestoredb-server`Using this cluster file, `sdb-deploy setup-cluster`:```yaml
  license: <license-from-portal.singlestore.com>
  high_availability: true
  memsql_server_version: rc:9.1

  package_type: <rpm | deb>
  hosts:
  - hostname: 172.16.212.165
    localhost: true
    nodes:
    - register: false
      role: Master
      config:
        password: <secure-password>
        port: 3306
  - hostname: 172.16.212.166
    localhost: false
    ssh: 
      host: 172.16.212.166
      private_key: /home/<user>/.ssh/id_rsa
    nodes:
    - register: false
      role: Aggregator
      config:
        password: <secure-password>
        port: 3306
  - hostname: 172.16.212.167
    localhost: false
    ssh: 
      host: 172.16.212.167
      private_key: /home/<user>/.ssh/id_rsa
    nodes:
    - register: false
      role: Leaf
      config:
        password: <secure-password>
        port: 3306
  - hostname: 172.16.212.168
    localhost: false
    ssh: 
      host: 172.16.212.168
      private_key: /home/<user>/.ssh/id_rsa
    nodes:
    - register: false
      role: Leaf
      config:
        password: <secure-password>
        port: 3306

  ```Using this cluster file, `sdb-deploy setup-cluster`:1) Registers four hosts to the cluster.

  2) Enables [High Availability](https://docs.singlestore.com/db/v9.1/user-and-cluster-administration/high-availability-and-disaster-recovery/managing-high-availability.md).

  3) Installs the latest 9.1 version of `singlestoredb-server` on both hosts.

  4) Creates a Master Aggregator node on port `3306` on host `172.16.212.165` and sets the SingleStore password to the one specified in the cluster file.

  5) Creates a Child Aggregator node on port `3306` on host `172.16.212.166` and sets the SingleStore password to the one specified in the cluster file.

  6) Creates a leaf node on port `3306` on host `172.16.212.167` and sets the SingleStore password to the one specified in the cluster file.

  7) Creates a leaf node on port `3306` on host `172.16.212.168` and sets the SingleStore password to the one specified in the cluster file.

  8) Run the following with the path to the cluster file as input.
     ```shell
     sdb-deploy setup-cluster --cluster-file </path/to/cluster-file>

     ```

## Additional Deployment Options

> **📝 Note**: If this deployment method is not ideal for your target environment, you can choose one that fits your requirements from the [Deployment Options](https://docs.singlestore.com/db/v9.1/deploy.md).

## Connect to Your Cluster

The `singlestore-client` package contains is a lightweight client application that allows you to run SQL queries against your database from a terminal window.

After you have installed `singlestore-client`, use the `singlestore` application as you would use the `mysql` client to access your database.

For more connection options, help is available through `singlestore --help`.

```shell
singlestore -h <Master-or-Child-Aggregator-host-IP-address> -P <port> -u <user> -p<secure-password>

```

```output

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 12
Server version: 5.7.32 SingleStoreDB source distribution (compatible; MySQL Enterprise & MySQL Commercial)

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

singlestore> 

```

Refer to [Connect to SingleStore](https://docs.singlestore.com/db/v9.1/connect-to-singlestore.md) for additional options for connecting to SingleStore.

## Next Steps After Deployment

Now that you have installed SingleStore, check out the following resources to learn more about SingleStore:

* [Optimizing Table Data Structures](https://docs.singlestore.com/db/v9.1/create-a-database/optimizing-table-data-structures.md): Learn the difference between rowstore and columnstore tables, when you should pick one over the other, how to pick a shard key, and so on.
* [How to Load Data into SingleStore](https://docs.singlestore.com/db/v9.1/load-data.md): Describes the different options you have when ingesting data into a SingleStore cluster.
* [How to Run Queries](https://docs.singlestore.com/db/v9.1/query-data/basic-query-examples.md): Provides example schema and queries to begin exploring the potential of SingleStore.
* [Configure Monitoring](https://docs.singlestore.com/db/v9.1/user-and-cluster-administration/cluster-health-and-performance/configure-monitoring.md): SingleStore’s native monitoring solution is designed to capture and reveal cluster events over time. By analyzing this event data, you can identify trends and, if necessary, take action to remediate issues.
* [Tools Reference](https://docs.singlestore.com/db/v9.1/reference/singlestore-tools-reference.md): Contains information about SingleStore Tools, including Toolbox and related commands.

***

Modified at: September 18, 2023

Source: [/db/v9.1/deploy/linux/yaml-online-deb/](https://docs.singlestore.com/db/v9.1/deploy/linux/yaml-online-deb/)

(An index of the documentation is available at /llms.txt)
