How Failover is Triggered in HA

Case 1: The node is experiencing some kind of grey failure and even though a live connection to it exists, it is not answering the heartbeats.

In this case, a heartbeat (ping) is sent every 150 ms and after 200 consecutive heartbeats with no response, there is a failover. The failover will take 30 seconds (150ms*200). This requires a thorough investigation because we don't really know what is causing the no response. Since the node is not returning an error, it could just be slow or under heavy load.

Case 2: The node is refusing the heartbeat connection entirely.

In this case, the connection attempts return an error or time out trying to connect. Instead of waiting, an extra penalty is added to the heartbeat counter with every failed connection attempt, so that failover happens in 3 seconds if the heartbeat fails to connect 3 consecutive times. Each connection attempt also has a 10-second timeout by default (and the heartbeat counter from Case 1 continues to rise every 150ms during this connection attempt).

See Cluster Management Commands for more details.

Last modified: May 5, 2022

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK