Distributed DML Query Execution

In this topic we will look at common DML query patterns and how they are executed through the distributed system.

You can use the EXPLAIN command to examine a query’s aggregator-level and leaf-level query plans.

Let’s assume the following schema:

CREATE TABLE a (
a1 int,
a2 int,
a3 int,
SHARD KEY (a1, a2),
KEY (a3)
);
CREATE TABLE b (
b1 int,
b2 int,
b3 int,
SHARD KEY (b1, b2)
);
CREATE REFERENCE TABLE r (
r1 int,
r2 int,
PRIMARY KEY (r1),
KEY (r2)
);

Index Matching

Matching on the shard key or the secondary index.

Matching the Shard Key

If you specify an equality on every column in the shard key, then the aggregator node will direct the query to exactly one partition. Most queries do not fall into this pattern; instead, the aggregator node must send queries to every partition in the cluster for intermediate results and then stitch them together.

These queries are sent to one partition:

SELECT * FROM a WHERE a1 = 4 AND a2 = 10;
SELECT a3, count(*) FROM a WHERE a1 = 4 AND a2 = 10 GROUP BY a3;

These queries are sent to all partitions:

SELECT * FROM a WHERE a1 = 4;
SELECT * FROM a WHERE a1 = 4 OR a2 = 10;
SELECT * FROM a WHERE a1 IN (4, 5) AND a2 IN (10);

Secondary Index Matching

If your query uses a secondary (non-shard) index, then the aggregator node must send the query to every partition in the cluster. Locally, each partition’s table will use its part of the secondary index to speed up the query. While the overall performance of the query is dictated by the seek and scan time of these indexes, the fact that the query must be sent everywhere in the cluster can increase the variance (and therefore overall latency) of the query.

This query matches the secondary index on the column a3:

SELECT * FROM a WHERE a3 = 5;

No Index Matching

Queries that do not match any index perform a full table scan on all partitions. From the perspective of the aggregator node, these queries are the same as queries that match a secondary index.

Last modified: June 22, 2022

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK