Parallel Read Support

You can enable parallel reads via the enableParallelRead option. The parallel read operation creates multiple Spark tasks. This can drastically improve the performance in some cases.

Note: Parallel reads are not consistent

Parallel reads read directly from partitions on the leaf nodes, which skips our transaction layer. This means that each individual read will see an independent version of the database's distributed state. If some queries (other than read operation) are run on the database, they may affect the current read operation. Make sure to take this into account when enabling parallel read.

Note: Parallel reads transparently fallback to single stream reads

Parallel reads currently only work for query-shapes which do not work on the Aggregator and thus can be pushed entirely down to the leaf nodes. To determine if a particular query is being pushed down you can ask the DataFrame how many partitions it has like so:

df.rdd.getNumPartitions

If this value is > 1 then we are reading in parallel from leaf nodes.

Note: Parallel reads require consistent authentication and connectible leaf nodes

In order to use parallel reads, the username and password provided to the singlestore-spark-connector must be the same across all nodes in the cluster.

In addition, the hostnames and ports listed by SHOW LEAVES must be directly connectible from Spark.

Last modified: February 23, 2024

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK