# Configuration Settings

The SingleStore Spark Connector leverages Spark SQL’s Data Sources API.

The `singlestore-spark-connector` is configurable globally via Spark options and locally when constructing a DataFrame. The global and local options use the same names; however the global options have the prefix `spark.datasource.singlestore.` The connection to SingleStore relies on the following Spark configuration options:

## Basic Options

| Option                  | Description                                                                                                                                                                                                                               | Default Value |
| ----------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------- |
| `ddlEndpoint`(required) | The hostname or IP address of theSingleStoreMaster Aggregator in the`host[:port]`format, where`port`is an optional parameter. Example:`master-agg.foo.internal:3308`or`master-agg.foo.internal`.                                          |               |
| `dmlEndPoints`          | The hostname or IP address ofSingleStoreAggregator nodes to run queries against in the`host[:port],host[:port],...`format, where`:port`is an optional parameter (multiple hosts separated by comma). Example:`child-agg:3308,child-agg2`. | `ddlendpoint` |
| `user`                  | SingleStoreusername.                                                                                                                                                                                                                      | `root`        |
| `password`              | SingleStorepassword.                                                                                                                                                                                                                      |               |
| `query`                 | The query to run (mutually exclusive with`dbtable`option).                                                                                                                                                                                |               |
| `dbtable`               | The table to query (mutually exclusive with query).                                                                                                                                                                                       |               |
| `database`              | If set, all connections use this database by default. This option is empty by default.                                                                                                                                                    |               |

## Read Options

| Option                                            | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | Default Value         |
| ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------- |
| `disablePushdown`                                 | Disables SQL Pushdown when running queries.                                                                                                                                                                                                                                                                                                                                                                                                                                     | `false`               |
| `enableParallelRead`                              | Enables reading data in parallel for some query shapes. It can have one of the following values:`disabled`,`automaticLite`,`automatic`, and`forced`. For more information, see[Parallel Read Support](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/parallel-read-support.md).                                                                                                                                                 | `automaticLite`       |
| `parallelRead.Features`                           | Specifies a comma separated list of parallel read features that are tried in the order they are listed.SingleStoresupports the following features:`ReadFromLeaves`,`ReadFromAggregators`, and`ReadFromAggregatorsMaterialized`. For example,`ReadFromAggregators`,`ReadFromAggregatorsMaterialized`. For more information, see[Parallel Read Support](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/parallel-read-support.md). | `ReadFromAggregators` |
| `parallelRead.tableCreationTimeoutMS`             | Specifies the amount of time (in`ms`) the reader waits for the result table creation when using the`ReadFromAggregators`feature. If set to`0`, timeout is disabled.                                                                                                                                                                                                                                                                                                             | `0`                   |
| `parallelRead.materializedTableCreationTimeoutMS` | Specifies the amount of time (in`ms`) the reader waits for the result table creation when using the`ReadFromAggregatorsMaterialized`feature. If set to`0`, timeout is disabled.                                                                                                                                                                                                                                                                                                 | `0`                   |
| `parallelRead.maxNumPartitions`                   | Specifies the maximum number of partitions in the resulting DataFrame. If set to`0`, the DataFrame can have unlimited number of partitions.                                                                                                                                                                                                                                                                                                                                     | `0`                   |
| `parallelRead.repartition`                        | Repartitions data before reading.                                                                                                                                                                                                                                                                                                                                                                                                                                               | `false`               |
| `parallelRead.repartition.columns`                | Specifies a comma separated list of columns that are used for repartitioning (when`parallelRead.repartition`is enabled). By default, an additional column with`RAND()`value is used for repartitioning.                                                                                                                                                                                                                                                                         |                       |

## Write Options

| Option                | Description                                                                                                                                                                                                                                                                                                                                                                  | Default Value   |
| --------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------- |
| `overwriteBehavior`   | Specifies the behavior during`Overwrite`. It can have one of the following values:`dropAndCreate`,`truncate`, or`merge`.                                                                                                                                                                                                                                                     | `dropAndCreate` |
| `truncate`            | **This option is deprecated, please use`overwriteBehavior`instead**. Truncates an existing table during`Overwrite`instead of dropping it.                                                                                                                                                                                                                                    | `false`         |
| `loadDataCompression` | Compresses data on load. It can have one of the following three values:`GZip`,`LZ4`, or`Skip`.                                                                                                                                                                                                                                                                               | `GZip`          |
| `loadDataFormat`      | Serializes data on load. It can have one of the following values:`Avro`or`CSV`.                                                                                                                                                                                                                                                                                              | `CSV`           |
| `tableKey`            | Specifies additional keys to add to tables created by the connector. See[Load Data from Spark Examples](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/examples/#UUID-7fb177b0-bda7-4fe9-1076-dfcdba01cd91.md)for more information.                                                                                          |                 |
| `onDuplicateKeySQL`   | If this option is specified and a new row with duplicate`PRIMARY KEY`or`UNIQUE`index is inserted,SingleStoreperforms an`UPDATE`operation on the existing row. See[Load Data from Spark Examples](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/examples/#UUID-e1b46ce2-97f2-b9f8-a204-ece6db02546e.md)for more information. |                 |
| `insertBatchSize`     | Specifies the size of the batch for row insertion.                                                                                                                                                                                                                                                                                                                           | 10000           |
| `maxErrors`           | The maximum number of errors in a single`LOAD DATA`request. When this limit is reached, the load fails. If this property is set to`0`, no error limit exists.                                                                                                                                                                                                                | `0`             |
| `createRowstoreTable` | If enabled, the connector creates a rowstore table.                                                                                                                                                                                                                                                                                                                          | `false`         |

## Connection Pool Options

| Option                                             | Description                                                                                                                                                                                                        | Default Value  |
| -------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------- |
| `driverConnectionPool.Enabled`                     | Enables the use of connection pool on the driver.                                                                                                                                                                  | `true`         |
| `driverConnectionPool.MaxOpenConns`                | The maximum number of active connections with the same options that can be allocated from the driver pool at the same time. A negative value indicates an unlimited number of active connections.                  | -1             |
| `driverConnectionPool.MaxIdleConns`                | The maximum number of connections with the same options that can remain idle in the driver pool without extra ones being released. A negative value indicates an unlimited number of idle connections.             | 8              |
| `driverConnectionPool.MinEvictableIdleTimeMs`      | The minimum amount of time (in ms) an object may sit idle in the driver pool before it is eligible for eviction by the idle object evictor (if any).                                                               | 30000 (30 sec) |
| `driverConnectionPool.TimeBetweenEvictionRunsMS`   | The number of milliseconds to sleep between runs of the idle object evictor thread on the driver. If set to 0 or a negative number, no idle object evictor thread is run.                                          | 1000(1 sec)    |
| `driverConnectionPool.MaxWaitMS`                   | The maximum number of milliseconds that the driver pool waits (when there are no available connections) for a connection to be returned before throwing an exception. If set to -1, the pool waits indefinitely.   | -1             |
| `driverConnectionPool.MaxConnLifetimeMS`           | The maximum lifetime of the connector (in ms) after which the connection fails the next activation, passivation or validation test. If set to 0 or a negative number, the connection has an infinite lifetime.     | -1             |
| `executorConnectionPool.Enabled`                   | Enables the use of connection pool on executors.                                                                                                                                                                   | `true`         |
| `executorConnectionPool.MaxOpenConns`              | The maximum number of active connections with the same options that can be allocated from the executor pool at the same time. A negative value indicates an unlimited number of active connections.                | -1             |
| `executorConnectionPool.MaxIdleConns`              | The maximum number of connections with the same options that can remain idle in the executor pool, without extra ones being released. A negative value indicates an unlimited number of idle connections.          | 8              |
| `executorConnectionPool.MinEvictableIdleTimeMs`    | The minimum amount of time an object may sit idle in the executor pool before it is eligible for eviction by the idle object evictor (if any).                                                                     | 2000(2 sec)    |
| `executorConnectionPool.TimeBetweenEvictionRunsMS` | The number of milliseconds to sleep between runs of the idle object evictor thread on the executor. If set to 0 or a negative number, no idle object evictor thread is run.                                        | 1000(1 sec)    |
| `executorConnectionPool.MaxWaitMS`                 | The maximum number of milliseconds that the executor pool waits (when there are no available connections) for a connection to be returned before throwing an exception. If set to -1, the pool waits indefinitely. | -1             |
| `executorConnectionPool.MaxConnLifetimeMS`         | The maximum lifetime of the connector (in ms) after which the connection fails the next activation, passivation or validation test. If set to 0 or a negative number, the connection has an infinite lifetime.     | -1             |

***

Modified at: September 26, 2025

Source: [/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/configuration-settings/](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/configuration-settings/)

(An index of the documentation is available at /llms.txt)
