Configuration Settings
The SingleStore Spark Connector leverages Spark SQL’s Data Sources API.
The singlestore-spark-connector
is configurable globally via Spark options and locally when constructing a DataFrame. The global and local options use the same names; however the global options have the prefix spark.datasource.singlestore.
The connection to SingleStoreDB relies on the following Spark configuration options:
Basic Options
Option | Description | Default Value |
---|---|---|
| The hostname or IP address of the SingleStore workspace in the | |
| SingleStoreDB username. | |
| SingleStoreDB password. | |
| The query to run (mutually exclusive with | |
| The table to query (mutually exclusive with query). | |
| If set, all connections use this database by default. This option is empty by default. |
Read Options
Option | Description | Default Value |
---|---|---|
| Disables SQL Pushdown when running queries. |
|
| Enables reading data in parallel for some query shapes. It can have one of the following values: |
|
| Specifies a comma separated list of parallel read features that are tried in the order they are listed. SingleStoreDB supports the following features: |
|
| Specifies the amount of time (in |
|
| Specifies the amount of time (in |
|
| Specifies the maximum number of partitions in the resulting DataFrame. If set to |
|
| Repartitions data before reading. |
|
| Specifies a comma separated list of columns that are used for repartitioning (when |
Write Options
Option | Description | Default Value |
---|---|---|
| Specifies the behavior during |
|
| This option is deprecated, please use |
|
| Compresses data on load. It can have one of the following three values: |
|
| Serializes data on load. It can have one of the following values: |
|
| Specifies additional keys to add to tables created by the connector. See Load Data from Spark Examples for more information. | |
| If this option is specified and a new row with duplicate | |
| Specifies the size of the batch for row insertion. | 10000 |
| The maximum number of errors in a single |
|
| If enabled, the connector creates a rowstore table. |
|
Connection Pool Options
Option | Description | Default Value |
---|---|---|
| Enables the use of connection pool on the driver. |
|
| The maximum number of active connections with the same options that can be allocated from the driver pool at the same time. A negative value indicates an unlimited number of active connections. | -1 |
| The maximum number of connections with the same options that can remain idle in the driver pool without extra ones being released. A negative value indicates an unlimited number of idle connections. | 8 |
| The minimum amount of time (in ms) an object may sit idle in the driver pool before it is eligible for eviction by the idle object evictor (if any). | 30000 (30 sec) |
| The number of milliseconds to sleep between runs of the idle object evictor thread on the driver. If set to 0 or a negative number, no idle object evictor thread is run. | 1000 (1 sec) |
| The maximum number of milliseconds that the driver pool waits (when there are no available connections) for a connection to be returned before throwing an exception. If set to -1, the pool waits indefinitely. | -1 |
| The maximum lifetime of the connector (in ms) after which the connection fails the next activation, passivation or validation test. If set to 0 or a negative number, the connection has an infinite lifetime. | -1 |
| Enables the use of connection pool on executors. |
|
| The maximum number of active connections with the same options that can be allocated from the executor pool at the same time. A negative value indicates an unlimited number of active connections. | -1 |
| The maximum number of connections with the same options that can remain idle in the executor pool, without extra ones being released. A negative value indicates an unlimited number of idle connections. | 8 |
| The minimum amount of time an object may sit idle in the executor pool before it is eligible for eviction by the idle object evictor (if any). | 2000 (2 sec) |
| The number of milliseconds to sleep between runs of the idle object evictor thread on the executor. If set to 0 or a negative number, no idle object evictor thread is run. | 1000 (1 sec) |
| The maximum number of milliseconds that the executor pool waits (when there are no available connections) for a connection to be returned before throwing an exception. If set to -1, the pool waits indefinitely. | -1 |
| The maximum lifetime of the connector (in ms) after which the connection fails the next activation, passivation or validation test. If set to 0 or a negative number, the connection has an infinite lifetime. | -1 |