Getting Started
On this page
You can download the latest version of SingleStore Spark Connector from Maven Central or SparkPackages.com.
and the artifact is singlestore-spark-connector_
.
The following matrix shows currently supported versions of the connector and their compatibility with different Spark versions:
Connector version |
Supported Spark versions |
---|---|
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
4. |
Spark 3. |
3. |
Spark 3. |
3. |
Spark 3. |
Note
SingleStore recommends using the latest version of the connector compatible with the corresponding Spark version.
The connector follows the x.
naming convention, where x.
represents the connector version and y.
represents the corresponding Spark version.3.
, 3.
Release Highlights
Version 4.
-
Changed retry during reading from result table to use exponential backoff.
-
Used
ForkJoinPool
instead ofFixedThreadPool
. -
Added more logging.
Version 4.
-
Fixed a bug that caused reading from the wrong result table when the task was restarted.
Version 4.
-
Changed
LoadDataWriter
to send data in batches. -
Added
numPartitions
parameter to specify the exact number of resulting partitions during parallel read.
Version 4.
-
Added support for Spark 3.
5 for connector version 4. 1. 5. -
Updated dependencies.
Version 4.
-
Added support for Spark 3.
4 for connector version 4. 1. 4. -
Added support for additional connection attributes.
-
Fixed conflicts in result table names during parallel read.
Version 4.
-
Improved error handling when using the
onDuplicateKeySQL
option.
Version 4.
-
Fixed an issue where retrying parallel reads caused a
Table has reached its quota of 1 reader(s)
error.
Version 4.
-
Added support for
clientEndpoint
option. -
Added support for Spark 3.
3 for connector version 4. 1. 1. -
Fixed an issue with error handling that caused deadlocks.
Version 4.
-
Added support for JWT-based authentication.
-
Added support for connection pooling.
-
Added multi-partition support to parallel read feature.
-
Added support for more SQL expressions in pushdown.
Version 4.
-
The connector uses the SingleStore JDBC driver instead of the MariaDB JDBC driver.
Version 3.
-
Added support for parallel reads from aggregator nodes.
-
Added support for repartition results by columns in parallel read from aggregators.
Version 3.
-
The connector uses the MariaDB JDBC driver and rebranded the connector from
memsql-spark-connector
tosinglestore-spark-connector
. -
Adapts the rebranding from
memsql
tosinglestore
.For example, the configuration prefix is changed from spark.
todatasource. memsql. <config_ name> spark.
.datasource. singlestore. <config_ name>
Related Topics
Last modified: August 29, 2024