SingleStore and Spark
On this page
How are SingleStore and Apache Spark related?
SingleStore and Apache Spark are both distributed, in-memory technologies.
What are the differences between SingleStore and Spark SQL?
-
Spark SQL treats datasets (RDDs) as immutable - there is currently no concept of an INSERT, UPDATE, or DELETE.
You could express these concepts as a transformation, but this operation returns a new RDD rather than updating the dataset in place. In contrast, SingleStore is an operational database with full transactional semantics. -
SingleStore supports updatable relational database indexes.
The closest analogue in Spark is IndexRDD, which is currently under development, and provides updatable key/value indexes.
You can connect SingleStore to Spark with the SingleStore Spark Connector.
SQL Push Down
What happens if SQL push down fails?
The SingleStore Connector takes a best effort approach towards query push down.
How can I check to see if a query is pushed down?
Every DataFrame has a method called .
which will print the final plan before execution.MemSQLPhysicalRDD
then the DataFrame has been fully pushed down.
What SQL push downs are not supported?
We are constantly improving push down, so the best thing to do is just try your query and then use .
to check to see if it got pushed down.
Last modified: January 10, 2023