SingleStore DB

SQL Pushdown

The singlestore-spark-connector has extensive support for rewriting Spark SQL and dataframe operation query plans into standalone SingleStore queries. This allows most of the computation to be pushed into the SingleStore distributed system without any manual intervention. The SQL rewrites are enabled automatically but can be disabled using the disablePushdown option. We also support partial pushdown in the case where certain parts of a query can be evaluated in SingleStore and certain parts need to be evaluated in Spark.

SQL Pushdown is either enabled or disabled on the entire Spark Session. If you want to run multiple queries in parallel with different values of disablePushdown, make sure to run them on separate Spark Sessions.

We currently support most of the primary Logical Plan nodes in Spark SQL including: Project, Filter, Aggregate, Window, Join, Limit, Sort.

We also support most Spark SQL expressions. A full list of supported operators/functions can be found in the file ExpressionGen.scala.

The best place to look for examples of fully supported queries is in the tests. Check out this file as a starting point: SQLPushdownTest.scala.

SQL Pushdown Incompatibilities
  • ToUnixTimestamp and UnixTimestamp handle only time less then 2038-01-19 03:14:08, if they get DateType or TimestampType as a first argument

  • FromUnixTime with default format (yyyy-MM-dd HH:mm:ss) handle only time less then 2147483648 (2^31)