# SQL Pushdown

The `singlestore-spark-connector` has extensive support for rewriting Spark SQL and DataFrame operation query plans into standalone SingleStore queries. This allows most of the computation to be pushed into the SingleStore distributed system without any manual intervention. The SQL rewrites are enabled automatically, but they can be disabled using the `disablePushdown` option. SingleStore also support partial pushdown where certain parts of a query can be evaluated in SingleStore and certain parts need to be evaluated in Spark.

SQL Pushdown is either enabled or disabled on the entire Spark Session. If you want to run multiple queries in parallel with different `disablePushdown` values, run them on separate Spark Sessions.

We currently support most of the primary Logical Plan nodes in Spark SQL including `Project`, `Filter`, `Aggregate`, `Window`, `Join`, `Limit`, and `Sort`.

We also support most Spark SQL expressions. A full list of supported operators/functions can be found in the [ExpressionGen.scala](https://github.com/memsql/singlestore-spark-connector/blob/master/src/main/scala/com/singlestore/spark/ExpressionGen.scala) file.

The best place to look for examples of fully supported queries is in the tests. Check out this file as a starting point: [SQLPushdownTest.scala](https://github.com/memsql/singlestore-spark-connector/blob/master/src/test/scala/com/singlestore/spark/SQLPushdownTest.scala).

## SQL Pushdown Incompatibilities

* `ToUnixTimestamp` and `UnixTimestamp` handle only time less than `2038-01-19 03:14:08`, if they get `DateType` or `TimestampType` as a first argument.
* `FromUnixTime` with default format (`yyyy-MM-dd HH:mm:ss`) handle only time less than `2147483648` (`2^31`).
* `DecimalType` is truncated on overflow (by default, Spark either throws an exception or returns `null`).
* `greatest` and `least` return `null` if at least one argument is `null` (in Spark these functions skip nulls).
* When a value can not be converted to a numeric or fractional type, SingleStore returns 0 (Spark returns `null`).
* `Atanh(x)`, for x ∈ (-∞, -1] ∪ \[1, ∞), returns null (Spark returns `NaN`).
* When a string is cast to a numeric type, SingleStore accepts its prefix if it is numeric (Spark returns `null` if the whole string is not numeric).
* When a numeric type is cast to a smaller one (in size), SingleStore truncates it. For example, 500 cast to the `Byte` will be 127.

  **Note**: Spark optimizer can optimize casts for literals and then the behaviour for literals matches custom Spark behaviour.
* When a fractional type is cast to an integral type, SingleStore rounds it to the nearest integer.
* `Log` returns `null` instead of `NaN`, `Infinity`, or `-Infinity`.
* `Round` rounds down if the number to be rounded is followed by 5 and it is `DOUBLE` or `FLOAT` (`DECIMAL` is rounded up).
* `Conv` works differently if the number contains non-alphanumeric characters.
* `ShiftLeft`, `ShiftRight`, and `ShiftRightUnsigned` convert the value to an `UNSIGNED BIGINT` and then produce the shift. In case of an overflow, it returns `0 (1 << 64 = 0 and 10>>20 = 0)`.
* `BitwiseGet` returns 0 when the bit position is negative or exceeds the bit upper limit.
* `Initcap` defines a letter as the beginning of a word even if it is enclosed in quotation marks, brackets, etc. For example "dear sir/madam (miss)" is converted to "Dear Sir/Madam (Miss)".
* `Skewness(x)`, in Spark 3.0, for `STD(x) = 0` returns `null` instead of `NaN`.

***

Modified at: July 27, 2022

Source: [/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/sql-pushdown/](https://docs.singlestore.com/db/v9.1/load-data/integrate-with-singlestore/load-data-from-spark/sql-pushdown/)

(An index of the documentation is available at /llms.txt)
