Data Type Conversions
When saving a DataFrame to SingleStore, the DataFrame column is converted to the following SingleStoreDB type:
Spark Type | SingleStoreDB Type |
---|---|
LongType | BIGINT |
IntegerType | TINYINT |
ShortType | SMALLINT |
FloatType | FLOAT |
DoubleType | DOUBLE |
ByteType | TINYINT |
StringType | TEXT |
BinaryType | BLOB |
DecimalType | DECIMAL |
BooleanType | TINYINT |
TimeStampType | TIMESTAMP(6) |
DateType | DATE |
When reading a SingleStoreDB table as a Spark DataFrame, the SingleStoreDB column type is converted to the following Spark type:
SingleStoreDB Type | SparkType |
---|---|
TINYINT | ShortType |
SMALLINT | ShortType |
INT | IntegerType |
BIGINT | LongType |
DOUBLE | DoubleType |
FLOAT | FloatType |
DECIMAL | DecimalType |
TIMESTAMP | TimeStampType |
TIMESTAMP(6) | TimeStampType |
DATE | DateType |
TEXT | StringType |
JSON | StringType |
TIME | TimeStampType |
BIT | BinaryType |
BLOB | BinaryType |
Data Type Conversion Remarks
When using the
onDuplicateKeySQL
option, the connector will error when writing a null-terminated StringType (i.e., \0).DECIMAL
in SingleStoreDB andDecimalType
in Spark have different maximum scales and precisions. An error will occur if you perform a read or write from a table or DataFrame with unsupported precision or scale. SingleStoreDB’s maximum scale for theDECIMAL
data type is 30, while Spark’s maximum scale forDecimalType
is 38. Similarly, SingleStoreDB’s maximum precision is 65, while Spark’s maximum precision is 38.TIMESTAMP
in SingleStoreDB supports values from 1000 to 2147483647999. SingleStore treats a null value in theTIMESTAMP
column as the current time.The Avro format does not support writing
TIMESTAMP
andDATE
types. As a result, the SingleStore Spark Connector currently does not support these types with Avro serialization.