Understanding How Datatype Can Affect Performance
On this page
Comparing Mismatched Datatypes
Using comparisons between mismatched data types may degrade query performance and may use unsafe type conversions which can yield undesirable query results.EXPLAIN and information_ records for the query.
These warnings do not necessarily indicate a problem, and you may have valid reasons for comparing different data types.
When you see these warnings, you should check whether the data types are expected to be different.
These warnings can be seen in EXPLAIN <query>, information_, information_, and the PlanWarnings column of show plancache in textual form, as well as in EXPLAIN JSON <query>, information_, and information_ in JSON form.
Example
An example of these warnings is the following:
CREATE TABLE t (id VARCHAR(50), PRIMARY KEY (id));EXPLAIN SELECT * FROM t WHERE id = 123;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| WARNING: Comparisons between mismatched datatypes which may involve unsafe datatype conversions and/or degrade performance. Consider changing the datatypes, or adding explicit typecasts. See https://docs.singlestore.com/docs/mismatched-datatypes for more information. |
| |
| WARNING: Comparison between mismatched datatypes: (`t`.`id` = 123). Types 'varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL' vs 'bigint(20) NOT NULL'. |
| |
| Gather partitions:all alias:remote_0 |
| Project [t.id] |
| Filter [t.id = 123] |
| TableScan db.t, PRIMARY KEY (id) |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+In this example, the query involves the expression t., where t. is a string field.'123' is converted to the number 123.
This comparison may be problematic for multiple reasons:
-
The mismatched data types may indicate a mistake in how the query is written or how the table is defined.
The query may behave differently than intended and yield undesired results. For example, all of the strings '123','123.,0' '0123', and'123abc'compare equal to the integer123, so while the query may be intended to retrieve a single row with the specifiedid, this equality comparison may yield multiple rows whoseidvalues all compare equal to123. -
The mismatched data types negatively impact the performance of the query.
If the field and constant were either both strings or both integers, the query plan would be able to use the index to efficiently seek to lookup the matching id.But because there are many possible string ids that match the number123, which do not come in any particular order in terms of string lexicographic order, the query cannot seek into the index and instead must scan all the rows of the table.
Data Type Conversion
The global sync variable data_ controls the way certain data conversions are performed.'7. level will fail the conversion of 'abc' to an integer value, while the '6. level will silently convert it to a NULL or 0.
The following table lists the data type conversion checks and behavior changes that are introduced with each data_ value.
|
data_ |
Data Type Conversion Checks |
|---|---|
|
|
|
|
|
|
|
|
|
It’s recommended to set the data_ variable to the highest available value for new application development.data_ can change the behavior of expressions in computed columns.
If a computed column value changes due to a change in data_; columnstore sorting, indexes, and sharding can become logically corrupted.data_ or sql_ changes.
Note
sql_ is persisted to the CREATE TABLE statement.sql_ that the table is created with, which may be different from the current sql_.
Handling invalid source database DATE and DATETIME values during Flow migrations
When using SingleStore Flow to migrate data from supported source databases into SingleStore, the source data may contain non-standard or invalid DATE or DATETIME values such as 0001-01-01 07:07:24 or 0000-12-01.data_ is set to a stricter setting such as 8., loading this data can fail with an error such as:
Error(PGT54): Table core_data.`mage_order` stage 3 error java.sql.SQLException: (conn=142441) Leaf Error (node-83201ac7-c278-4902-8df7-ab7902e5adb3-leaf-ag2-0.svc-83201ac7-c278-4902-a1f2-600f50d9b580:3306): Invalid DATE/TIME in type conversion for column 'col_datetime'
To work around this issue without modifying the source data, the compatibility level can be temporarily set to an earlier version such as 6.:
SET GLOBAL data_conversion_compatibility_level = '6.5';
For example, the following executes successfully:
SELECT @@data_conversion_compatibility_level;DROP TABLE IF EXISTS date_tbl;CREATE TABLE date_tbl (id INT PRIMARY KEY NOT NULL,d DATETIME);INSERT date_tbl VALUES (1, '0001-01-01 07:07:24.000000');SELECT * FROM date_tbl;
Important
By default, SingleStore recommends using the highest available data_ to enforce strict validation.
Best Practices to Avoid Data Corruption
-
Review the computed column expressions when changing
data_.conversion_ compatibility_ level -
Perform application upgrade tests.
For example, if a computed column contains corrupted data and you have to switch to a higher data_, you may need to drop the computed column before changing the level.
The following examples demonstrate how the behavior of expressions may change if data_ is changed.
Example 1
SET GLOBAL data_conversion_compatibility_level = '6.0';
SELECT DATE('2019-04-20') + 2;
+------------------------+
| DATE('2019-04-20') + 2 |
+------------------------+
| 20190422 |
+------------------------+Example 2
SET GLOBAL data_conversion_compatibility_level = '7.0';
SELECT DATE('2019-04-20') + 2;
+------------------------+
| DATE('2019-04-20') + 2 |
+------------------------+
| 2019-04-22 |
+------------------------+The following example shows the error message generated if a zero value is attempted to be inserted into a DATE, DATETIME, or DATETIME(6) column if the data_ is set to 7. or higher.
SET GLOBAL data_conversion_compatibility_level = '7.0';
CREATE TABLE dates (dt DATETIME);INSERT INTO dates VALUES ('0000-00-00');
ERROR 2363 (HY000): Leaf Error: Invalid DATE/TIME in type conversionSee the table on the Time and Date page for examples of zero values for DATE, DATETIME, and DATETIME(6) data types.
Last modified: October 18, 2021