Understanding How Datatype Can Affect Performance

Comparing Mismatched Datatypes

Using comparisons between mismatched data types may degrade query performance and may use unsafe type conversions which can yield undesirable query results. SingleStore will display a warning for queries with potentially problematic comparisons between mismatched data types in the EXPLAIN and information_schema records for the query.

These warnings do not necessarily indicate a problem, and you may have valid reasons for comparing different data types. But these data type mismatches are flagged to help you identify potential problems that you may not be aware of.

When you see these warnings, you should check whether the data types are expected to be different. You may wish to consider changing the data types of the fields or values involved. You may also wish to consider adding explicit type conversion operations, such as the cast operators or functions like STR_TO_DATE which convert between different types.

These warnings can be seen in EXPLAIN <query>, information_schema.plancache.plan_warnings, information_schema.mv_queries.plan_warnings, and the PlanWarnings column of show plancache in textual form, as well as in EXPLAIN JSON <query>, information_schema.plancache.plan_info, and information_schema.mv_queries.plan_info in JSON form.

Example

An example of these warnings is the following:

CREATE TABLE t (id VARCHAR(50), PRIMARY KEY (id));
EXPLAIN SELECT * FROM t WHERE id = 123;
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                                                                                |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| WARNING: Comparisons between mismatched datatypes which may involve unsafe datatype conversions and/or degrade performance. Consider changing the datatypes, or adding explicit typecasts. See https://docs.singlestore.com/docs/mismatched-datatypes for more information. |
|                                                                                                                                                                                                                                                                        |
| WARNING: Comparison between mismatched datatypes: (`t`.`id` = 123). Types 'varchar(50) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL' vs 'bigint(20) NOT NULL'.                                                                                                  |
|                                                                                                                                                                                                                                                                        |
| Gather partitions:all alias:remote_0                                                                                                                                                                                                                                   |
| Project [t.id]                                                                                                                                                                                                                                                         |
| Filter [t.id = 123]                                                                                                                                                                                                                                                    |
| TableScan db.t, PRIMARY KEY (id)                                                                                                                                                                                                                                       |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

In this example, the query involves the expression t.i = 123, where t.i is a string field. When a numeric value is compared to a string, the string is converted to a numeric value, e.g. the string '123' is converted to the number 123.

This comparison may be problematic for multiple reasons:

  • The mismatched data types may indicate a mistake in how the query is written or how the table is defined. The query may behave differently than intended and yield undesired results. For example, all of the strings '123', '123.0', '0123', and '123abc' compare equal to the integer 123, so while the query may be intended to retrieve a single row with the specified id, this equality comparison may yield multiple rows whose id values all compare equal to 123.

  • The mismatched data types negatively impact the performance of the query. If the field and constant were either both strings or both integers, the query plan would be able to use the index to efficiently seek to lookup the matching id. But because there are many possible string ids that match the number 123, which do not come in any particular order in terms of string lexicographic order, the query cannot seek into the index and instead must scan all the rows of the table.

Data Type Conversion

The global sync variable data_conversion_compatibility_level controls the way certain data conversions are performed. This variable can have the following possible values: '6.0', '6.5', '7.0'. Higher values introduce stricter checking of values and will error for conversions that worked at lower levels. For example, the '7.0' level will fail the conversion of 'abc' to an integer value, while the '6.0' level will silently convert it to a NULL or 0.

The following table lists the data type conversion checks and behavior changes that are introduced with each data_conversion_compatibility_level value. Note that the data type conversion checks introduced in a compatibility level are preserved in subsequent levels.

data_conversion_compatibility_level

Data Type Conversion Checks

6.0

  • The lowest data type conversion compatibility level.

  • Default value for SingleStore engine versions lower than 8.0.

6.5

  • Error on integer overflows and underflows in INSERT statements. This check applies to all INTEGER data types, including BIGINT, INT, MEDIUMINT, SMALLINT, TINYINT, BIT, and BOOL.

  • Error on string inputs to VARCHAR columns that are too short to store the data. This check is run against INSERT statements.

  • Error on string inputs to LONGBLOB, MEDIUMBLOB, BLOB, TINYBLOB, LONGTEXT, MEDIUMTEXT, TEXT, and TINYTEXT columns that are too short to store the data. This check is run against INSERT statements.

  • Error on invalid string to int conversion in INSERT statements. Only strings with valid characters will be converted to integers.

7.0

  • Error on string inputs to CHAR columns that are too short to store the data. This check is run against INSERT statements.

  • Error on inputs to BINARY columns that are too short to store the data. This check is run against INSERT statements.

  • Error on decimal inputs that do not match the size specified in the DECIMAL column definition. This check is run against INSERT statements.

  • Error on invalid or out-of-range date inputs to DATE, DATETIME, DATETIME(6), and YEAR columns in INSERT statements.

  • Errors on zero value date inputs into DATE, DATETIME, and DATETIME(6) columns in INSERT statements.

  • Arithmetic operations involving DATE and INT values are performed by converting the INT data into intervals (days) instead of converting both data types to DOUBLE.

It’s recommended to set the data_conversion_compatibility_level variable to the highest available value for new application development. For existing applications, it’s also recommended to use the highest available level, but it’s recommended that you test your application before deploying this change. In particular, changing the value of data_conversion_compatibility_level can change the behavior of expressions in computed columns.

If a computed column value changes due to a change in data_conversion_compatibility_level; columnstore sorting, indexes, and sharding can become logically corrupted. SingleStore does not recompile an existing plan when data_conversion_compatibility_level or sql_mode changes.

Note

sql_mode is persisted to the CREATE TABLE statement. Therefore, the computed column of a table always uses the sql_mode that the table is created with, which may be different from the current sql_mode.

Best Practices to Avoid Data Corruption

  • Review the computed column expressions when changing data_conversion_compatibility_level.

  • Review the computed column expressions when changing data_conversion_compatibility_level.

  • Perform application upgrade tests.

For example, if a computed column contains corrupted data and you have to switch to a higher data_conversion_compatibility_level, you may need to drop the computed column before changing the level. Once the level is increased, add the computed column to the table. Alternatively, if a persisted computed column is used in a columnstore key, you may have to create a new table and select data into this new table. After copying the data, drop the old table and rename the new table.

The following examples demonstrate how the behavior of expressions may change if data_conversion_compatibility_level is changed.

Example 1

SET GLOBAL data_conversion_compatibility_level = '6.0';
SELECT DATE('2019-04-20') + 2;
+------------------------+
| DATE('2019-04-20') + 2 |
+------------------------+
|               20190422 |
+------------------------+

Example 2

SET GLOBAL data_conversion_compatibility_level = '7.0';
SELECT DATE('2019-04-20') + 2;
+------------------------+
| DATE('2019-04-20') + 2 |
+------------------------+
| 2019-04-22             |
+------------------------+

The following example shows the error message generated if a zero value is attempted to be inserted into a DATE, DATETIME, or DATETIME(6) column if the data_conversion_compatibility_level is set to 7.0 or higher.

SET GLOBAL data_conversion_compatibility_level = '7.0';
CREATE TABLE dates (dt DATETIME);
INSERT INTO dates VALUES ('0000-00-00');
ERROR 2363 (HY000): Leaf Error: Invalid DATE/TIME in type conversion

See the table on the Time and Date page for examples of zero values for DATE, DATETIME, and DATETIME(6) data types. This page also discusses how date (and time) values are implicitly converted from string literals (e.g., '2023-02-03') and how to use date and time functions to perform explicit conversions from string literals.

Last modified: October 18, 2021

Was this article helpful?