Choosing Rowstore Keys

You may define indexes, also called keys, on SingleStore rowstore tables. SingleStore uses these keys to efficiently find specific rows.

There are two storage types for rowstore indexes: a lockfree skiplist and a lockfree hash table. In both cases, we use lockfree data structures to optimize the performance of concurrent updates to the table.

  • By default, indexes are stored as skiplists, which have similar functional and performance characteristics as B-trees in other databases. A skiplist is a data structure optimized for ordered data that stores rows in collections of increasingly smaller ordered lists. Queries can quickly seek data by binary searching using the different sized lists and can quickly scan over ranges of data by iterating over the largest list. For multi-column indexes, query filters must match a prefix of the index column list to be able to take advantage of the index.

  • A hash table is a data structure optimized for fast lookups, which stores rows in a sparse array of buckets indexed by a hash function on the relevant columns. Queries can quickly find exact match data by examining only the bucket identified by the hash function, but cannot easily scan over a subset of the table. For multi-column indexes, query filters must match all of the index columns to be able to take advantage of the index. Due to this inflexibility, we discourage the use of hash indexes. They should only be used when there is a demonstrated need and measurable benefit on your particular dataset and workload.

Another consideration when choosing an index is the overhead of adding another index. Each added index uses extra memory for the additional data structures – on average about 40 bytes per row – and slightly slows inserts due to the additional data structures that need to be updated.

Each rowstore table may have at most one primary key and optionally many secondary keys. Scans on the primary key are generally somewhat faster than on secondary keys. For example, if the data was inserted in primary key order, the rows would be in memory order for the primary key and have better cache locality for the primary key than for a secondary key.

For more information, refer to the following two resources:

Check Your Understanding

Q: For the table, CREATE ROWSTORE TABLE t(a INT, b INT, KEY (a, b)), will the query SELECT SUM(a) FROM t WHERE b = 3 benefit from the index?

A: No, since the only column in the filter list, b, is not a prefix of the key (a, b), the query cannot benefit from the index. The query SELECT SUM(a) FROM t WHERE a = 3 would be able to benefit from the index since a is a prefix of the key (a, b).

Last modified: February 22, 2023

Was this article helpful?