Choosing a Table Storage Type

Flash, SSD, or Disk-based Columnstore/Universal Storage	In-Memory Rowstore
Analytical workloads, but can handle operational/transactional workloads with relatively frequent updates and deletes	Demanding operational/transactional workloads with high update and delete rates
Fast inserts over a small or large number of rows	Fast inserts and updates over a small or large number of rows
Fast aggregations and table scans via sub-segment access	Random seek performance
Compression	`SPARSE` compression option

Conceptual Differences Between Columnstore and Rowstore

Column-oriented stores or columnstores (columnstore is also referred to as Universal Storage) treat each column as a unit. Columnstore tables sequentially store data segments for each column on the same physical location on disk. This enables two important capabilities. One capability is being able to scan only the columns needed for the query with good cache locality during the scan. The other capability is that columnstores lend themselves well to compression. For example, repeating and similar values can easily be compressed together.

In addition, columnstore tables utilize a feature called sub-segment access with allows quick retrieval of a row once its position is known via hash indexing. Columnstore tables can have multiple multi-column hash indexes. Finally, row-level locking allows many concurrent updates of multiple rows in a columnstore to proceed, without requiring transactions to wait.

Columnstores are usually batch-loaded for optimal performance, although small-batch and streaming insert to columnstores can also perform well. The combination of efficient batch loading and processing of analytic queries make columnstore highly suitable for data warehousing scenarios.

Columnstores can handle operational and transactional workloads with relatively frequent updates and deletes, but rowstore can perform better for a demanding operational and transactional workload with high update and delete rates. Column group indexes can be used on columnstore tables to speed up full-row retrievals and updates, particularly for wide tables.

In general, columnstore tables are effective when your workload matches the following descriptions as much as possible:

A large number of rows are scanned sequentially (i.e. millions of rows or >5% of the table)
Aggregation happens over only a few columns (e.g. <10 columns)

A simplified example of how columnstore data is physically stored is shown here:

Columnstore

ProductId
1
2
3
4

Color
Red x 2
Black
White

x 2 indicates that the value is repeated twice.

Price
10
20 x 3

x 3 indicates that the value is repeated three times.

Row-oriented stores or rowstores are the most common type of data stores used by relational databases. As the name suggests, a rowstore treats each row as a unit and stores all fields for a given row together in the same physical location. This results in very fast performance when running queries seek specific rows. This makes rowstores great for transactional workloads, where the database frequently selects, inserts, updates, and deletes individual rows, often referencing either most or all columns.

With rowstore, data is stored in lock-free indexes for great performance at high concurrency. Rowstore tables can use multiple indexes, allowing them to flexibly support many types of queries.

Finally, when rowstore tables are created with SPARSE compression, the tables are allowed to store more data in the same amount of RAM. Nullable, structured columns can use sparse data compression. The column data types include BIGINT, INT, MEDIUMINT, SMALLINT, TINYINT, DATE, DATETIME, TIME, TIMESTAMP, and VARCHAR.

Logical table and rowstore representation

ProductId	Color	Price
1	Red	10
2	Red	20
3	Black	20
4	White	20

Related Topics

Training: SingleStore Storage

Check Your Understanding

Q: Do columnstore tables use memory?

A: Absolutely – SingleStore uses the operating system disk buffer cache to cache segment files in memory. Good performance in columnstore can only be achieved when there is enough memory to cache the working set. In addition, columnstore tables use a rowstore buffer table as a special segment to batch writes to the disk.

Q: Do rowstore tables ever write to disk?

A: Yes – data in a rowstore is also written to the transaction log on disk so that it can be recovered when SingleStore is restarted.

Q: Are rowstore tables always faster than columnstore tables?

A: No, columnstore tables are faster on some workloads – if the workload is batch inserts and sequential reads (e.g. an analytical workload with lots of scans) a columnstore can be significantly faster.

Q: What if I need small high-concurrency updates and my data doesn’t fit in memory?

A: You can also use columnstore tables while setting hash index(es) on columns you regularly filter on. Rowstore tables could also be effective if you create them with SPARSE compression.