Reducing Memory Use by Row Store Tables

If rowstore tables are using too much memory there are a few things you can do:

  • Make sure all secondary indexes are actually needed. They are expensive (40 bytes per row).

  • Make sure columns that are actually NOT NULL are marked as NOT NULL. Some types use an extra 4 bytes per nullable column to store the nullability state of the column (integer types for example).

  • Avoid the CHAR data type. A CHAR(N) column takes 4*N bytes of storage because its charset is utfmb3. Use VARCHAR instead.

  • If the workload is using DECIMAL and doesn’t need fixed point math, use DOUBLE instead. SingleStore does its best to use the least amount of memory possible for DECIMAL, but a fixed point representation fundamentally needs more storage space then a floating point representation.

  • SingleStore’s memory allocators can become fragmented over time (especially if a large table is shrunk dramatically by deleting data randomly). Use OPTIMIZE TABLE <rowstore table> to compact the table.

    Warning

    Using OPTIMIZE TABLE will cause query plans to be rebuilt, so this should not be used that often. Further, the command is resource intensive and for larger tables, can take some time to run. Use caution when running OPTIMIZE TABLE.

If you want more details on how much memory is used by each table, use the following:

  • SHOW TABLE STATUS has a BuffMgr Memory Use column. This includes memory use for all the components listed above in the rowstore allocator section, but broken down per table. If run on an aggregator, it will show how much memory is used across the entire cluster for the table. If run on a leaf, it will show how much memory the table is using in whichever partition database you are in when you run the command.

  • INFORMATION_SCHEMA.TABLE_STATISTICS lists how much data every sharded table is using on each partition in the cluster. This is a great way to check where data skew is coming from.

Last modified: March 8, 2024

Was this article helpful?