# Columnstore Sizing Estimations

There are two possible methods for estimating the total memory used by a columnstore table. The first method is brief, while the second is more precise. For more information about how columnstore works, see [Columnstore](https://docs.singlestore.com/cloud/create-a-database/columnstore.md), and the other topics nested under it.

## Simple Method

For the simple method, start with calculating the average row size for your table by adding together the size of each column's data type from [Data Types](https://docs.singlestore.com/cloud/reference/sql-reference/data-types.md).  Multiply the average row size by the expected number of rows in your table to get the estimated uncompressed size.&#x20;

Next, estimate the compression ratio for your dataset. Columnstore compression typically shrinks the data size by 3-7x. Based on these estimations, it is possible to get a very rough estimate of storage requirements for columnstore tables.

## Complex Method

In the same way as the simple method, start with calculating the average row size for your table by adding together the size of each column's data type from [Data Types](https://docs.singlestore.com/cloud/reference/sql-reference/data-types.md).  Multiply the average row size by the expected number of rows in your table to get the estimated uncompressed size.

Then, to get an estimate that is more accurate than the base calculation demonstrated above, a more precise compression ratio is needed. Use the steps below to accomplish this by loading a large amount of representative data, then measuring the actual compression ratios achieved (by table, and by column).

1. Load a representative sample of data into a columnstore table. More data leads to more accurate measurements.

   1. For information about creating a columnstore table, see [Creating a Columnstore Table](https://docs.singlestore.com/cloud/create-a-database/creating-a-columnstore-table.md).

   2. To see options and procedures for loading data, see [Load Data](https://docs.singlestore.com/cloud/load-data.md) and the topics nested below it.

2. Run  [OPTIMIZE TABLE](https://docs.singlestore.com/cloud/reference/sql-reference/data-definition-language-ddl/optimize-table.md) on the table in order to ensure the data is optimized and on disk. Use caution when running this command, as it can impact the performance of all other workloads running in the workspace. For more information, see the linked topic.

3. Measure the compression ratios using  SQL queries as outlined below.

   1. By querying [COLUMNAR\_SEGMENTS](https://docs.singlestore.com/cloud/reference/information-schema-reference/segment-column-and-merge-status/columnar-segments.md).

      1. Compression ratio per table (compressed / uncompressed):
         ```sql
         SELECT
         database_name,
         table_name, 
         (1-SUM(compressed_size)/SUM(uncompressed_size)) AS ratio 
         FROM information_schema.columnar_segments 
         GROUP BY database_name, table_name 
         ORDER BY ratio DESC;
         ```

      2. Compression ratio per column (compressed / uncompressed):
         ```sql
         SELECT
         database_name,
         table_name,
         column_name,
         (1-SUM(compressed_size)/SUM(uncompressed_size)) AS ratio
         FROM information_schema.columnar_segments
         GROUP BY database_name, table_name, column_name
         ORDER BY ratio DESC;
         ```

## Additional Methods to Obtain Columnstore Table Statistics

The following query can be used to obtain the total disk space used by columnstore secondary indexes:

```sql
SELECT database_name, table_name, SUM(size) AS total_index_size 
FROM information_schema.MV_COLUMNSTORE_FILES 
WHERE file_type = "INVERTED_INDEX" OR file_type = "CROSS_SEGMENT_HASH_INDEX" GROUP BY 1,2;
```

For more information about the column descriptions, see [MV\_COLUMNSTORE\_FILES](https://docs.singlestore.com/cloud/reference/information-schema-reference/segment-column-and-merge-status/mv-columnstore-files.md).

## Additional Sizing Estimations for Columnstore Table Overhead

Beyond the variable size estimates explained above, columnstore tables also require space for a constant metadata overhead of:

* \~100 bytes of memory per segment per column.
* \~100 KB for each segment that has at least one deleted row.
* 7KB per column per partition for incremental cardinality statistics.

***

Modified at: February 7, 2025

Source: [/cloud/create-a-database/columnstore/columnstore-sizing-estimations/](https://docs.singlestore.com/cloud/create-a-database/columnstore/columnstore-sizing-estimations/)

(An index of the documentation is available at /llms.txt)
