Important
The SingleStore 9.1 release candidate (RC) gives you the opportunity to preview, evaluate, and provide feedback on new and upcoming features prior to their general availability. In the interim, SingleStore 9.0 is recommended for production workloads, which can later be upgraded to SingleStore 9.1.
APPROX_ COUNT_ DISTINCT
On this page
Aggregate function.
APPROX_ is useful if an exact result is not required.SELECT COUNT(DISTINCT <column>), which calculates the exact number of distinct values in a column of table, APPROX_ can run much faster and consume significantly less memory.
Syntax
APPROX_COUNT_DISTINCT ( expr [, expr ...] )Arguments
expr
One or more expressions evaluating to a scalar value.
expr is typically the name of a column.APPROX_ returns an estimate of the number of distinct values in the column.
Examples
The following example populates a column with one-hundred thousand unique integers and runs SELECT COUNT(DISTINCT <column>) and APPROX_ on the column so you can compare the results.
DROP DATABASE IF EXISTS singlestore_docs_example;CREATE DATABASE singlestore_docs_example;USE singlestore_docs_example;DELIMITER //CREATE PROCEDURE populate_table() ASBEGINCREATE TABLE acd_t(a INT);FOR i IN 1..100000 LOOPINSERT INTO acd_t VALUES(i);END LOOP;END//DELIMITER ;CALL populate_table();
Get the exact count of distinct values in the column a:
SELECT COUNT(DISTINCT a) FROM acd_t;
+--------------------+
| COUNT(DISTINCT a) |
+--------------------+
| 100000 |
+--------------------+Get an estimate of the number of distinct values in the column a:
SELECT APPROX_COUNT_DISTINCT(a) FROM acd_t;
+--------------------------+
| APPROX_COUNT_DISTINCT(a) |
+--------------------------+
| 100498 |
+--------------------------+Using multiple expressions:
CREATE TABLE acd (a int, b int);INSERT INTO acd VALUES (1, 1), (1, 2), (1, 1), (2, 1), (2, 2), (2, 2);
SELECT APPROX_COUNT_DISTINCT(a, b) from acd;
+-----------------------------+
| APPROX_COUNT_DISTINCT(a, b) |
+-----------------------------+
| 4 |
+-----------------------------+
1 row in set (0.254 sec)A somewhat more practical example, using the following table and data:
CREATE TABLE acd_assets(asset_id int,asset_type varchar(50),asset_desc varchar(50),asset_value decimal(6,2));INSERT into acd_assets values('1049', 'laptop', 'mac_book_pro', '1999.00'),('49', 'cell_phone', 'iphone_12','879.00'),('1100', 'laptop', 'mac_book_pro','1999.00'),('2037', 'laptop', 'mac_book_air_M2','1199.00'),('58', 'cell_phone', 'iphone_12', '879.00'),('130', 'cell_phone', 'iphone_13', '699'),('210', 'laptop', 'mac_book_pro','2500.00'),('111', 'laptop', 'mac_book_pro','2100.00'),('099', 'laptop', 'mac_book_air_M1','999'),('140', 'cell_phone', 'iphone_13_pro','999.00');
SELECT approx_count_distinct(asset_id) AS approx_distinct_asset_idFROM acd_assets;
+-------------------------+
|approx_distinct_asset_id |
+-------------------------+
| 10 |
+-------------------------+The GROUP BY clause can be used with the APPROX_ function.GROUP BY returns the approximate total of each type of asset from the assets table.
SELECT asset_type, APPROX_COUNT_DISTINCT(asset_id) AS approx_distinct_asset_idFROM acd_assetsGROUP BY asset_type;
+------------+--------------------------+
|asset_type | approx_distinct_asset_id |
+------------+--------------------------+
| cell_phone | 4 |
| laptop | 6 |
+------------+--------------------------+Last modified: March 18, 2024