APPROX_COUNT_DISTINCT

Aggregate function. Returns an estimate of the number of distinct values in a collection.

APPROX_COUNT_DISTINCT is useful if an exact result is not required. Compared to SELECT COUNT(DISTINCT <column>), which calculates the exact number of distinct values in a column of table, APPROX_COUNT_DISTINCT can run much faster and consume significantly less memory.

Syntax

APPROX_COUNT_DISTINCT ( expr [, expr ...] )

Arguments

expr

One or more expressions evaluating to a scalar value.

expr is typically the name of a column. In this case, APPROX_COUNT_DISTINCT returns an estimate of the number of distinct values in the column.

Examples

The following example populates a column with one-hundred thousand unique integers and runs SELECT COUNT(DISTINCT <column>) and APPROX_COUNT_DISTINCT on the column so you can compare the results.

DROP DATABASE IF EXISTS singlestore_docs_example;
CREATE DATABASE singlestore_docs_example;
USE singlestore_docs_example;
DELIMITER //
CREATE PROCEDURE populate_table() AS
BEGIN
CREATE TABLE acd_t(a INT);
FOR i IN 1..100000 LOOP
INSERT INTO acd_t VALUES(i);
END LOOP;
END
//
DELIMITER ;
CALL populate_table();

Get the exact count of distinct values in the column a:

SELECT COUNT(DISTINCT a) FROM acd_t;
+--------------------+
| COUNT(DISTINCT a)  |
+--------------------+
|             100000 |
+--------------------+

Get an estimate of the number of distinct values in the column a:

SELECT APPROX_COUNT_DISTINCT(a) FROM acd_t;
+--------------------------+
| APPROX_COUNT_DISTINCT(a) |
+--------------------------+
|                   100498 |
+--------------------------+

Using multiple expressions:

CREATE TABLE acd (a int, b int);
INSERT INTO acd VALUES (1, 1), (1, 2), (1, 1), (2, 1), (2, 2), (2, 2);
SELECT APPROX_COUNT_DISTINCT(a, b) from acd;
+-----------------------------+
| APPROX_COUNT_DISTINCT(a, b) |
+-----------------------------+
|                           4 |
+-----------------------------+
1 row in set (0.254 sec)

A somewhat more practical example, using the following table and data:

CREATE TABLE acd_assets(
asset_id int,
asset_type varchar(50),
asset_desc varchar(50),
asset_value decimal(6,2)
);
INSERT into acd_assets values('1049', 'laptop', 'mac_book_pro', '1999.00'),
('49', 'cell_phone', 'iphone_12','879.00'),
('1100', 'laptop', 'mac_book_pro','1999.00'),
('2037', 'laptop', 'mac_book_air_M2','1199.00'),
('58', 'cell_phone', 'iphone_12', '879.00'),
('130', 'cell_phone', 'iphone_13', '699'),
('210', 'laptop', 'mac_book_pro','2500.00'),
('111', 'laptop', 'mac_book_pro','2100.00'),
('099', 'laptop', 'mac_book_air_M1','999'),
('140', 'cell_phone', 'iphone_13_pro','999.00');
SELECT approx_count_distinct(asset_id) AS approx_distinct_asset_id
FROM acd_assets;
+-------------------------+
|approx_distinct_asset_id |
+-------------------------+
| 10                      |
+-------------------------+

The GROUP BY clause can be used with the APPROX_COUNT_DISTINCT function. In this example GROUP BY returns the approximate total of each type of asset from the assets table.

SELECT asset_type, APPROX_COUNT_DISTINCT(asset_id) AS approx_distinct_asset_id
FROM acd_assets
GROUP BY asset_type;
+------------+--------------------------+
|asset_type  | approx_distinct_asset_id |
+------------+--------------------------+
| cell_phone | 4                        |
| laptop     | 6                        |
+------------+--------------------------+

Last modified: March 18, 2024

Was this article helpful?

Verification instructions

Note: You must install cosign to verify the authenticity of the SingleStore file.

Use the following steps to verify the authenticity of singlestoredb-server, singlestoredb-toolbox, singlestoredb-studio, and singlestore-client SingleStore files that have been downloaded.

You may perform the following steps on any computer that can run cosign, such as the main deployment host of the cluster.

  1. (Optional) Run the following command to view the associated signature files.

    curl undefined
  2. Download the signature file from the SingleStore release server.

    • Option 1: Click the Download Signature button next to the SingleStore file.

    • Option 2: Copy and paste the following URL into the address bar of your browser and save the signature file.

    • Option 3: Run the following command to download the signature file.

      curl -O undefined
  3. After the signature file has been downloaded, run the following command to verify the authenticity of the SingleStore file.

    echo -n undefined |
    cosign verify-blob --certificate-oidc-issuer https://oidc.eks.us-east-1.amazonaws.com/id/CCDCDBA1379A5596AB5B2E46DCA385BC \
    --certificate-identity https://kubernetes.io/namespaces/freya-production/serviceaccounts/job-worker \
    --bundle undefined \
    --new-bundle-format -
    Verified OK