APPROX_PERCENTILE

Syntax

Aggregate Function

SQL

APPROX_PERCENTILE (col | expr , percentage [, error_tolerance])

Window Function

SQL

APPROX_PERCENTILE (col | expr , percentage [, error_tolerance]) OVER
  ( [PARTITION BY (col | expr)] )

Arguments

col: A column of a numeric data type.
expr: An expression that evaluates to a numeric data type.
percentage: A numeric value between 0 and 1.
error_tolerance: A numeric value between 0 and 0.5 (not including 0).

Return Type

A DOUBLE data type.

Remarks

While calculating the percentile, APPROX_PERCENTILE ignores all the NULL values.
A percentage value of 0.5 will return the approximate median for the data set.
The default value for error_tolerance is 0.01. A higher error_tolerance value means a higher tolerance for error, hence lower accuracy.
APPROX_PERCENTILE can be used as a scalar aggregate function. It can also be used with a GROUP BY clause.

Note

If you require an exact value instead of an estimation, you should use the PERCENTILE_CONT or PERCENTILE_DISC functions.

Error Tolerance Behavior

The error threshold of APPROX_PERCENTILE depends directly on the size of data and the requested quantile. We expect the approximate estimation to differ from the exact estimation within a bounded range of [-0.01,0.01] for more than 100k rows. For example, for a quantile of 0.5 (50th percentile), you can expect the estimation value will be within the range of 0.49 or 0.51 deterministic/original quantile (+-0.01). Additionally, accuracy improves with larger datasets, and also as the percentage approaches the lower or upper bounds of the distribution (i.e., 1st and 99th percentile).

Examples

The following examples use the test_scores table:

SQL

CREATE TABLE test_scores(student_id INT, lastname TEXT, class TEXT, test1 decimal(5,2), test2 decimal(5,2));

INSERT INTO test_scores VALUES('304', 'Smith', 'sophomore','51.3', '64.1'),
('288', 'Hawkins', 'junior','76.9', '81.6'),
('614', 'Jones', 'senior','89.3', '75.9'),
('451', 'Brown', 'junior','68.4', '71'),
('643', 'Nova', 'junior','81', '84.4'),
('400', 'Miller', 'senior','90', '88'),
('690', 'Maxwell', 'sophomore','72', '74.5'),
('243', 'Anderson', 'junior','78.8', '86.3'),
('184', 'Williams', 'junior','82', '91'),
('111', 'Jackson', 'sophomore','72', '84.7');

Aggregate Function

To calculate the percentile rank using the approx_percentile function: two inputs are required: the column that will be used to determine the percentile value and the specific percentile rank to be calculated.

Example 1

The approx_percentile function is used to calculate the approximate percentile value of a specified column in a table.

SQL

SELECT student_id, class, test1, approx_percentile(test1, 0.1)
    OVER (PARTITION BY class)
    FROM test_scores
    ORDER BY class, student_id, test1;

+------------+-----------+-------+--------------------------------+
| student_id | lastname  | test1 | approx_percentile(test1, 0.1)  |
|            |           |       | OVER (PARTITION BY student_id) |
+------------+-----------+-------+--------------------------------+
|        184 | junior    | 82.00 |                      68.400000 |
|        243 | junior    | 78.80 |                      68.400000 |
|        288 | junior    | 76.90 |                      68.400000 |
|        451 | junior    | 68.40 |                      68.400000 |
|        643 | junior    | 81.00 |                      68.400000 |
|        400 | senior    | 90.00 |                      89.300000 |
|        614 | senior    | 89.30 |                      89.300000 |
|        111 | sophomore | 72.00 |                      51.300000 |
|        304 | sophomore | 51.30 |                      51.300000 |
|        690 | sophomore | 72.00 |                      51.300000 |
+------------+-----------+-------+--------------------------------+

Example 2

The SELECT clause includes two columns: class and the approximate median of test2 scores for each class. The APPROX_PERCENTILE function calculates the approximate median at the 50th percentile (0.5).

The GROUP BY clause groups the results by class. The query calculates the approximate median of the test2 scores for each unique class in the test_scores table.

SQL

SELECT class, APPROX_PERCENTILE(test2, 0.5) AS "approx_median"
    FROM test_scores GROUP BY class;

+-----------+---------------+
| class     | approx_median |
+-----------+---------------+
| senior    |     81.950000 |
| sophomore |     74.500000 |
| junior    |     84.400000 |
+-----------+---------------+

Window Function

The SELECT clause includes four columns: student_id, class, test1, and the 10th percentile of test1 within each class. The approx_percentile function calculates the approximate 10th percentile of the test1 column. The OVER clause is used to specify the window or subset of rows on which to perform the calculation. In this example, the window is defined by the PARTITION BY clause, which divides the data by class, so the 10th percentile is calculated for each distinct class in the test_scores table.

SQL

SELECT class, student_id, test1, APPROX_PERCENTILE(test1, 0.3)
    OVER (PARTITION BY class) AS "percentile" FROM test_scores;

+-----------+------------+-------+------------+
| class     | student_id | test1 | percentile |
+-----------+------------+-------+------------+
| senior    |        614 | 89.30 |  89.370000 |
| senior    |        400 | 90.00 |  89.370000 |
| sophomore |        690 | 72.00 |  59.580000 |
| sophomore |        304 | 51.30 |  59.580000 |
| sophomore |        111 | 72.00 |  59.580000 |
| junior    |        643 | 81.00 |  76.900000 |
| junior    |        243 | 78.80 |  76.900000 |
| junior    |        184 | 82.00 |  76.900000 |
| junior    |        288 | 76.90 |  76.900000 |
| junior    |        451 | 68.40 |  76.900000 |
+-----------+------------+-------+------------+

APPROX_PERCENTILE

On this page

Syntax

Aggregate Function

Window Function

Arguments

Return Type

Remarks

Error Tolerance Behavior

Examples

Aggregate Function

Example 1

Example 2

Window Function

Was this article helpful?

On this page

Was this article helpful?

APPROX_PERCENTILE

On this page

Syntax

Aggregate Function

Window Function

Arguments

Return Type

Remarks

Error Tolerance Behavior

Examples

Aggregate Function

Example 1

Example 2

Window Function

Related Topics

Was this article helpful?

On this page

Was this article helpful?