BM25
On this page
This function is used to employ BM25 partition-scoped scoring for VERSION 2
full-text search queries.
Partition level scoring improves accuracy across a larger set of data.MATCH
syntax are still scored on a segment level.
Refer to Working with Full-Text Search for more conceptual information on this function.
Syntax
The syntax for BM25 partition-scoped scoring, use the BM25()
function as follows:
BM25(<table_name>,'<column>:<term>')
The syntax for the BM25()
function for JSON columns is:
BM25(<table_name>,'<column>$<key.path>:<term>')
Examples
Obtaining a BM25 Score Example
CREATE TABLE scoring (a INT, b TEXT, c JSON, SORT KEY(a),FULLTEXT USING VERSION 2 scr_ft_index (b, c));
INSERT INTO scoring VALUES(1, 'one two three three four four four five five five five five','{"one": "two"}');INSERT INTO scoring VALUES(2, 'one two two three three three three four four four four four four four four','{"one": "three"}');OPTIMIZE TABLE scoring FLUSH;
SELECT a, ROUND(BM25(scoring,'b:one'), 3) AS score_one, ROUND(BM25(scoring,'b:two'), 3) AS score_two, ROUND(0.6*BM25(scoring,'b:three') + 0.4*BM25(scoring,'b:four'),3) AS score_boostedFROM scoring;
+-----+-----------+-----------+---------------+
| a | score_one | score_two | score_boosted |
+-----+-----------+-----------+---------------+
| 1 | 0.087 | 0.087 | 0.124 |
| 2 | 0.079 | 0.110 | 0.145 |
+-----+-----------+-----------+---------------+
To obtain BM25 scores over JSON documents at a keypath, use the following query:
SELECT a, ROUND(BM25(scoring, 'c$one:three'), 3) AS score_one FROM scoring;
+-----+-----------+
| a | score_one |
+-----+-----------+
| 1 | 0.000 |
| 2 | 0.131 |
+-----+-----------+
Related Topics
Last modified: June 3, 2024