Vector Range Search
On this page
Vector range search allows users to query for vectors which have a similarity score beyond a specified threshold.
As an example, consider a table, comments
, with the following definition:
CREATE TABLE comments(id INT,comment TEXT,comment_embedding VECTOR(4) not null,category VARCHAR(256));
The following query shows an example of a vector range search on the comments
table using a WHERE
clause to filter for embeddings with a score > 0.
.
The @query_
variable is cast to a VECTOR
to ensure that @query_
is a valid VECTOR
and to improve performance.
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]':>VECTOR(4));SELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE score > 0.7;
Vector range search queries can take advantage of a vector index if a vector index is built on the comment_
column of the comments
table.
Refer to Vector Indexing, Vector Type, Working with Vector Data, Hybrid Search - Re-ranking and Blending Searches , and Tuning Vector Indexes and Queries, for additional information on vector search in SingleStore.
Index Options, Hints, and Parameters
The index options, hints, and parameters for vector range search are largely the same as those for indexed vector search (Vector Indexing), with the following exceptions:
-
Use
RANGE_
instead ofSEARCH_ OPTIONS SEARCH_
to specify search options in queries.OPTIONS -
The search parameter
k
is not supported. -
The search parameter
nprobe
is not supported when using anIVF_
index.PQFS
Create Table and Indexes
The following table, index, and data are used in the examples.
The statement below creates a comments
table which represents a set of comments to be searched.TEXT
and embeddings of those comments as a VECTOR
with 4 elements.
CREATE TABLE comments(id INT,comment TEXT,comment_embedding VECTOR(4) not null,category VARCHAR(256));
Next populate the table.
INSERT INTO comments VALUES(1, "The cafeteria in building 35 has a great salad bar",'[0.45, 0.55, 0.495, 0.5]',"Food"),(2, "I love the taco bar in the B16 cafeteria.",'[0.01111, 0.01111, 0.1, 0.999]',"Food"),(3, "The B24 restaurant salad bar is quite good.",'[0.1, 0.8, 0.2, 0.555]',"Food");
Next, add a vector index of type IVF_
to the comment_
ALTER TABLE comments ADD VECTOR INDEX ivf_dp(comment_embedding)INDEX_OPTIONS'{"index_type":"IVF_PQFS"}';
Optionally optimize the table to enhance performance.
OPTIMIZE TABLE comments FULL;
Example 1 - Use Vector Range Search
The following example demonstrates the use of vector range search.
Suppose that the user likes salad and so the query is "restaurants with good salad.'[0.
.
The following query first creates a query vector (@query_
) using this vector.DOT_
(<*>
).ORDER BY
clause is included to ensure that the higher scores appear at the top of the results.
EUCLIDEAN_
(<->
) is also available for use with vector range search.
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]':>VECTOR(4));SELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE score > 0.7ORDER BY score DESC;
+------+----------------------------------------------------+----------+--------------------+
| id | comment | category | score |
+------+----------------------------------------------------+----------+--------------------+
| 1 | The cafeteria in building 35 has a great salad bar | Food | 0.9810000061988831 |
| 3 | The B24 restaurant salad bar is quite good. | Food | 0.8993000388145447 |
+------+----------------------------------------------------+----------+--------------------+
If a vector index has been added to the comment_
column in the table, as was done in this example, the vector index will be used in this query.
Example 2 - Identify the use of Vector Range Search
The EXPLAIN
or PROFILE
plan for the query can be used to tell if the vector index is being used in the query.
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]':>VECTOR(4));EXPLAINSELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE score > 0.7ORDER BY score DESC;
+----------------------------------------------------------------------------+
| EXPLAIN |
| |
+----------------------------------------------------------------------------+
| Gather partitions:all alias:remote_0 parallelism_level:segment |
| Project [comments.id, comments.comment, comments.category, |
| DOT_PRODUCT(comments.comment_embedding,@query_vec) AS score] |
| |
| ColumnStoreFilter [DOT_PRODUCT(comments.comment_embedding,@query_vec) > |
| (.7!:>double NULL) AND INTERNAL_VECTOR_RANGE_SEARCH(0, @query_vec, .7) |
| index] |
| |
| ColumnStoreScan dbname.comments, SORT KEY __UNORDERED () |
| table_type:sharded_columnstore |
+----------------------------------------------------------------------------+
The INTERNAL_
notation in the ColumnStoreFilter
row indicates that the plan uses a vector index for range search.
Example 3 - Use Index Hints
The syntax [USE {INDEX|KEY} ([<index_
can be used to specify a specific index to use when there are multiple vector indexes, or to disable the use of an index.
The query below specifies the use of the index named ivf_
.
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]' :>VECTOR(4));SELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE score > 0.7 USE INDEX (ivf_dp);
+------+----------------------------------------------------+----------+--------------------+
| id | comment | category | score |
+------+----------------------------------------------------+----------+--------------------+
| 1 | The cafeteria in building 35 has a great salad bar | Food | 0.9810000061988831 |
| 3 | The B24 restaurant salad bar is quite good. | Food | 0.8993000388145447 |
+------+----------------------------------------------------+----------+--------------------+
The use of empty parentheses ( )
in an index hint disables the use of vector indexes as is shown in the following query.
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]' :>VECTOR(4));SELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE score > 0.7 USE INDEX ();
+------+----------------------------------------------------+----------+--------------------+
| id | comment | category | score |
+------+----------------------------------------------------+----------+--------------------+
| 1 | The cafeteria in building 35 has a great salad bar | Food | 0.9810000061988831 |
| 3 | The B24 restaurant salad bar is quite good. | Food | 0.8993000388145447 |
+------+----------------------------------------------------+----------+--------------------+
As described previously, EXPLAIN
can be used to verify if a query does or does not use an index.
Example 4 - Use Index Building Parameters
The following query creates a second index, named ivf_index_
, metric_
, and nlist
index building parameters.
ALTER TABLE commentsADD VECTOR INDEX ivf_euclid(comment_embedding)INDEX_OPTIONS'{"index_type":"IVF_PQFS","metric_type":"EUCLIDEAN_DISTANCE","nlist":3}';
Example 5 - Use Index Search Parameters
Add a HNSW_
index to the comments
table.
ALTER TABLE comments ADD VECTOR INDEX hnsw(comment_embedding)INDEX_OPTIONS'{"index_type":"HNSW_FLAT"}';
The query below specifies that the hnsw
index with the search parameter ef
set to 120
should be used to evaluate the filter score > 0.
.USE_
clause and the RANGE_
clause must appear immediately after the filter (score > 0.
in this example) and USE_
must appear before RANGE_
.USE_
and RANGE_
clauses are associated with the filter score > 0.
.
SELECT id, comment, category,comment_embedding <*> @query_vec AS scoreFROM commentsWHERE (score > 0.7 USE INDEX (hnsw)RANGE_SEARCH_OPTIONS '{"ef" : 120 }');
+------+----------------------------------------------------+----------+--------------------+
| id | comment | category | score |
+------+----------------------------------------------------+----------+--------------------+
| 1 | The cafeteria in building 35 has a great salad bar | Food | 0.9810000061988831 |
| 3 | The B24 restaurant salad bar is quite good. | Food | 0.8993000388145447 |
+------+----------------------------------------------------+----------+--------------------+
Refer to Tuning Vector Indexes and Queries for more information on setting index building and search parameters.
Related Topics
Last modified: August 14, 2024