Tuning Vector Indexes and Queries

SingleStore supports indexed vector search which can be used to improve the performance of similarity searches over larger vector data sets. Similarity searches find a set of nearest neighbors to a query vector. Exact k-nearest neighbor search (kNN) finds the (exact) top k nearest neighbors. Indexed vector search finds an approximate top N nearest neighbors, and is known as Approximate Nearest Neighbor (ANN) search.

Vector indexes can significantly improve the speed of similarity search. However, the vector indexes perform ANN search and thus return approximate results and, in addition, require memory to store the indexes. There is a tradeoff between faster queries and, on the other side, approximate results and memory use.

To obtain the performance improvement from vector indexes, it is important to only use vector indexes in specific cases, pick the right index type, and set the index-building and search parameters to meet your application's needs.

For background on vector indexes, also refer to Vector Indexing, Vector Type, Working with Vector Data, and Hybrid Search - Re-ranking and Blending Searches.

Choose a Vector Index

Vector indexes can significantly improve the speed of query execution over larger vector data sets. This section addresses when to use an index and choosing a vector index to use.

When to Use a Vector Index

SingleStore recommends using vector indexes for searches over very large sets of vectors (e.g. billions of vectors) when there are no other filters in the query that can be used to reduce the number of vectors being searched. That is, if your data set is only millions of vectors, or if there is a WHERE clause (filter) in the query which will eliminate many of the rows in the table, you may have acceptable performance without using an index. As vector indexes use memory, if you can achieve acceptable performance without an index, SingleStore recommends not creating an index and using brute-force (scan) for your queries.

Types of Vector Indexes

SingleStore supports several types of vector indexes, but recommends using IVF_PQFS and HNSW_FLAT. Refer to Vector Indexing for more information on types of indexes and configuration parameters supported by SingleStore.

Example Table

The following example table represents product reviews from a web site. The table contains the text of the reviews as well as a vector embedding that captures the meaning of the review. Embeddings are a common use of vectors as is described in Working with Vector Data. The columns of the example table are:

  • id: an INT id

  • review: a TEXT field which stores the text of the comment

  • review_embedding: a VECTOR of 1024 dimensions which stores a vector embedding of the review

  • category: a VARCHAR which stores a category assigned to the review

Below is the SQL that creates the example reviews table.

CREATE TABLE reviews (
id INT OPTION 'Integer',
review TEXT OPTION 'SeekableString',
review_embedding VECTOR(1024) NOT NULL OPTION 'SeekableString',
category VARCHAR(256));

Note that the id column is qualified OPTION 'Integer', and the review and review_embedding columns are qualified as OPTION 'Seekable String'. The OPTION clauses specify how SingleStore should encode the data in those columns and will improve query performance.

The reviews table will be used to demonstrate creating and tuning indexes and to discuss advantages, disadvantages, and tradeoffs during index creation and tuning.

General Advantages and Disadvantages

There are several index types and a variety of configuration parameters. Vector indexes can be chosen and tuned to provide the accuracy, memory size, and search time that your application requires.

In general, IVF_PQFS has lower index build time and index size and HNSW_FLAT has lower search time and higher accuracy (recall). IVF_PQFS uses a fraction of the memory (~10x reduction or more) of HNSW and has a faster index build. The disadvantage of IVF_PQFS is potentially lower accuracy (recall) and longer search time; however, the recall and search time of IVF_PQFS can be improved by tuning the index building and search parameters as described in the section Tune IVF_PQFS Parameters below.

HNSW gives excellent performance (search time) and accuracy (recall); however, the size of an HNSW_FLAT index is the total size of the vectors. SingleStore keeps vector indexes in memory, so to use HNSW_FLAT, you will need to have enough memory to store the index.

As the size of an IVF_PQFS index is significantly smaller than an HNSW_FLAT index, tuning an IVF_PQFS index to give good performance and recall can be a good solution for many situations. SingleStore recommends IVF_PQFS unless your application requires high recall and very fast search speed; use HNSW_FLAT only if you cannot obtain good enough performance (search time) and accuracy (recall) by tuning IVF_PQFS.

Where to Start

This section is intended to provide a starting point for adding an index to your vector data. The section provides suggestions for initial parameter settings, selected performance pointers, and suggestions for testing index performance. The section assumes that the reviews table has been created and data has been loaded into that table. Creating the index after loading the data is more efficient than loading data after creating the index.

Create the Vector Index

SingleStore recommends starting with an IVF_PQFS index due to the reduced memory required to store IVF_PQFS indexes. When building the IVF_PQFS index, SingleStore recommends setting the index building parameter m to be the number of dimensions of the vector column divided by 4.

The following command creates such an index for the reviews table, where m is set to 256 (which is 1024/4) and nlist is set to 1024 and nprobe is set to 20. Refer to nlist and nprobe for more information on setting nlist and nprobe. The metric_type is set to DOT_PRODUCT, which is also the default.

The parameter m is an index building parameter that determines the number of subquantizers used in product quantization, a method for compressing vectors. The parameter nlist specifies the number of clusters created during index build. The parameter nprobe specifies the number of probes to be used at query time.

ALTER TABLE reviews
ADD VECTOR INDEX ivf_pqfs(review_embedding)
INDEX_OPTIONS'{"index_type":"IVF_PQFS",
"metric_type":"DOT_PRODUCT",
"m":256,
"nlist":1024,
"nprobe":20}';

After the index is loaded, run the OPTIMIZE command for best performance.

OPTIMIZE TABLE reviews FULL;

Perform a Similarity Search Query

This example searches for the top 5 reviews that are most similar to a particular phrase. The query below first sets up a @query_vec, which is a vector that represents the phrase to be searched on. In the example below, the @query_vec is created from a JSON string; more commonly the @query_vec would be obtained from an API or from querying a different table. Refer to Working with Vector Data for more information on vector embeddings.

The query below uses an ORDER BY … LIMIT query to look for the top 5 reviews that are most similar to @query_vec. DESC ordering is used in the ORDER BY clause as the metric used to create the vector index was DOT_PRODUCT.

SET @query_vec=('[0.002, 0.01, 0.001, ...]'):>VECTOR(1024);
SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX (ivf_pqfs) SEARCH_OPTIONS '{"k":50}' DESC
LIMIT 5;

Key points to note about this query:

  1. The @query_vec variable is cast to a VECTOR to ensure that @query_vec is a valid VECTOR and to improve performance.

  2. The SEARCH_OPTIONS clause sets the search parameter k to 50. A setting of k equal to the the LIMIT value * 10 is a good starting point. In this example LIMIT is 5, so k is set to 5*10 = 50.

The USE INDEX clause in the query above specifies that the index named ivf_pqfs should be used in the query. USE INDEX is included for completeness, but is not necessary in this query. The index will be used regardless of whether the clause is included or not.

Verify a Vector Index is Being Used

It is important to verify that your query is using a vector index. A vector index can be used in the following conditions:

  • The metric in the query (dot product vs. euclidean distance) matches the metric used to create the index.

  • The order in the ORDER BY clause (DESC vs. ASC) in the search query matches the metric.

You can verify that a vector index is being used in a query with EXPLAIN as shown below.

SET @query_vec = ('[0.002, 0.01, 0.0001, ...]'):>VECTOR(1024);
EXPLAIN
SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX (ivf_pqfs) SEARCH_OPTIONS '{"k":50}' DESC
LIMIT 5;
+----------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                    |
+----------------------------------------------------------------------------------------------------------------------------+
| Project [remote_0.id, remote_0.review, remote_0.score]                                                                     |
| TopSort limit:5 [remote_0.score DESC]                                                                                      |
| Gather partitions:all alias:remote_0 parallelism_level:segment                                                             |
| Project [reviews.id, reviews.review, DOT_PRODUCT(reviews.review_embedding,(@query_vec:>vector(1024, F32) NOT NULL)) AS score] |
| TopSort limit:5 [DOT_PRODUCT(reviews.review_embedding,(@query_vec:>vector(3, F32) NOT NULL)) DESC]                         |
| ColumnStoreFilter [INTERNAL_VECTOR_SEARCH(0, (@query_vec:>vector(1024, F32) NOT NULL), 5, '{\"k\" : 50 }') index]             |
| ColumnStoreScan dbname.reviews, SORT KEY __UNORDERED () table_type:sharded_columnstore                                     |
+----------------------------------------------------------------------------------------------------------------------------+

The ColumnStoreFilter line of the EXPLAIN output above shows INTERNAL_VECTOR_SEARCH(...) which indicates that the vector index is being used. If INTERNAL_VECTOR_SEARCH is not in the ColumnStoreFilter line, the vector index is not being used.

Test Index Performance

To understand if an index meets the needs of your application, you will want to test the query accuracy (recall) and search time.

Test Accuracy via Recall

As described previously, recall measures the accuracy of the results returned. Specifically, recall is the percentage of relevant results that are returned by the query. To check the recall of a query with a vector index, you need to run the query with and without the vector index and compare the results returned. The USE INDEX clause can be used for this purpose.

The query below contains an empty USE INDEX clause which will disable the use of the vector index. Note that the SEARCH_OPTIONS clause has also been removed.

SET @query_vec = ('[0.002, 0.01, 0.0001, ...]'):>VECTOR(1024);
SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX() DESC
LIMIT 5;

To obtain the recall of the indexed query, perform the following steps:

  1. Run the query without the index using the empty USE INDEX() clause as shown above and note the results. This will give you the actual top N results to your query. (N=5 in the query above.)

  2. Run the query with the index and note the results. This gives you the top N results as determined by the indexed search. Remember that the indexed search returns approximate results.

  3. Determine how many of the top N results from the indexed query results are in the actual top N results. For example, if the query with the index returned 4 of the 5 actual results, the recall is 4 / 5 or 80%. Note that the index query will return 5 results. For determining recall, the question is how many of those 5 results are in the results from the query without the index.

Measure Search Time

When measuring search time, it is important to exclude the run time of the first query and to exclude the round-trip time between the client and server. The runtime of the first query includes query compile time and the time to load the vector index into memory and is not indicative of the performance of the query. The round-trip time includes communication between client and server and time for the server to package the result tuples and ship them to the client. Including round-trip time will make the search time artificially high.

Perform the following to measure search time. First run a warm-up query and ignore the results. Then run the query several times using PROFILE and take the average run time. PROFILE has limited overhead and is acceptable for initial performance testing. For more detailed testing see the How to Check Search Performance Section. The query below shows an example use of PROFILE.

SET @query_vec = ('[0.002, 0.01, 0.0001, ...]'):>VECTOR(1024);
PROFILE
SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX (ivf_pqfs) SEARCH_OPTIONS '{"k":50}' DESC
LIMIT 5;

Tune IVF_PQFS Parameters

This section discusses tuning an IVF_PQFS index for good performance (search time) and accuracy (recall). IVF_PQFS has three index building parameters: m, nprobe, and nlist and one search parameter k. Each of these are discussed in turn. The ordering of the sections below indicates the recommended ordering of tuning. That is, when tuning, start with m, move on to k and nprobe, finally nlist.

m

The parameter m is an index building parameter that determines the number of subquantizers used in product quantization. Product quantization is used to compress high-dimensional vectors so they use less memory. The memory reduction in using product quantization is significant and is the reason that IVF_PQFS indexes use significantly less memory than HNSW_FLAT indexes.

The parameter m effectively trades off index size and accuracy of results. A higher m gives greater accuracy (recall) but the index size is larger. A lower value of m gives a smaller index, but the results will have lower accuracy. The parameter m must be set so that dimensions % m = 0 and m defaults to 32. Refer to Vector Indexing for details.

Consider values for m such as: dimensions/2, dimensions/4, dimensions/8. SingleStore recommends starting with dimensions/4.

k

The parameter k is an index search parameter which can be specified at query time. An example query using k is shown below.

The following query uses the reviews table defined above. As before, the @query_vec is a VECTOR representing a phrase the user is interested in and the query searches for reviews in the reviews table that are similar to @query_vec. Specifically, the query searches for the top five comments which are most similar to @query_vec.

SET @query_vec = ('[0.002, 0.01, 0.0001, ...]'):>VECTOR(1024);
SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX (ivf_pqfs) SEARCH_OPTIONS '{"k":50}' DESC
LIMIT 5;

In this query, k determines the number of rows output by the vector index scan, that is k determines the number of rows that the vector index scan produces before the ORDER BY … LIMIT clause is applied. Larger values of k give more accurate results (better recall), but increase search times. The parameter k must be >= limit, where limit is from ORDER BY … LIMIT clause and the default for k is the LIMIT value.

SingleStore suggests starting with a k of 10 * LIMIT and adjusting to get the recall and search time needed for your application.

Note

A common cause of poor recall with IVF_PQFS is not setting a good value for k.

nprobe

nprobe is a search parameter which specifies the number of probes to be used at query time. During index build, vectors are clustered into nlist clusters and then the query search is done by searching nprobe of the nearest clusters. SingleStore recommends setting nprobe to 20; nprobe cannot be greater than nlist.

A query using a vector index will search an nprobe/nlist fraction of the data. Increasing nprobe will improve accuracy at the cost of increased query time.

nlist

nlist is an index-building parameter which specifies the number of inverted lists created when the index is built. That is, nlist controls the number of centroids generated by k-means clustering. SingleStore recommends starting with nlist = 1024; nlist is the last of the parameters you should try tuning.

If you fix nprobe and increase nlist, you get faster queries with worse accuracy; however, the results depend on data distribution and may not change monotonically. A general recommendation is to set nlist to be of the same order as the square root of the number of rows in a segment.

The following query finds the segment size in number of rows for the reviews table.

SELECT database_name, table_name,
AVG(rows_count) AS average_segment_size_in_rows
FROM information_schema.columnar_segments
WHERE database_name = 'dbname' AND table_name = 'reviews'
GROUP BY ALL;

Tune HNSW_FLAT Parameters

HNSW_FLAT indexes can be used when IFV_PQFS indexes cannot be tuned to meet application needs. HNSW_FLAT indexes use approximately the same amount of memory as the vectors themselves, be aware of the memory requirements of this type of index.

HNSW_FLAT has two index-building parameters, M and efConstruction, and one search parameter, ef.

For tuning HNSW_FLAT indexes, the primary parameter to tune is the search parameter, ef, which trades recall for performance. A larger ef gives better accuracy (higher recall), but slower search. That is, increasing ef increases exhaustiveness of graph search which gives better accuracy (recall) at the cost of slower search performance. SingleStore recommends starting with ef set to 120. If recall is too low, you may wish to increase ef.

SingleStore recommends starting with setting M to 12 and efConstruction to 120.

Approximate Vector Index Memory Use

Vector indexes are stored in memory in SingleStore and can be large. HSNW_FLAT indexes take the same amount of space as the vectors themselves. IVF_PQFS indexes use much less memory. Thus, understanding how much memory is taken by a vector index is important. The following example shows a method for approximating the amount of memory used by a vector index.

To approximate the amount of memory used by a vector index, calculate the amount of Malloc_active_memory used before and after the vector index is loaded into memory. Refer to SHOW STATUS EXTENDED for information on Malloc_active_memory.

Note

An improved method for determining the amount of memory used by a vector index will be provided in an upcoming release.

The following example shows how to perform (calculate) this approximation. The example uses the reviews table created above. To measure the memory usage of your index, replace the table and column names with your table and vector column names.

Important

To calculate the memory used by a vector index, the following steps must be done in order and the steps must be done on a system with minimal other activity. Malloc_active_memory is used by multiple operators in SingleStore so it is important that the system is dedicated to this memory calculation while it is running to avoid other uses of Malloc_active_memory that could impact this calculation.

1. Add a Vector Index

First, add a vector index on the table. This example uses a HNSW index of type HNSW_FLAT; create the type of index that you want to test.

ALTER TABLE reviews ADD VECTOR INDEX hnsw(review_embedding)
INDEX_OPTIONS '{"index_type":"HNSW_FLAT"}';

At this point the vector index has been built on disk, but has not been loaded into memory.

2. Calculate the Initial Value of Malloc_active_memory

Use the following query to get the value of Malloc_active_memory before the vector index is loaded into memory. This query sums Malloc_active_memory across all nodes in the cluster.

SELECT SUM (variable_value) INTO @m1
FROM information_schema.mv_global_status
WHERE variable_name = "Malloc_active_memory";

3. Load the Vector Index into Memory

To force the vector index to be loaded into memory, run a query that uses the vector index.

To do so, run the following two queries which will cause the vector index to be loaded into memory. The first query selects a vector from the reviews table and the second query uses that vector in a nearest neighbor search.

In the first query, it is necessary to select a vector from the reviews table to use as a query vector in the subsequent query (as opposed to using a random query vector) in order to force the vector index to be loaded into memory.

SET @query_vec =
(SELECT review_embedding
FROM reviews LIMIT 1);
SELECT @query_vec <*> review_embedding AS s
INTO @s2
FROM reviews
ORDER BY s
USE INDEX (hnsw)
DESC LIMIT 1;

4. Find the Approximate Memory Use

Re-run the sum query to get the value of Malloc_active_memory after the vector index has been loaded in memory. And take the difference between @m2 and @m1 to obtain the amount of memory used by the vector index.

SELECT SUM (variable_value) INTO @m2
FROM information_schema.mv_global_status
WHERE variable_name = "Malloc_active_memory";
SELECT @m2-@m1;

Indexed Vector Search with a Filter

Similarity search using a vector index can be combined with standard SQL filters. This type of query is referred to as filtered ANN search. When combining an indexed vector search with a filter, it is important to adjust the index search parameter, k , so that sufficient results are returned.

A query using filtered ANN search is shown below. This query searches for reviews that are similar to @query_vec using an ORDER BY … LIMIT query, and uses the filter WHERE category = "Food" to limit the results to those reviews in the category "Food.

SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
WHERE category = "Food"
ORDER BY score USE INDEX (hnsw) DESC
LIMIT 5;

This query uses the index named hnsw. When the query is run, the hnsw index is first scanned to find reviews that are similar to @query_vec, and then the category = "Food" filter is applied.

The k parameter, which controls the number of  candidates that the index scan will output, is not set in this query and defaults to the value of LIMIT, which is 5 in this example. Thus, 5 candidates are produced from the scan of the hnsw index. The filter (category = "Food") is then applied to these 5 candidates, which may yield less than 5 rows in the query result.

To ensure that there are 5 rows in the query result when the query contains a filter, the value of k must be increased. Increasing the value of k will increase the number of results returned from the index scan, which is also expected to increase the number of results returned from the query until the limit is reached.

When a filter is used with an ANN search, SingleStore recommends setting k to: (the value of k without a filter) / (selectivity of the filter). For example, assume that the filter category = "Food" has a selectivity of 25%, meaning that 25% of the rows in the reviews table are in the category "Food". To adjust for this filter, k would be adjusted to: (the initial value of k) / 25%, which is equal to (5/0.25), or 20.

The adjusted query is shown below, with k set to 20.

SELECT id, review,
review_embedding <*> @query_vec AS score
FROM reviews
ORDER BY score USE INDEX (hnsw) SEARCH_OPTIONS '{"k":20}' DESC
WHERE category = "Food"
LIMIT 5;

In this query, 20 candidates will be returned from the scan of the hnsw index, those 20 candidates will be filtered by category = "Food", and the top 5 of those candidates will be returned as the query result.

As demonstrated in the example above, increasing the value of k can lead to better results for ANN search queries with filters. If your filter is highly selective, consider running the query without an index.

While this example showed how to adjust the k parameter for HNSW_FLAT indexes in queries that contain a filter, queries with IVF_PQFS indexes and a filter need to also be adjusted. For both HNSW_FLAT and IVF_PQFS indexes, k should be adjusted to: (the value of k without filter) / (selectivity of the filter). A difference here is that:

  • For HNSW_FLAT indexes, SingleStore recommends letting k be the default when there is no filter.

  • For IVF_PQFS indexes, SingleStore recommends setting k to LIMIT*10 when there is no filter.

This is because IVF_PQFS compresses vectors while HNSW_FLAT does not. Therefore,  for IVF_PQFS, two adjustments are needed to k, one to adjust for the vector compression and one to adjust for selectivity.

Vector Type

Vector Indexing

Working with Vector Data

Hybrid Search - Re-ranking and Blending Searches

Last modified: August 14, 2024

Was this article helpful?