# Hybrid Search - Re-ranking and Blending Searches

Hybrid Search combines multiple search methods in a single query. Hybrid search can blend full-text search, which finds keyword matches, and vector search, which finds semantic matches. This allows search results to be (re-)ranked by a score that combines full-text and vector rankings.

Combining the syntactic (keyword-based) ranking from full-text search and the semantic ranking from vector search is often important. For example, if it is significant that a particular keyword or phrase appears in a text chunk, but you do not want to search on keyword or phrase alone and want to also include semantic (meaning-based) search.

Combining full-text and vector search allows you to combine the benefits of fine-grained control provided by full-text search with semantic vector search. Further, SingleStore has [Vector Indexing](https://docs.singlestore.com/db/v9.1/reference/sql-reference/vector-functions/vector-indexing.md) and [full-text indexing](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search.md) which can be used to speed up a hybrid search.

## Reciprocal Rank Fusion

A common approach to blending results from two or more different rankings (e.g. keyword-based text search and semantic vector search) is reciprocal rank fusion (RRF). RRF scores based on ranking and does not directly rely on full-text search scores or vector similarity metric scores (e.g. dot product). RRF is thus not dependent on any particular score or magnitude range.

The RRF formula for `N` different ranked lists where `r` is the rank of an item in list `i`, is shown in the following.

![](https://images.contentstack.io/v3/assets/bltac01ee6daa3a1e14/bltd3784a50f66f592e/6a318b7e23176f000820733d/img_docs_reciprocal-rank-fusion-formula_inline-dark_smaller-1B9ylp.png)

In this formula, `k` is a smoothing factor which smooths out the decline in weights. Without a smoothing factor (`k=0`) RRF would give weight 1.0 to item ranked 1, weight 0.5 to item ranked 2 and so on. With a smoothing factor of `k=60`, the item ranked 1 is weighted 1/61 ≈ 0.0164 and the item ranked 2 is weighted 1/62 ≈ 0.0161, creating a much more gradual decline in weights.

You can further refine RRF by weighting each ranking differently. For example, you could weight full-text search results at 0.7 and vector search results at 0.3.

## Examples

It is possible, as shown in the following examples, to combine full-text and vector search rankings with RRF in a SQL query.

Example 1 shows an indexed hybrid search over data taken from Wikipedia pages. Example 2 provides an example of how hybrid search can combine full-text search, vector search, and a SQL filter.

In these examples, CTEs, the `RANK()` window function and general expressions are used to perform powerful hybrid search rankings. You can use user-defined functions (UDFs), such as SingleStore's [external functions](https://docs.singlestore.com/db/v9.1/reference/sql-reference/procedural-sql-reference/create-or-replace-external-function.md) or [Wasm UDFs](https://docs.singlestore.com/db/v9.1/reference/code-engine-powered-by-wasm/create-wasm-udfs.md), to create functions and queries that implement late interaction models, cross-encoders or other reranking techniques.

## Example 1 - Hybrid Search of Wikipedia Articles

SingleStore provides full-text search ([Working with Full-Text Search](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search.md)) and [Vector Indexing](https://docs.singlestore.com/db/v9.1/reference/sql-reference/vector-functions/vector-indexing.md) which can be used to speed up a hybrid search. This example searches over a dataset of Wikipedia articles to show how these indexes can be used in hybrid searches.

SingleStore has created a dataset of [Wikipedia](https://en.wikipedia.org/wiki/Main_Page) articles. This dataset contains 160M vectors and associated paragraphs to simulate vector-based semantic search over all 6.7 million articles in Wikipedia. The dataset has real data for video games, and the rest of the data is mocked up. The full dataset, including how to load and generate the vectors, is described in this [blog on ANN search](https://www.singlestore.com/blog/singlestore-indexed-ann-vector-search/). The data set is available under the [Creative Commons Attribution-ShareAlike License 4.0](https://en.wikipedia.org/wiki/Wikipedia:Text_of_the_Creative_Commons_Attribution-ShareAlike_4.0_International_License).

The following example uses just the data for video games, which is approximately 41,000 vectors based on 1,800 articles from Wikipedia.

Use the following SQL commands to run this example.

Create a table to hold the vectors and paragraphs.

```sql
CREATE TABLE vecs(
id BIGINT(20),
     url TEXT DEFAULT NULL,
     paragraph TEXT DEFAULT NULL,
     v VECTOR(1536) NOT NULL,
     SHARD KEY(id), KEY(id) USING HASH
);
```

Load data into this table using the following pipeline.

```sql
-- since the bucket is open, you can leave the credentials clause as-is
CREATE OR REPLACE PIPELINE wiki_pipeline AS
load data S3
's3://singlestore-docs-example-datasets/wikipedia-video-game-data/video-game-embeddings.csv'
config '{"region":"us-east-1"}'
credentials '{"aws_access_key_id": "", "aws_secret_access_key": ""}'
SKIP DUPLICATE KEY ERRORS
INTO TABLE vecs
FORMAT csv
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\r\n';

START PIPELINE wiki_pipeline FOREGROUND;
```

Add a version 2 full-text index and a vector index on the table. Creating the indexes after data loading is more efficient in the index build process than creating indexes before data loading.

```sql
ALTER TABLE vecs ADD FULLTEXT USING VERSION 2 ft_para(paragraph);

ALTER TABLE vecs ADD VECTOR INDEX ivf_v(v)
    INDEX_OPTIONS '{"index_type":"IVF_FLAT"}';
```

Run the following command to optimize the index searches and to ensure the values above are indexed.

```sql
OPTIMIZE TABLE vecs FLUSH;
```

The following query searches for rows with information about Super Mario Kart, by using full-text search to search with paragraphs that match the string `super mario kart` and that have embedding vectors (`v`) that are similar to the embedding for the introduction to the Wikipedia Super Mario Kart page.

The variables are set prior to defining the query.

* `@ft_q` to hold the full-text match condition.
* `@v_mario_kart` to hold the vector embedding for the Mario Kart Wikipedia page.

At a high-level, the query is structured as follows:

* Two Common Table Expressions (CTEs) use indexed full-text and vector search to produce ordered lists of rows that best meet the full-text and vector search criteria.
* Two additional CTEs create ranked full-text and vector results.
* A combined score is calculated using the RRF formula, and the top 5 results are returned.

```sql
SET @ft_q = 'paragraph:"super mario kart"';
SET @v_mario_kart = (SELECT v FROM vecs
          WHERE URL = "https://en.wikipedia.org/wiki/Super_Mario_Kart" 
          ORDER BY id LIMIT 1);

WITH fts AS ( /* Find top full-text matches. */
  SELECT
    id,
    paragraph,
    MATCH(TABLE vecs) AGAINST(@ft_q) AS SCORE
  FROM vecs
    WHERE MATCH(TABLE vecs) AGAINST(@ft_q)
 ORDER BY SCORE desc
  LIMIT 200
),
vs AS ( /* Find top vector search matches. <*> is dot product. */
  SELECT
    id,
    paragraph,
    v <*> @v_mario_kart AS SCORE
  FROM vecs
  ORDER BY score DESC
  LIMIT 200
),
fts_ranked as (
  SELECT
    *,
    RANK() OVER (ORDER BY SCORE DESC) AS fts_rank
  FROM fts
),
vs_ranked as (
  SELECT
    *,
    RANK() OVER (ORDER BY SCORE DESC) as vs_rank
  FROM vs
)
SELECT
  *,
  0.7 * (1.0 / (NVL(fts_rank, 1000.0) + 60)) + 0.3 * (1.0 / (NVL(vs_rank, 1000.0) + 60)) as combined_score
FROM fts_ranked f FULL OUTER JOIN vs_ranked v ON f.id = v.id
ORDER BY combined_score DESC
LIMIT 5;

```

In more detail, this query uses the following structure:

* The first two CTEs in the query (`fts` and `vs`) create ordered lists of 200 rows that best match the full-text and vector search criteria.

  * The full-text search block (`fts`) uses inverted index search based on Java Lucene.
  * The vector search block (`vs`) uses approximate nearest neighbor (ANN) indexing.
* The next two CTEs use the `RANK` window function to create ranked lists (`fts_ranked` and `vs_ranked`) of these 200 rows.

  * The ranked lists are created in separate CTEs to allow the initial CTEs to use indexed search.
  * If the `RANK` calculation were pushed into the initial full-text and vector search CTEs (`fts` and `vs`), that would prevent those CTEs from using index search.
* The final query block uses a `FULL OUTER JOIN` to ensure that rows for items that appear in one ranked list but not the other are included in the results. Standard `INNER JOIN` would discard items that are not in both lists.
* The `SELECT` clause calculates the `combined_score` using RRF and returns the top 5 results with highest `combined_score`.

This query uses a smoothing factor of 60 and weights full-text results at 0.7 and vector search results at 0.3. `NULL` ranks — which can arise from the outer join — are given an artificially high rank of 1000 so they don't contribute meaningfully to the score.

Notice that values with high ranks rise to the top (as with the (2,1) rank pair), but a poor ranking on one sorted input doesn't prevent an item from participating in the results at a high level, e.g., the (3,11) pair ranks 3 overall, better than (5,8) which ranks at 4.

## Ranking Functions

RRF requires the use of a ranking function in the CTEs. Three ranking functions are available: `RANK()`, `DENSE_RANK()`, and `ROW_NUMBER()`. These ranking functions differ in how they handle ties.

Tied scores are common in full-text search, particularly for short or common query terms. When using RRF, it is important to consider how the ranking function manages ties.

> **📝 Note**: SingleStore recommends using `RANK()` for the ranking function when using RRF.

The following table describes the ranking functions and their usage.

| Function Name  | Functionality                                                                                                                                                         | Usage                                                      |
| -------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------- |
| `RANK()`       | Returns the rank of the current row as specified by the`ORDER BY`clause.Assigns the same rank to tied rows. Skips subsequent ranks to account for the number of ties. | When ties matter and gaps in ranks are acceptable.         |
| `DENSE_RANK()` | Returns the rank of the current row as specified by the`ORDER BY`clause.Assigns the same rank to tied rows. Does not skip any ranks.                                  | When ties matter and gaps in ranks are not acceptable.     |
| `ROW_NUMBER()` | Returns the rank of the current row as specified by the`ORDER BY`clause.The assignment of row numbers among ties is non-deterministic.                                | When a unique ranking number on each output row is needed. |

The following table shows an example of the results of the three functions.

| Score | `RANK()` | `DENSE_RANK()` | `ROW_NUMBER()` |
| ----- | -------- | -------------- | -------------- |
| 100   | 1        | 1              | 1              |
| 90    | 2        | 2              | 2              |
| 90    | 2        | 2              | 3              |
| 80    | 4        | 3              | 4              |

## Example 2 - Hybrid Search of Comments

This example demonstrates a hybrid search over a comments table. The example combines:

* A semantic search using vector search to find comments that relate to a user's request.
* Full-text search to prioritize comments with a specific keyword.
* A SQL filter that restricts to the "Food" category.

Consider a table of comments that contains:

* `id` – an `INT` id
* `comment` – a `TEXT` field which stores the text of the comment
* `comment_embedding` –  a `VECTOR`, which stores a vector embedding that captures the meaning of the comment as a vector, as described in [Working with Vector Data](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-vector-data.md).
* `category` – a `VARCHAR` which stores a category assigned to the comment

Create the `comments` table and insert three rows into that table.

```sql
CREATE TABLE comments(id INT,
   comment TEXT,
   comment_embedding VECTOR(4),
   category VARCHAR(256));

```

```sql
INSERT INTO comments VALUES
(1, "The cafeteria in building 35 has a great salad bar", 
    '[0.45, 0.55, 0.495, 0.5]',
    "Food"),
(2, "I love the taco bar in the B16 cafeteria.",
    '[0.01111, 0.01111, 0.1, 0.999]',
    "Food"),
(3, "The B24 restaurant salad bar is quite good.",
    '[0.1, 0.8, 0.2, 0.555]',
    "Food");
```

Add a [full-text](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search.md) version 2 index to the table and run `OPTIMIZE TABLE` to ensure the values above are indexed.

```sql
ALTER TABLE comments ADD FULLTEXT USING VERSION 2 ft_comment(comment);

OPTIMIZE TABLE comments FLUSH;
```

The following SQL query searches for comments that match the word restaurant and are similar to a query vector (`'[0.44, 0.554, 0.34, 0.62]'`) which is intended to represent an embedding of the user's request. In addition, search is restricted to comments in the category `"Food"`.

The following variables are set prior to defining the query.

* `@query_comment` to hold the full-text match condition that the comment must include the word restaurant.
* `@query_vec` to hold the vector embedding from the user's request.

Similar to Example 1, the query is structured as follows:

* Two Common Table Expressions (CTEs) use indexed full-text and vector search to produce ordered lists of rows that meet the full-text and vector search criteria.

  * Both CTEs include only comments in the category `"Food"`.
* Two additional CTEs create ranked full-text and vector results.
* A combined score is calculated using the RRF formula, and the top 3 results are returned.

The `@query_vec` variable is cast to a `VECTOR` to ensure that `@query_vec` is a valid `VECTOR` and to improve performance.

```sql
SET @query_comment = 'comment:restaurant'; 
SET @query_vec = ('[0.44, 0.554, 0.34, 0.62]'):>VECTOR(4);

WITH fts AS ( /* Find top full-text matches. */
  SELECT
    id, 
    comment,
    MATCH(TABLE comments) AGAINST (@query_comment) AS ft_score
  FROM comments
  WHERE MATCH(TABLE comments) AGAINST(@query_comment)
  ORDER BY ft_score desc
  LIMIT 200
),
vs AS ( /* Find top vector search matches. <*> is dot product. */
  SELECT
    id,
    comment, 
    @query_vec <*> comment_embedding AS vec_score
  FROM comments
  WHERE category = "Food"
  ORDER BY vec_score desc
  LIMIT 200),
fts_ranked as (
  SELECT
    *,
    RANK() OVER (ORDER BY ft_score DESC) AS fts_rank
  FROM fts
),
vs_ranked as (
  SELECT
    *,
    RANK() OVER (ORDER BY vec_score DESC) as vs_rank
  FROM vs
)
SELECT
  *,
  0.7 * (1.0 / (NVL(fts_rank, 1000.0) + 60)) + 0.3 * (1.0 / (NVL(vs_rank, 1000.0) + 60)) AS combined_score
FROM fts_ranked f FULL OUTER JOIN vs_ranked v ON f.id = v.id
ORDER BY combined_score DESC
LIMIT 3;

```

```output

*** 1. row ***
            id: 3
       comment: The B24 restaurant salad bar is quite good.
      ft_score: 0.46706151962280273
      fts_rank: 1
            id: 3
       comment: The B24 restaurant salad bar is quite good.
     vec_score: 0.8993000388145447
       vs_rank: 2
combined_score: 0.016314
*** 2. row ***
            id: NULL
       comment: NULL
      ft_score: NULL
      fts_rank: NULL
            id: 1
       comment: The cafeteria in building 35 has a great salad bar
     vec_score: 0.9810000061988831
       vs_rank: 1
combined_score: 0.005578
*** 3. row ***
            id: NULL
       comment: NULL
      ft_score: NULL
      fts_rank: NULL
            id: 2
       comment: I love the taco bar in the B16 cafeteria.
     vec_score: 0.6644233465194702
       vs_rank: 3
combined_score: 0.005422
```

In the results, the row with `id = 3` has the word "restaurant" in it, so it gets a boost from the full-text score. As a result, it becomes the top-ranked item, ranked higher than the comment with `id = 1`. In contrast, if only the vector score was considered, the comment with `id = 1` would be ranked higher than the comment with `id = 3`.

## Related Topics

* Blog Post: [Why Your Vector Database Should Not Be a Vector Database](https://www.singlestore.com/blog/why-your-vector-database-should-not-be-a-vector-database/)
* Blog Post: [Announcing SingleStore Indexed ANN Vector Search](https://www.singlestore.com/blog/singlestore-indexed-ann-vector-search/)
* Blog Post: [Hybrid Search: Vector + Full-Text Search](https://www.singlestore.com/blog/hybrid-search-vector-full-text-search/)
* [Working with Vector Data](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-vector-data.md)
* [Working with Full-Text Search](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/working-with-full-text-search.md)
* [Tuning Vector Indexes and Queries](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/tuning-vector-indexes-and-queries.md)
* [Vector Type](https://docs.singlestore.com/db/v9.1/reference/sql-reference/data-types/vector-type.md)
* [Vector Indexing](https://docs.singlestore.com/db/v9.1/reference/sql-reference/vector-functions/vector-indexing.md)

***

Modified at: May 15, 2026

Source: [/db/v9.1/developer-resources/functional-extensions/hybrid-search-re-ranking-and-blending-searches/](https://docs.singlestore.com/db/v9.1/developer-resources/functional-extensions/hybrid-search-re-ranking-and-blending-searches/)

(An index of the documentation is available at /llms.txt)
