TEI RankerCompatible with Milvus 2.6.x

The TEI Ranker leverages the Text Embedding Inference (TEI) service from Hugging Face to enhance search relevance through semantic reranking. It represents an advanced approach to search result ordering that goes beyond traditional vector similarity.

Prerequisites

Before implementing vLLM Ranker in Milvus, ensure you have:

A Milvus collection with a VARCHAR field containing the text to be reranked
A running TEI service with reranking capabilities. For detailed instructions on setting up a TEI service, refer to the official TEI documentation.

Create a TEI ranker function

To use TEI Ranker in your Milvus application, create a Function object that specifies how the reranking should operate. This function will be passed to Milvus search operations to enhance result ranking.

Python Java NodeJS Go cURL

from pymilvus import MilvusClient, Function, FunctionType

# Connect to your Milvus server
client = MilvusClient(
    uri="http://localhost:19530"  # Replace with your Milvus server URI
)

# Configure TEI Ranker
tei_ranker = Function(
    name="tei_semantic_ranker",            # Unique identifier for your ranker
    input_field_names=["document"],        # VARCHAR field containing text to rerank
    function_type=FunctionType.RERANK,     # Must be RERANK for reranking functions
    params={
        "reranker": "model",               # Enables model-based reranking
        "provider": "tei",                 # Specifies TEI as the service provider
        "queries": ["renewable energy developments"],  # Query text for relevance evaluation
        "endpoint": "http://localhost:8080",  # Your TEI service URL
        "max_client_batch_size": 32,                    # Optional: batch size for processing (default: 32)
        "truncate": True,                # Optional: Truncate the inputs that are longer than the maximum supported size
        "truncation_direction": "Right",    # Optional: Direction to truncate the inputs
    }
)

// java

// nodejs

// go

# restful

TEI ranker-specific parameters

The following parameters are specific to the TEI ranker:

Parameter	Required?	Description	Value / Example
`reranker`	Yes	Must be set to `"model"` to enable model reranking.	`"model"`
`provider`	Yes	The model service provider to use for reranking.	`"tei"`
`queries`	Yes	List of query strings used by the rerank model to calculate relevance scores. The number of query strings must match exactly the number of queries in your search operation (even when using query vectors instead of text), otherwise an error will be reported.	["search query"]
`endpoint`	Yes	Your TEI service URL.	`"http://localhost:8080"`
`max_client_batch_size`	No	Since model services may not process all data at once, this sets the batch size for accessing the model service in multiple requests.	`32` (default)
`truncate`	No	Whether to truncate inputs exceeding max sequence length. If `False`, long inputs raise errors.	`True` or `False`
`truncation_direction`	No	Direction to truncate from when input is too long: `"Right"` (default): Tokens are removed from the end of the sequence until the maximum supported size is matched. `"Left"`: Tokens are removed from the beginning of the sequence.	`"Right"` or `"Left"`

For general parameters shared across all model rankers (e.g., provider, queries), refer to Create a model ranker.

Apply to standard vector search

To apply TEI Ranker to a standard vector search:

Python Java NodeJS Go cURL

# Execute search with vLLM reranking
results = client.search(
    collection_name="your_collection",
    data=["AI Research Progress", "What is AI"],  # Search queries
    anns_field="dense_vector",                   # Vector field to search
    limit=5,                                     # Number of results to return
    output_fields=["document"],                  # Include text field for reranking
    ranker=tei_ranker,                         # Apply tei reranking
    consistency_level="Bounded"
)

// java

// nodejs

// go

# restful

Apply to hybrid search

TEI Ranker can also be used with hybrid search to combine dense and sparse retrieval methods:

Python Java NodeJS Go cURL

from pymilvus import AnnSearchRequest

# Configure dense vector search
dense_search = AnnSearchRequest(
    data=["AI Research Progress", "What is AI"],
    anns_field="dense_vector",
    param={},
    limit=5
)

# Configure sparse vector search  
sparse_search = AnnSearchRequest(
    data=["AI Research Progress", "What is AI"],
    anns_field="sparse_vector", 
    param={},
    limit=5
)

# Execute hybrid search with vLLM reranking
hybrid_results = client.hybrid_search(
    collection_name="your_collection",
    [dense_search, sparse_search],              # Multiple search requests
    ranker=tei_ranker,                        # Apply tei reranking to combined results
    limit=5,                                   # Final number of results
    output_fields=["document"]
)

// java

// nodejs

// go

# restful

TEI Ranker
Prerequisites
Create a TEI ranker function
TEI ranker-specific parameters
Apply to standard vector search
Apply to hybrid search

Try Managed Milvus for Free

Zilliz Cloud is hassle-free, powered by Milvus and 10x faster.

Get Started

Feedback

Was this page helpful?

TEI RankerCompatible with Milvus 2.6.x

Prerequisites

Create a TEI ranker function

TEI ranker-specific parameters

Apply to standard vector search

Apply to hybrid search

Table of contents

Try Managed Milvus for Free

Feedback