voyage-2 generates text embeddings by transforming raw text input into a fixed-length numerical vector that represents the semantic meaning of that text. From a developer’s point of view, the process is straightforward: you send a string (or a list of strings) to the voyage-2 embedding API, and the model returns a vector of floating-point numbers with a consistent dimensionality. Each vector encodes semantic relationships learned during the model’s training, so texts with similar meaning end up closer together in vector space, even if they use different words or phrasing.
Under the hood, voyage-2 applies a deep neural network architecture trained specifically for embedding tasks. The model processes the input text token by token, building internal representations that capture context and meaning across the entire sequence. These representations are then pooled and projected into a single dense vector. While developers don’t need to understand the internal layers to use the model, it helps to know that the output vector is deterministic for a given input and model version. This consistency is critical: embeddings generated today must be comparable to embeddings generated tomorrow, otherwise similarity search breaks down. That’s why developers typically re-embed all content if they ever change embedding models or configurations.
Once generated, embeddings are usually stored and queried rather than recomputed repeatedly. A common pattern is to generate embeddings in batch for documents and store them in a vector database such as Milvus or Zilliz Cloud. At query time, only the user’s query text needs to be embedded, and that single vector is compared against the stored vectors using similarity search. voyage-2 focuses entirely on producing high-quality, stable embeddings; the vector database handles indexing, distance calculations, and retrieval efficiency. This separation keeps systems simpler and easier to reason about.
For more information, click here: https://zilliz.com/ai-models/voyage-2