Definition
An Embedding is a dense numerical vector representation of text (or other data) produced by a neural network trained to capture semantic meaning in a geometric space. Texts with similar meanings produce vectors that are geometrically close to each other, making it possible to compute semantic similarity using mathematical operations (cosine similarity, dot product) rather than exact string matching. Embeddings are the foundational technology that enables semantic search, retrieval-augmented generation, and vector-based agent memory.
Engineering Context
Embeddings are the bridge between text and vector search. An embedding model (such as text-embedding-3-large, BGE-M3, or Cohere Embed) converts any text to a fixed-size vector—for example, 1536 dimensions. Engineering decisions affecting embedding quality and cost: embedding model selection (larger models generally produce higher-quality embeddings at higher cost), dimensionality (higher dimensionality captures more nuance but increases index size and query latency), and the need to re-embed documents when the source changes (embeddings are tied to the model version—upgrading models requires re-indexing). For multilingual applications, multilingual embedding models (BGE-M3, Cohere Embed multilingual) are essential, as language-specific models perform poorly on cross-lingual queries.
Related Terms
Building production AI agents?
We design and implement deterministic AI agent systems for enterprise teams.
Start Assessment