What Embeddings Are and Why They Power Enterprise AI Search
A clear explanation of embeddings — the numerical representations that allow AI systems to understand semantic meaning — and why they are foundational to enterprise AI search and retrieval.
A clear explanation of vector databases — how they store and query embeddings — and the selection criteria that matter for technical leaders building enterprise AI systems.
A vector database is a data store purpose-built to store, index and query high-dimensional vectors — the numerical embeddings produced when AI models encode text, images or other content. Instead of retrieving rows that match exact conditions, a vector database retrieves items based on mathematical similarity to a query vector. This nearest-neighbour retrieval capability is the infrastructure that makes semantic search, retrieval-augmented generation and enterprise AI knowledge systems operationally viable at scale. For technical decision-makers designing AI architecture, understanding what vector databases do, how they differ and what matters for selection is foundational.
When an AI system encodes a document into an embedding, it produces a vector — typically a list of 768, 1024 or 1536 floating-point numbers. To find documents similar to a query, the system must compare the query vector against potentially millions of stored document vectors and return the most similar ones quickly.
This is computationally non-trivial. Exact nearest-neighbour search across millions of high-dimensional vectors requires computing the distance between the query and every stored vector — an operation that becomes prohibitively slow at scale. Vector databases solve this through approximate nearest-neighbour (ANN) indexing algorithms — notably HNSW (Hierarchical Navigable Small World graphs) and IVF (Inverted File Index) — that trade a small amount of recall accuracy for very large gains in query speed. In practice, well-configured ANN indices return the correct top results with 95–99% accuracy in milliseconds, even against very large corpora.
Vector databases also handle the operational requirements of production AI systems: metadata storage and filtering, access control, index updates as new documents are added, and integration with application layers through standard APIs.
The vector database is not the most visible component of an AI system — users interact with the model's outputs, not the retrieval infrastructure. But it is a consequential architectural choice. A poorly chosen or misconfigured vector database will limit retrieval quality, introduce latency bottlenecks under production load, create data sovereignty complications, or generate unexpected operational costs at scale.
For Australian enterprises in regulated sectors — financial services, healthcare, government — data residency is often a non-negotiable constraint. The Privacy Act 1988 and sector-specific obligations may require that personal data remains within Australian borders or at minimum within defined cloud regions. Not all managed vector database services offer Australian region hosting, and organisations that default to the nearest readily available service may discover a compliance gap after system deployment.
A vector database system operates across two phases:
Indexing: When documents are added to the system, their pre-computed embeddings are inserted into the database. The database builds or updates an ANN index structure. HNSW builds a multi-layer graph where each node is connected to its nearest neighbours at each layer; queries traverse the graph starting from a high-level layer and progressively refine until the nearest neighbours are identified. IVF-based indices cluster vectors into groups (Voronoi cells) and search only the most likely clusters for a given query, dramatically reducing computation.
Querying: The query embedding is compared against the index using the configured similarity metric — most commonly cosine similarity (which measures the angle between vectors, normalised for magnitude) or dot product (unnormalised). The top-k most similar vectors are returned, along with their associated metadata and original text.
Most production implementations use hybrid retrieval: combining vector similarity search with keyword (BM25 or TF-IDF) search, then fusing the result sets. This is more robust than either approach alone, because some queries are better served by exact term matching (specific product codes, named entities, regulatory identifiers) while others benefit from semantic similarity.
Metadata filtering narrows the vector search before or during the ANN search — returning only vectors associated with documents of a certain type, date range, source or access permission. This is critical for enterprise use cases where retrieval must respect document-level access controls.
Vector database selection for enterprise deployments involves several dimensions:
| Dimension | Considerations |
|---|---|
| Scale | How many vectors at launch, and in 2 years? Some services degrade in performance or cost efficiency at very large indices. |
| Operational model | Managed cloud service (lower ops burden) vs self-hosted (more control, data sovereignty). |
| Data residency | Does the service offer an Australian or specified cloud region? Critical for regulated sectors. |
| Existing stack | Postgres-heavy stack? pgvector may reduce operational complexity. Greenfield? Evaluate purpose-built options. |
| Hybrid search support | Does the database natively support combined vector + keyword search, or does hybrid retrieval require external orchestration? |
| Access controls | Does the database support row-level or namespace-level access controls aligned to your permission model? |
Common purpose-built vector databases include Pinecone (managed cloud), Weaviate (managed or self-hosted), Qdrant (managed or self-hosted), Milvus (open source, self-hosted or managed) and Chroma (lightweight, suited to development and small-scale production). PostgreSQL with the pgvector extension is a practical choice for organisations whose existing infrastructure is Postgres-based and whose vector volumes are in the low tens of millions.
Organisations with data sovereignty requirements should confirm, in writing, the specific cloud regions in which their data is processed and stored before committing to any managed service. Edison AI's AI implementation practice routinely includes a data residency assessment as part of vector database selection for Australian enterprise clients.
Before selecting a vector database, document your requirements across three dimensions: estimated vector count at production scale, data residency obligations, and existing infrastructure preferences. Share these with the implementation team before any vendor evaluation begins. If your organisation operates under sector-specific data obligations, confirm cloud region availability with each candidate service vendor before shortlisting.
Edison AI builds bespoke AI systems — including retrieval over your own documents — for Australian businesses.
A vector database is a data store purpose-built to store, index and query high-dimensional vectors — the numerical representations (embeddings) produced by AI models. Unlike relational databases that retrieve rows matching exact conditions, vector databases retrieve items based on mathematical similarity to a query vector.
A relational database retrieves records by matching structured conditions (WHERE column = value). A vector database retrieves records by approximate nearest-neighbour search: finding stored vectors that are mathematically closest to a query vector. Some databases (such as PostgreSQL with pgvector) support both modes, but dedicated vector databases are optimised specifically for high-dimensional similarity search at scale.
The right choice depends on scale, operational model and existing infrastructure. Managed cloud services (Pinecone, Weaviate Cloud, Qdrant Cloud) minimise operational overhead. PostgreSQL with pgvector suits organisations whose stack is already Postgres-based and whose vector volumes are moderate. Self-hosted options (Weaviate, Qdrant, Milvus) suit organisations with data sovereignty requirements.
Edison AI helps Australian businesses move from AI curiosity to practical implementation, with workflow design, team training and measurable outcomes. Tell us about your setup and we'll come back with a sequenced plan grounded in the same thinking you just read.
Article: Vector Databases Explained for Technical Decision-Makers