Verified by Garnet Grid

Vector Embeddings & Semantic Search

Build semantic search systems with embeddings. Covers embedding models, vector databases, similarity search, hybrid search, RAG pipelines, and embedding optimization.

Traditional keyword search fails when users search for “cheap flights” but the content says “affordable airfare.” Semantic search understands meaning, not just words. It converts text into high-dimensional vectors (embeddings) and finds similar vectors — connecting intent to content regardless of exact wording.


How Embeddings Work

"How to deploy a Docker container"
         │ Embedding Model

[0.023, -0.156, 0.891, 0.034, ..., -0.445]   (1536 dimensions)

"Steps to run a containerized application"
         │ Same Model

[0.019, -0.148, 0.887, 0.041, ..., -0.439]   (similar vector!)

Cosine similarity: 0.94 (very similar meaning)

Embedding Models

ModelDimensionsSpeedQualityCost
text-embedding-3-small (OpenAI)1536FastGood$
text-embedding-3-large (OpenAI)3072MediumBest$$
multilingual-e5-large1024FastGood (multilingual)Free (self-host)
BGE-large-en-v1.51024FastGoodFree (self-host)
Cohere embed-v31024FastGood$
Voyage-31024FastExcellent$$

RAG Pipeline

User Query: "How do I handle database migrations in production?"


┌─────────────────┐
│ Embed Query      │  → [0.12, -0.34, 0.78, ...]
└────────┬────────┘


┌─────────────────┐
│ Vector Search    │  Top 5 most similar documents
│ (Pinecone/       │  from your knowledge base
│  Weaviate/Chroma)│
└────────┬────────┘


┌─────────────────┐
│ LLM Generation  │  "Based on these docs, here's how
│ (GPT-4, Claude) │   to handle database migrations..."
│                  │
│ Context:         │
│ [retrieved docs] │
└─────────────────┘

# Combine keyword (BM25) + semantic (vector) search
from pinecone import Pinecone

# Semantic search results
semantic_results = index.query(
    vector=embed("database migration best practices"),
    top_k=20,
    include_metadata=True
)

# Keyword search results  
keyword_results = bm25_search("database migration production")

# Reciprocal Rank Fusion (RRF) to combine
def reciprocal_rank_fusion(results_lists, k=60):
    scores = {}
    for results in results_lists:
        for rank, result in enumerate(results):
            doc_id = result.id
            scores[doc_id] = scores.get(doc_id, 0) + 1 / (k + rank + 1)
    
    return sorted(scores.items(), key=lambda x: x[1], reverse=True)

final_results = reciprocal_rank_fusion([semantic_results, keyword_results])

Anti-Patterns

Anti-PatternProblemFix
Embedding entire documentsContext lost in averagingChunk documents (200-500 tokens per chunk)
No chunk overlapContext split across chunk boundaries10-20% overlap between consecutive chunks
Wrong embedding modelPoor retrieval qualityBenchmark models on your data, use MTEB leaderboard
Vector search onlyMisses exact keyword matchesHybrid search (vector + BM25)
No rerankingTop results not always most relevantRerank top-20 with cross-encoder
Stale embeddingsContent updated but embeddings not refreshedRe-embed on content change

Checklist

  • Embedding model selected (benchmark on your data)
  • Chunking strategy: 200-500 tokens, 10-20% overlap
  • Vector database chosen (Pinecone, Weaviate, Chroma, pgvector)
  • Hybrid search: vector + keyword for best recall
  • Reranking: cross-encoder on top-K results
  • Metadata filtering: narrow search by category/date
  • Embedding refresh pipeline for updated content
  • Evaluation: retrieval quality metrics (recall@k, MRR)

:::note[Source] This guide is derived from operational intelligence at Garnet Grid Consulting. For AI/ML consulting, visit garnetgrid.com. :::

Jakub Dimitri Rezayev
Jakub Dimitri Rezayev
Founder & Chief Architect • Garnet Grid Consulting

Jakub holds an M.S. in Customer Intelligence & Analytics and a B.S. in Finance & Computer Science from Pace University. With deep expertise spanning D365 F&O, Azure, Power BI, and AI/ML systems, he architects enterprise solutions that bridge legacy systems and modern technology — and has led multi-million dollar ERP implementations for Fortune 500 supply chains.

View Full Profile →