Skip to main content

Embeddings in EngramDB

EngramDB includes native support for generating vector embeddings from text. This feature makes it easy to create semantic memory nodes without needing to implement external embedding generation logic.

Getting Started with Embeddings

Basic Usage

The embedding functionality in EngramDB is centered around the EmbeddingService, which provides methods to convert text into vector embeddings:
use engramdb::{Database, EmbeddingService};

// Create the embedding service
let embedding_service = EmbeddingService::default();

// Create a database
let mut db = Database::in_memory();

// Create a memory from text
let text = "Artificial intelligence is transforming how we interact with technology";
let memory_id = db.create_memory_from_text(
    text,
    &embedding_service,
    Some("AI concepts"),
    None,
)?;

Python Usage

Embeddings are also available in the Python API:
from engramdb_py import Database, EmbeddingService

# Create the embedding service
embedding_service = EmbeddingService.default()

# Create a database
db = Database.in_memory()

# Create a memory from text
text = "Artificial intelligence is transforming how we interact with technology"
memory_id = db.create_memory_from_text(
    text=text,
    embedding_service=embedding_service,
    category="AI concepts",
    title="AI Introduction"
)

Embedding Services

EngramDB provides multiple ways to generate embeddings:

Default Service

The default embedding service will attempt to use model-based embeddings if available, and fall back to deterministic mock embeddings if not:
let embedding_service = EmbeddingService::default();

Model-Based Embeddings

You can explicitly request model-based embeddings, specifying the model to use:
// Use a specific model
let embedding_service = EmbeddingService::new_with_model(
    Some("intfloat/multilingual-e5-large-instruct")
);
EngramDB uses the E5 family of embedding models by default, which have excellent performance for semantic search tasks.

Mock Embeddings

For testing or when ML dependencies aren’t available, you can use mock embeddings:
// Create mock embeddings with 384 dimensions
let embedding_service = EmbeddingService::new_mock(384);

Multi-Vector Embeddings

EngramDB now supports multi-vector embeddings, inspired by models like ColBERT and ColPali:

What are Multi-Vector Embeddings?

Unlike traditional embeddings that represent an entire document as a single vector, multi-vector embeddings represent a document as a collection of vectors. This approach:
  • Captures different aspects of the document in separate vectors
  • Enables more nuanced similarity comparisons
  • Improves precision in semantic search

Using Multi-Vector Embeddings

use engramdb::embeddings::{MultiVectorProvider, MockMultiVectorProvider};

// Create a multi-vector provider (96-dimensional vectors, typically 20 per document)
let provider = MockMultiVectorProvider::new(96, 20);

// Generate multi-vector embeddings for a document
let document = "Artificial intelligence is transforming how we interact with technology";
let multi_vector = provider.generate_multi_vector_for_document(document, Some("AI"))?;

// Generate multi-vector embeddings for a query (typically fewer vectors than for documents)
let query = "How does AI impact technology?";
let query_vectors = provider.generate_multi_vector_for_query(query)?;

Similarity Metrics for Multi-Vectors

When using multi-vector embeddings, EngramDB supports specialized similarity functions:
// Max similarity: highest similarity between any pair of vectors
let max_sim = doc_embedding.max_similarity(&query_embedding);

// Average similarity: average similarity across all vector pairs
let avg_sim = doc_embedding.avg_similarity(&query_embedding);

// Late interaction: ColBERT-style scoring that finds the most similar vector for each query vector
let late_sim = doc_embedding.late_interaction_score(&query_embedding);
The late interaction score is particularly effective for retrieval tasks, as it allows different parts of the query to match different parts of the document.

Creating Memory Nodes from Text

Using the Database API

The simplest way to create memory nodes from text is to use the Database API:
// Create a memory node and save it to the database
let memory_id = db.create_memory_from_text(
    "This is the text content to embed",
    &embedding_service,
    Some("category"),
    Some(&attributes),
)?;

Using the MemoryNode API

You can also create a memory node directly from text:
// Create a memory node
let memory_node = MemoryNode::from_text(
    "This is the text content to embed",
    &embedding_service,
    Some("category"),
)?;

// Save it to the database
let memory_id = db.save(&memory_node)?;
Once you have memories created from text, you can perform semantic searches:
// Generate embeddings for a query
let query = "How does AI change technology?";
let query_embedding = embedding_service.generate_for_query(query)?;

// Search for similar memories
let results = db.search_similar(&query_embedding, 5, 0.7, None, None)?;

// Process results
for (memory_id, similarity) in results {
    let memory = db.load(memory_id)?;
    println!("Similarity: {}, Content: {}", 
        similarity, 
        memory.get_attribute("content")
            .and_then(|a| if let AttributeValue::String(s) = a { Some(s) } else { None })
            .unwrap_or("No content")
    );
}

Configuration and Options

Document vs. Query Embeddings

EngramDB distinguishes between document and query embeddings:
// Generate embeddings for storing a document
let doc_embeddings = embedding_service.generate_for_document(
    "Document text", 
    Some("category")
)?;

// Generate embeddings for a search query
let query_embeddings = embedding_service.generate_for_query(
    "Search query"
)?;
This distinction follows best practices for asymmetric semantic search.

Vector Similarity Metrics

EngramDB supports different similarity metrics:
  • Cosine Similarity: Measures the cosine of the angle between vectors (default)
  • Dot Product: Simple dot product of two vectors
  • Euclidean Distance: Measures the straight-line distance in the vector space
You can specify which metric to use in your vector index configuration:
let config = DatabaseConfig {
    // other configuration...
    vector_index_config: VectorIndexConfig {
        algorithm: VectorAlgorithm::HNSW,
        similarity_metric: SimilarityMetric::Cosine,
        hnsw: Some(HnswConfig::default()),
    },
};

Dimensions

You can check the dimensions of the embeddings:
let dimensions = embedding_service.dimensions();
println!("Embedding dimensions: {}", dimensions);
The default model produces 384-dimensional embeddings.

Supported Embedding Models

EngramDB supports multiple embedding models that you can choose from:

E5 Multilingual Large Instruct (Default)

// Rust
let service = EmbeddingService::with_model_type(EmbeddingModel::E5MultilingualLargeInstruct);
# Python
service = EmbeddingService.with_model_type(EmbeddingModelType.E5)
The intfloat/multilingual-e5-large-instruct model provides:
  • High-quality semantic embeddings (1024 dimensions)
  • Support for 100+ languages
  • Instruction-based embedding generation
  • Excellent performance on retrieval tasks

GTE Modern BERT Base

// Rust
let service = EmbeddingService::with_model_type(EmbeddingModel::GteModernBertBase);
# Python
service = EmbeddingService.with_model_type(EmbeddingModelType.GTE)
The Alibaba-NLP/gte-modernbert-base model provides:
  • Modern semantics updated with recent data (768 dimensions)
  • Strong performance on semantic search and similarity tasks
  • Efficient inference with a base-sized model

Jina Embeddings V3

// Rust
let service = EmbeddingService::with_model_type(EmbeddingModel::JinaEmbeddingsV3);
# Python
service = EmbeddingService.with_model_type(EmbeddingModelType.JINA)
The jinaai/jina-embeddings-v3 model provides:
  • State-of-the-art performance on various NLP tasks (768 dimensions)
  • Good cross-lingual capabilities
  • Well-optimized for retrieval tasks

Custom Models

You can also specify a custom model by providing the model name:
// Rust
let service = EmbeddingService::new_with_model(Some("sentence-transformers/all-MiniLM-L6-v2"));
# Python
service = EmbeddingService.with_model("sentence-transformers/all-MiniLM-L6-v2")
EngramDB will automatically determine the dimensions and other properties of the model.

Dependencies and Features

The embedding functionality is optional and protected by feature flags:
  • embeddings: Basic embedding support with mock providers
  • python: Python support for embeddings (requires PyO3)
To use model-based embeddings, you need the following Python packages:
  • torch
  • transformers
If these dependencies are not available, EngramDB will automatically fall back to mock embeddings.

Best Practices

  • Choose the right embedding type: Use single-vector embeddings for simple cases and multi-vector embeddings for more complex documents
  • Select appropriate dimensions: Larger dimensions (768-1536) capture more semantic nuance but use more memory
  • Consider your use case: For search, use smaller multi-vectors; for classification, single vectors often work well
  • Normalize your vectors: Especially important when using cosine similarity
  • Distinguish between document and query embeddings: This asymmetric approach often improves search results
  • Use domain-specific models when possible for better semantic understanding in your field

Examples

For complete examples of using embeddings with EngramDB, see: