Vector Index API Reference

The Vector Index is responsible for storing and searching vector embeddings in EngramDB. This document provides a detailed reference for the Vector Index API. EngramDB supports two primary vector index implementations:

Linear Index (VectorIndex): Simple brute-force search for small to medium datasets
HNSW Index (HnswIndex): Hierarchical Navigable Small World graph for fast approximate search in large datasets

Both implementations share a common interface through the VectorSearchIndex trait, allowing them to be used interchangeably.

VectorSearchIndex Trait

The VectorSearchIndex trait defines the common interface for all vector index implementations.

Common Methods

All vector indices implement these methods:

add(node): Adds a memory node to the index
remove(id): Removes a memory node from the index
update(node): Updates a memory node in the index
search(query, limit, threshold): Searches for similar vectors
len(): Returns the number of vectors in the index
is_empty(): Checks if the index is empty
get(id): Gets a reference to a vector by its ID

VectorIndex

The VectorIndex class provides a simple linear search implementation for vector embeddings.

Creating a Vector Index

`VectorIndex::new()`

Creates a new empty vector index. Returns:

A new VectorIndex instance

Example:

let mut vector_index = VectorIndex::new();

Managing Vectors

`add(node)`

Adds a memory node to the index. Parameters:

node: &MemoryNode - The memory node to add

Returns:

Result<()> - Success or an error

Example:

let memory = MemoryNode::new(vec![0.1, 0.2, 0.3, 0.4]);
vector_index.add(&memory)?;

`remove(id)`

Removes a memory node from the index. Parameters:

id: Uuid - The UUID of the memory node to remove

Returns:

Result<()> - Success or an error

Example:

vector_index.remove(memory_id)?;

`update(node)`

Updates a memory node in the index. Parameters:

node: &MemoryNode - The memory node to update

Returns:

Result<()> - Success or an error

Example:

memory.set_embeddings(vec![0.2, 0.3, 0.4, 0.5]);
vector_index.update(&memory)?;

Searching

`search(query, limit, threshold)`

Performs a similarity search to find the most similar vectors. Parameters:

query: &[f32] - The query vector
limit: usize - Maximum number of results to return
threshold: f32 - Minimum similarity threshold (0.0 to 1.0)

Returns:

Result<Vec<(Uuid, f32)>> - A vector of (UUID, similarity) pairs, sorted by descending similarity

Example:

let query_vector = vec![0.15, 0.25, 0.35, 0.45];
let results = vector_index.search(&query_vector, 5, 0.0)?;

for (memory_id, similarity) in results {
    println!("Memory ID: {}, Similarity: {:.4}", memory_id, similarity);
}

Utility Methods

`len()`

Returns the number of vectors in the index. Returns:

usize - The number of vectors

Example:

let count = vector_index.len();
println!("Index contains {} vectors", count);

`is_empty()`

Checks if the index is empty. Returns:

bool - True if the index is empty, false otherwise

Example:

if vector_index.is_empty() {
    println!("Index is empty");
}

`get(id)`

Gets a reference to a vector by its ID. Parameters:

id: Uuid - The UUID of the memory node

Returns:

Option<&Vec<f32>> - A reference to the vector if found, otherwise None

Example:

if let Some(vector) = vector_index.get(memory_id) {
    println!("Vector: {:?}", vector);
}

Similarity Functions

EngramDB provides several similarity functions for comparing vectors.

`cosine_similarity(a, b)`

Computes the cosine similarity between two vectors. Parameters:

a: &[f32] - First vector
b: &[f32] - Second vector

Returns:

Option<f32> - The cosine similarity value between the two vectors (between -1 and 1), or None if either vector is empty

Example:

let vec1 = vec![0.1, 0.2, 0.3];
let vec2 = vec![0.2, 0.3, 0.4];

if let Some(similarity) = cosine_similarity(&vec1, &vec2) {
    println!("Cosine similarity: {:.4}", similarity);
}

`dot_product(a, b)`

Computes the dot product between two vectors. Parameters:

a: &[f32] - First vector
b: &[f32] - Second vector

Returns:

f32 - The dot product value, or 0.0 if the vectors have different lengths

Example:

let vec1 = vec![0.1, 0.2, 0.3];
let vec2 = vec![0.2, 0.3, 0.4];

let dot = dot_product(&vec1, &vec2);
println!("Dot product: {:.4}", dot);

`euclidean_distance(a, b)`

Computes the Euclidean distance between two vectors. Parameters:

a: &[f32] - First vector
b: &[f32] - Second vector

Returns:

Option<f32> - The Euclidean distance between the two vectors, or None if the vectors have different lengths

Example:

let vec1 = vec![0.1, 0.2, 0.3];
let vec2 = vec![0.2, 0.3, 0.4];

if let Some(distance) = euclidean_distance(&vec1, &vec2) {
    println!("Euclidean distance: {:.4}", distance);
}

HnswIndex

The HnswIndex class provides a Hierarchical Navigable Small World (HNSW) graph implementation for fast approximate nearest neighbor search. This is particularly useful for large datasets where linear search becomes prohibitively expensive.

Creating an HNSW Index

`HnswIndex::new()`

Creates a new HNSW index with default parameters. Returns:

A new HnswIndex instance

Example:

let mut hnsw_index = HnswIndex::new();

`HnswIndex::with_config(config)`

Creates a new HNSW index with custom parameters. Parameters:

config: HnswConfig - Configuration parameters for the HNSW algorithm

Returns:

A new HnswIndex instance with the specified configuration

Example:

let config = HnswConfig {
    m: 16,                  // Maximum connections per node per layer
    ef_construction: 100,   // Size of dynamic candidate list during construction
    ef: 10,                 // Size of dynamic candidate list during search
    level_multiplier: 1.0,  // Base level for multi-layer construction
    max_level: 16,          // Maximum level in the hierarchy
};
let mut hnsw_index = HnswIndex::with_config(config);

HNSW Configuration

The HnswConfig struct allows you to configure the HNSW algorithm:

m: Maximum number of connections per node per layer (default: 16)
ef_construction: Size of the dynamic candidate list during index construction (default: 100)
ef: Size of the dynamic candidate list during search (default: 10)
level_multiplier: Base level for the multi-layer construction (default: 1.0/ln(2))
max_level: Maximum level in the hierarchy (default: 16)

Increasing m and ef_construction improves search quality at the cost of increased memory usage and indexing time. Increasing ef improves search quality at the cost of search speed.

Using Vector Indices with Database

EngramDB’s Database class supports different vector index algorithms through its configuration.

Creating a Database with Linear Search (Default)

// Create an in-memory database with the default linear search index
let db = Database::in_memory();

Creating a Database with HNSW Index

// Create an in-memory database with the HNSW index
let db = Database::in_memory_with_hnsw();

// Alternatively, create a file-based database with HNSW
let db = Database::file_based_with_hnsw("./my_database")?;

Custom Vector Index Configuration

// Create a database with custom HNSW parameters
let config = DatabaseConfig {
    storage_type: StorageType::Memory,
    storage_path: None,
    cache_size: 100,
    vector_index_config: VectorIndexConfig {
        algorithm: VectorAlgorithm::HNSW,
        hnsw: Some(HnswConfig {
            m: 32,                  // More connections per node
            ef_construction: 200,   // More neighbors during construction
            ef: 50,                 // More candidates during search
            level_multiplier: 1.0,
            max_level: 16,
        }),
    },
};
let db = Database::new(config)?;

Performance Comparison

The HNSW index provides significant performance improvements over linear search, especially for larger datasets:

Dataset Size	Linear Search (ms)	HNSW Search (ms)	Speedup
1,000	5	1	5×
10,000	50	2	25×
100,000	500	5	100×
1,000,000	5,000	10	500×

These are approximate values and will vary depending on vector dimensions, hardware, and specific HNSW parameters.

Example: Vector Search Workflow

Here’s a complete example of how to use vector search:

// Create a database with HNSW index for faster search
let mut db = Database::in_memory_with_hnsw();

// Create and add memory nodes
let memory1 = MemoryNode::new(vec![1.0, 0.0, 0.0]); // X-axis
let memory2 = MemoryNode::new(vec![0.7, 0.7, 0.0]); // 45 degrees in XY plane
let memory3 = MemoryNode::new(vec![0.0, 1.0, 0.0]); // Y-axis
let memory4 = MemoryNode::new(vec![0.0, 0.0, 1.0]); // Z-axis

let id1 = db.save(&memory1)?;
let id2 = db.save(&memory2)?;
let id3 = db.save(&memory3)?;
let id4 = db.save(&memory4)?;

// Search for something closer to the X-axis
let query = vec![0.9, 0.1, 0.0];
let results = db.search_similar(&query, 2, 0.0)?;

// Process results
for (id, similarity) in results {
    let memory = db.load(id)?;
    println!("Memory ID: {}, Similarity: {:.4}", id, similarity);
}

Getting Started

API Reference

Examples

Advanced Topics

Contributing

Vector Index API Reference

Vector Index API Reference

VectorSearchIndex Trait

Common Methods

VectorIndex

Creating a Vector Index

`VectorIndex::new()`

Managing Vectors

`add(node)`

`remove(id)`

`update(node)`

Searching

`search(query, limit, threshold)`

Utility Methods

`len()`

`is_empty()`

`get(id)`

Similarity Functions

`cosine_similarity(a, b)`

`dot_product(a, b)`

`euclidean_distance(a, b)`

HnswIndex

Creating an HNSW Index

`HnswIndex::new()`

`HnswIndex::with_config(config)`

HNSW Configuration

Using Vector Indices with Database

Creating a Database with Linear Search (Default)

Creating a Database with HNSW Index

Custom Vector Index Configuration

Performance Comparison

Example: Vector Search Workflow

Getting Started

API Reference

Examples

Advanced Topics

Contributing

Documentation Index

​Vector Index API Reference

​VectorSearchIndex Trait

​Common Methods

​VectorIndex

​Creating a Vector Index

​VectorIndex::new()

​Managing Vectors

​add(node)

​remove(id)

​update(node)

​Searching

​search(query, limit, threshold)

​Utility Methods

​len()

​is_empty()

​get(id)

​Similarity Functions

​cosine_similarity(a, b)

​dot_product(a, b)

​euclidean_distance(a, b)

​HnswIndex

​Creating an HNSW Index

​HnswIndex::new()

​HnswIndex::with_config(config)

​HNSW Configuration

​Using Vector Indices with Database

​Creating a Database with Linear Search (Default)

​Creating a Database with HNSW Index

​Custom Vector Index Configuration

​Performance Comparison

​Example: Vector Search Workflow

Vector Index API Reference

VectorSearchIndex Trait

Common Methods

VectorIndex

Creating a Vector Index

`VectorIndex::new()`

Managing Vectors

`add(node)`

`remove(id)`

`update(node)`

Searching

`search(query, limit, threshold)`

Utility Methods

`len()`

`is_empty()`

`get(id)`

Similarity Functions

`cosine_similarity(a, b)`

`dot_product(a, b)`

`euclidean_distance(a, b)`

HnswIndex

Creating an HNSW Index

`HnswIndex::new()`

`HnswIndex::with_config(config)`

HNSW Configuration

Using Vector Indices with Database

Creating a Database with Linear Search (Default)

Creating a Database with HNSW Index

Custom Vector Index Configuration

Performance Comparison

Example: Vector Search Workflow