@sqlrooms/ai-rag
Retrieval Augmented Generation (RAG) slice for SQLRooms. Query vector embeddings stored in DuckDB for semantic search and AI-powered applications.
This package is designed to work with sqlrooms-rag, a Python package that prepares embedding/FTS-index databases for RAG search.
Refer to the ai-rag example for a complete working example.
Features
- 🔍 Hybrid Search - Combines vector similarity with full-text search (BM25) using Reciprocal Rank Fusion
- 🚀 Semantic Search - Query embeddings using vector similarity (cosine similarity)
- 🗄️ Multiple Databases - Attach and search across multiple embedding databases
- 🎯 Per-Database Embedding Providers - Each database can use a different embedding model
- ✅ Metadata Validation - Automatic validation of embedding dimensions and models
- 📊 DuckDB-Powered - Fast, in-process vector search with SQL and FTS
- 🔄 Flexible - Works with OpenAI, HuggingFace, Transformers.js, or custom embeddings
Installation
npm install @sqlrooms/ai-rag @sqlrooms/duckdb @sqlrooms/room-storeQuick Start
import {createDuckDbSlice} from '@sqlrooms/duckdb';
import {createRagSlice, createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
import {createRoomStore} from '@sqlrooms/room-store';
import {openai} from '@ai-sdk/openai';
// 1. Create an embedding provider (matches your database preparation)
const embeddingProvider = createAiEmbeddingProvider(
openai,
'text-embedding-3-small',
1536,
);
// 2. Configure your embedding databases
const embeddingsDatabases = [
{
databaseFilePathOrUrl: '/path/to/docs.duckdb',
databaseName: 'docs',
embeddingProvider,
embeddingDimensions: 1536,
},
];
// 3. Create the store with RAG capabilities
const store = createRoomStore({
slices: [
createDuckDbSlice({databasePath: ':memory:'}),
createRagSlice({embeddingsDatabases}),
],
});
// 4. Initialize and query
await store.getState().rag.initialize();
const results = await store
.getState()
.rag.queryByText('How do I create a table?', {topK: 5});
console.log(results);API Reference
createRagSlice(options)
Creates a RAG slice for your store.
Options
embeddingsDatabases- Array of embedding database configurations
Returns
A state creator function for Zustand.
EmbeddingDatabase
Configuration for an embedding database:
type EmbeddingDatabase = {
/** Path or URL to the DuckDB embedding database file */
databaseFilePathOrUrl: string;
/** Name to use when attaching the database */
databaseName: string;
/**
* Embedding provider for this database.
* MUST match the model used during database preparation.
*/
embeddingProvider: EmbeddingProvider;
/**
* Expected embedding dimensions (for validation).
* Optional but recommended.
*/
embeddingDimensions?: number;
};EmbeddingProvider
Function that converts text to embeddings:
type EmbeddingProvider = (text: string) => Promise<number[]>;Important: The embedding provider MUST match the model used when preparing the database. Check your database metadata to ensure compatibility.
Store Methods
rag.initialize()
Initialize RAG by attaching all embedding databases and validating metadata.
await store.getState().rag.initialize();rag.queryByText(text, options)
Query embeddings using text. By default, performs hybrid search combining vector similarity with full-text search (BM25) using Reciprocal Rank Fusion (RRF).
const results = await store.getState().rag.queryByText('search query', {
topK: 5, // Number of results to return (default: 5)
database: 'docs', // Database to search (default: first database)
hybrid: true, // Enable hybrid search (default: true)
// hybrid: false, // Disable hybrid search (vector-only)
// hybrid: 60, // Custom RRF k value (default: 60)
});Hybrid Search combines:
- Vector similarity: Semantic understanding of the query
- Full-text search (BM25): Keyword matching and ranking
- Reciprocal Rank Fusion: Smart combination of both result sets
This approach typically provides better results than vector-only search, especially for queries with specific keywords or technical terms.
Returns:
type EmbeddingResult = {
score: number; // Cosine similarity (0-1, higher is better)
text: string; // The matched text chunk
nodeId: string; // Unique identifier for the chunk
metadata?: Record<string, unknown>; // Optional metadata (file path, etc.)
};rag.queryEmbeddings(embedding, options)
Query embeddings using a pre-computed embedding vector.
const embedding = await embeddingProvider('search query');
const results = await store.getState().rag.queryEmbeddings(embedding, {
topK: 5,
database: 'docs',
});rag.getMetadata(databaseName)
Get metadata for a specific database:
const metadata = await store.getState().rag.getMetadata('docs');
console.log(metadata);
// {
// provider: 'openai',
// model: 'text-embedding-3-small',
// dimensions: 1536,
// chunkingStrategy: 'markdown-aware'
// }rag.listDatabases()
List all attached embedding databases:
const databases = store.getState().rag.listDatabases();
console.log(databases); // ['docs', 'tutorials', 'api']Multiple Databases
You can attach multiple embedding databases, each with its own embedding model:
import {openai} from '@ai-sdk/openai';
import {google} from '@ai-sdk/google';
import {createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
const embeddingsDatabases = [
{
databaseFilePathOrUrl: '/data/duckdb_docs.duckdb',
databaseName: 'duckdb_docs',
// OpenAI text-embedding-3-small (1536d)
embeddingProvider: createAiEmbeddingProvider(
openai,
'text-embedding-3-small',
1536,
),
embeddingDimensions: 1536,
},
{
databaseFilePathOrUrl: '/data/react_docs.duckdb',
databaseName: 'react_docs',
// OpenAI text-embedding-3-small with reduced dimensions (512d)
embeddingProvider: createAiEmbeddingProvider(
openai,
'text-embedding-3-small',
512,
),
embeddingDimensions: 512,
},
{
databaseFilePathOrUrl: '/data/python_docs.duckdb',
databaseName: 'python_docs',
// Google text-embedding-004 (768d)
embeddingProvider: createAiEmbeddingProvider(
google,
'text-embedding-004',
768,
),
embeddingDimensions: 768,
},
];Query a specific database:
// Query DuckDB docs
const duckdbResults = await store
.getState()
.rag.queryByText('How to create a table?', {
database: 'duckdb_docs',
});
// Query React docs
const reactResults = await store
.getState()
.rag.queryByText('How to use hooks?', {
database: 'react_docs',
});Hybrid Search
Hybrid search combines vector similarity (semantic understanding) with full-text search (keyword matching) to provide more accurate and comprehensive results.
How It Works
- Vector Search: Uses embedding similarity to find semantically related content
- Full-Text Search (BM25): Uses DuckDB's FTS extension for keyword-based ranking
- Reciprocal Rank Fusion (RRF): Intelligently combines both result sets
Benefits
- Better Recall: Finds results even when exact keywords aren't in the query
- Improved Precision: Keyword matching helps rank exact matches higher
- Balanced Results: RRF prevents either method from dominating unfairly
Configuration
// Default: Hybrid search enabled with k=60
const results = await store.getState().rag.queryByText('query', {
hybrid: true, // Enable hybrid search (default)
});
// Pure vector search only
const vectorOnly = await store.getState().rag.queryByText('query', {
hybrid: false, // Disable hybrid search
});
// Custom RRF k value (lower = more weight to top-ranked results)
const customRRF = await store.getState().rag.queryByText('query', {
hybrid: 60, // Custom k value for Reciprocal Rank Fusion
});When to Use What
- Hybrid (default): Best for most use cases, especially technical documentation
- Vector-only: When you want pure semantic matching without keyword bias
- Lower k value (e.g., 30): Give more weight to top-ranked results
- Higher k value (e.g., 100): More balanced combination, less bias to top results
Requirements
Hybrid search requires that the embedding database was prepared with FTS indexing enabled (which is the default in the Python prepare-embeddings command). If FTS is not available, the system automatically falls back to vector-only search.
Embedding Providers
The createAiEmbeddingProvider() function works with any provider from the Vercel AI SDK.
OpenAI
import {openai} from '@ai-sdk/openai';
import {createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
const embeddingProvider = createAiEmbeddingProvider(
openai,
'text-embedding-3-small',
1536,
);Google
import {google} from '@ai-sdk/google';
import {createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
const embeddingProvider = createAiEmbeddingProvider(
google,
'text-embedding-004',
768,
);Custom Provider
import {createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
// Any provider that implements the AiProvider interface
const embeddingProvider = createAiEmbeddingProvider(
myCustomProvider,
'my-model-id',
512,
);Multiple Providers Example
You can use different providers for different databases:
import {openai} from '@ai-sdk/openai';
import {google} from '@ai-sdk/google';
import {createAiEmbeddingProvider} from '@sqlrooms/ai-rag';
const embeddingsDatabases = [
{
databaseName: 'docs_openai',
databaseFilePathOrUrl: './embeddings/docs_openai.duckdb',
embeddingProvider: createAiEmbeddingProvider(
openai,
'text-embedding-3-small',
1536,
),
embeddingDimensions: 1536,
},
{
databaseName: 'docs_google',
databaseFilePathOrUrl: './embeddings/docs_google.duckdb',
embeddingProvider: createAiEmbeddingProvider(
google,
'text-embedding-004',
768,
),
embeddingDimensions: 768,
},
];Preparing Databases
Use the Python sqlrooms_rag package to prepare embedding databases:
# Install the package
pip install sqlrooms-rag
# Prepare embeddings with OpenAI
python -m sqlrooms_rag.cli prepare-embeddings \
docs/ \
-o embeddings.duckdb \
--provider openai \
--model text-embedding-3-small \
--embed-dim 1536
# Prepare embeddings with HuggingFace (local, free)
python -m sqlrooms_rag.cli prepare-embeddings \
docs/ \
-o embeddings.duckdb \
--provider huggingface \
--model BAAI/bge-small-en-v1.5See the Python package documentation for more details.
Database Schema
The embedding databases created by sqlrooms_rag have the following structure:
-- Main documents table with embeddings
CREATE TABLE documents (
node_id VARCHAR PRIMARY KEY,
text TEXT,
metadata_ JSON,
embedding FLOAT[], -- Vector embedding
doc_id VARCHAR -- Link to source document
);
-- Original source documents (full, unchunked)
CREATE TABLE source_documents (
doc_id VARCHAR PRIMARY KEY,
file_path VARCHAR,
file_name VARCHAR,
text TEXT,
metadata_ JSON,
created_at TIMESTAMP
);
-- Metadata about the embedding process
CREATE TABLE embedding_metadata (
key VARCHAR PRIMARY KEY,
value VARCHAR,
created_at TIMESTAMP
);Error Handling
try {
const results = await store.getState().rag.queryByText('search query', {
database: 'nonexistent',
});
} catch (error) {
// Error: Database "nonexistent" not found. Available: docs, tutorials
}
try {
const wrongDimEmbedding = new Array(384).fill(0);
await store.getState().rag.queryEmbeddings(wrongDimEmbedding, {
database: 'docs', // Expects 1536 dimensions
});
} catch (error) {
// Error: Dimension mismatch: query has 384 dimensions,
// but database "docs" expects 1536 dimensions
}Best Practices
Match Embedding Models: Always use the same embedding model and dimensions when querying as when preparing the database.
Check Metadata: Use
getMetadata()to verify the model and dimensions before querying.Dimension Validation: Provide
embeddingDimensionsin your database configuration for automatic validation.Database Naming: Use descriptive database names (e.g.,
duckdb_docs,react_docs) to easily identify them.Error Handling: Always wrap queries in try-catch blocks to handle dimension mismatches and missing databases.
Performance: For large databases, consider using reduced dimensions (e.g., 512 instead of 1536) for faster queries and lower costs.
Examples
See the examples/ai directory for complete examples:
src/embeddings.ts- OpenAI embedding provider implementationssrc/rag-example.ts- Comprehensive usage examplessrc/store.ts- Store configuration with RAG
License
MIT
Interfaces
Type Aliases
- EmbeddingResult
- EmbeddingProvider
- EmbeddingDatabase
- DatabaseMetadata
- QueryOptions
- RagSliceState
- AiProviderFactory
- RagToolParameters
- RagToolLlmResult
- RagToolAdditionalData
- RagToolContext
