Quickstart
Get from zero to working ingest and retrieve in under 10 minutes.
This guide walks you through the complete setup from a fresh project to your first successful retrieval. By the end, you'll have content stored in Postgres and be able to query it with semantic search.
Step 1: Initialize UnRAG
Run the CLI with your preferences. For this quickstart, we'll use Drizzle as the store adapter:
bunx unrag init --yes --store drizzle --dir lib/unrag --alias @unragThis creates your configuration file, installs the vendored module code, and adds the necessary dependencies to your package.json. Run your package manager to install them:
bun install # or: npm install / pnpm install / yarnStep 2: Set environment variables
UnRAG needs two environment variables to function. Add them to your .env file (or however you manage secrets):
# Your Postgres connection string
DATABASE_URL="postgresql://user:password@localhost:5432/mydb"
# API key for embedding (required by the AI SDK)
AI_GATEWAY_API_KEY="your-openai-api-key"The AI_GATEWAY_API_KEY is used by the default embedding provider, which calls OpenAI's text-embedding-3-small model. If you want to use a different model, you can also set AI_GATEWAY_MODEL:
AI_GATEWAY_MODEL="openai/text-embedding-3-large"Step 3: Create the database schema
Connect to your Postgres database and run the schema creation SQL. You can find the complete schema in lib/unrag/unrag.md, but here's the quick version:
-- Enable pgvector
create extension if not exists vector;
-- Create tables
create table documents (
id uuid primary key,
source_id text not null,
content text not null,
metadata jsonb,
created_at timestamp default now()
);
create table chunks (
id uuid primary key,
document_id uuid not null references documents(id) on delete cascade,
source_id text not null,
idx integer not null,
content text not null,
token_count integer not null,
metadata jsonb,
created_at timestamp default now()
);
create table embeddings (
chunk_id uuid primary key references chunks(id) on delete cascade,
embedding vector,
embedding_dimension integer,
created_at timestamp default now()
);
-- Add indexes for common queries
create index if not exists chunks_source_id_idx on chunks(source_id);
create index if not exists documents_source_id_idx on documents(source_id);Step 4: Ingest your first content
Create a simple script to test ingestion. This can be a standalone file or part of your application:
import { createUnragEngine } from "@unrag/config";
async function ingestDemo() {
const engine = createUnragEngine();
// Ingest a piece of content
const result = await engine.ingest({
sourceId: "docs:quickstart",
content: `
UnRAG is a RAG installer for TypeScript projects. It installs a small,
composable module into your codebase as vendored source files. Instead
of depending on an external SDK, you own the code and can modify it
to fit your needs.
The two core operations are ingest and retrieve. Ingestion takes content,
splits it into chunks, generates embeddings for each chunk, and stores
everything in Postgres with pgvector. Retrieval takes a query, embeds it,
and finds the most similar chunks using vector similarity search.
`,
metadata: {
section: "getting-started",
language: "en"
},
});
console.log("Ingested document:", result.documentId);
console.log("Created chunks:", result.chunkCount);
console.log("Embedding model:", result.embeddingModel);
console.log("Timings:", result.durations);
}
ingestDemo().catch(console.error);Run this script, and you should see output showing the document was chunked and stored. The durations object tells you how long each phase took—chunking, embedding, and storage—which is helpful for understanding performance characteristics.
Step 5: Retrieve content
Now query the content you just ingested:
import { createUnragEngine } from "@unrag/config";
async function retrieveDemo() {
const engine = createUnragEngine();
const result = await engine.retrieve({
query: "What are the core operations in UnRAG?",
topK: 5,
});
console.log("Found", result.chunks.length, "chunks");
console.log("Embedding model:", result.embeddingModel);
console.log("Timings:", result.durations);
for (const chunk of result.chunks) {
console.log("\n---");
console.log("Score:", chunk.score);
console.log("Source:", chunk.sourceId);
console.log("Content:", chunk.content.substring(0, 200) + "...");
}
}
retrieveDemo().catch(console.error);The score field represents the distance between the query embedding and each chunk's embedding. Lower scores mean higher similarity (the chunks are "closer" to your query in the embedding space).
Understanding the results
When you run retrieval, you get back an object containing:
- chunks: An array of the most relevant chunks, each with its content, metadata, source information, and similarity score
- embeddingModel: The model used to embed the query (useful for debugging if you have multiple models)
- durations: Timing breakdown showing how long embedding and retrieval took
The chunks are sorted by score ascending, so the first chunk is the most similar to your query. You can use these chunks to build context for an LLM prompt, populate a search results page, or power any other retrieval-based feature.
Where to customize
Open unrag.config.ts to adjust settings. The most common customizations are:
-
Chunking parameters: Adjust
chunkSizeandchunkOverlapto change how documents are split. Smaller chunks give more precise retrieval but cost more to embed. -
Default topK: Change how many chunks are returned by default when you don't specify
topKin your retrieve call. -
Embedding model: Switch to a different model by changing the
modelfield. Larger models liketext-embedding-3-largemay give better results at higher cost. -
Database connection: If you already have a database pool or client configured elsewhere in your app, you can modify
createUnragEngine()to use that instead of creating a new one.