Changelog
What's new in Unrag, features, fixes, and breaking changes as they ship.
0.3.3
Patch Changes
- Advanced chunking methods: semantic, hierarchical, markdown, code, and agentic chunkers
- Shared utilities for LLM-based and text-based chunking operations
- Upgrade CLI crashes due to path resolution conflict on Windows
0.3.2
Patch Changes
- Improved upgrade command UX: Added verbose mode (
-v/--verbose) and detailed conflict information during upgrades - Slimmer initial scaffold: Reduced initial project size by consolidating embedding providers into a single registry-based system
- Improved maintainability of scaffolded code with cleaner file structure
- Enhanced upgrade snapshot diffing with better conflict detection
- Updated CLI registry logic for improved component resolution
0.3.1
Patch Changes
- Fixed a bug in OneDrive connector causing indefinite polling after successful ingestion
0.3.0
Minor Changes
- New OneDrive connector with Microsoft Graph API integration
- New Dropbox connector with full OAuth2 authentication flow
- Added new capabilities in Google Drive connector to sync entire folders and file paths
Breaking Changes
syncPagesandsyncFilesmethods are replaced bystreamPagesandstreamFilesonProgresscallbacks are removed in favor of onEvent via runConnectorStream, check here.- The
engineparameter is no longer passed to connector functions; instead, pass the stream toengine.runConnectorStream()
0.2.12
Patch Changes
- Added
upgradecommand with three-way merge support for seamless code upgrades - Added
--versionflag for version checking in CLI - Slimmed down generated config file for minimal installations
0.2.11
Patch Changes
- Debug Panel with TUI for Real-time RAG Pipeline Monitoring: A powerful new terminal-based debugging interface for monitoring and troubleshooting your RAG pipelines in real-time.
- Interactive TUI Dashboard: Live metrics, connection status, and pipeline health at a glance
- Event Tracing: Real-time event streaming with detailed inspection for ingest, retrieve, rerank, and delete operations
- Query Runner: Execute and test retrieval queries directly from the debug panel
- Doctor Panel: Built-in diagnostics to validate your Unrag configuration
- Eval Panel: Run and monitor evaluation suites interactively
- Ingest Panel: Trigger and observe document ingestion workflows
- Docs Panel: Quick-access documentation viewer
- Universal TypeScript Import Alias Support: Internal refactoring to support path aliases across the codebase for cleaner, more maintainable imports.
- Introduced tsconfig.json path aliases (@/...) for internal modules
- Refactored all registry modules to use the new alias convention
- Consolidated vector store adapters: drizzle-postgres-pgvector → drizzle, prisma-postgres-pgvector → prisma, raw-sql-postgres-pgvector → raw-sql
0.2.10
Patch Changes
- Fixed numerous type issues in vendored code causing build failures
- Fixed reranker response property naming bug causing reranking to fail
0.2.9
Minor Changes
- Added Evaluation harness battery for retrieval quality measurement (experimental). This new eval module is installable via
unrag add battery eval. It provides systematic, reproducible metrics for measuring and tracking retrieval quality:- Dataset format (
EvalDatasetV1) for defining documents, queries, and ground truth relevance labels - Core metrics: Precision@K, Recall@K, MRR, MAP, and NDCG
runEval()function to execute evaluations with optional reranking pass- JSON and Markdown report generation with
writeEvalReport()andwriteEvalSummaryMd() - Diff comparison between runs via
diffEvalReports()for tracking regressions - Configurable thresholds with
--fail-belowfor CI integration - Comprehensive documentation at
/docs/eval
- Dataset format (
- Batteries now included in preset installs. The
unrag init --presetflow now supports installing batteries (likeevalandreranker) alongside extractors and connectors. - Resolved an issue where image URLs passed to multi-modal embedding were not being fetched through the configured fetch policy, causing failures when embedding remote images. Image URLs are now properly resolved to bytes via getAssetBytes() before embedding.
0.2.8
Minor Changes
- Reranker battery with Cohere and custom reranker support: Added a new reranking system to improve search result relevance. Includes:
- Built-in Cohere reranker integration (
rerank/cohere) - Custom reranker support for bring-your-own-reranker scenarios (
rerank/custom) - New
rerank()function in the context engine for post-retrieval result optimization - Configurable via
unrag.config.tsunderengine.reranker - Comprehensive documentation at
/docs/batteries/reranker
- Built-in Cohere reranker integration (
unrag doctorcommand for installation validation and troubleshooting: New diagnostic CLI command that validates your Unrag installation and identifies common issues:- Scans
unrag.config.tsfor configuration problems - Validates database connectivity and schema
- Checks environment variables and dependencies
- Provides actionable fix suggestions
- Supports
--fixflag for auto-remediation of common issues - Interactive
unrag doctor --setupmode for guided configuration
- Scans
Patch Changes
- Fixed deep merge file not present after installation: Resolved an issue where the
deep-merge.tsutility was not being copied to the target project duringunrag init, causing runtime errors. - Robust logging for
pdf:text-layerextractor: Improved error handling and logging in the PDF text layer extractor to surface extraction issues more clearly and provide better debugging information. - New supported runtimes documentation: Added documentation page covering supported Node.js versions and runtime environments.
- Updated connector documentation: Enhanced API documentation for Google Drive and Notion connectors with additional examples and configuration options.
0.2.7
Minor Changes
- The ingest pipeline now supports batch text embeddings via
embedMany()when the provider supports it, with configurable concurrency limits. This can significantly reduce embedding API calls and improve throughput. - Added new
embeddingProcessingconfig options. Configureconcurrency(default: 4) andbatchSize(default: 32) for embedding operations. Set viaunrag.config.tsunderdefaults.embeddingorengine.embeddingProcessing.
Patch Changes
- Eliminate
anytype assertions across the codebase. Replaced allanytype assertions with properly-typed interfaces for external SDKs and runtime-loaded modules:- Added typed interfaces for all 12 embedding providers (OpenAI, Azure, Bedrock, Cohere, Google, Mistral, Ollama, OpenRouter, Together, Vertex, Voyage, AI Gateway)
- Added structural types for Google Drive API (
DriveFile,DriveClient,AuthClient) - Added typed interfaces for extractors (pdfjs, audio transcription, video processing)
- Improved Drizzle store types with proper
QueryRowinterface
- New
AssetMetadataFieldsinterface andhasAssetMetadata()type guard. Standardized metadata shape for asset-derived chunks with compile-time type safety. - Refactored
mergeDeeputility. Extracted to dedicateddeep-merge.tsmodule with proper generic type signatures andisRecord()type guard. - Typed
requireOptional()helper. The shared optional dependency loader now requires explicit type parameters instead of defaulting toany.
0.2.6
Patch Changes
- Add
--presetflag toinitcommand for preset-based installation from URL or preset ID - Add
--providerflag toinitcommand to select embedding provider during initialization - Add
--overwriteflag toinitcommand withskiporforceoptions for controlling file overwrite behavior - Support 12 embedding providers with automatic peer dependency installation: OpenAI, Google AI, Azure OpenAI, Vertex AI, AWS Bedrock, Cohere, Mistral, Together.ai, Voyage, OpenRouter, Ollama, and AI Gateway
- Read available extractors and connectors from registry manifest instead of hardcoded lists
- Generate
unrag.config.tswith preset-configurable defaults for chunkSize, chunkOverlap, topK, embedding model, type, and timeout
Fixes and improvements
- Fix drizzle pgvector store adapter to handle both array and object return types from
db.execute() - Upgrade AI SDK dependency from ^5.0.113 to ^6.0.3
- Refactor
addcommand to use registry manifest for available extractors and connectors - Add new CLI library modules for manifest reading and preset fetching
- Include embedding provider SDK as a dependency when provider is selected during init
0.2.5
Minor Changes
- Added Google Drive connector. Sync specific Google Drive files into Unrag by file ID, with stable source IDs for reliable updates and deletes.
- New support for safer, more controllable sync. Configurable per-file size limit (defaults to 15MB) and an option to delete previously ingested content when a file is removed or access is revoked.
unrag add google-driveis now supported (andunrag addlists available connectors).
0.2.4
Patch Changes
- New
--rich-mediaflag enables multimodal embeddings and prompts for extractor selection - New
--extractors <list>flag to specify extractors directly (e.g.,pdf-text-layer,image-ocr) - Interactive grouped extractor picker (PDF, Image, Audio, Video, Files)
- Default preset (
pdf-text-layer,file-text) for non-interactive mode with--yes --rich-media
0.2.3
Minor Changes
- Multi-modal ingestion: The engine now processes images, PDFs, audio, video, and documents alongside text within a unified embedding space. Text queries can retrieve content from any modality.
- Extractor modules system: Added 12 extractors installable via
unrag add extractor:<name>:- PDF:
pdf-llm(LLM-based extraction),pdf-ocr(OCR fallback),pdf-text-layer(native text layer) - Image:
image-caption-llm(LLM captioning),image-ocr(text extraction) - Audio:
audio-transcribe(speech-to-text transcription) - Video:
video-frames(frame extraction + analysis),video-transcribe(audio track transcription) - Files:
file-docx,file-pptx,file-xlsx,file-text(document parsing)
- PDF:
- Asset processing pipeline: New
assetProcessingconfig for routing assets to extractors with support for fallback chains, size limits, and per-kind strategies. getAssetFromChunk()helper: Resolve asset URLs from retrieved chunks to display original media alongside extracted text.- Redesigned config API: More ergonomic configuration structure with optional storage of content in document and embedding tables.
Patch Changes
- Added default multi-modal model configuration so extraction works out of the box.
- Extractor-produced metadata (e.g., page numbers, confidence scores) now preserved as first-class metadata keys on chunks.
- Added ingestion warnings for skipped assets with clear reasons (unsupported kind, size limits, missing extractor).
- Added
repositoryfield to CLI'spackage.jsonfor better npm discoverability.
0.2.2
Patch Changes
- Expanded help text from a minimal usage hint to a richer “mini manpage” including commands, global flags,
initoptions, examples, and quick links to docs + repo. - Updated unknown-command handling to include the help text so users can recover quickly.
unrag add notionnow prints a full documentation URL (instead of a relative/docs/...path) by constructing it from a shared base URL constant viadocsUrl(...).- Added/used a central constants module (e.g.
cli/lib/constants.ts) to hold the public base URL + repo URL and a smalldocsUrl()helper for consistent link formatting across commands.
0.2.1
Patch Changes
- Added
unrag add notionto install the Notion connector into an existing Unrag installation. - Shipped a vendored Notion connector (pages-only v1) that can ingest specific Notion pages by ID/URL and optionally delete on not-found.
- Added docs at
/docs/connectors/notion.
0.2.0
Minor Changes
-
VectorStorenow includes a requireddelete({ sourceId } | { sourceIdPrefix })method. -
Ingestion is idempotent by default: built-in Postgres adapters treat
upsert()as replace-by-sourceId(delete-by-exact-sourceIdinside the transaction, then insert the new document/chunks/embeddings). -
ContextEngineexposesdelete(...)to delete a single logical document or wipe a namespace prefix.
0.1.1
Patch Changes
- Fixed
scope.sourceIdfiltering in the Postgres store adapters (Drizzle, Prisma, Raw SQL) to treatscope.sourceIdas a prefix (SQLLIKE '${scope.sourceId}%') instead of an exact match, enabling namespaced/tenant-scoped retrieval consistent with the docs. - Updated the Prisma store adapter to use Prisma’s runtime SQL helpers (sqltag/empty) instead of generated-client Prisma.sql
- Fixed TypeScript errors when Prisma Client isn’t generated yet while keeping the hard @prisma/client imports.
0.1.0
Minor Changes
- Initial version of unrag complete we ingestion and retrieval primitives along with adapters for Drizzle, Prisma & SQL