Chroma

Collection Metadata Corruption After Unclean Shutdown

critical
reliabilityUpdated Mar 2, 2026

Chroma's SQLite metadata store is vulnerable to corruption if the process terminates abnormally (crash, SIGKILL, power loss) during a write transaction. Corruption manifests as collection initialization failures, missing collections, inconsistent vector counts, or foreign key constraint violations. This is more likely in embedded deployments without WAL mode enabled.

Technologies:
How to detect:

Collection operations fail after unclean process termination (OOM kill, crash, forced shutdown). Errors include 'database disk image is malformed', 'no such table', foreign key violations, or collections appearing empty despite prior data. SQLite integrity_check fails.

Recommended action:

1. Detect corruption: Run SQLite integrity check on startup: `sqlite3 chroma.db 'PRAGMA integrity_check;'`. Check for error patterns in logs indicating database issues. 2. Prevent: Enable SQLite WAL (Write-Ahead Logging) mode which provides better crash resilience: `PRAGMA journal_mode=WAL;`. Ensure graceful shutdown procedures (handle SIGTERM, not SIGKILL). Implement regular backups. 3. Recover: If corruption detected, restore from latest backup. If no backup, attempt SQLite recovery: `sqlite3 chroma.db '.recover' | sqlite3 recovered.db`, then validate recovered data. 4. Re-initialize: If recovery fails, delete corrupted database and rebuild collections from source data. 5. Monitoring: Track unclean shutdowns (OOM kills, crashes), integrity check results on startup, backup recency. Alert on any corruption detection.