LangfuseClickHouse

Dashboard query timeout on large observations table prevents trace latency panel loading

warning
performanceUpdated Oct 30, 2025(via Exa)
How to detect:

The 'Trace latency percentiles' dashboard panel times out and fails to load when the observations table contains over 100 million rows, even with time-range filters applied. The panel remains stuck in loading state and eventually displays 'Backend Service Overloaded - Database resource limit exceeded' error.

Recommended action:

Increase ClickHouse memory allocation to 16 GiB or more for heavy analytical workloads. Ensure queries filter on raw start_time (not wrapped in functions like toDate()) to allow ClickHouse to use indexes efficiently. Consider adding secondary indexes or materialized views on project_id, trace_id, or latency fields. Raise ClickHouse's max_execution_time and max_memory_usage settings if hardware can handle the load. For self-hosted deployments, consider switching from LEFT JOIN to INNER JOIN in percentile queries when traces without observations can be excluded.