LangfuseClickHouse

LEFT JOIN in trace percentile queries causes performance degradation at scale

info
performanceUpdated Oct 30, 2025(via Exa)
How to detect:

The dashboard percentile query uses LEFT JOIN between traces and observations tables, which causes ClickHouse to scan all traces within the time range even when observations data is required for latency calculation. At 100M+ observations, this results in timeouts.

Recommended action:

For self-hosted deployments with query customization, switch to INNER JOIN when calculating trace latency percentiles since traces without matching observations don't contribute meaningful values anyway. Move traces filtering into a subquery and join observations using INNER JOIN on trace_id, ensuring project_id and time filters are applied to both sides.