Dashboard query timeout on large observations table prevents trace latency panel loading
warningThe 'Trace latency percentiles' dashboard panel times out and fails to load when the observations table contains over 100 million rows, even with time-range filters applied. The panel remains stuck in loading state and eventually displays 'Backend Service Overloaded - Database resource limit exceeded' error.
Increase ClickHouse memory allocation to 16 GiB or more for heavy analytical workloads. Ensure queries filter on raw start_time (not wrapped in functions like toDate()) to allow ClickHouse to use indexes efficiently. Consider adding secondary indexes or materialized views on project_id, trace_id, or latency fields. Raise ClickHouse's max_execution_time and max_memory_usage settings if hardware can handle the load. For self-hosted deployments, consider switching from LEFT JOIN to INNER JOIN in percentile queries when traces without observations can be excluded.