ClickHouse

Query Failures from High Cardinality GROUP BY

warning
Resource ContentionUpdated Jan 21, 2026

Queries grouping by high-cardinality columns (millions of unique values like session_id) fail or time out due to excessive memory usage or hash table overhead, especially without external aggregation configured.

How to detect:

Identify failed queries in system.query_log with GROUP BY clauses and memory-related exceptions. Track queries with high unique key counts in aggregation states. Monitor queries where result row count is very large (>1M rows).

Recommended action:

Use approximate aggregation functions (uniqHLL12, quantileTDigest) when exact counts aren't required. Filter data earlier in the query with WHERE/PREWHERE. Aggregate at a higher level (e.g., by user_id instead of session_id). Enable max_bytes_before_external_group_by to spill to disk. Consider pre-aggregating with materialized views.