Try Schema • Step 2

Choose a use case

Technology ⇅	Scenario ↑	Category ⇅	Severity ⇅	Grade
Apache Kafka	Broker Disk Space Approaching CapacityMy Kafka broker disk usage just hit 92% and climbing. What should I do to prevent the broker from crashing? Should I delete old segments or increase retention settings?	Incident Response	critical	—
Apache Kafka	Broker Failure and Excessive Producer InstantiationOur Kafka brokers just crashed with OutOfMemoryError. I see millions of producer connections being tracked. Did something in our application code go wrong?	Incident Response	critical	—
Apache Kafka	Consumer Group Rebalancing Storm During DeploymentWe're doing a rolling deployment of our Kafka consumers and the entire consumer group has been stuck rebalancing for 30 minutes. Processing is completely stopped. What's happening and how do we fix it?	Incident Response	critical	—
Apache Kafka	Consumer Lag Spike During Peak TrafficMy Kafka consumer lag just spiked from 100 messages to 50,000 messages in the last 10 minutes. Help me figure out what's wrong and whether I need to scale up my consumer group.	Capacity Planning	critical	—
Apache Kafka	Cost Optimization with Tiered Storage StrategyMy AWS MSK cluster storage costs are high due to 90-day retention requirements. Should I enable tiered storage to move older data to S3? What's the cost-performance trade-off?	Cost Optimization	info	—
Apache Kafka	Heap Memory and Garbage Collection TuningMy Kafka brokers are experiencing GC pauses of 200-500ms every few minutes, causing request timeouts. Current heap is set to 8GB with default GC settings. How should I tune the JVM for better performance?	Proactive Health	warning	—
Apache Kafka	Migration from Self-Hosted to Managed KafkaWe're planning to migrate our self-hosted Kafka cluster to AWS MSK. What's the best approach to migrate without downtime? Should we use MirrorMaker 2 or Cluster Linking?	Migration	info	—
Apache Kafka	Network Saturation Between Brokers and ClientsMy Kafka broker network usage is at 85% of capacity and I'm seeing slower producer acknowledgments. Is my network saturated? Should I enable compression or scale up my broker instance types?	Incident Response	warning	—
Apache Kafka	P99 Latency Degradation InvestigationMy Kafka cluster's P99 produce latency just jumped from 50ms to 300ms, but P50 is still around 20ms. What's causing the tail latency spike and how do I troubleshoot it?	Incident Response	warning	—
Apache Kafka	Partition Count Optimization for ThroughputI have a Kafka topic with 6 partitions but need to scale to handle 5x more throughput. Should I increase partition count to 30 or 60? What's the impact on broker performance and recovery time?	Proactive Health	info	—
Apache Kafka	Producer Throughput Optimization with BatchingMy Kafka producers are only achieving 10,000 messages/sec but we need 50,000 messages/sec. I see low batch sizes in the metrics. Should I tune batch.size, linger.ms, or enable compression to improve throughput?	Proactive Health	info	—
Apache Kafka	Right-Sizing Kafka Cluster on AWS MSKMy AWS MSK cluster is running on m5.large brokers with 60% CPU and 40% disk usage. Based on my current throughput and partition count, should I scale up to larger instances, add more brokers, or am I over-provisioned?	Capacity Planning	info	—
Apache Kafka	Under-Replicated Partitions AppearingI'm seeing under-replicated partitions in my Kafka cluster monitoring. ISR count is dropping on several partitions. What's causing this and how urgent is it to fix?	Incident Response	critical	—
Apache Kafka	ZooKeeper to KRaft Migration PlanningWe're still running Kafka with ZooKeeper and need to migrate to KRaft before upgrading to Kafka 4.0. What's involved in this migration and can we do it without downtime?	Migration	warning	—