TimeLimitExceeded exception during error handling causes worker thread death
criticalWhen an actor with time_limit exceeds its limit while in a blocking syscall and then raises an exception, a race condition causes TimeLimitExceeded to fire during broker.emit_after exception handling. Because TimeLimitExceeded inherits from BaseException (not Exception), it escapes exception handlers, killing the worker thread and causing the worker to hang with decreased alive threads. Confirmed unfixed in versions 1.11.0, 1.14.2, and 1.17.0.
Use gevent instead of threading for time limits (recommended by maintainer) for more precise timeouts that don't rely on async exceptions. Alternatively, implement a watchdog to detect and restart dead threads. Monitor dramatiq.worker.threads metric for decreases indicating thread death. Avoid setting time_limit on actors that perform long blocking syscalls without gevent.