TrinoAmazon S3MinIO

S3 list operations timeout in Hive metastore with large buckets

warning
performanceUpdated Jul 3, 2020
How to detect:

Hive metastore fails to list S3 objects when response time exceeds socket timeout. With 100k+ files in a bucket, listing can take over 1 minute. Heavy load on S3 cluster exacerbates the issue. Results in 'Failed to list directory' with 'Read timed out' error.

Recommended action:

Increase Hive connector socket timeout in catalog properties. Set hive.s3.socket-timeout to 1-5 minutes (e.g., hive.s3.socket-timeout=3m) to accommodate large directory listings and S3 cluster latency.