I'll help you measure and reclaim that bloated space in your PostgreSQL table. This is a common issue caused by PostgreSQL's MVCC (Multi-Version Concurrency Control) system leaving behind dead tuples.
## Measuring Bloat
**1. Check bloat with this query:**
```sql
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS indexes_size,
round(100 * pg_relation_size(schemaname||'.'||tablename) / pg_total_relation_size(schemaname||'.'||tablename)) AS table_pct
FROM pg_tables
WHERE tablename = 'your_table_name'
AND schemaname = 'public';
```
**2. More detailed bloat estimation:**
```sql
SELECT
current_database() AS db,
schemaname,
tablename,
round(100 * (pg_relation_size(schemaname||'.'||tablename) -
(pg_stat_get_live_tuples(c.oid) * (current_setting('block_size')::integer) / 1024))::numeric /
pg_relation_size(schemaname||'.'||tablename), 2) AS bloat_pct,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
pg_stat_get_live_tuples(c.oid) AS live_tuples,
pg_stat_get_dead_tuples(c.oid) AS dead_tuples
FROM pg_tables t
JOIN pg_class c ON c.relname = t.tablename
WHERE tablename = 'your_table_name'
AND schemaname = 'public';
```
## Reclaiming Space Safely
### Option 1: VACUUM FULL (Requires exclusive lock)
```sql
-- This locks the table and rewrites it
VACUUM FULL VERBOSE your_table_name;
```
**Pros:** Built-in, guaranteed to work
**Cons:** Requires exclusive lock (table unavailable during operation), needs 2x disk space temporarily
### Option 2: pg_repack (Recommended for production)
```bash
# Install extension
CREATE EXTENSION pg_repack;
# Run from command line (minimal locking)
pg_repack -t your_table_name -d your_database
```
**Pros:** Minimal locking, works online, also rebuilds indexes
**Cons:** Requires extension installation, needs extra disk space
### Option 3: CLUSTER (Reorders table)
```sql
-- Rewrite table ordered by an index
CLUSTER your_table_name USING your_index_name;
```
**Pros:** Also optimizes data order for better query performance
**Cons:** Requires exclusive lock like VACUUM FULL
### Option 4: Regular VACUUM + autovacuum tuning
```sql
-- Non-blocking but doesn't return space to OS immediately
VACUUM VERBOSE your_table_name;
```
This won't reclaim the 40GB immediately, but prevents future bloat.
## Best Practice Workflow
```sql
-- 1. Check current bloat and dead tuples
SELECT n_dead_tup, n_live_tup, last_vacuum, last_autovacuum
FROM pg_stat_user_tables
WHERE relname = 'your_table_name';
-- 2. If high bloat, run regular vacuum first
VACUUM ANALYZE your_table_name;
-- 3. For immediate space reclaim:
-- During maintenance window:
VACUUM FULL VERBOSE your_table_name;
-- OR with pg_repack during production:
-- pg_repack -t your_table_name -d your_database
```
## Prevent Future Bloat
```sql
-- Tune autovacuum for this table
ALTER TABLE your_table_name SET (
autovacuum_vacuum_scale_factor = 0.05, -- More aggressive
autovacuum_vacuum_threshold = 50
);
-- Check autovacuum settings
SELECT relname, reloptions
FROM pg_class
WHERE relname = 'your_table_name';
```
**Which approach should you use?**
- **Development/off-hours:** VACUUM FULL (simplest)
- **Production with downtime window:** VACUUM FULL or CLUSTER
- **Production 24/7:** pg_repack (requires installation but near-zero downtime)
Would you like me to help you check the current bloat level or set up pg_repack?
I'll help you measure and reclaim that PostgreSQL table bloat. Here's a comprehensive approach:
## Measuring Bloat
First, let's check which tables have bloat using this query:
```sql
-- Check table bloat
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS total_size,
pg_size_pretty(pg_relation_size(schemaname||'.'||tablename)) AS table_size,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename) - pg_relation_size(schemaname||'.'||tablename)) AS indexes_size,
round(100 * pg_relation_size(schemaname||'.'||tablename) / pg_total_relation_size(schemaname||'.'||tablename)) AS table_pct
FROM pg_tables
WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC
LIMIT 20;
```
For more detailed bloat estimation:
```sql
-- Estimate dead tuples and bloat percentage
SELECT
schemaname,
relname,
n_live_tup,
n_dead_tup,
round(100 * n_dead_tup / NULLIF(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
pg_size_pretty(pg_relation_size(schemaname||'.'||relname)) AS size,
last_vacuum,
last_autovacuum
FROM pg_stat_user_tables
WHERE n_dead_tup > 0
ORDER BY n_dead_tup DESC;
```
## Safe Methods to Reclaim Space
### Option 1: VACUUM FULL (requires table lock)
**Pros:** Built-in, thorough
**Cons:** Exclusive lock blocks reads/writes, requires 2x disk space temporarily
```sql
VACUUM FULL VERBOSE your_table_name;
```
Or analyze first:
```sql
VACUUM FULL ANALYZE your_table_name;
```
### Option 2: pg_repack (recommended for production)
**Pros:** No long-term locks, works online
**Cons:** Requires extension installation
```bash
# Install extension (Debian/Ubuntu)
sudo apt-get install postgresql-<version>-repack
# Or via source
git clone https://github.com/reorg/pg_repack.git
```
```sql
-- In PostgreSQL
CREATE EXTENSION pg_repack;
```
```bash
# Repack specific table
pg_repack -t your_table_name -d your_database
# Repack entire database
pg_repack -d your_database
```
### Option 3: CLUSTER (reorders by index)
**Pros:** Also optimizes table order
**Cons:** Requires exclusive lock
```sql
CLUSTER your_table_name USING your_primary_key_index;
```
## Recommended Workflow
1. **Check current bloat:**
```bash
psql -d your_database -f bloat_check.sql
```
2. **If bloat > 20% and table is large:**
- **Production:** Use `pg_repack` (minimal downtime)
- **Dev/Maintenance window:** Use `VACUUM FULL`
3. **Prevent future bloat:**
```sql
-- Check autovacuum settings
SELECT
relname,
reloptions
FROM pg_class
WHERE relname = 'your_table_name';
-- Tune autovacuum for high-churn tables
ALTER TABLE your_table_name SET (
autovacuum_vacuum_scale_factor = 0.05, -- Vacuum at 5% dead tuples (default 20%)
autovacuum_analyze_scale_factor = 0.05
);
```
4. **Monitor space after reclaim:**
```sql
SELECT pg_size_pretty(pg_total_relation_size('your_table_name'));
```
## Quick Decision Guide
- **< 1GB table:** VACUUM FULL during off-hours
- **1-100GB table with downtime:** VACUUM FULL
- **1-100GB table in production:** pg_repack
- **> 100GB table:** pg_repack or scheduled maintenance window
Given your 50GB→10GB situation, **pg_repack is likely your best option** if this is a production system. It will rebuild the table online without blocking operations.
Would you like help setting up pg_repack, or do you have a maintenance window where VACUUM FULL would work?