Data Type Selection and Schema Design Problems
warningProactive Health
Identifying and fixing problematic data type choices in PostgreSQL schemas, including timestamp without timezone, char(n) padding issues, and money type locale problems.
Prompt: “I'm reviewing our PostgreSQL schema and I see we're using timestamp without time zone in many places, char(20) for some text fields, and the money type for currency. Are these bad choices and what problems might they cause in production?”
Agent Playbook
When an agent encounters this scenario, Schema provides these diagnostic steps automatically.
When reviewing PostgreSQL schema design for problematic data types, start with the critical data corruption risks from timestamp without timezone and money types, then address the space-padding and arbitrary limit issues from char(n) and varchar(n), and finally audit for less common but equally problematic types like timetz and serial.
1Audit all timestamp columns for timezone handling
Query information_schema.columns for timestamp without time zone usage: SELECT table_name, column_name FROM information_schema.columns WHERE data_type = 'timestamp without time zone'. This is your highest priority because `timestamp-no-timezone-corruption` causes silent data corruption when application servers are in different timezones or timezone settings change. Any column storing a point in time (created_at, updated_at, scheduled_for) should be timestamptz, not timestamp. If you have distributed systems or multiple environments with different timezone configs, this is actively corrupting your data right now.
2Identify money type columns and check locale settings
Search for money type usage with: SELECT table_name, column_name FROM information_schema.columns WHERE data_type = 'money'. The `money-type-locale-corruption` insight tells us that money precision depends on lc_monetary locale setting, which can differ between servers or during restore operations, silently changing values. Check your cluster's locale with SHOW lc_monetary; if you have multiple databases or plan to replicate/migrate data, this is a ticking time bomb. Plan migration to numeric(19,4) with separate currency columns.
3Find char(n) columns and measure storage waste
Query for char(n) usage: SELECT table_name, column_name, character_maximum_length FROM information_schema.columns WHERE data_type = 'character'. The `char-n-space-padding-issues` insight explains that char(n) pads with trailing spaces, wasting storage and causing confusing comparison behavior. PostgreSQL stores char(n) and text identically internally, so there's zero performance benefit. For a table with millions of rows and char(20) fields storing 5-character values, you're wasting 15 bytes per row. Calculate actual storage waste and prioritize high-volume tables for migration to text.
4Review varchar(n) columns for arbitrary limits
Find all varchar columns with: SELECT table_name, column_name, character_maximum_length FROM information_schema.columns WHERE data_type = 'character varying' ORDER BY character_maximum_length. Look for the classic varchar(255) or other arbitrary limits — the `varchar-n-arbitrary-limits` insight notes these provide no performance advantage over text in PostgreSQL but create constraints that require table rewrites to change. If you see varchar(255) on email addresses or names, that's a code smell. Replace with text and CHECK constraints for actual business logic validation.
5Check for timetz usage and meaningless offsets
Search for time with time zone: SELECT table_name, column_name FROM information_schema.columns WHERE data_type = 'time with time zone'. The `timetz-meaningless-offset` insight explains that timetz stores time-of-day with timezone offset, but that's meaningless because DST rules depend on the date — '3:00 PM PDT' is ambiguous without knowing which day. PostgreSQL docs say this type exists only for SQL compliance. If you need points in time, use timestamptz; for time-of-day values like 'store opens at 9:00 AM', use plain time and handle timezone conversion in your app.
6Audit serial columns for modern identity alternatives
Find serial/bigserial columns: SELECT table_name, column_name FROM information_schema.columns WHERE column_default LIKE 'nextval%'. While not causing data corruption, the `serial-lacks-ownership-enforcement` insight shows that serial pseudo-types don't truly own their sequences and allow manual inserts that can cause sequence conflicts. If you're on PostgreSQL 10+, new tables should use GENERATED ALWAYS AS IDENTITY instead — it ties the sequence tightly to the column and prevents the foot-gun of manual ID insertions. This is a lower priority migration but worth planning for your next schema version.
Technologies
Related Insights
timetz stores meaningless time-only values with timezone offset
warning
Storing UTC in timestamp columns breaks timezone-aware functions
warning
timestamp without time zone causes silent data corruption across timezones
diagnostic_stepcritical
char(n) pads with spaces causing comparison and storage issues
diagnostic_stepwarning
money type precision depends on locale causing data corruption
diagnostic_stepcritical
serial pseudo-type lacks sequence ownership and permissions enforcement
supporting_evidencewarning
varchar(n) creates arbitrary length limits requiring table rewrites
supporting_evidenceinfo
Monitoring Interfaces
PostgreSQL Native