Skip to main content

ADR-0002: Data Modeling and Database Architecture

Status

Accepted - 2025-10-24


Context

TVL Platform requires a robust data model for a vacation rental management system with:

  • Multi-tenant architecture (100s-1000s of organizations)
  • Complex relationships (properties → units → bookings → payments)
  • High consistency requirements (no double-bookings, accurate financial records)
  • Performance targets (< 200ms for calendar queries, < 100ms for availability checks)
  • Scalability to 10,000+ properties, 100,000+ bookings/year

Decision

PostgreSQL via Supabase with the following architectural patterns:

1. Primary Keys: UUIDs

CREATE TABLE properties (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
...
);

Rationale:

  • Distributed ID generation (no coordination needed)
  • No information leakage (unlike sequential IDs)
  • URL-safe and globally unique
  • Compatible with distributed systems

Trade-off: 16 bytes vs. 8 bytes for BIGINT, but worth it for security and flexibility.

2. Multi-Tenancy: organization_id Column

CREATE TABLE properties (
id UUID PRIMARY KEY,
organization_id UUID NOT NULL REFERENCES organizations(id),
...
);

CREATE INDEX idx_properties_org ON properties(organization_id);

Rationale:

  • Shared database (simpler than database-per-tenant)
  • Row-Level Security (RLS) enforces isolation
  • Easier backups and migrations
  • Cost-effective for SMB market

Trade-off: Must ensure RLS policies on ALL tenant tables; but Supabase makes this straightforward.

3. Soft Deletes

CREATE TABLE properties (
...
deleted_at TIMESTAMPTZ NULL
);

CREATE INDEX idx_properties_not_deleted ON properties(organization_id) WHERE deleted_at IS NULL;

Rationale:

  • Recoverability (undo delete within 30 days)
  • Audit trail preservation
  • Foreign key integrity (no orphaned records)

Trade-off: Queries must filter WHERE deleted_at IS NULL; mitigated with partial indexes.

4. Timestamps: created_at, updated_at

CREATE TABLE properties (
...
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE TRIGGER update_properties_updated_at
BEFORE UPDATE ON properties
FOR EACH ROW
EXECUTE FUNCTION update_updated_at_column();

Rationale:

  • Audit trail
  • Debugging and troubleshooting
  • Ordering and pagination

5. Currency: Integers (Cents)

CREATE TABLE bookings (
...
total_amount_cents INTEGER NOT NULL CHECK (total_amount_cents >= 0)
);

Rationale:

  • Avoid floating-point precision errors
  • Exact arithmetic (important for money)
  • Industry standard (Stripe uses cents)

Trade-off: Must convert to/from dollars in application layer; worth it for accuracy.

6. Enum Types: Check Constraints vs. Postgres ENUMs

Decision: Use VARCHAR with CHECK constraints (not Postgres ENUMs)

CREATE TABLE properties (
...
property_type VARCHAR(50) NOT NULL CHECK (
property_type IN ('house', 'apartment', 'condo', 'villa', 'cabin', 'townhouse', 'cottage', 'other')
)
);

Rationale:

  • Postgres ENUMs are hard to modify (requires migration)
  • Check constraints allow easy value additions
  • Simpler for application code (no custom types)

Trade-off: Less type safety; mitigated with application-level validation.

7. JSONB for Flexible Metadata

CREATE TABLE properties (
...
metadata JSONB DEFAULT '{}'::jsonb
);

CREATE INDEX idx_properties_metadata_gin ON properties USING GIN(metadata);

Rationale:

  • Future-proof (add fields without migrations)
  • Useful for custom fields, connector-specific data
  • Fast queries with GIN index

Trade-off: Less schema enforcement; use sparingly and document expected structure.

8. Composite Indexes for Tenant Queries

-- Most queries filter by organization_id first
CREATE INDEX idx_bookings_org_created ON bookings(organization_id, created_at DESC);
CREATE INDEX idx_properties_org_name ON properties(organization_id, name);

Rationale:

  • Postgres uses leftmost prefix of composite index
  • Query plans optimized for tenant-scoped queries
  • Reduces full table scans

Alternatives Considered

Alternative 1: MongoDB (NoSQL)

Rejected

Pros:

  • Schema flexibility
  • Horizontal scaling easier
  • JSON-native

Cons:

  • No ACID guarantees across documents (risky for bookings/payments)
  • No JOINs (complex queries require multiple round-trips)
  • RLS not built-in (must implement in application)
  • Team familiarity with SQL > NoSQL

Decision: ACID guarantees are critical for vacation rental bookings and financial data.


Alternative 2: Separate Database Per Tenant

Rejected

Pros:

  • Perfect isolation (no chance of data leaks)
  • Easier to scale individual tenants
  • Simpler RLS (not needed)

Cons:

  • Expensive (N databases = N × cost)
  • Operationally complex (backups, migrations × N)
  • Cross-tenant analytics difficult
  • Not needed at MVP scale

Decision: Shared database with RLS is sufficient and cost-effective.


Alternative 3: Sequential IDs (BIGINT)

Rejected

Pros:

  • Smaller size (8 bytes vs. 16 bytes)
  • Faster index lookups (slightly)
  • Familiar to developers

Cons:

  • Information leakage (competitor can guess booking volume)
  • Requires coordination (sequences)
  • No distributed generation

Decision: UUIDs' benefits outweigh slight performance cost.


Alternative 4: Hard Deletes

Rejected

Pros:

  • Cleaner data model (no deleted_at checks)
  • Smaller table sizes

Cons:

  • No recoverability
  • Audit trail gaps
  • Foreign key issues

Decision: Soft deletes are industry best practice for SaaS.


Consequences

Positive

  1. Data Integrity

    • ACID transactions prevent double-bookings
    • Foreign key constraints ensure referential integrity
    • Check constraints validate data at database level
  2. Performance

    • Composite indexes optimized for tenant queries
    • Partial indexes for soft deletes reduce index size
    • JSONB GIN indexes enable fast flexible queries
  3. Security

    • RLS enforces tenant isolation at database level
    • UUIDs prevent ID enumeration attacks
    • Postgres audit logging for compliance
  4. Maintainability

    • Soft deletes simplify recoverability
    • Timestamps enable debugging and auditing
    • Standard patterns familiar to developers

Negative

  1. Storage Overhead

    • UUIDs: +8 bytes per row
    • Soft deletes: +8 bytes per row (timestamptz)
    • Total: ~16 bytes overhead per row
    • Impact: 10M rows = 160MB extra storage (negligible)
  2. Query Complexity

    • Must filter deleted_at IS NULL on every query
    • Mitigation: Database views or ORM defaults
  3. JSONB Trade-offs

    • Less schema enforcement
    • Can become dumping ground for unstructured data
    • Mitigation: Document expected JSONB structure; use sparingly
  4. RLS Performance

    • RLS adds ~10-50ms per query
    • Mitigation: Ensure indexes include organization_id

Implementation Plan

Phase 1: Core Schema (Week 1-2)

  1. Create foundation tables (organizations, users, roles)
  2. Enable RLS on all tables
  3. Add composite indexes for tenant queries
  4. Test RLS isolation

Phase 2: Domain Tables (Week 3-5)

  1. Properties and units
  2. Availability and calendars
  3. Pricing rules
  4. Bookings and quotes

Phase 3: Financial Tables (Week 6-7)

  1. Payments and refunds
  2. Transactions (double-entry ledger)
  3. Payouts

Phase 4: Integration Tables (Week 8-9)

  1. Channels and listings
  2. Connectors and webhooks
  3. Sync logs

Phase 5: Optimization (Week 10)

  1. Analyze slow queries (pg_stat_statements)
  2. Add missing indexes
  3. Tune connection pool settings
  4. Load testing and optimization

Validation Checklist

  • All tenant-scoped tables have organization_id
  • All tenant-scoped tables have RLS policies
  • All tables have primary key (UUID)
  • All tables have created_at, updated_at
  • All tables with deletes have deleted_at
  • All foreign keys have ON DELETE CASCADE or SET NULL
  • All money fields use INTEGER (cents)
  • All enum fields use CHECK constraints
  • Composite indexes include organization_id first
  • Partial indexes for deleted_at IS NULL
  • GIN indexes on JSONB columns
  • Migration scripts are idempotent

References

Sources

  • docs/04-data/*.md
  • docs/01-architecture/logical-architecture.md
  • meta/research-log.md