Skip to main content

Channels & Distribution - Domain Specification

First Introduced: MVP.0 Status: Specification Complete Last Updated: 2025-10-25


Overview

Channels & Distribution govern how listings, rates, and availability are exported to and synchronized with external marketplaces and white-label sites. This domain transforms internal Units, Rates, and Calendars into external channel-specific formats and maintains bidirectional sync with OTAs (Online Travel Agencies), property management systems (PMS), and partner platforms.

The architecture supports multiple distribution strategies — from simple one-way iCal exports to sophisticated API-based bidirectional synchronization — while maintaining TVL as the canonical source of truth for supply data.


Responsibilities

This domain IS responsible for:

  • Managing channel connection configurations and credentials
  • Mapping internal Units/Spaces to external channel listings
  • Transforming TVL data into channel-specific payloads (field mapping)
  • Executing outbound sync operations with rate limiting and retry logic
  • Processing inbound booking and availability updates (MVP.1+)
  • Tracking sync status, audit trails, and operational health per channel
  • Handling idempotent operations and preventing duplicate syncs

This domain is NOT responsible for:

  • Unit/Space content management (→ Supply domain)
  • Pricing and rate calculations (→ Pricing domain)
  • Availability and calendar management (→ Availability domain)
  • Payment processing for bookings (→ Payments domain)
  • User authentication and authorization (→ Authorization domain)
  • Multi-org access delegation (→ Delegation domain)

Relationships

Depends On:

Depended On By:

Related Domains:


Core Concepts

Entity: Channel Target (channel_targets)

Purpose: Represents a configured connection to an external distribution channel (Hostaway, Airbnb, VRBO, internal brand site).

Key Attributes:

  • id (UUID, primary key)
  • org_id (UUID, foreign key → organizations.id)
  • account_id (UUID, foreign key → accounts.id)
  • name (VARCHAR, required) - Display name (e.g., "Hostaway Miami", "Airbnb Premium")
  • channel_type (ENUM) - hostaway | airbnb | vrbo | booking_com | internal_brand
  • sync_mode (ENUM) - manual | ical | api
  • is_active (BOOLEAN, default true)
  • base_url (TEXT) - API endpoint for channel
  • credentials_ref (VARCHAR) - Reference to secret in Secrets Manager (not plain text)
  • rate_limit_config (JSONB) - Per-target rate limits: {"max_requests": 12, "window_seconds": 10}
  • last_health_check_at (TIMESTAMP)
  • health_status (ENUM) - healthy | degraded | offline
  • settings (JSONB) - Channel-specific configuration
  • created_at, updated_at, deleted_at (timestamps)

Relationships:

  • ChannelTarget → Org (*, many-to-one)
  • ChannelTarget → Account (*, many-to-one)
  • ChannelTarget → ChannelListing (1:*)
  • ChannelTarget → OutboundAudit (1:*)
  • ChannelTarget → InboundAudit (1:*) - MVP.1+

Lifecycle:

  • Created: When user configures a new channel connection
  • Updated: When credentials rotated or settings changed
  • Deactivated: Via is_active=false (soft delete via deleted_at)
  • Health Checked: Periodic validation of API connectivity

Business Rules:

  • Each ChannelTarget must have unique (org_id, account_id, channel_type, name)
  • Credentials stored only as references, never plain text in database
  • API tokens rotated per channel provider policy (alerts at T-60/T-30 days)
  • Rate limit config must be populated for all sync_mode=api targets
  • Inactive targets do not receive sync jobs

Entity: Channel Listing (channel_listings)

Purpose: Maps an internal Unit to its external representation on a specific channel. Tracks sync state and idempotency.

Key Attributes:

  • id (UUID, primary key)
  • org_id (UUID, foreign key → organizations.id)
  • account_id (UUID, foreign key → accounts.id)
  • unit_id (UUID, foreign key → units.id)
  • space_id (UUID, nullable, foreign key → spaces.id) - For Space-level distribution (single-unit properties)
  • target_id (UUID, foreign key → channel_targets.id)
  • external_listing_id (VARCHAR) - Channel's listing identifier (e.g., Hostaway listing ID)
  • status (ENUM) - active | paused | error | archived
  • sync_status (ENUM) - ok | pending | error | rate_limited
  • last_synced_at (TIMESTAMP)
  • last_sync_version (INTEGER) - Unit version at last successful sync
  • last_payload_hash (VARCHAR(64)) - SHA-256 of last synced payload (for change detection)
  • retry_count (INTEGER, default 0)
  • retry_after_at (TIMESTAMP, nullable) - Honor 429 Retry-After header
  • error_message (TEXT, nullable)
  • created_at, updated_at (timestamps)

Relationships:

  • ChannelListing → Unit (*, many-to-one)
  • ChannelListing → Space (*, many-to-one, optional)
  • ChannelListing → ChannelTarget (*, many-to-one)
  • ChannelListing → OutboundAudit (1:*)

Lifecycle:

  • Created: When user links a Unit to a ChannelTarget
  • Updated: After each sync attempt (success or failure)
  • Paused: User temporarily disables distribution
  • Archived: Unit no longer distributed to channel

Business Rules:

  • Unique constraint on (unit_id, target_id) - one listing per Unit per Target
  • Exactly one of space_id or unit_id must be populated (enforced via CHECK constraint)
  • last_payload_hash computed from normalized, sorted JSON to detect changes
  • MVP.0: Unit-level distribution for Hostaway; Space-level for iCal exports
  • Sync only triggered when last_payload_hash differs from current Unit state
  • retry_count reset to 0 on successful sync
  • retry_after_at honored before rescheduling failed jobs

Entity: Outbound Audit (outbound_audit)

Purpose: Immutable log of every outbound sync operation for debugging, compliance, and metrics.

Key Attributes:

  • id (UUID, primary key)
  • org_id (UUID, foreign key → organizations.id)
  • account_id (UUID, foreign key → accounts.id)
  • listing_id (UUID, foreign key → channel_listings.id)
  • target_id (UUID, foreign key → channel_targets.id)
  • unit_id (UUID, foreign key → units.id)
  • unit_version (INTEGER) - Version of Unit at sync time
  • operation (ENUM) - create | update | delete
  • http_method (VARCHAR) - POST | PUT | PATCH | DELETE
  • http_status (INTEGER) - 200, 201, 429, 500, etc.
  • started_at (TIMESTAMP, required)
  • completed_at (TIMESTAMP, nullable)
  • duration_ms (INTEGER) - Latency metric
  • request_payload_hash (VARCHAR(64)) - SHA-256 of request body (no PII)
  • request_excerpt (TEXT) - First 500 chars for operator triage
  • response_excerpt (TEXT) - Response summary (no PII)
  • idempotency_key (VARCHAR(128)) - sha256(unit_version|target_site_id)
  • retry_attempt (INTEGER, default 0)
  • error_message (TEXT, nullable)
  • trace_id (UUID) - OpenTelemetry correlation ID
  • created_by (UUID, foreign key → users.id, nullable) - Manual sync trigger
  • created_at (TIMESTAMP)

Relationships:

  • OutboundAudit → ChannelListing (*, many-to-one)
  • OutboundAudit → ChannelTarget (*, many-to-one)
  • OutboundAudit → Unit (*, many-to-one)

Lifecycle:

  • Created: Before each sync attempt
  • Updated: After sync completion (success or failure)
  • Retained: 90 days default; exported to data warehouse before purge

Business Rules:

  • Append-only; never updated after completed_at set
  • Unique constraint on (listing_id, idempotency_key) prevents duplicate processing
  • request_payload_hash prevents storing raw PII in audit logs
  • request_excerpt limited to non-sensitive fields for debugging
  • All sync operations must create audit record (100% coverage)
  • Trace IDs enable correlation with application logs and metrics

Entity: Inbound Audit (inbound_audit) - MVP.1+

Purpose: Logs inbound webhook events and API pulls from channels (bookings, availability updates, cancellations).

Key Attributes:

  • id (UUID, primary key)
  • org_id (UUID, foreign key → organizations.id)
  • account_id (UUID, foreign key → accounts.id)
  • target_id (UUID, foreign key → channel_targets.id)
  • listing_id (UUID, nullable, foreign key → channel_listings.id)
  • event_type (ENUM) - booking_created | booking_cancelled | availability_updated | message_received
  • external_event_id (VARCHAR) - Channel's event identifier
  • payload_hash (VARCHAR(64)) - SHA-256 for deduplication
  • received_at (TIMESTAMP, required)
  • processed_at (TIMESTAMP, nullable)
  • processing_status (ENUM) - pending | success | failed | duplicate
  • error_message (TEXT, nullable)
  • trace_id (UUID)
  • created_at (TIMESTAMP)

Relationships:

  • InboundAudit → ChannelTarget (*, many-to-one)
  • InboundAudit → ChannelListing (*, many-to-one, optional)

Lifecycle:

  • Created: When webhook received or pull operation completes
  • Processed: After event applied to internal state (Booking, Availability)
  • Retained: 90 days default

Business Rules:

  • Unique constraint on (target_id, external_event_id) prevents duplicate processing
  • payload_hash provides secondary deduplication for identical events
  • Processing must be idempotent (safe to replay)
  • Failed events retried with exponential backoff

Workflows

Workflow: Channel Setup (MVP.0)

Goal: Connect a new Hostaway site for distribution.

Steps:

  1. Operator navigates to Channels section (Owner or ChannelPublisher role required)
  2. Select "Add Channel Target" and choose channel_type=hostaway
  3. Enter configuration:
    • Name (e.g., "Hostaway Miami Villas")
    • Base URL (Hostaway API endpoint)
    • API token (stored in Secrets Manager, only reference saved)
    • Rate limit config: {"max_requests": 12, "window_seconds": 10}
  4. System validates credentials:
    • Test API call to Hostaway /v1/listings endpoint
    • Record health check result
  5. Create ChannelTarget record with is_active=true, health_status=healthy
  6. Emit event: channel.target.created

Postconditions:

  • ChannelTarget ready for Unit linking
  • Credentials securely stored and validated
  • Rate limiter configured per target

Goal: Map an internal Unit to a Hostaway listing for automated sync.

Steps:

  1. Operator selects Unit in Unit management UI
  2. Click "Distribute to Channel" and select ChannelTarget
  3. System checks:
    • Unit has required fields (name, capacity, address)
    • Unit has at least one primary image
    • No existing ChannelListing for (unit_id, target_id)
  4. Create ChannelListing record:
    • status=active, sync_status=pending
    • last_payload_hash=NULL (triggers initial sync)
  5. Enqueue sync job: {unit_id, target_id, operation: 'create'}
  6. Return success with listing ID

Postconditions:

  • ChannelListing exists with pending sync status
  • Sync job queued for execution
  • User sees real-time sync status in UI

Workflow: Outbound Sync Execution (MVP.0)

Goal: Publish Unit data to Hostaway with idempotency, rate limiting, and retry logic.

Steps:

  1. Sync job dequeued from BullMQ per-target limiter
  2. Acquire rate limit token:
    • Check target's rate limit: ≤12 requests / 10s
    • If rate exceeded, requeue with retry_after_at delay
  3. Load Unit data with version, compute payload hash
  4. Check idempotency:
    • Compare current_payload_hash with last_payload_hash
    • If identical, skip sync and mark sync_status=ok
  5. Transform payload:
    • Apply field mappings (TVL → Hostaway schema)
    • Validate required fields
    • Generate idempotency_key = sha256(unit_version|target_site_id)
  6. Create OutboundAudit record with started_at, trace_id
  7. Execute HTTP request:
    • POST /v1/listings (create) or PUT /v1/listings/{id} (update)
    • Include X-Idempotency-Key header
    • Timeout: 30s
  8. Handle response:
    • Success (200/201):
      • Update ChannelListing: sync_status=ok, last_synced_at, last_payload_hash, retry_count=0
      • Complete OutboundAudit: http_status, completed_at, duration_ms
    • Rate Limited (429):
      • Parse Retry-After header
      • Update ChannelListing: sync_status=rate_limited, retry_after_at
      • Complete OutboundAudit with error
      • Requeue with backoff
    • Client Error (4xx):
      • Update ChannelListing: sync_status=error, error_message, increment retry_count
      • Complete OutboundAudit
      • Alert if validation error
    • Server Error (5xx):
      • Apply full-jitter exponential backoff
      • Requeue if retry_count < 5, else mark failed
  9. Emit metrics:
    • tvl_sync_jobs_total{action=create|update, state=success|failed, target=hostaway}
    • tvl_sync_latency_ms{target=hostaway} (duration)
    • tvl_hostaway_http_requests_total{status=200|429|500}
    • tvl_rate_limited_total{target=hostaway} (on 429)

Postconditions:

  • Outbound sync recorded in audit log
  • ChannelListing state reflects sync result
  • Metrics updated for observability dashboard
  • Failed syncs retried automatically with backoff

Workflow: Manual Retry from Snapshot (MVP.0)

Goal: Operator replays a previous Unit version to recover from sync errors.

Steps:

  1. Operator views ChannelListing with sync_status=error
  2. Click "View Sync History" to see OutboundAudit records
  3. Select "Replay from Snapshot" and choose Unit version
  4. System loads UnitSnapshot for selected version
  5. Compute payload from snapshot and generate new idempotency key
  6. Enqueue high-priority sync job with snapshot data
  7. Execute sync workflow (same as standard sync)
  8. Display real-time status in UI

Postconditions:

  • Sync reattempted with historical Unit state
  • Operator can compare diff between versions
  • Audit trail records manual replay action

Workflow: Health Dashboard Monitoring (MVP.0)

Goal: Operator monitors per-target sync health and troubleshoots issues.

Steps:

  1. Operator opens Channels Health Dashboard
  2. View per-target metrics:
    • Queue depth (pending jobs per target)
    • 429 count (rate limit hits, rolling 10min window)
    • Last success timestamp and latency (p50, p95)
    • Error rate (failed syncs / total syncs)
  3. Drill down to target:
    • View recent OutboundAudit entries (paginated)
    • See error messages and retry attempts
    • Check retry_after_at for rate-limited jobs
  4. Take action:
    • Pause target (is_active=false) to stop syncs
    • Manually retry failed listings
    • Adjust rate limit config if negotiated with channel
  5. Set up alerts:
    • Alert if 429 ratio > 2% (rolling 10min)
    • Alert if p95 latency > 10s
    • Alert if error rate > 1% per day

Postconditions:

  • Real-time visibility into channel health
  • Operators can proactively address issues
  • Alerts prevent service degradation

Workflow: Inbound Webhook Processing (MVP.1+)

Goal: Process booking notifications from Hostaway.

Steps:

  1. Webhook received at /api/v1/webhooks/hostaway
  2. Validate webhook signature using channel credentials
  3. Parse event payload: event_type=booking.created, extract external_event_id
  4. Create InboundAudit record:
    • target_id, event_type, external_event_id, received_at
    • Compute payload_hash for deduplication
  5. Check duplicate:
    • Query InboundAudit for (target_id, external_event_id) or matching payload_hash
    • If exists, mark processing_status=duplicate and exit
  6. Map external listing to internal Unit:
    • Lookup ChannelListing by external_listing_id
    • Validate Unit exists and is active
  7. Process event:
    • booking.created: Create internal Booking entity (→ Bookings domain)
    • booking.cancelled: Cancel internal Booking
    • availability.updated: Update Availability calendar
  8. Update InboundAudit:
    • processed_at, processing_status=success
  9. Emit event: channel.booking.imported

Postconditions:

  • Inbound booking reflected in TVL system
  • Audit trail for compliance
  • Deduplication prevents double-bookings

Business Rules

  1. Org Isolation: All channel operations MUST filter by org_id and account_id
  2. One Listing Per Target: Unique constraint on (unit_id, target_id) prevents duplicate mappings
  3. Idempotent Sync: DB constraint on (listing_id, idempotency_key) ensures exactly-once processing
  4. Rate Limit Compliance: Per-target limiters honor channel-specific rate limits; 429 responses trigger backoff
  5. Retry Logic: Max 5 retry attempts with full-jitter exponential backoff; cap at 60 seconds
  6. Credentials Security: API tokens stored only in Secrets Manager; database stores references only
  7. Payload Hashing: Sync triggered only when current_payload_hash != last_payload_hash (change detection)
  8. Audit Coverage: 100% of outbound sync operations logged in OutboundAudit
  9. Soft Deletes: ChannelTargets and ChannelListings never hard deleted (compliance/audit)
  10. Health Checks: Periodic validation of channel connectivity; inactive targets do not receive jobs
  11. Field Mapping Validation: All channel payloads validated pre-sync using channel-specific schemas
  12. Timezone Handling: All timestamps stored in UTC; channel-specific fields (check-in/out) mapped to listing timezone

Implementation Notes

MVP.0 Scope: Hostaway One-Way Sync

Included:

  • ChannelTarget, ChannelListing, OutboundAudit entities
  • Google SSO authentication with role-based access (Owner, ChannelPublisher, ContentManager, Viewer)
  • Hostaway API integration (create/update listings)
  • Unit-level distribution (one ChannelListing per Unit)
  • Unit Snapshots for versioning and diff preview
  • Idempotent sync with idempotency_key and last_payload_hash
  • Per-target rate limiting (BullMQ limiters)
  • Full-jitter exponential backoff on 429/5xx errors
  • Manual retry from snapshot
  • Health dashboard (queue depth, 429 count, error rate, latency)
  • Observability: OpenTelemetry metrics/traces, structured logs
  • Field mapping: TVL → Hostaway schema transformation

Deferred to MVP.1:

  • InboundAudit entity and webhook processing
  • Two-way sync (bookings, availability updates from Hostaway)
  • Booking awareness (preventing double-bookings)
  • PATCH operations (MVP.0 uses PUT for full payloads)

Deferred to MVP.2:

  • Airbnb API integration
  • VRBO API integration
  • Multi-channel distribution (one Unit → multiple channels beyond Hostaway)
  • Channel-specific pricing/availability rules

Database Indexes

Critical for performance:

CREATE INDEX idx_channel_targets_org_account ON channel_targets(org_id, account_id);
CREATE UNIQUE INDEX idx_channel_targets_unique ON channel_targets(org_id, account_id, channel_type, name) WHERE deleted_at IS NULL;

CREATE INDEX idx_channel_listings_unit_target ON channel_listings(unit_id, target_id);
CREATE UNIQUE INDEX idx_channel_listings_unique ON channel_listings(unit_id, target_id);
CREATE INDEX idx_channel_listings_sync_status ON channel_listings(sync_status) WHERE sync_status IN ('pending', 'error', 'rate_limited');
CREATE INDEX idx_channel_listings_retry ON channel_listings(retry_after_at) WHERE retry_after_at IS NOT NULL;

CREATE INDEX idx_outbound_audit_listing ON outbound_audit(listing_id, started_at DESC);
CREATE INDEX idx_outbound_audit_target_status ON outbound_audit(target_id, http_status, started_at DESC);
CREATE UNIQUE INDEX idx_outbound_audit_idempotency ON outbound_audit(listing_id, idempotency_key);
CREATE INDEX idx_outbound_audit_trace ON outbound_audit(trace_id);

CREATE INDEX idx_inbound_audit_target ON inbound_audit(target_id, received_at DESC); -- MVP.1+
CREATE UNIQUE INDEX idx_inbound_audit_event ON inbound_audit(target_id, external_event_id); -- MVP.1+

Constraints

Enforce data integrity:

-- ChannelTargets
ALTER TABLE channel_targets ADD CONSTRAINT channel_targets_rate_limit_valid CHECK (
(sync_mode != 'api') OR (rate_limit_config IS NOT NULL)
);

-- ChannelListings
ALTER TABLE channel_listings ADD CONSTRAINT channel_listings_resource_check CHECK (
(unit_id IS NOT NULL AND space_id IS NULL) OR (space_id IS NOT NULL AND unit_id IS NULL)
);
ALTER TABLE channel_listings ADD CONSTRAINT channel_listings_retry_count_valid CHECK (retry_count >= 0 AND retry_count <= 5);

-- OutboundAudit
ALTER TABLE outbound_audit ADD CONSTRAINT outbound_audit_duration_valid CHECK (
(completed_at IS NULL) OR (duration_ms >= 0)
);

Rate Limiting Strategy

Per-Target BullMQ Limiter:

  • Each ChannelTarget has isolated rate limiter
  • Config stored in rate_limit_config JSONB: {"max_requests": 12, "window_seconds": 10}
  • Hostaway default: 12 requests / 10 seconds
  • Limiter checks token availability before dequeue
  • On 429 response:
    • Parse Retry-After header (seconds or HTTP-date)
    • Set retry_after_at = now() + retry_after_duration
    • Requeue job with delay
    • Increment tvl_rate_limited_total{target} metric
  • Alert if 429 ratio > 2% (rolling 10min) for sustained rate limit issues

Field Mapping Architecture

Transformation Pipeline:

TVL Unit Model

[Field Mapper] ← channel_field_mappings.yaml

Channel Payload (validated)

[HTTP Client] → External API

Key Transformations (Hostaway):

  • unit.namename (direct)
  • unit.capacity_adultsaccommodates (direct)
  • space.property_typepropertyTypeName (enum mapping: villa → "Villa")
  • unit.bathroomsbathrooms (decimal preserved)
  • media_assets[is_primary=true].urlpicture (required)
  • media_assets[].urlphotos[] (array, max 50 images)

See channel-field-mappings.md for complete mapping reference.


Secrets Management

Credentials Storage:

  • Hostaway API tokens stored in cloud Secrets Manager (AWS Secrets Manager / GCP Secret Manager)
  • Database stores only reference: credentials_ref = "projects/tvl/secrets/hostaway-miami-token"
  • Application retrieves secrets at runtime with caching (TTL 1 hour)
  • Token rotation policy:
    • Alert at T-60 days and T-30 days before expiration
    • Handle 403 responses as rotation hint
    • Zero-downtime rotation via versioned secrets

Future Enhancements

MVP.1: Two-Way Sync (Hostaway → TVL)

Scope:

  • InboundAudit entity
  • Webhook endpoint for Hostaway events
  • Booking import: booking.created, booking.cancelled
  • Availability import: availability.updated
  • Booking awareness: prevent double-bookings via reservation lock

Timeline: 2-3 weeks after MVP.0


MVP.2: Multi-Channel Distribution

Scope:

  • Airbnb API integration
  • VRBO API integration
  • Channel-specific field mappings (bathroom rounding for Airbnb, split bathrooms for VRBO)
  • Per-channel pricing/availability rules
  • Unified health dashboard across channels

Timeline: 4-6 weeks after MVP.1


V1.0: Advanced Distribution

Features:

  • White-label marketplace support (internal brand sites)
  • Delta-based exports (send only changes since last sync)
  • Dynamic pricing feeds (integration with revenue management systems)
  • Multi-currency support per channel
  • Automated error recovery with ML-based retry strategies

Timeline: Post-MVP phase


Physical Schema

See 001_initial_schema.sql for complete CREATE TABLE statements.

Summary:

  • 4 tables (MVP.0): channel_targets, channel_listings, outbound_audit, (inbound_audit in MVP.1)
  • 15+ indexes for query performance
  • 10+ constraints for data integrity
  • JSONB fields for flexible configuration
  • Foreign keys enforce referential integrity


Acceptance Criteria

MVP.0 Completion Requires:

  1. Channel Setup:

    • ✅ Create ChannelTarget with Hostaway API credentials
    • ✅ Validate credentials via test API call
    • ✅ Store credentials in Secrets Manager (reference only in DB)
  2. Unit Linking:

    • ✅ Link Unit to ChannelTarget creating ChannelListing
    • ✅ Enforce unique constraint (unit_id, target_id)
    • ✅ Validate Unit has required fields before linking
  3. Outbound Sync:

    • ✅ Compute payload_hash to detect changes
    • ✅ Generate idempotency_key for duplicate prevention
    • ✅ Apply per-target rate limiting (≤12 req / 10s)
    • ✅ Transform TVL Unit → Hostaway payload via field mappings
    • ✅ Execute HTTP POST/PUT with retry logic
    • ✅ Handle 429 with retry_after_at and backoff
    • ✅ Record all operations in OutboundAudit
    • ✅ Emit metrics: tvl_sync_jobs_total, tvl_sync_latency_ms, tvl_rate_limited_total
  4. Idempotency:

    • ✅ DB constraint prevents duplicate (listing_id, idempotency_key)
    • ✅ Identical payloads skip sync (hash comparison)
  5. Retry & Error Handling:

    • ✅ Max 5 retry attempts with full-jitter backoff
    • ✅ Rate-limited jobs honor retry_after_at
    • ✅ Failed syncs display error messages in UI
    • ✅ Manual retry from snapshot succeeds
  6. Health Dashboard:

    • ✅ Display per-target queue depth, 429 count, error rate, latency
    • ✅ Drill-down to OutboundAudit entries
    • ✅ Real-time status updates (≤20s latency)
    • ✅ Alerts fire for 429 ratio > 2% or p95 latency > 10s
  7. Observability:

    • ✅ OpenTelemetry spans correlate sync operations
    • ✅ Structured logs include trace_id and key job fields
    • ✅ Prometheus metrics exported for dashboard
  8. Security:

    • ✅ Session cookies are Secure, HttpOnly, SameSite=Lax
    • ✅ RBAC enforced: only Owner/ChannelPublisher can manage channels
    • ✅ ABAC filters by org_id on all queries
    • ✅ Audit logs include created_by for manual actions
  9. Data Integrity:

    • ✅ All foreign keys enforced
    • ✅ Soft deletes via deleted_at (no hard deletes)
    • ✅ Outbound audit records immutable after completed_at
    • ✅ 90-day retention policy enforced with scheduled purge

Success Metrics

MetricTargetMeasurement
95% of syncs complete within 5 minutestvl_sync_latency_ms p95 < 300000
< 1% sync failures per day(failed_syncs / total_syncs) < 0.01
100% outbound audit coverageoutbound_audit.count == sync_jobs.count
429 ratio < 2% (rolling 10min)(rate_limited / total_requests) < 0.02
Mean time to recover via snapshot replay < 10minManual test: replay → success < 600s
Health dashboard latency < 20sUI poll interval 15s, backend query < 5s