Reference: the 5-stage Data Maturity Curve — from stranded data to data as product
Created 2026-05-22
The 5-stage curve from "data exists in the owner's head" to "data is the product." Use as the spine of the article.
| Stage | What it looks like | Business profile | What unlocks moving up |
|---|---|---|---|
| 1. Stranded | Data in owner's head, paper files, text threads, unsorted Gmail. Nothing systematic. | Owner-operator, <$500K, ≤5 customers/week, no employees | Forcing function: first hire, or owner can't remember a recurring customer's last service date |
| 2. Captured but unstructured | QuickBooks holds financials. Google Sheet tracks customers. Calendly handles scheduling. Comms in Gmail. Nothing queryable across silos. | Most $500K-$3M service businesses — the modal stage | Pain of doing month-end manually; a competitor wins a customer the business had but didn't follow up on |
| 3. Queryable in one place | Real CRM or vertical SaaS is system of record. Data exportable to CSV and joinable. Reports still manual but possible. | $1M-$10M; 5-40 employees; one dominant operational pattern | Owner starts asking "what's our gross margin by customer segment?" and can't answer in under a day |
| 4. Surfaced in daily decisions | Dashboards exist. At-risk customer alerts fire. Pricing references history. Forecasting uses real seasonal curves. Team operates from the data. | $5M+; multi-location, multi-product, or multi-segment | Investment in an analyst, fractional data person, or tightly-scoped internal tool |
| 5. Data as product | Dataset is sold, licensed, or used as wedge into adjacent markets (Carfax, Zillow). Or it powers structurally cheaper unit economics. | Rare for SMBs. Achievable for industry-leading regional businesses, roll-ups, or unique data positions (e.g., only fuel distributor with 5 years of weather-correlated tank-fill data in a region) | Strategic decision: is the data a defensible asset or just operational byproduct? |
The honest 2026 read: the majority of service businesses sit at Stage 2, and most don't need to move past Stage 3. The piece's argument is about recognizing when you should — not arguing everyone must.
The Carfax 38-year arc (1984-2022, sold to S&P Global Mobility — Carfax: from 10,000 records faxed in 1986 to 35B+ records across 151,000+ sources — sold to S&P Global Mobility 2022) is the canonical Stage 1 → Stage 5 trajectory. But the structural decisions that get you there — canonical entity IDs, normalized schema, time-series accumulation, proper provenance — improve operational value at every stage along the way.
Referenced by (4)
- reference Illustration: what "modeling the actual business" means for Boucher & Jones fuel distribution (in-development, NOT delivered) relates-to
- reference CFIB / Sage / Payworks 2025: only 10% of Canadian SMBs have digital tools fully integrated; $1.60 ROI per dollar invested relates-to
- rule RULE: Own a single customer_id primary key that joins across your vertical SaaS + QuickBooks + email + ads. depends-on
- reference Research brief: The Dataset is the Product — when a service business should own its data (piece 12 of 15) relates-to