Research brief: The Dataset is the Product — when a service business should own its data (piece 12 of 15)
Status: Research material — not finished article. Compiled May 2026. B&J CRM is in-development — frame as illustration only; no outcomes claims.
Thesis
Most small service businesses sit at Data Maturity Stage 1 or 2 — data exists, sometimes captured, almost never queryable. The economic question isn't "should we own our data?" (almost always yes above ~$1-2M revenue with repeat customers) but "at what point does owning structure pay back the cost of building it?"
The defensible 2026 answer: when you have (a) repeat customers with multi-event history, (b) pricing or scheduling decisions that recur, and (c) a vertical SaaS that captures your transactions but not your model of the business. Below that bar, a tidy QuickBooks + spreadsheet + single-source-of-truth CRM is sufficient. Above it, the dataset is the product.
The dominant working pattern in 2026
- 91% of companies with 10+ employees use a CRM; ~50% of <10-employee businesses do (industry-consensus)
- Industry-cited CRM project failure rate: 47% (Forrester) to 55% (Johnny Grow) to 70% (multiple aggregators); poor user adoption is the leading cause
- The "captured but unusable" gap: ServiceNow reports "91% of data in CRM systems is incomplete, 18% is duplicated, and 70% becomes outdated each year" (single-source caveat)
- McKinsey: knowledge workers spend "nearly 20%" of the workweek searching for information
- MuleSoft 2025: organizations use 897 applications on average; only 29% are integrated
Vertical SaaS as the system of record
ServiceTitan IPO'd December 2024 with first-day market cap $8.9B, gross retention >95%, NDR >110% — the vertical SaaS playbook validates. But every major vertical SaaS (ServiceTitan, Jobber, Housecall Pro, Clio, Karbon, Tekmetric) captures the transaction well and the customer-specific business model poorly. That gap is where Stage 4 / Stage 5 businesses justify their own data layer.
Honest caveats
- The "91% of CRM data is incomplete" figure traces to ServiceNow without primary methodology — illustrative, not load-bearing.
- ServiceTitan exit-friction figures ($24k-$39k contract buyouts) are aggregated from BBB/Reddit complaints by Projul — two steps from source. Don't cite specific dollar figures without verifying an original BBB complaint.
- Lovett Services 25% revenue growth / 5% net profit margin claim is vendor-published (ServiceTitan) — not independently audited.
- Tunguz's 0.97 vs 0.66 sales efficiency for vertical vs horizontal SaaS is from 2015 (n=54 public SaaS companies). Directional, not 2026 benchmark.
- Quebec Law 25 Axeptio "<5% SME compliance" figure is from a consent-management vendor — directional.
- The B&J CRM has not shipped. Frame as conceptual illustration of "modeling the actual business," not as a delivered case study.
Related
- reference CRM project failure rate: 47% (Forrester) to 50% (Gartner) to 55% (Johnny Grow 2025) to 70% (industry aggregators)
- reference MuleSoft 2025 Connectivity Benchmark: organizations use 897 applications on average; only 29% are integrated
- reference ServiceTitan IPO Dec 12, 2024: $8.9B closing cap, $685M revenue, $62B GTV, >95% gross retention, >110% NDR
- reference ServiceTitan: "Open Data Pledge" promises CSV export — but practitioner reports cite $24k-$39k exit contract buyouts (flag for verification)
- reference Reference: vertical SaaS data portability comparison (ServiceTitan / Jobber / Housecall Pro / Clio / Karbon / Tekmetric)
- reference Tunguz: vertical SaaS median sales efficiency 0.97 vs horizontal 0.66 (n=54, 2015 — flag age)
- reference Canadian privacy 2026: PIPEDA still governs; Bill C-27 died on the Order Paper Jan 6, 2025 — no fines, only findings
- reference OPC vs Brinks Home (PIPEDA Findings #2024-002, Mar 28 2024): inadequate safeguards left customer data accessible for 10 weeks
- reference Quebec Law 25 (fully in force Sept 22, 2024): data portability + fines up to C$25M / 4% of worldwide turnover
- reference CFIB / Sage / Payworks 2025: only 10% of Canadian SMBs have digital tools fully integrated; $1.60 ROI per dollar invested
- reference Clio Legal Trends: AI adoption among legal professionals jumped 19% (2023) → 79% (2024) → 93% mid-sized firms (2025); 64% mid-sized offer flat-fee
- reference Reference: minimum viable data stack for a $1M-$10M Canadian service business (2026, C$100-C$500/month)
- reference Reference: the 5-stage Data Maturity Curve — from stranded data to data as product
- reference Reference: operational data map by industry — what gets generated, stranded, and unlockable for 5 service verticals
- reference Illustration: what "modeling the actual business" means for Boucher & Jones fuel distribution (in-development, NOT delivered)
- rule RULE: Own a single customer_id primary key that joins across your vertical SaaS + QuickBooks + email + ads.
- rule RULE: Don't bend the client's business model to a generic CRM. Either find vertical SaaS that fits, or add a custom data layer on top.
- rule RULE: Treat a deliberate data layer as a privacy-compliance accelerator, not a privacy risk. The scattered alternative is harder to comply with.
- reference Research brief: Owning your stack — why agency-managed platforms cost more than they save (piece 4 of 15)
- reference Research brief: Built to Last — why most SMB sites rebuild every 3-4 years (piece 5 of 15)
- reference Research brief: Public data as a private moat — building proprietary intelligence from government open data (piece 11 of 15)
- reference Research brief: What makes a marketing site do something (piece on brochure vs platform)