{"id":583,"slug":"research-brief-dataset-is-the-product","title":"Research brief: The Dataset is the Product — when a service business should own its data (piece 12 of 15)","kind":"reference","scope":"business","status":"current","audiences":["kevin","claude-code","smb-owner","candid-team"],"topics":["agency-methodology","vertical-saas","data-infrastructure","crm-systems"],"reference_body":"**Status:** Research material — not finished article. Compiled May 2026. **B&J CRM is in-development — frame as illustration only; no outcomes claims.**\n\n## Thesis\n\nMost small service businesses sit at **Data Maturity Stage 1 or 2** — data exists, sometimes captured, almost never queryable. The economic question isn't \"should we own our data?\" (almost always yes above ~$1-2M revenue with repeat customers) but **\"at what point does owning structure pay back the cost of building it?\"**\n\nThe defensible 2026 answer: when you have (a) repeat customers with multi-event history, (b) pricing or scheduling decisions that recur, and (c) a vertical SaaS that captures your transactions but not your *model of the business*. Below that bar, a tidy QuickBooks + spreadsheet + single-source-of-truth CRM is sufficient. Above it, the dataset is the product.\n\n## The dominant working pattern in 2026\n\n- 91% of companies with 10+ employees use a CRM; ~50% of <10-employee businesses do (industry-consensus)\n- Industry-cited CRM project failure rate: 47% (Forrester) to 55% (Johnny Grow) to 70% (multiple aggregators); poor user adoption is the leading cause\n- The \"captured but unusable\" gap: ServiceNow reports *\"91% of data in CRM systems is incomplete, 18% is duplicated, and 70% becomes outdated each year\"* (single-source caveat)\n- McKinsey: knowledge workers spend \"nearly 20%\" of the workweek searching for information\n- MuleSoft 2025: organizations use 897 applications on average; only 29% are integrated\n\n## Vertical SaaS as the system of record\n\nServiceTitan IPO'd December 2024 with first-day market cap $8.9B, gross retention >95%, NDR >110% — the vertical SaaS playbook validates. But every major vertical SaaS (ServiceTitan, Jobber, Housecall Pro, Clio, Karbon, Tekmetric) captures the *transaction* well and the *customer-specific business model* poorly. That gap is where Stage 4 / Stage 5 businesses justify their own data layer.\n\n## Honest caveats\n\n- The \"91% of CRM data is incomplete\" figure traces to ServiceNow without primary methodology — illustrative, not load-bearing.\n- ServiceTitan exit-friction figures ($24k-$39k contract buyouts) are aggregated from BBB/Reddit complaints by Projul — two steps from source. Don't cite specific dollar figures without verifying an original BBB complaint.\n- Lovett Services 25% revenue growth / 5% net profit margin claim is **vendor-published (ServiceTitan)** — not independently audited.\n- Tunguz's 0.97 vs 0.66 sales efficiency for vertical vs horizontal SaaS is from 2015 (n=54 public SaaS companies). Directional, not 2026 benchmark.\n- Quebec Law 25 Axeptio \"<5% SME compliance\" figure is from a consent-management vendor — directional.\n- The B&J CRM **has not shipped**. Frame as conceptual illustration of \"modeling the actual business,\" not as a delivered case study.","rationale_body":"The strongest argument for owning your data layer comes from naming when NOT to (most sub-$2M owner-operator businesses). This brief pairs the case for ownership with the honest counter-case for \"stay on vertical SaaS until you outgrow it.\"","metadata":null,"links":{"outgoing":[{"slug":"crm-project-failure-rate-47-to-70-pct","title":"CRM project failure rate: 47% (Forrester) to 50% (Gartner) to 55% (Johnny Grow 2025) to 70% (industry aggregators)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"mulesoft-2025-897-apps-29pct-integrated","title":"MuleSoft 2025 Connectivity Benchmark: organizations use 897 applications on average; only 29% are integrated","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"servicetitan-ipo-december-2024-8-9b","title":"ServiceTitan IPO Dec 12, 2024: $8.9B closing cap, $685M revenue, $62B GTV, >95% gross retention, >110% NDR","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"servicetitan-open-data-pledge-vs-exit-friction","title":"ServiceTitan: \"Open Data Pledge\" promises CSV export — but practitioner reports cite $24k-$39k exit contract buyouts (flag for verification)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"vertical-saas-export-portability-comparison-2026","title":"Reference: vertical SaaS data portability comparison (ServiceTitan / Jobber / Housecall Pro / Clio / Karbon / Tekmetric)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"tunguz-vertical-saas-sales-efficiency-097-vs-066","title":"Tunguz: vertical SaaS median sales efficiency 0.97 vs horizontal 0.66 (n=54, 2015 — flag age)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"pipeda-bill-c-27-died-january-2025","title":"Canadian privacy 2026: PIPEDA still governs; Bill C-27 died on the Order Paper Jan 6, 2025 — no fines, only findings","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"brinks-home-opc-finding-2024","title":"OPC vs Brinks Home (PIPEDA Findings #2024-002, Mar 28 2024): inadequate safeguards left customer data accessible for 10 weeks","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"quebec-law-25-sept-2024-data-portability-c25m-fines","title":"Quebec Law 25 (fully in force Sept 22, 2024): data portability + fines up to C$25M / 4% of worldwide turnover","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"cfib-2025-smb-10pct-fully-integrated","title":"CFIB / Sage / Payworks 2025: only 10% of Canadian SMBs have digital tools fully integrated; $1.60 ROI per dollar invested","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"clio-legal-trends-2024-2025-ai-adoption-79-to-93","title":"Clio Legal Trends: AI adoption among legal professionals jumped 19% (2023) → 79% (2024) → 93% mid-sized firms (2025); 64% mid-sized offer flat-fee","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"modern-data-stack-on-budget-2026","title":"Reference: minimum viable data stack for a $1M-$10M Canadian service business (2026, C$100-C$500/month)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"data-maturity-curve-5-stage","title":"Reference: the 5-stage Data Maturity Curve — from stranded data to data as product","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"industry-data-map-5-verticals","title":"Reference: operational data map by industry — what gets generated, stranded, and unlockable for 5 service verticals","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"bj-fuel-distributor-data-model-illustration","title":"Illustration: what \"modeling the actual business\" means for Boucher & Jones fuel distribution (in-development, NOT delivered)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"rule-own-customer-id-as-primary-key","title":"RULE: Own a single customer_id primary key that joins across your vertical SaaS + QuickBooks + email + ads.","kind":"rule","scope":"business","link_type":"relates-to"},{"slug":"rule-dont-bend-business-to-generic-crm","title":"RULE: Don't bend the client's business model to a generic CRM. Either find vertical SaaS that fits, or add a custom data layer on top.","kind":"rule","scope":"business","link_type":"relates-to"},{"slug":"rule-structured-data-is-privacy-compliance-accelerator","title":"RULE: Treat a deliberate data layer as a privacy-compliance accelerator, not a privacy risk. The scattered alternative is harder to comply with.","kind":"rule","scope":"business","link_type":"relates-to"},{"slug":"research-brief-owning-your-stack","title":"Research brief: Owning your stack — why agency-managed platforms cost more than they save (piece 4 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"research-brief-built-to-last","title":"Research brief: Built to Last — why most SMB sites rebuild every 3-4 years (piece 5 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"research-brief-public-data-private-moat","title":"Research brief: Public data as a private moat — building proprietary intelligence from government open data (piece 11 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"research-brief-marketing-sites-that-do-something","title":"Research brief: What makes a marketing site do something (piece on brochure vs platform)","kind":"reference","scope":"business","link_type":"relates-to"}],"incoming":[{"slug":"foundation-roadmap-15-pieces-closure","title":"CANDID REFERENCE: how the 15-brief foundation roadmap connects — the throughline from strategic frame to editorial layer","kind":"reference","scope":"business","link_type":"depends-on"}]},"created_at":"2026-05-22T20:37:13.053Z","updated_at":"2026-05-22T20:37:13.053Z"}