Three data categories for SMB-facing analytics: public/government open data, live third-party feeds, and operational (first-party) data

reference · Scope: business · Status: current

open-data data-infrastructure data-moats

Created 2026-06-20

Summary

Claim: Data inputs for SMB-facing analytics fall into three categories:

Public / government open data — statistical agencies (Census/StatCan/Eurostat), central-bank indicators (FRED, BoC Valet), weather, GIS, transit (GTFS), business / property / permit registries.
Live third-party feeds and APIs — commercial market data, mapping/places, embedded-analytics SaaS.
Operational (first-party) data — the business's own transaction logs, CRM, inventory, scheduling, product-usage logs.

Source: Industry framework — synthesised from FRED docs (https://fred.stlouisfed.org/docs/api/fred/), BoC Valet docs (https://www.bankofcanada.ca/valet/docs), gtfs.org, and the build-vs-buy-data literature (https://medium.com/@audaciatech/data-products-build-vs-buy ; https://www.audacia.co.uk).

Confidence: Industry-consensus.

Why this matters for Candid: The three buckets carry very different cost / defensibility profiles. Categories 1 and 2 are non-exclusive (anyone can use them). Category 3 is the only one with native defensibility — see Andreessen Horowitz, "The Empty Promise of Data Moats" (Casado & Lauten, 2019) — most "data network effects" are really scale effects that diminish, Synthesis: data is a defensible asset only when proprietary + hard to replicate + tightly coupled to a feedback loop + continuously refreshed — otherwise it is an operational byproduct any competitor can buy or collect, and R2 — Build only on data you already own — transaction history, CRM, scheduling, no-show patterns; that is the only category with native defensibility.

Related entries

Referenced by (1)

reference Research brief: live data and data-driven tools for SMBs — when it's an edge, when it's overkill (June 2026) · relates-to