{"id":5,"slug":"research-brief-kb-backed-website-methodology","title":"Research brief: The knowledge-base-backed website (piece 3 of 15)","kind":"reference","scope":"business","status":"current","audiences":["kevin","claude-code","candid-team"],"topics":["content-architecture","knowledge-base","structured-content","e-e-a-t","agency-methodology"],"reference_body":"**Status:** Research material, not finished article. For Candid Creative KB. Compiled May 22, 2026.\n\n## TL;DR\n\n- KB-backed websites separate **research** from **publication** by treating sources, claims, and definitions as typed, citable nodes in a structured content graph — Markdown-on-disk with frontmatter + Zod/JSON Schema validation + static-site build is the dominant 2026 pattern (Astro Content Collections, Quartz, Docusaurus, MDX-on-Next.js).\n- The strongest institutional exemplars are not personal \"digital gardens\" but **Stripe docs, Our World in Data, the Stanford Encyclopedia of Philosophy, Cochrane Library, OpenAlex, Semantic Scholar, Bellingcat's OSINT toolkit, CourtListener, and Anthropic's Transformer Circuits Thread** — all production sites with provenance, dating, and versioning baked into the methodology.\n- The empirical AI-citation case is real but narrow: Aggarwal et al. (KDD '24) showed 30-40% lifts from citations/statistics/quotations and a 115.1% lift for rank-5 pages, but `/llms.txt` itself sees near-zero AI-bot traffic in measurement; the durable value is structure on the page, not the index file.\n\n## The 21 strongest claims to anchor future writing (filed as atomic entries)\n\nSee linked entries below. Through-lines: provenance as a discipline (not a feature); the prose-vs-data dichotomy is false; \"digital garden\" is a B2B liability — reframe as \"research-backed knowledge base\"; the methodology is itself the article — Candid's KB IS the demonstration.\n\n## Caveats (the strongest gaps to acknowledge)\n\n- **No rigorous study** compares lifetime traffic/lead quality/revenue of KB-backed vs CMS-backed marketing sites controlled for industry. The compounding-value claim rests on analogy (Stripe, OWID, Gwern) and theory, not RCTs.\n- The eMarketer 2025 figure (8% / 8.6% overlap with Google top 10) is via secondary sources; underlying Ahrefs methodology not independently audited.\n- OtterlyAI's 90-day llms.txt measurement is single-source.\n- Princeton GEO study tested 2023-era engines. 2026 engine behavior is unverified.\n- \"Digital garden\" usage warning: collides with the unrelated walled-gardens / platforms critique. Disambiguate if both terms appear in the same piece.\n\n## Recommendations to the writer of piece 3\n\n1. Lead with institutional examples (Stripe, OWID, SEP), not personal gardens. Personal gardens read as side projects to skeptical B2B buyers.\n2. Use \"research-backed knowledge base\" or \"documentation-driven website\" — never \"digital garden\" in client-facing copy.\n3. Cite the Princeton GEO paper with the ACM DOI as the empirical anchor.\n4. Acknowledge the llms.txt skepticism early — it inoculates against trend-chasing.\n5. Use Diátaxis as the IA of the article itself.\n6. Build the article as the first node in the agency KB — cross-link to all related entries.","rationale_body":"Why this brief exists: Candid Creative's outward-facing site IS the demonstration of the agency methodology. This brief justifies the structural choices (KB substrate + curated long-form on top) by citing the production examples (Stripe, OWID, SEP) and the empirical evidence (GEO paper).","metadata":null,"links":{"outgoing":[{"slug":"diataxis-framework-procida","title":"Diátaxis (Daniele Procida): four documentation types — tutorials, how-to, reference, explanation","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"matuschak-evergreen-notes","title":"Andy Matuschak: evergreen notes — atomic, concept-oriented, densely linked, accreting over time","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"stripe-docs-as-product","title":"Stripe docs as a first-class product — Markdoc framework, documentation in performance reviews","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"wikipedia-verifiability-policy","title":"Wikipedia verifiability policy: all challenged material must carry an inline citation to a reliable published source","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"wikipedia-top-cited-domain-ai-mode-2026","title":"Wikipedia is the #1 cited domain in Google AI Mode (11.22%); YouTube #2 at 9.51%","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"llms-txt-low-traffic-skepticism","title":"OtterlyAI: /llms.txt sees 0.1% of AI-bot traffic — performs worse than average content page","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"stanford-encyclopedia-permanent-archives","title":"Stanford Encyclopedia of Philosophy: every entry has permanent dated archived editions","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"our-world-in-data-open-provenance","title":"Our World in Data: CC-BY licensing, per-indicator JSON/CSV endpoints, full GitHub provenance","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"openalex-semantic-scholar-citation-graphs","title":"OpenAlex (271M works) and Semantic Scholar (214M+ works) — open scholarly citation graphs at scale","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"astro-content-collections-zod","title":"Astro Content Collections (stable since 2.0, modernized in Content Layer API late 2024): Zod-validated frontmatter, build-time TypeScript types","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"princeton-geo-paper-aggarwal-2024","title":"Princeton GEO paper (Aggarwal et al., KDD '24) — the foundational generative engine optimization study","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"rule-cite-with-named-source-and-url","title":"RULE: Every non-trivial claim carries a named source with author/institution + date + URL. Confidence flag honest.","kind":"rule","scope":"business","link_type":"relates-to"},{"slug":"rule-publish-and-last-updated-dates-mandatory","title":"RULE: Every page ships with a publish date and a last-updated date. Refresh quarterly minimum.","kind":"rule","scope":"business","link_type":"relates-to"},{"slug":"research-brief-structured-content-as-competitive-advantage","title":"Research brief: Structured content as a competitive advantage (piece 2 of 15)","kind":"reference","scope":"business","link_type":"relates-to"}],"incoming":[{"slug":"research-brief-owning-your-stack","title":"Research brief: Owning your stack — why agency-managed platforms cost more than they save (piece 4 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"decay-vs-compound-matrix","title":"Reference framework: which website dimensions decay vs compound over 10 years (12-dimension matrix)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"research-brief-built-to-last","title":"Research brief: Built to Last — why most SMB sites rebuild every 3-4 years (piece 5 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"research-brief-research-before-pages","title":"Research brief: Research Before Pages — methodology for KB-backed websites (piece 14 of 15)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"foundation-roadmap-15-pieces-closure","title":"CANDID REFERENCE: how the 15-brief foundation roadmap connects — the throughline from strategic frame to editorial layer","kind":"reference","scope":"business","link_type":"depends-on"},{"slug":"research-brief-confidence-sources-dated-claims","title":"Research brief: Confidence Levels, Sources, and Dated Claims — why every statement on a credible site should be verifiable (piece 15 of 15)","kind":"reference","scope":"business","link_type":"relates-to"}]},"created_at":"2026-05-22T18:57:39.467Z","updated_at":"2026-05-22T18:57:39.467Z"}