Research brief: the searchable, structured catalogue as a working tool — when records-not-prose pays off (June 2026)
Status: Synthesised June 2026. Sister brief to Research brief: the falling cost floor of "real" web functionality for SMBs (June 2026) (falling cost-floor for real web functionality) and Research brief: the website as a working surface of the business — four capabilities, AI-citation decoupling, freshness as a real signal (June 2026) (the website as a working surface).
TL;DR — the through-line
A searchable, structured catalogue — where each item is a distinct record with queryable attributes — functions as a working tool: visitors filter and query it to self-serve answers (Nielsen Norman Group — faceted search refines a large content set; controls + results displayed simultaneously, ScienceDirect — faceted search definition: progressive refinement by independent facets/attributes), and each record becomes its own individually-indexable, search- and AI-eligible page (Google Search Central — structured data labels each individual element so users can search by ingredient, calorie count, cook time, Google Search Central — product rich results require "a distinct URL" per product (or per variant); confirms one findable page per record, schema.org — ItemList / ListItem / Product / Offer types exist precisely to mark up individual records and lists as machine-readable structured data).
Independent research robustly establishes that customers prefer to self-serve before contacting a business (HBR (Dixon et al., 2017) — 81% of all customers attempt self-service before reaching out to a live representative, Forrester (2016) — web/mobile self-service overtook the phone; FAQ-page use rose from 67% (2012) to 81% (2015) of US online adults) — but this evidence is about self-service BROADLY, not searchable catalogues SPECIFICALLY. The strongest indirect argument that structured records beat un-queryable documents is Gartner's finding that self-service most often fails on findability (Gartner / Eric Keller — single most common self-service failure mode is inability to find relevant content; appears in >43% of cases, Gartner (Aug 2024) — only 14% of customer service and support issues are fully resolved in self-service; only 36% for "very simple" issues).
The build-versus-skip line
Build when records are numerous, change often, or carry several independent attributes worth querying — roughly, when a visitor would otherwise scan past a few hundred items (HawkSearch (vendor — concedes against interest) — for catalogs of "just a few dozen products," basic search and navigation are adequate, Prefixbox (vendor — faceted-search software, concedes against interest) — skip faceted search if catalog <200 products; implementation cost outweighs UX value, Luigi's Box (vendor) — independently corroborates the skip cutoff: "smaller catalogs with only a few hundred products may not require this level of complexity"), or when the data updates frequently, or when items differ along multiple dimensions people genuinely filter on. See R1 — Build a searchable, structured catalogue when records are numerous, change often, or carry several independent queryable attributes.
Skip when the inventory is small (commonly cited as under ~200 items / a few dozen pages), stable, and shallow. See R2 — Skip faceted search when the inventory is small (~under 200 items), stable, and shallow; a static list is fine.
What the brief recommends
- One indexable page per record (R3 — One indexable page per record (distinct URL per item / variant) — per Google's merchant-listing guidance) — per Google's "distinct URL per variant" guidance (Google Search Central — product rich results require "a distinct URL" per product (or per variant); confirms one findable page per record).
- Do not generalise the Nestlé 82% CTR figure (R4 — Do NOT generalise the Nestlé 82% CTR figure; it is one company's self-measurement, never present as typical) — it is one company's self-measurement, RING-FENCED (Nestlé 82% higher CTR on rich-results pages — Google's own documentation, attributed to one named company; RING-FENCED, never present as typical).
- Self-service fails on findability — deliver via structured records (R5 — Self-service most often fails on findability; deliver self-service via structured, queryable records, not un-queryable documents), anchored on the Gartner 43% finding.
- Quarantine vendor CTR multipliers (R6 — Quarantine vendor CTR / AI-citation multipliers; the only clean primary figure (Nestlé 82%) is single-company and ring-fenced) — the SEO-agency "35% CTR" / "2.5x AI" / "40% more AI Overview" claims have incentive to inflate and no independent corroboration.
Source-incentive meta-finding
The independent self-service research (HBR, Forrester, Gartner) all measures self-service in aggregate — no study isolates "searchable catalogue" as a discrete intervention. The strongest primary documentation for the mechanism comes from Google Search Central and schema.org. The only clean magnitude figure independent of vendors is Baymard's 67-90% vs 17-33% abandonment finding (Baymard Institute (2015) — sites with mediocre product list usability saw 67-90% abandonment vs 17-33% for sites with even a slightly optimised toolset) — ~11 years old and e-commerce-specific.
See Caveats for the searchable-catalogue brief: no study isolates catalogues as a variable; vendor-sourced skip thresholds; Baymard age; AI-eligibility under-sourced for the full source-quality reckoning.
Related
- reference Research brief: the falling cost floor of "real" web functionality for SMBs (June 2026)
- reference ScienceDirect — faceted search definition: progressive refinement by independent facets/attributes
- reference FlowHunt (vendor glossary) — faceted search plain-language: filter by multiple attributes simultaneously vs single-attribute filter
- reference Nielsen Norman Group — faceted search refines a large content set; controls + results displayed simultaneously
- reference Structured vs unstructured data — Integrate.io / MongoDB definition: predefined schema, every record consistent, sorting/filtering/querying straightforward
- reference USPTO Patent 8,700,594 — structured data is "searchable by data type"; unstructured data (bitmaps, audio, text docs) is not
- reference Mechanism summary — structured catalogues expose attributes as data; prose/PDF/images lock them in a format no filter can reach
- reference HBR (Dixon et al., 2017) — 81% of all customers attempt self-service before reaching out to a live representative
- reference Forrester (2016) — web/mobile self-service overtook the phone; FAQ-page use rose from 67% (2012) to 81% (2015) of US online adults
- reference Forrester 2015 Customer Lifecycle Survey — 53% of customers likely to abandon online purchases if they can't find quick answers; 73% say valuing their time is most important
- reference Gartner (Aug 2024) — only 14% of customer service and support issues are fully resolved in self-service; only 36% for "very simple" issues
- reference Gartner / Eric Keller — single most common self-service failure mode is inability to find relevant content; appears in >43% of cases
- reference Gartner (Sept 2019, 8,398 customers) — company website preferred by 12% / search engine by 4% for issue resolution; phone leads at 44%
- reference Gartner (April 2021) — 62% of millennials and 75% of Gen Z would use noncompany guidance (Google, YouTube) to self-resolve, even when they can contact support
- reference schema.org — ItemList / ListItem / Product / Offer types exist precisely to mark up individual records and lists as machine-readable structured data
- reference schema.org — founded jointly by Bing, Google, and Yahoo! in 2011; Yandex joined Nov 2011
- reference Google Search Central — structured data labels each individual element so users can search by ingredient, calorie count, cook time
- reference Google Search Central — structured data produces eligibility, not a guarantee; does not guarantee rich results even with correct markup
- reference Google Search Central — product rich results require "a distinct URL" per product (or per variant); confirms one findable page per record
- reference Google Search Central — "We won't show a rich result for time-sensitive content that is no longer relevant"; data freshness is implicitly rewarded
- reference Bing — supports JSON-LD, schema.org, Microdata, Microformats, Open Graph, RDFa via the Markup Validator (August 2018)
- reference Nestlé 82% higher CTR on rich-results pages — Google's own documentation, attributed to one named company; RING-FENCED, never present as typical
- reference AI-citation eligibility via structured records — DIRECTIONAL; vendor-blog driven, Google's own position is "not a direct ranking factor"
- reference Baymard Institute (2015) — sites with mediocre product list usability saw 67-90% abandonment vs 17-33% for sites with even a slightly optimised toolset
- reference Nielsen Norman Group — faceted navigation adds interaction cost and vocabulary/metadata maintenance "consumes significant financial and human resources"
- reference Algolia (vendor) — faceting helps when catalogs have multiple specifications and broad-based filtering is insufficient
- reference HawkSearch (vendor — concedes against interest) — for catalogs of "just a few dozen products," basic search and navigation are adequate
- reference Prefixbox (vendor — faceted-search software, concedes against interest) — skip faceted search if catalog <200 products; implementation cost outweighs UX value
- reference Luigi's Box (vendor) — independently corroborates the skip cutoff: "smaller catalogs with only a few hundred products may not require this level of complexity"
- reference Caveats for the searchable-catalogue brief: no study isolates catalogues as a variable; vendor-sourced skip thresholds; Baymard age; AI-eligibility under-sourced
- research-notes QUARANTINE — "67% of customers prefer self-service over speaking with a live agent" (attributed to Zendesk)
- research-notes QUARANTINE — "Self-service interaction costs $0.10 vs $6-$12 for live agent" (attributed to Forrester)
- research-notes QUARANTINE — "Pages with structured data earn 35% higher CTR" / "2.5x higher chance of appearing in AI answers" / "40% more AI Overview appearances"
- research-notes QUARANTINE — "Microsoft reported a 22% reduction in support tickets" / "Spotify forum resolves 30% of issues"
- research-notes QUARANTINE — "60% of websites do not use facets and filters" / "only 10% of e-commerce sites use faceted sorting"
- rule R1 — Build a searchable, structured catalogue when records are numerous, change often, or carry several independent queryable attributes
- rule R2 — Skip faceted search when the inventory is small (~under 200 items), stable, and shallow; a static list is fine
- rule R3 — One indexable page per record (distinct URL per item / variant) — per Google's merchant-listing guidance
- rule R4 — Do NOT generalise the Nestlé 82% CTR figure; it is one company's self-measurement, never present as typical
- rule R5 — Self-service most often fails on findability; deliver self-service via structured, queryable records, not un-queryable documents
- rule R6 — Quarantine vendor CTR / AI-citation multipliers; the only clean primary figure (Nestlé 82%) is single-company and ring-fenced
- reference Research brief: the website as a working surface of the business — four capabilities, AI-citation decoupling, freshness as a real signal (June 2026)
Referenced by (11)
- reference HBR (Dixon et al., 2017) — 81% of all customers attempt self-service before reaching out to a live representative relates-to
- reference Gartner / Eric Keller — single most common self-service failure mode is inability to find relevant content; appears in >43% of cases relates-to
- reference Baymard Institute (2015) — sites with mediocre product list usability saw 67-90% abandonment vs 17-33% for sites with even a slightly optimised toolset relates-to
- reference USPTO Patent 8,700,594 — structured data is "searchable by data type"; unstructured data (bitmaps, audio, text docs) is not relates-to
- reference Google Search Central — structured data labels each individual element so users can search by ingredient, calorie count, cook time relates-to
- reference Google Search Central — product rich results require "a distinct URL" per product (or per variant); confirms one findable page per record relates-to
- reference schema.org — ItemList / ListItem / Product / Offer types exist precisely to mark up individual records and lists as machine-readable structured data relates-to
- reference Research brief: the website as a working surface of the business — four capabilities, AI-citation decoupling, freshness as a real signal (June 2026) relates-to
- reference Capability 1 — structured, queryable data: content stored as records with fields, types and relationships so it can be filtered, sorted, searched, and assembled on demand relates-to
- reference Research brief: the falling cost floor of "real" web functionality for SMBs (June 2026) relates-to
- reference Enterprise-tier example: typo-tolerant instant search over a product/document catalog — Algolia or Elasticsearch instead of a dedicated Lucene engineer relates-to