{"id":689,"slug":"content-extraction-decision-tree","title":"Content extraction decision tree — WP REST API default, WXR XML fallback, direct DB only for hidden postmeta","kind":"reference","scope":"business","status":"current","audiences":["claude-code","candid-team"],"topics":["migration-mechanics","content-extraction"],"reference_body":"**Decision tree for extracting WordPress content during migration:**\n\n- **WP REST API** (`/wp-json/wp/v2/posts`, `/pages`, `/media`) — **default choice** for any WP 4.7+ site. Reliable, supports custom post types if `show_in_rest => true`. Sanity's official migration course documents this end-to-end.\n- **WXR XML export** (Tools → Export) — when REST API is firewalled; one-shot full archive; WordPress.com source sites via OAuth. WXR includes references to attachments but **not the binary media files**.\n- **Direct DB query** (WP-CLI `wp db export` or MySQL dump) — only when you need raw `postmeta` rows not exposed via REST. Typically ACF data or custom post-meta fields registered without REST exposure.\n\n**Page-builder content handling:**\n\n| Source | Storage format | Extraction reality |\n|---|---|---|\n| Gutenberg | HTML + `<!-- wp:blockname -->` comments | Cleanly parseable. `@wordpress/block-serialization-default-parser` converts to AST → Portable Text / MDX. |\n| Classic editor | Plain HTML | Trivial. Use `turndown` to convert to markdown. |\n| ACF fields | Serialized PHP in `postmeta`; exposed via REST with ACF-to-REST or WPGraphQL-for-ACF | Manageable if planned; catastrophic if discovered mid-migration. **Always audit ACF first.** |\n| Elementor | JSON blob in `postmeta._elementor_data` | Essentially un-portable. **Rebuild pages from screenshots.** No clean Elementor→markdown/Portable Text path exists. |\n| Divi | Shortcodes in `post_content` | Rebuild, don't convert. Severe lock-in. See [[divi-4-shortcode-lockin-et-pb]] (existing). |\n| Bricks | JSON in `postmeta._bricks_page_content_2` | Rebuild. |\n\n**Image migration (where projects overrun):**\n\n1. Export `/wp-content/uploads/` via SFTP or WP-CLI (`wp media export`).\n2. Upload to new host (Cloudflare Images, Sanity CDN, Vercel Blob, or new repo `public/`).\n3. Rewrite every `<img src>` and every markdown `![alt](path)` in migrated content. Regex against old domain + `/wp-content/uploads/`.\n4. Regenerate responsive sizes via new stack's image pipeline.\n5. Plan to budget time for this — Outsourcify (7,000-article migration) flagged it as a major engineering challenge.\n\n**Internal link rewriting:** one-shot script scanning all body content for `https://oldsite.com/...` or relative `/2019/...` patterns and rewriting them.\n\n**Comments:** for SMB sites <500 lifetime comments, sunset gracefully — export to JSON, archive, replace with Giscus (GitHub Discussions). For active commenter communities, migrate to Disqus or keep WordPress headless.","rationale_body":null,"metadata":null,"links":{"outgoing":[{"slug":"divi-4-shortcode-lockin-et-pb","title":"Divi 4 stored content as proprietary [et_pb_*] shortcodes — orphan text on theme deactivation (Divi 5 fixes this)","kind":"reference","scope":"business","link_type":"depends-on"},{"slug":"elementor-no-deactivate-with-content-issue-5667","title":"Elementor: no built-in \"deactivate but retain content\" option — open feature request since 2018","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"page-builder-migration-by-use-case","title":"Reference: alternative-stack recommendations by use case and budget (Candid 6-tier framework)","kind":"reference","scope":"business","link_type":"relates-to"}],"incoming":[{"slug":"research-brief-wp-migration-playbook","title":"Research brief: The Candid Creative WordPress Migration Playbook (piece 19)","kind":"reference","scope":"business","link_type":"depends-on"},{"slug":"feature-parity-replacements-wp-to-modern","title":"Feature-parity replacements for common WordPress plugins (forms, SEO, search, comments, commerce, membership, newsletter, analytics)","kind":"reference","scope":"business","link_type":"relates-to"},{"slug":"migration-objection-handling-map","title":"Migration objection-handling map — sourced answers to every common client fear about migrating off WordPress","kind":"reference","scope":"business","link_type":"depends-on"}]},"created_at":"2026-05-22T21:24:18.430Z","updated_at":"2026-05-22T21:24:18.430Z"}