AI-assisted dev productivity 2026: published evidence ranges from 21-28% speedup (GovTech) to 19% slowdown (METR RCT) — honest ceiling is 1.3-1.7× for fluent users

reference·Scope: business·Status: current

agency-methodology ai-dev-productivity

Created 2026-05-22

Published research range — three primary sources:

GovTech Singapore GitHub Copilot study (arXiv:2409.17434): "coding/tasks speed increased by 21-28%".
Longitudinal arXiv:2509.19708 (100+ developers): "approximately 30-40% of code shipped to production through this tool accounts for overall 28% increase in code shipment volume".
METR 2025 RCT (arXiv:2507.09089, n=16 experienced OSS developers): "allowing AI actually increases completion time by 19%—AI tooling slowed developers down," despite developers forecasting a 24% reduction.

Confidence: Moderate. The field is moving fast and results are workload-dependent.

Translation:

Fluent users with skill-library discipline → meaningful (20-30%) lifts.
Naive power-users on unfamiliar codebases → can be slower than baseline.
Honest ceiling for a 2-person Candid Creative shop: ship 1.3-1.7× the marketing-site work a 2024-equivalent shop could once the team is fluent.
Cap is set by client communication, discovery, and design — not coding.

Where AI benefits most (high confidence):

Greenfield Astro/Tailwind scaffolding.
Boilerplate refactors (renaming, type-narrowing, API client updates).
Framework migrations (Tailwind v3→v4, Astro 4→5→6, Next.js 14→15→16).
Writing tests for existing code.
One-off scripts and CMS schemas.

Where it benefits least (moderate confidence):

Performance optimization (requires human judgment about what to measure).
Accessibility decisions.
Information architecture and content strategy.
Anything requiring institutional/client context the model doesn't have.

Push back on hype: "Vibe coding" entire production marketing sites without review produces fragile code. The shops that win in 2026 use AI to accelerate the boring and keep humans on the judgment.

rule RULE: Every Candid developer gets paid Claude Code + Cursor (or Copilot). ~$30-40/mo/person is not a real number against the productivity lift.

Referenced by (1)

reference Research brief: Candid Creative 2026 Build-Standards — web stack decision framework for SMB marketing sites & lightweight apps (piece 16) · relates-to

Related

Referenced by (1)