Earlier we said “tests exist, CI doesn’t.” On closer look, the test story is worse — and better.
The precise breakdown
Of 7,810 analyzed Opus 4.7 repos:
| Test bucket | Repos | Share |
|---|---|---|
| No tests at all | 4,581 | 59% |
| Minimal tests (test:source < 10%) | 1,031 | 13% |
| Some tests (10-50%) | 1,325 | 17% |
| Well-tested (>50%) | 823 | 11% |
Three out of five Opus 4.7 repos contain zero test files. Not “tests that don’t run” — just no test files at all.
When tests exist, they’re substantial
The 11% that do have strong test coverage write real tests. Their test:source ratios are often >75%, indicating deliberate TDD-style practice. When Opus 4.7 is prompted to write tests, it writes good ones. The issue is it doesn’t do so by default.
Why the gap exists
Three hypotheses based on commit patterns and corpus structure:
1. Initial prompts don’t ask for tests
The “build me X” prompt pattern produces working features. Tests aren’t in the feature description. If the user doesn’t explicitly say “write tests for this,” Opus 4.7 moves on once the feature works.
2. Test runners are installed but not used
394 repos pull in Vitest as a framework (per package.json), but only ~800 repos have substantial *.test.ts files. That means roughly half of Opus 4.7’s repos have Vitest declared but no tests written. The package is scaffolded as part of the template; writing tests is a separate act that doesn’t happen.
3. Agent loops don’t circle back to tests
Even in multi-turn agent sessions, the iteration typically goes: write → run → fix errors → declare done. Tests are a parallel investment with no immediate visual payoff. The agent doesn’t add them unless asked.
What’s different about the 11%?
Looking at the 823 well-tested repos, common traits:
- They’re longer-lived (higher commit counts, not one-shot generations)
- They tend to be libraries or SDKs where tests are the primary validation mechanism
- They’re more likely to have CLAUDE.md specifying a test command
- They often include CI configs (GitHub Actions, etc.)
In other words: a test-writing Opus 4.7 repo is usually one where the human collaborator baked testing into the workflow upfront.
The CI sibling gap
Separately: of the 2,179 repos that do have any tests, only a fraction wire them into CI. Most install Vitest/pytest and configure the test script, but there’s no .github/workflows/ci.yml running it.
So the actual story is:
Has tests 40% (3,179 repos)
↓
Has tests + CI ~15% (estimate based on workflow file presence)
↓
Has tests + CI passing ≪ 15%
The training signal
If you’re curating training data for test generation, don’t pull positive examples from the test-bearing 40%. Pull from the well-tested 11%. That 823-repo subset writes tests consistently, meaningfully, and with the stack Opus 4.7 defaults to (Vitest + React Testing Library, pytest + fixtures).
The fix
Three automated interventions would close most of this gap:
- Post-generation step: if
package.jsonhas atestscript and*.test.*files exist, add a minimal.github/workflows/ci.ymlthat runs on push - Scaffold step: for any “build me X” prompt, include 3-5 smoke tests with the initial generation
- CLAUDE.md reinforcement: when CLAUDE.md exists, instruct the agent to add tests whenever a new function is added
Each of these is trivial to implement and would visibly shift the distribution.