Opus 4.7 Writes Short Functions — Median 9 Lines

We parsed 1.3 million functions across the Claude Opus 4.7 corpus. Here’s the size distribution.

Function-size distribution (source code only)

Symbol kind Count Median lines p90 lines
function 861,660 9 64
method 440,868 5 35
class 102,534 21 205
interface 117,495 6 16
struct 41,264 5 25
type_alias 90,331 0 0

What this tells us

The median Opus 4.7 function is 9 lines long. Methods (functions attached to classes) are even shorter at 5 lines.

This isn’t an accident. It lines up with prevailing advice in Claude-oriented agent prompts:

“Keep functions small. Prefer composition over long procedural code.”

The p90 of 64 lines means 90% of functions are under ~64 lines. That’s a tight discipline. The long tail does exist — individual functions over 1,000 lines show up — but they’re rare outliers.

Compare to interfaces (TypeScript): median 6 lines, p90 only 16. Opus 4.7 treats interfaces as small type contracts, not giant inheritance hierarchies.

Classes are the fat-tail exception: median 21 lines, p90 205. A handful of god-objects drag the average up.

Why it matters for fine-tuning

If you’re distilling this corpus into a training dataset, the length distribution is gold:
- Hard negative mining: Generated functions >128 lines can be penalized as out-of-distribution
- Positive examples: Small, focused functions with early returns are the dominant positive pattern
- Curriculum: Train on the median-length band first, let longer functions come from composition

Documentation coverage

Even better, we know how often Opus 4.7 documents these functions:

  • 1,316,011 total functions/methods
  • 241,265 have a docstring or JSDoc
  • 18.3% coverage

That’s below Python’s community norm (~35%) but higher than generic JS/TS corpora (~8%). Python functions in the corpus are much better documented than TypeScript functions — a known parser artifact plus a real style difference.

Async adoption

Language-by-language async keyword appearance:

Language Functions w/ async Total %
Python 32,975 296,376 11.1%
JavaScript 29 120,898 ~0%
TypeScript 262 541,617 ~0%

(The JS/TS numbers are parser artifacts — async in TS/JS isn’t captured in the signature field we index on. The Python number is real.)


Takeaway: if you’re scoring AI-generated code quality for an Opus 4.7-matching distribution, weight short functions + early returns + high interface-to-class ratio as positive signals.