The Golden Set: Highest-Quality Opus 4.7 Repos by Archetype

We run quality scoring on every Opus 4.7 repo that has a completed analysis. The top picks per archetype give the clearest picture of what Opus 4.7 does at its best.

Top overall (by composite quality score)

Repo	Score	Language	Type
`Lambda-Biolab/dpette-usb-driver`	86.7	Python	driver
`ethanarnold/screenase`	85.3	Python	CLI
`chris2ao/unifi-mcp`	84.5	Python	MCP server
`mrcoggsworth/LogPose`	84.0	Python	containerized
`astroicers/CyPulse`	82.2	Python	CLI
`dynamik-dev/bully`	81.9	Python	—
`knorq-ai/xlsx-mcp-server`	81.2	JSON+Python	MCP server
`luthen-seas/nostr-mail-ts`	81.1	TypeScript	—
`DIG-Network/dig-epoch`	80.9	Markdown	—
`jaman/ex_v_ex`	80.7	Elixir	—

The top 10 skew heavily Python + CLI / MCP server. TypeScript does fine at scale but rarely reaches the 80+ tier. The elephant in the room: MCP servers cluster disproportionately at the top — two of the top ten are literally MCP servers, and the pattern repeats deeper in the list.

Top MCP servers specifically

When we filter for repos named *-mcp* or organized like MCP servers:

chris2ao/unifi-mcp — UniFi controller MCP bridge, quality 84.5
knorq-ai/xlsx-mcp-server — Excel/XLSX MCP exposer, quality 81.2
90+ others with mcp-* naming

The MCP-server archetype wins on quality because:
1. Scoped contract: MCP’s JSON-RPC over stdio is a rigid interface, which forces clean code
2. Small: most MCP servers are under 2K lines
3. Template bias: Opus 4.7 seems to have a strong internal template for “how an MCP server looks” and reproduces it consistently

If you’re picking training exemplars from this corpus, MCP servers are the tightest, most uniform subset.

Top CLI tools

Command-line utilities also cluster at the top:

ethanarnold/screenase — Python, 85.3
astroicers/CyPulse — Python, 82.2
A few dozen more in the 70s

Like MCPs, CLIs benefit from small scope + a hard interface (argparse → stdout). Opus 4.7 handles argparse idiomatically and catches the common pitfalls (help text, exit codes).

Top web apps

Web apps in the corpus are much more numerous (928 repos!) but score lower on average:

Quality concentrates in the 60–72 band
The top web app barely cracks 75
Many are scaffolds rather than finished products

This isn’t Opus 4.7’s fault so much as a property of the archetype: web apps are big, multi-layered, and the grader can always find something missing. A quality score of 70 on a large web app is better than 82 on a 500-line CLI.

Top monorepos (by size)

Monorepos are our largest artifacts:

elizaOS/eliza — 1.8M LOC, TypeScript, the well-known agent framework
BOLDPreciousMetals-Master/bold-ops-dashboard — 5.1M LOC, Python ops dashboard
EndUser123/why — 2.7M LOC, Python
Halildeu/platform-ssot — 2.5M LOC
brianonbased-dev/HoloScript — 1.8M LOC, TypeScript

These aren’t “highest quality” by score — they’re biggest. Useful as exemplars of Opus 4.7 at scale.

The practical use of a golden set

If you’re building:

Training data: start with overall_top_50 as positive examples
Reference docs: the MCP servers above are clean illustrations of idiomatic MCP code
Demo projects: the CLIs are easy to show running
Stack comparisons: elizaOS/eliza is a canonical TS agent framework
Anti-pattern hunts: compare to the bottom 10 (not listed here yet, but coming)

Auto-refresh

The golden set is regenerated every 30 minutes. Check the live snapshot JSON (admin-only) for the current picks.

Scores come from the quality_latest materialized view in our Postgres instance. The view rolls up structure_score, code_quality_score, documentation_score, testing_score, practices_score, security_score, and dependency_score into a single overall number.

The Golden Set: Highest-Quality Opus 4.7 Repos by Archetype

The Golden Set: Highest-Quality Opus 4.7 Repos by Archetype

Top overall (by composite quality score)

Top MCP servers specifically

Top CLI tools

Top web apps

Top monorepos (by size)

The practical use of a golden set

Auto-refresh

Share this research

Data Privacy Disclaimer

Want the full dataset?

Related Research

Button, Card, Badge: The Four Horsemen of shadcn/ui

84% of Opus 4.7 Python Functions Return Types

Rust Files Are the Biggest. Java Files Are the Smallest.

52 Repos With .env Committed: A Security Audit