Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.
119 of your 142 findings came from Repobility's proprietary detections. ✓ Repobility tags below mark them.

Scan timing: clone 2.68s · analysis 9.78s · 8.7 MB · GitHub API rate-limit (preflight)

skrub-data/skrub

https://github.com/skrub-data/skrub · scanned 2026-06-05 14:27 UTC (5 days, 5 hours ago) · 10 languages

387 raw signals (139 security + 248 graph) 80th percentile · Python · medium (20-100K LoC) System graph score 92 (lower by 13)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 5 days, 5 hours ago · v2 · 161 actionable findings from 2 signal sources. 102 repeated signals grouped for readability. Security checks, system graph analysis, and verified AI-agent feedback are merged into one review queue.

JSON
Score breakdown â 2026-05-18-v5
Component Sub-score Weight Contribution
structure_score 60.0 0.15 9.00
security_score 95.2 0.25 23.80
testing_score 97.0 0.20 19.40
documentation_score 81.0 0.15 12.15
practices_score 70.0 0.15 10.50
code_quality 43.1 0.10 4.31
Overall 1.00 79.2
Severity distribution — click a segment to filter
Active filters: excluding tests × Reset all

Bug-class explainers. Each card groups findings of the same shape — these are the patterns most likely to ship to prod and reappear in future scans unless you systematically fix the cause, not just the instance.

Duplicates & near-duplicates 5 findings
What it is: Same function copy-pasted into multiple modules with minor variations.
Why it matters: Each copy drifts independently — bug fixes apply to one, miss the others.
How AI causes it: AI completes the same pattern in each file rather than refactoring to a shared helper.
Fix approach: Extract the duplicated logic into the most general module both call sites already import. Add tests at the helper level.
5 matching findings on this repo
  • low Near-duplicate function bodies in 3 places repo-level
  • low Near-duplicate function bodies in 2 places repo-level
  • low Near-duplicate function bodies in 6 places
  • low Near-duplicate function bodies in 5 places
  • low Near-duplicate function bodies in 4 places
View all duplicates & near-duplicates findings →
Commented-out code 36 findings
What it is: Lines of source that were intentionally disabled but never deleted.
Why it matters: Git already remembers history — commented code rots, becomes wrong, and adds noise to diffs.
How AI causes it: AI sometimes comments out broken code instead of fixing it. Reviewers approve out of inertia.
Fix approach: Delete. Trust `git log`. If you really need to remember, save it in a notes file under `docs/`.
12 matching findings on this repo
  • info Commented-code block (5 lines) in skrub/_utils.py:146
  • info Commented-code block (6 lines) in skrub/_similarity_encoder.py:385
  • info Commented-code block (6 lines) in skrub/_dispatch.py:199
  • info Commented-code block (10 lines) in skrub/_reporting/_utils.py:57
  • info Commented-code block (6 lines) in skrub/_reporting/_sample_table.py:387
  • info Commented-code block (9 lines) in skrub/_reporting/tests/test_utils.py:19
  • info Commented-code block (5 lines) in skrub/_reporting/_data/templates/report.js:662
  • info Commented-code block (5 lines) in skrub/tests/test_similarity_encoder.py:21
  • info Commented-code block (13 lines) in skrub/tests/test_docstrings.py:155
  • info Commented-code block (5 lines) in skrub/tests/test_to_datetime.py:169
  • info Commented-code block (7 lines) in skrub/tests/test_data_ops_stack_description.p…
  • info Commented-code block (6 lines) in skrub/tests/test_fuzzy_join.py:155
View all commented-out code findings →
Config drift 2 findings
What it is: Settings duplicated across env files, Docker compose, K8s, and code defaults, all with slightly different values.
Why it matters: Production behaviour depends on whichever copy your loader reads first. Subtle bugs in staging that don't reproduce in dev.
How AI causes it: AI writes new config from memory rather than reading the existing source.
Fix approach: Pick one source of truth (env vars + a settings module). Have every other place import from there. Lint for duplicates in CI.
2 matching findings on this repo
  • low Possibly dead Python function: reset_skrub_config doc/conf.py:456
  • low File has no detected symbols: skrub/_reporting/js_tests/cypress.config.js
View all config drift findings →
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/f8fbe2ac-1fee-44fb-921a-41af7da12550/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/f8fbe2ac-1fee-44fb-921a-41af7da12550/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.