← Back to scan
File as GitHub Issue repo: scikit-learn/scikit-learn

Push this scan report to scikit-learn/scikit-learn

Click the green button below to open GitHub’s new-issue form, pre-filled with the report title, summary table, top findings, and an embedded score-card image. No authentication needed — you review on GitHub before submitting. Repobility is credited as the scanner.

Embedded score card image

This image will render at the top of the issue body. Hosted on Repobility, refreshes automatically after re-scans.

Repobility score card

Issue title

Unsafe Deserialization

Curate findings to include

Pick exactly which findings appear in the issue body. By default the top 5 are included. Uncheck noise, check what matters.

Top 5 (default)
Severity Rule Title File:line
MED SEC012 [SEC012] ZipSlip — Archive Path Traversal: Archive extraction without path validation all… examples/applications/plot_out_of_core_…:175
MED SEC012 [SEC012] ZipSlip — Archive Path Traversal: Archive extraction without path validation all… sklearn/utils/fixes.py:348
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. sklearn/utils/estimator_checks.py:2713
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. asv_benchmarks/benchmarks/common.py:176
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. benchmarks/bench_plot_randomized_svd.py:137
MED AIC003 Duplicated implementation block across source files sklearn/externals/array_api_compat/nump…:56
MED AIC003 Duplicated implementation block across source files sklearn/externals/array_api_compat/nump…:52
MED AIC003 Duplicated implementation block across source files sklearn/externals/array_api_compat/dask…:48
MED AIC003 Duplicated implementation block across source files sklearn/externals/array_api_compat/dask…:106
MED AIC003 Duplicated implementation block across source files sklearn/ensemble/_voting.py:279
MED AIC003 Duplicated implementation block across source files sklearn/ensemble/_stacking.py:501
MED AIC003 Duplicated implementation block across source files sklearn/decomposition/_fastica.py:567
MED AIC003 Duplicated implementation block across source files sklearn/datasets/_species_distributions…:111
MED AIC003 Duplicated implementation block across source files sklearn/datasets/_rcv1.py:111
MED AIC003 Duplicated implementation block across source files sklearn/covariance/_shrunk_covariance.py:121
MED AIC003 Duplicated implementation block across source files sklearn/covariance/_shrunk_covariance.py:120
MED AIC003 Duplicated implementation block across source files sklearn/covariance/_robust_covariance.py:541
MED AIC004 Suspicious implementation file appears unreferenced maint_tools/sort_whats_new.py:1
LOW AIC002 Source file name looks like an AI patch artifact maint_tools/sort_whats_new.py:1
LOW CORE_NO_LICENSE No LICENSE file
Reset to top 5 20 findings available (after auto-suppression of test files + won't-fix)

Issue body (markdown)

## Code-quality scan: `scikit-learn/scikit-learn`

**Score: 74/100 (B+)**  ·  26 findings  ·  scanned 2026-05-15 09:54 UTC  ·  445,841 LOC

| Severity | Count |
|---|---|
| CRITICAL | 0 |
| HIGH | 0 |
| MEDIUM | 18 |
| LOW | 2 |

📊 [Full filterable report](https://repobility.com/scan/a5f73a3d-9c26-4983-8ec3-040adfc69698/)  ·  ![scorecard](https://repobility.com/scan/a5f73a3d-9c26-4983-8ec3-040adfc69698/report.png?v=1778838846-s2)

### Top findings

1. **MEDIUM** `SEC012` — ZipSlip — Archive Path Traversal
   `examples/applications/plot_out_of_core_classification.py:175` · A01:2021 Broken Access Control (path traversal)
2. **MEDIUM** `SEC012` — ZipSlip — Archive Path Traversal
   `sklearn/utils/fixes.py:348` · A01:2021 Broken Access Control (path traversal)
3. **MEDIUM** `SEC007` — Unsafe Deserialization
   `sklearn/utils/estimator_checks.py:2713` · A08:2021 Software & Data Integrity Failures
4. **MEDIUM** `SEC007` — Unsafe Deserialization
   `asv_benchmarks/benchmarks/common.py:176` · A08:2021 Software & Data Integrity Failures
5. **MEDIUM** `SEC007` — Unsafe Deserialization
   `benchmarks/bench_plot_randomized_svd.py:137` · A08:2021 Software & Data Integrity Failures

---

_Filed automatically. Close this issue if not useful — we won't refile. Full report: https://repobility.com/scan/a5f73a3d-9c26-4983-8ec3-040adfc69698/_
Already filed
'scikit-learn' is on the known-megaproject org list. These projects use auto-triage bots and established security disclosure channels. Unsolicited automated issues from Repobility would be perceived as AI-generated spam. For security findings, follow the project's SECURITY.md policy. For non-security findings, open a focused PR or community discussion instead.
Megaproject â high spam risk
Could not determine 'scikit-learn/scikit-learn' star count (GitHub API rate-limited or unreachable). When in doubt about repo size, prefer opening a focused PR or a discussion rather than an issue.

The button opens GitHubâs new-issue page in a new tab. You will see the title + body pre-filled â review, edit if you want, then click GitHubâs "Submit new issue" button. Repobility never posts anything on your behalf.

For real security findings on big repos: use the project's SECURITY.md or private advisory flow instead of a public issue.