← Back to scan
File as GitHub Issue repo: skrub-data/skrub

Push this scan report to skrub-data/skrub

Click the green button below to open GitHub’s new-issue form, pre-filled with the report title, summary table, top findings, and an embedded score-card image. No authentication needed — you review on GitHub before submitting. Repobility is credited as the scanner.

Embedded score card image

This image will render at the top of the issue body. Hosted on Repobility, refreshes automatically after re-scans.

Repobility score card

Issue title

Missing import: `queue` used but not imported

Curate findings to include

Pick exactly which findings appear in the issue body. By default the top 5 are included. Uncheck noise, check what matters.

Top 5 (default)
Severity Rule Title File:line
CRIT MINED030 [MINED030] Python Pickle Loads: pickle.loads() can execute arbitrary code via __reduce__. doc/tutorials/1110_data_ops_intro.py:173
CRIT MINED018 [MINED018] Unsafe Deserialization Pickle: pickle.loads / yaml.load (without Loader=SafeLo… doc/tutorials/1110_data_ops_intro.py:173
CRIT SEC081 [SEC081] Python: pickle.loads / marshal.loads on untrusted data: pickle.load(s) and marsh… doc/tutorials/1110_data_ops_intro.py:173
CRIT MINED107 Missing import: `html` used but not imported skrub/_reporting/_table_report.py:444
CRIT MINED107 Missing import: `queue` used but not imported skrub/_reporting/_serve.py:66
CRIT MINED107 Missing import: `string` used but not imported skrub/_string_distances.py:78
CRIT MINED107 Missing import: `string` used but not imported skrub/_fast_hash.py:81
CRIT MINED116 Workflow uses `secrets.CODECOV_TOKEN` on a `pull_request` trigger .github/workflows/testing.yml:42
HIGH SEC128 [SEC128] Async function without await — fire-and-forget Promise (AI mistake): Async call … skrub/_join_utils.py:191
HIGH SEC128 [SEC128] Async function without await — fire-and-forget Promise (AI mistake): Async call … skrub/_dispatch.py:238
HIGH MINED012 [MINED012] Curl Pipe Bash: curl ... | sh / bash — runs unverified network code. build_tools/circle/build_doc.sh:101
HIGH MINED108 `self.all_outputs_` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:220
HIGH MINED108 `self._transformed_output_names` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:219
HIGH MINED108 `self.created_outputs_` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:219
HIGH MINED108 `self._columns` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:218
HIGH MINED108 `self.used_inputs_` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:218
HIGH MINED108 `self._columns` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:196
HIGH MINED108 `self._columns` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:189
HIGH MINED108 `self.all_inputs_` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:188
HIGH MINED108 `self.fit_transform` used but never assigned in __init__ skrub/_apply_to_sub_frame.py:165
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:172
HIGH MINED108 `self.module_name_` used but never assigned in __init__ skrub/_check_input.py:135
HIGH MINED108 `self.feature_names_in_` used but never assigned in __init__ skrub/_check_input.py:144
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:151
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:150
HIGH MINED108 `self.feature_names_in_` used but never assigned in __init__ skrub/_check_input.py:140
HIGH MINED108 `self.module_name_` used but never assigned in __init__ skrub/_check_input.py:133
HIGH MINED108 `self._handle_array` used but never assigned in __init__ skrub/_check_input.py:130
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:124
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:123
HIGH MINED108 `self._handle_array` used but never assigned in __init__ skrub/_check_input.py:114
HIGH MINED108 `self.feature_names_out_` used but never assigned in __init__ skrub/_check_input.py:122
HIGH MINED108 `self.n_features_in_` used but never assigned in __init__ skrub/_check_input.py:121
HIGH MINED108 `self.feature_names_in_` used but never assigned in __init__ skrub/_check_input.py:120
HIGH MINED108 `self.module_name_` used but never assigned in __init__ skrub/_check_input.py:116
HIGH MINED108 `self.fit_transform` used but never assigned in __init__ skrub/_check_input.py:109
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/check_stub_files_diff…:18
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/check_stub_files_diff…:17
HIGH MINED115 Action `peter-evans/create-pull-request` pinned to mutable ref `@v8` .github/workflows/update_pixi_lock_file…:38
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/update_pixi_lock_file…:25
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/update_pixi_lock_file…:24
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/run-code-format-check…:18
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/run-code-format-check…:17
HIGH MINED115 Action `larsoner/circleci-artifacts-redirector-action` pinned to mutable ref `@master` .github/workflows/main.yml:18
HIGH MINED115 Action `cypress-io/github-action` pinned to mutable ref `@v7` .github/workflows/test-javascript.yml:29
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/test-javascript.yml:16
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/test-javascript.yml:15
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/changelog.yml:21
HIGH MINED115 Action `actions/first-interaction` pinned to mutable ref `@v3` .github/workflows/welcome_action.yaml:20
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/testing.yml:64
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/testing.yml:63
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/testing.yml:50
HIGH MINED115 Action `codecov/codecov-action` pinned to mutable ref `@v6.0.1` .github/workflows/testing.yml:40
HIGH MINED115 Action `prefix-dev/setup-pixi` pinned to mutable ref `@v0.9.6` .github/workflows/testing.yml:28
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/testing.yml:27
HIGH MINED131 pre-commit hook `https://github.com/astral-sh/ruff-pre-commit` pinned to mutable rev `v0.… .pre-commit-config.yaml:15
HIGH MINED131 pre-commit hook `https://github.com/pre-commit/pygrep-hooks` pinned to mutable rev `v1.10… .pre-commit-config.yaml:8
HIGH MINED131 pre-commit hook `https://github.com/pre-commit/pre-commit-hooks` pinned to mutable rev `v… .pre-commit-config.yaml:2
HIGH GHSA-3xgq-45jj-v275 cross-spawn: GHSA-3xgq-45jj-v275 skrub/_reporting/js_tests/package-lock.…
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. skrub/_data_ops/_utils.py:56
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. doc/tutorials/1110_data_ops_intro.py:173
MED MINED111 Bare except continues silently doc/sphinxext/github_link.py:59
MED MINED111 Bare except continues silently doc/sphinxext/github_link.py:67
MED MINED111 Bare except continues silently doc/sphinxext/github_link.py:54
MED MINED111 Bare except continues silently skrub/datasets/_utils.py:351
MED MINED111 Bare except continues silently skrub/_data_ops/_data_ops.py:250
MED MINED111 Bare except continues silently skrub/_data_ops/_data_ops.py:2017
MED MINED111 Bare except continues silently skrub/_data_ops/_data_ops.py:745
MED MINED111 Bare except continues silently skrub/_data_ops/_data_ops.py:410
MED MINED111 Bare except continues silently skrub/_data_ops/_data_ops.py:1386
MED MINED111 Bare except continues silently skrub/_data_ops/_inspection.py:70
MED MINED111 Bare except continues silently skrub/_data_ops/_inspection.py:156
MED MINED111 Bare except continues silently skrub/_data_ops/_utils.py:85
MED MINED111 Bare except continues silently skrub/_data_ops/_utils.py:161
MED MINED111 Bare except continues silently skrub/_data_ops/_optuna.py:272
MED MINED111 Bare except continues silently skrub/selectors/_selectors.py:573
MED MINED111 Bare except continues silently skrub/_interpolation_joiner.py:442
MED MINED111 Bare except continues silently skrub/_interpolation_joiner.py:426
MED MINED111 Bare except continues silently skrub/_single_column_transformer.py:345
MED MINED109 Mutable default argument in `repr_args` (dict) skrub/_utils.py:194
MED MINED124 requirements.txt: `pydata-sphinx-theme` has no version pin .binder/requirements.txt:7
MED MINED124 requirements.txt: `statsmodels` has no version pin .binder/requirements.txt:6
MED MINED124 requirements.txt: `seaborn` has no version pin .binder/requirements.txt:5
MED MINED124 requirements.txt: `matplotlib` has no version pin .binder/requirements.txt:4
MED MINED124 requirements.txt: `sphinxext-opengraph` has no version pin .binder/requirements.txt:3
MED MINED124 requirements.txt: `dirty-cat` has no version pin .binder/requirements.txt:2
MED MINED124 requirements.txt: `sphinx-gallery` has no version pin .binder/requirements.txt:1
MED GHSA-q8mj-m7cp-5q26 qs: GHSA-q8mj-m7cp-5q26 skrub/_reporting/js_tests/package-lock.…
MED AGT015 Remote install command pipes network code directly to a shell build_tools/circle/build_doc.sh:101
LOW COMP001 [COMP001] High cognitive complexity: Function `make_node` has cognitive complexity 11 (So… doc/sphinxext/sphinx_issues.py:91
LOW COMP001 [COMP001] High cognitive complexity: Function `_linkcode_resolve` has cognitive complexit… doc/sphinxext/github_link.py:24
LOW COMP001 [COMP001] High cognitive complexity: Function `extract_code` has cognitive complexity 9 (… doc/generate_data_ops_example_for_index…:86
LOW DEPCUR-NPM npm package `cypress` is minor version(s) behind (15.15.0 -> 15.16.0) skrub/_reporting/js_tests/package.json
LOW AIC003 Duplicated implementation block across source files skrub/selectors/_selectors.py:30
LOW AIC003 Duplicated implementation block across source files skrub/_text_encoder.py:329
LOW AIC003 Duplicated implementation block across source files skrub/_string_encoder.py:225
LOW AIC003 Duplicated implementation block across source files skrub/_string_encoder.py:223
LOW AIC003 Duplicated implementation block across source files skrub/_single_column_transformer.py:37
LOW AIC003 Duplicated implementation block across source files skrub/_minhash_encoder.py:254
LOW AIC003 Duplicated implementation block across source files skrub/_datetime_encoder.py:404
LOW AIC003 Duplicated implementation block across source files skrub/_apply_to_sub_frame.py:137
LOW AIC003 Duplicated implementation block across source files skrub/_apply_to_each_col.py:55
INFO MINED044 [MINED044] Js Console Log Prod: console.log left in code. Should be replaced with logger … skrub/_reporting/_data/templates/data_o…:32
INFO MINED064 [MINED064] Python Input Call: input() blocks for stdin. Inappropriate in services. skrub/_check_input.py:101
INFO MINED055 [MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi… examples/data_ops/1131_optuna_choices.py:22
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… skrub/_clean_categories.py:150
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… skrub/_check_input.py:50
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… doc/sphinxext/autoshortsummary.py:29
INFO MINED043 [MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr… doc/sphinxext/github_link.py:31
INFO MINED043 [MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr… doc/demo_periodic_features.py:14
Reset to top 5 110 findings available (after auto-suppression of test files + won't-fix)

Issue body (markdown)

## Code-quality scan: `skrub-data/skrub`

**Score: 85/100 (B+)**  ·  139 findings  ·  scanned 2026-06-05 14:27 UTC  ·  58,205 LOC

| Severity | Count |
|---|---|
| CRITICAL | 8 |
| HIGH | 51 |
| MEDIUM | 30 |
| LOW | 13 |

📊 [Full filterable report](https://repobility.com/scan/f8fbe2ac-1fee-44fb-921a-41af7da12550/)  ·  ![scorecard](https://repobility.com/scan/f8fbe2ac-1fee-44fb-921a-41af7da12550/report.png?v=1780669620-s2)

### Top findings

1. **CRITICAL** `MINED030` — Python Pickle Loads
   `doc/tutorials/1110_data_ops_intro.py:173` · CWE-502 · ✓ Repobility
2. **CRITICAL** `MINED018` — Unsafe Deserialization Pickle
   `doc/tutorials/1110_data_ops_intro.py:173` · CWE-502 · ✓ Repobility
3. **CRITICAL** `SEC081` — Python: pickle.loads / marshal.loads on untrusted data
   `doc/tutorials/1110_data_ops_intro.py:173` · A05:2021 Security Misconfiguration
4. **CRITICAL** `MINED107` — Missing import: `html` used but not imported
   `skrub/_reporting/_table_report.py:444` · ✓ Repobility
5. **CRITICAL** `MINED107` — Missing import: `queue` used but not imported
   `skrub/_reporting/_serve.py:66` · ✓ Repobility

---

**Security note**: this issue is public. If any flagged finding is a real, exploitable vulnerability, please redirect to your `SECURITY.md` policy or open a [private security advisory](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing-information-about-vulnerabilities/privately-reporting-a-security-vulnerability) instead. We're happy to close this and re-submit privately.

---

_Filed automatically. Close this issue if not useful — we won't refile. Full report: https://repobility.com/scan/f8fbe2ac-1fee-44fb-921a-41af7da12550/_
Megaproject â high spam risk
Could not determine 'skrub-data/skrub' star count (GitHub API rate-limited or unreachable). When in doubt about repo size, prefer opening a focused PR or a discussion rather than an issue.
Already filed
70/142 findings (49%) on this scan are already flagged as test-file, won't-fix, or suppressed. The scan is too noisy to file as a single issue. Curate down to specific actionable findings, or address the FP source first.

The button opens GitHubâs new-issue page in a new tab. You will see the title + body pre-filled â review, edit if you want, then click GitHubâs "Submit new issue" button. Repobility never posts anything on your behalf.

For real security findings on big repos: use the project's SECURITY.md or private advisory flow instead of a public issue.