← Back to scan
File as GitHub Issue repo: D4Vinci/Scrapling

Push this scan report to D4Vinci/Scrapling

Click the green button below to open GitHub’s new-issue form, pre-filled with the report title, summary table, top findings, and an embedded score-card image. No authentication needed — you review on GitHub before submitting. Repobility is credited as the scanner.

Embedded score card image

This image will render at the top of the issue body. Hosted on Repobility, refreshes automatically after re-scans.

Repobility score card

Issue title

Missing import: `string` used but not imported

Curate findings to include

Pick exactly which findings appear in the issue body. By default the top 5 are included. Uncheck noise, check what matters.

Top 5 (default)
Severity Rule Title File:line
CRIT MINED030 [MINED030] Python Pickle Loads: pickle.loads() can execute arbitrary code via __reduce__. scrapling/spiders/checkpoint.py:74
CRIT MINED018 [MINED018] Unsafe Deserialization Pickle: pickle.loads / yaml.load (without Loader=SafeLo… scrapling/spiders/checkpoint.py:74
CRIT SEC081 [SEC081] Python: pickle.loads / marshal.loads on untrusted data: pickle.load(s) and marsh… scrapling/spiders/checkpoint.py:74
CRIT MINED013 [MINED013] Password In Url: https://user:password@host — leaks creds via logs, referrer, … scrapling/engines/toolbelt/proxy_rotati…:60
CRIT MINED107 Missing import: `string` used but not imported scrapling/core/utils/_utils.py:119
CRIT MINED116 Workflow uses `secrets.CONTAINER_TOKEN` on a `pull_request` trigger .github/workflows/docker-build.yml:47
CRIT MINED116 Workflow uses `secrets.DOCKER_PASSWORD` on a `pull_request` trigger .github/workflows/docker-build.yml:40
CRIT MINED116 Workflow uses `secrets.DOCKER_USERNAME` on a `pull_request` trigger .github/workflows/docker-build.yml:39
CRIT MINED125 GHA script injection via github.event.pull_request.title in run-step .github/workflows/release-and-publish.y…:27
CRIT GHSA-3f63-hfp8-52jq pillow: GHSA-3f63-hfp8-52jq docs/requirements.txt
HIGH MINED004 [MINED004] Weak Crypto: MD5/SHA1/DES/RC4 used for security context (not just checksums). scrapling/spiders/request.py:122
HIGH SEC128 [SEC128] Async function without await — fire-and-forget Promise (AI mistake): Async call … scrapling/spiders/request.py:163
HIGH SEC128 [SEC128] Async function without await — fire-and-forget Promise (AI mistake): Async call … scrapling/engines/_browsers/_validators…:209
HIGH SEC029 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTT… scrapling/spiders/links.py:249
HIGH SEC029 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTT… scrapling/engines/_browsers/_validators…:42
HIGH SEC029 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTT… scrapling/core/storage.py:24
HIGH SEC078 [SEC078] Python: requests without timeout: requests.get/post without a timeout will hang … benchmarks.py:138
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:443
HIGH MINED108 `self.iterancestors` used but never assigned in __init__ scrapling/parser.py:437
HIGH MINED108 `self.iterancestors` used but never assigned in __init__ scrapling/parser.py:429
HIGH MINED108 `self.__element_convertor` used but never assigned in __init__ scrapling/parser.py:422
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:419
HIGH MINED108 `self.parent` used but never assigned in __init__ scrapling/parser.py:414
HIGH MINED108 `self.parent` used but never assigned in __init__ scrapling/parser.py:413
HIGH MINED108 `self.__element_convertor` used but never assigned in __init__ scrapling/parser.py:405
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:402
HIGH MINED108 `self.__elements_convertor` used but never assigned in __init__ scrapling/parser.py:397
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:394
HIGH MINED108 `self.__element_convertor` used but never assigned in __init__ scrapling/parser.py:389
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:381
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:363
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:357
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:347
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:338
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:298
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:271
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:262
HIGH MINED108 `self.__elements_convertor` used but never assigned in __init__ scrapling/parser.py:248
HIGH MINED108 `self.attrib` used but never assigned in __init__ scrapling/parser.py:191
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:189
HIGH MINED108 `self.attrib` used but never assigned in __init__ scrapling/parser.py:186
HIGH MINED108 `self._is_text_node` used but never assigned in __init__ scrapling/parser.py:184
HIGH MINED106 Phantom test coverage: test_autoscraper benchmarks.py:116
HIGH MINED106 Phantom test coverage: test_scrapling_text benchmarks.py:111
HIGH MINED106 Phantom test coverage: test_selectolax benchmarks.py:94
HIGH MINED106 Phantom test coverage: test_mechanicalsoup benchmarks.py:87
HIGH MINED106 Phantom test coverage: test_parsel benchmarks.py:82
HIGH MINED106 Phantom test coverage: test_scrapling benchmarks.py:74
HIGH MINED106 Phantom test coverage: test_pyquery benchmarks.py:69
HIGH MINED106 Phantom test coverage: test_bs4_html5lib benchmarks.py:64
HIGH MINED106 Phantom test coverage: test_bs4_lxml benchmarks.py:59
HIGH MINED106 Phantom test coverage: test_lxml benchmarks.py:47
HIGH COMP001 [COMP001] High cognitive complexity: Function `_general_selection` has cognitive complexi… scrapling/core/mixins.py:15
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/docker-build.yml:28
HIGH MINED115 Action `actions/upload-artifact` pinned to mutable ref `@v6` .github/workflows/code-quality.yml:186
HIGH MINED115 Action `actions/setup-python` pinned to mutable ref `@v6` .github/workflows/code-quality.yml:51
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/code-quality.yml:46
HIGH MINED115 Action `actions/cache` pinned to mutable ref `@v5` .github/workflows/tests.yml:110
HIGH MINED115 Action `actions/cache` pinned to mutable ref `@v5` .github/workflows/tests.yml:87
HIGH MINED115 Action `actions/setup-python` pinned to mutable ref `@v6` .github/workflows/tests.yml:65
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/tests.yml:62
HIGH MINED115 Action `pypa/gh-action-pypi-publish` pinned to mutable ref `@release/v1` .github/workflows/release-and-publish.y…:74
HIGH MINED115 Action `actions/setup-python` pinned to mutable ref `@v6` .github/workflows/release-and-publish.y…:60
HIGH MINED115 Action `softprops/action-gh-release` pinned to mutable ref `@v2` .github/workflows/release-and-publish.y…:49
HIGH MINED115 Action `actions/github-script` pinned to mutable ref `@v8` .github/workflows/release-and-publish.y…:30
HIGH MINED115 Action `actions/checkout` pinned to mutable ref `@v6` .github/workflows/release-and-publish.y…:21
HIGH MINED131 pre-commit hook `https://github.com/netromdk/vermin` pinned to mutable rev `v1.7.0` .pre-commit-config.yaml:16
HIGH MINED131 pre-commit hook `https://github.com/PyCQA/bandit` pinned to mutable rev `1.9.0` .pre-commit-config.yaml:2
HIGH MINED118 Dockerfile FROM `python:3.12-slim-trixie` not pinned by digest Dockerfile:1
HIGH PYSEC-2023-117 pygments: PYSEC-2023-117 tests/requirements.txt
HIGH GHSA-2g68-c3qc-8985 werkzeug: GHSA-2g68-c3qc-8985 tests/requirements.txt
HIGH GHSA-44wm-f244-xhp3 pillow: GHSA-44wm-f244-xhp3 docs/requirements.txt
HIGH PYSEC-2026-165 pillow: PYSEC-2026-165 docs/requirements.txt
HIGH PYSEC-2023-227 pillow: PYSEC-2023-227 docs/requirements.txt
HIGH PYSEC-2023-175 pillow: PYSEC-2023-175 docs/requirements.txt
MED SEC007 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code. scrapling/spiders/checkpoint.py:74
MED MINED111 Bare except continues silently scrapling/spiders/engine.py:210
MED MINED111 Bare except continues silently scrapling/core/ai.py:307
MED MINED111 Bare except continues silently scrapling/core/shell.py:363
MED MINED111 Bare except continues silently cleanup.py:29
MED MINED111 Bare except continues silently cleanup.py:37
MED MINED124 requirements.txt: `pytest-xdist` has no version pin tests/requirements.txt:8
MED MINED124 requirements.txt: `pytest-asyncio` has no version pin tests/requirements.txt:6
MED MINED124 requirements.txt: `werkzeug<3.0.0` has no version pin tests/requirements.txt:4
MED MINED124 requirements.txt: `pytest-cov` has no version pin tests/requirements.txt:2
MED MINED124 requirements.txt: `pngquant` has no version pin docs/requirements.txt:8
MED GHSA-q34m-jh98-gwm2 werkzeug: GHSA-q34m-jh98-gwm2 tests/requirements.txt
MED GHSA-hgf8-39gv-g3f2 werkzeug: GHSA-hgf8-39gv-g3f2 tests/requirements.txt
MED GHSA-f9vj-2wh5-fj8j werkzeug: GHSA-f9vj-2wh5-fj8j tests/requirements.txt
MED GHSA-87hc-h4r5-73f7 werkzeug: GHSA-87hc-h4r5-73f7 tests/requirements.txt
MED GHSA-29vq-49wr-vm6x werkzeug: GHSA-29vq-49wr-vm6x tests/requirements.txt
MED GHSA-6w46-j5rx-g56g pytest: GHSA-6w46-j5rx-g56g tests/requirements.txt
MED GHSA-r73j-pqj5-w3x7 pillow: GHSA-r73j-pqj5-w3x7 docs/requirements.txt
MED DKR009 Dockerfile separates apt update from install Dockerfile:24
MED DKR001 Docker final stage has no non-root USER Dockerfile:1
MED DKR014 Dockerfile copies broad context with incomplete .dockerignore Dockerfile:21
MED AGT012 Agent control bridge may listen on a network interface without visible auth scrapling/cli.py:153
LOW COMP001 [COMP001] High cognitive complexity: Function `__str__` has cognitive complexity 10 (Sona… scrapling/core/translator.py:36
LOW COMP001 [COMP001] High cognitive complexity: Function `clean` has cognitive complexity 13 (SonarS… cleanup.py:6
LOW DEPCUR-PY Python package `playwright` is minor version(s) behind (1.59.0 -> 1.60.0) tests/requirements.txt:3
LOW GHSA-5239-wwwm-4pmq pygments: GHSA-5239-wwwm-4pmq tests/requirements.txt
LOW GHSA-68rp-wp8r-4726 flask: GHSA-68rp-wp8r-4726 tests/requirements.txt
LOW AIC003 Duplicated implementation block across source files scrapling/fetchers/stealth_chrome.py:22
LOW AIC003 Duplicated implementation block across source files scrapling/fetchers/stealth_chrome.py:20
LOW AIC003 Duplicated implementation block across source files scrapling/engines/_browsers/_stealth.py:19
LOW DKR008 .dockerignore misses sensitive defaults .dockerignore
INFO MINED043 [MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr… scrapling/engines/toolbelt/proxy_rotati…:60
INFO MINED062 [MINED062] Python Dataclass No Fields: @dataclass over an empty class — unfinished model. scrapling/spiders/checkpoint.py:15
INFO MINED062 [MINED062] Python Dataclass No Fields: @dataclass over an empty class — unfinished model. scrapling/engines/_browsers/_validators…:158
INFO MINED062 [MINED062] Python Dataclass No Fields: @dataclass over an empty class — unfinished model. scrapling/engines/_browsers/_page.py:13
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… scrapling/engines/toolbelt/custom.py:179
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… scrapling/core/translator.py:74
INFO MINED050 [MINED050] Stub Only Function: Function declared but body is just pass, return None, rais… scrapling/core/storage.py:49
INFO MINED067 [MINED067] Python Requests No Timeout: requests.get/post/etc. without timeout= can hang f… benchmarks.py:138
Reset to top 5 114 findings available (after auto-suppression of test files + won't-fix)

Issue body (markdown)

## Code-quality scan: `D4Vinci/Scrapling`

**Score: 76/100 (B+)**  ·  131 findings  ·  scanned 2026-06-04 04:11 UTC  ·  25,099 LOC

| Severity | Count |
|---|---|
| CRITICAL | 10 |
| HIGH | 65 |
| MEDIUM | 22 |
| LOW | 9 |

📊 [Full filterable report](https://repobility.com/scan/a5c84e5c-c138-49af-9b50-4766d4aaf498/)  ·  ![scorecard](https://repobility.com/scan/a5c84e5c-c138-49af-9b50-4766d4aaf498/report.png?v=1780546265-s2)

### Top findings

1. **CRITICAL** `MINED030` — Python Pickle Loads
   `scrapling/spiders/checkpoint.py:74` · CWE-502 · ✓ Repobility
2. **CRITICAL** `MINED018` — Unsafe Deserialization Pickle
   `scrapling/spiders/checkpoint.py:74` · CWE-502 · ✓ Repobility
3. **CRITICAL** `SEC081` — Python: pickle.loads / marshal.loads on untrusted data
   `scrapling/spiders/checkpoint.py:74` · A05:2021 Security Misconfiguration
4. **CRITICAL** `MINED013` — Password In Url
   `scrapling/engines/toolbelt/proxy_rotation.py:60` · CWE-200 · ✓ Repobility
5. **CRITICAL** `MINED107` — Missing import: `string` used but not imported
   `scrapling/core/utils/_utils.py:119` · ✓ Repobility

---

**Security note**: this issue is public. If any flagged finding is a real, exploitable vulnerability, please redirect to your `SECURITY.md` policy or open a [private security advisory](https://docs.github.com/en/code-security/security-advisories/guidance-on-reporting-and-writing-information-about-vulnerabilities/privately-reporting-a-security-vulnerability) instead. We're happy to close this and re-submit privately.

---

_Filed automatically. Close this issue if not useful — we won't refile. Full report: https://repobility.com/scan/a5c84e5c-c138-49af-9b50-4766d4aaf498/_
Megaproject â high spam risk
Could not determine 'D4Vinci/Scrapling' star count (GitHub API rate-limited or unreachable). When in doubt about repo size, prefer opening a focused PR or a discussion rather than an issue.
Already filed
56/136 findings (41%) on this scan are already flagged as test-file, won't-fix, or suppressed. The scan is too noisy to file as a single issue. Curate down to specific actionable findings, or address the FP source first.

The button opens GitHubâs new-issue page in a new tab. You will see the title + body pre-filled â review, edit if you want, then click GitHubâs "Submit new issue" button. Repobility never posts anything on your behalf.

For real security findings on big repos: use the project's SECURITY.md or private advisory flow instead of a public issue.