Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.
145 of your 223 findings came from Repobility's proprietary detections. ✓ Repobility tags below mark them.

Scan timing: clone 2.94s · analysis 51.7s · 4.6 MB · GitHub API rate-limit (preflight)

scrapy/scrapy

https://github.com/scrapy/scrapy · scanned 2026-06-05 09:27 UTC (5 days, 17 hours ago) · 10 languages

438 raw signals (208 security + 230 graph) 64th percentile · Python · medium (20-100K LoC) System graph score 74 (lower by 2)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 5 days, 17 hours ago · v2 · 182 actionable findings from 2 signal sources. 140 repeated signals grouped for readability. Security checks, system graph analysis, and verified AI-agent feedback are merged into one review queue.

JSON
Score breakdown â 2026-05-18-v5
Component Sub-score Weight Contribution
structure_score 60.0 0.15 9.00
security_score 46.0 0.25 11.50
testing_score 100.0 0.20 20.00
documentation_score 94.0 0.15 14.10
practices_score 70.0 0.15 10.50
code_quality 69.3 0.10 6.93
Overall 1.00 72.0
Severity distribution — click a segment to filter
Active filters: excluding tests × Reset all
Scan summary Quality grade B (72/100). Dimensions: security 46, maintainability 60. 208 findings (42 security). 76,750 lines analyzed.

Showing 119 of 182 actionable findings. 322 raw detector signals were grouped into reader-sized issues. Click TP / FP to vote on a finding's accuracy — votes adjust the confidence weighting and improve detection across the platform.

critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED030] Python Pickle Loads: pickle.loads() can execute arbitrary code via __reduce__.
Review and fix per the pattern semantics. See CWE-502 / for context.
scrapy/squeues.py:152
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED030] Python Pickle Loads: pickle.loads() can execute arbitrary code via __reduce__.
Review and fix per the pattern semantics. See CWE-502 / for context.
scrapy/extensions/spiderstate.py:44
critical Security checks quality Quality conf 1.00 [SEC081] Python: pickle.loads / marshal.loads on untrusted data: pickle.load(s) and marshal.load(s) execute arbitrary code on untrusted input. Ported from dlint DUO103 / DUO120 (BSD-3).
Use json, msgpack, or protobuf for untrusted data. If pickle is required, sign the payload with HMAC.
scrapy/extensions/spiderstate.py:44
critical Security checks security secrets conf 0.95 3 occurrences Identified a Private Key, which may compromise cryptographic security and sensitive data encryption.
Gitleaks detected a committed secret or credential pattern.
3 files, 3 locations
tests/keys/example-com.key.pem:1
tests/keys/localhost.ip.key:1
tests/keys/mitmproxy-ca.pem:1
high Security checks quality Quality conf 1.00 ✓ Repobility 3 occurrences Missing import: `queue` used but not imported
The file uses `queue.something(...)` but never imports `queue`. This raises NameError at runtime the first time the line executes.
3 files, 3 locations
scrapy/pqueues.py:404
scrapy/utils/asyncio.py:114
tests/test_settings/__init__.py:371
low Security checks quality Quality conf 1.00 ✓ Repobility [MINED006] Overcatch Baseexception: except BaseException: ... — prevents Ctrl+C and SystemExit from working.
Review and fix per the pattern semantics. See CWE-705 / for context.
scrapy/utils/console.py:137
high Security checks quality Quality conf 1.00 ✓ Repobility [MINED036] Python Os System Call: os.system() invokes shell with no escaping.
Review and fix per the pattern semantics. See CWE-78 / for context.
scrapy/commands/genspider.py:123
high Security checks quality Quality conf 1.00 ✓ Repobility [MINED036] Python Os System Call: os.system() invokes shell with no escaping.
Review and fix per the pattern semantics. See CWE-78 / for context.
scrapy/commands/edit.py:48
low Security checks security Injection conf 0.80 [SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
Use subprocess with shell=False and a list of args. Never eval user input.
scrapy/commands/edit.py:48
high Security checks software Resource exhaustion conf 1.00 3 occurrences [SEC035] Unbounded Resource Allocation — DoS risk: Allocating resources (buffers, recursion stack, large ranges) based on user input without an upper bound. Attackers send `size=10000000` to exhaust memory, or trigger expensive computation. CWE-770/400. Examples: CVE-2023-44487 (HTTP/2 Rapid Reset), countless YAML/XML billion-laughs variants.
Cap user-controlled sizes BEFORE allocation: size = min(int(request.args.get('n', 100)), MAX_SIZE) Set framework-level limits: Flask: app.config['MAX_CONTENT_LENGTH'] = 10 * 1024 * 1024 FastAPI: use middleware to enforce request size Django: DATA_UPLOAD_MAX_MEMORY_SIZE in settings.py …
3 files, 3 locations
scrapy/core/http2/agent.py:156
scrapy/pipelines/images.py:248
scrapy/utils/request.py:134
high Security checks quality Quality conf 1.00 ✓ Repobility 25 occurrences `self.encoding` used but never assigned in __init__
Method `_configure` of class `BaseItemExporter` reads `self.encoding`, but no assignment to it exists in __init__ (and no class-level fallback). This raises AttributeError the first time the method runs against an instance.
3 files, 25 locations
scrapy/exporters.py:49, 50, 53, 54, 83, 85, 87, 89, +10 more (18 hits)
scrapy/mail.py:159, 162, 163, 210 (4 hits)
scrapy/statscollectors.py:97, 109, 110 (3 hits)
high Security checks software dependencies conf 0.88 cryptography: PYSEC-2026-36
cryptography is a package designed to expose cryptographic primitives and recipes to Python developers. From 45.0.0 to before 46.0.7, if a non-contiguous buffer was passed to APIs which accepted Python buffers (e.g. Hash.update()), this could lead to buffer overflows. This vulnerability is fixed in…
docs/requirements.txt
medium Security checks cicd CI/CD security conf 0.90 ✓ Repobility 15 occurrences GitHub Action is tag-pinned rather than SHA-pinned
Action `pre-commit/action` pinned to mutable ref `@v3.0.1` uses a mutable tag or branch. Pin external actions to a reviewed full commit SHA when the workflow is security-sensitive.
5 files, 15 locations
.github/workflows/tests-macos.yml:44, 48 (4 hits)
.github/workflows/tests-ubuntu.yml:107, 111 (4 hits)
.github/workflows/tests-windows.yml:71, 75 (4 hits)
.github/workflows/publish.yml:29 (2 hits)
.github/workflows/checks.yml:58
CI/CD securitySupply chainGitHub Actions
low Security checks cicd CI/CD security conf 0.90 ✓ Repobility 17 occurrences GitHub Action is tag-pinned rather than SHA-pinned
Action `actions/github-script` pinned to mutable ref `@v6` uses a mutable tag or branch. Pin external actions to a reviewed full commit SHA when the workflow is security-sensitive.
6 files, 17 locations
.github/workflows/checks.yml:41, 44, 57 (4 hits)
.github/workflows/tests-macos.yml:30, 33 (3 hits)
.github/workflows/tests-ubuntu.yml:87, 90 (3 hits)
.github/workflows/tests-windows.yml:57, 60 (3 hits)
.github/workflows/auto-close-llm-pr.yml:14 (2 hits)
.github/workflows/publish.yml:21, 22 (2 hits)
CI/CD securitySupply chainGitHub Actions
high Security checks software dependencies conf 0.88 lxml: PYSEC-2026-87
lxml is a library for processing XML and HTML in the Python language. Prior to 6.1.0, using either of the two parsers in the default configuration (with resolve_entities=True) allows untrusted XML input to read local files. Setting the resolve_entities option explicitly to resolve_entities='interna…
docs/requirements.txt
high Security checks software dependencies conf 0.90 ✓ Repobility 5 occurrences pre-commit hook `https://github.com/astral-sh/ruff-pre-commit` pinned to mutable rev `v0.15.2`
`.pre-commit-config.yaml` references `https://github.com/astral-sh/ruff-pre-commit` at `rev: v0.15.2`. If `{rev}` is a branch or version tag, the repo owner can push new code there and `pre-commit install --install-hooks` will fetch it on every developer's machine.
lines 8, 14, 20, 25, 29
.pre-commit-config.yaml:8, 14, 20, 25, 29 (5 hits)
high Security checks software dependencies conf 0.90 ✓ Repobility 3 occurrences requirements.txt installs from `sphinx-llms-txt @ git+https://github.com/zytedata/...` (git/URL)
Pip requirement points to a VCS URL or direct download. Bypasses PyPI's integrity check + scanning. If the host or branch tip changes, the next `pip install` pulls a different package — no diff visible to reviewers.
lines 144, 146, 156
docs/requirements.txt:144, 146, 156 (3 hits)
high Security checks software dependencies conf 0.88 scrapy: PYSEC-2017-83
Scrapy 1.4 allows remote attackers to cause a denial of service (memory consumption) via large files because arbitrarily many files are read into memory, which is especially problematic if the files are then individually written in a separate thread to a slow storage resource, as demonstrated by in…
docs/requirements.txt
high Security checks software dependencies conf 0.88 twisted: PYSEC-2026-160
Twisted is an event-based framework for internet applications, supporting Python 3.6+. Prior to 26.4.0rc2, the twisted.names module is vulnerable to a Denial of Service (DoS) attack via resource exhaustion during DNS name decompression. A remote, unauthenticated attacker can exploit this by sending…
docs/requirements.txt
high Security checks software dependencies conf 0.88 urllib3: PYSEC-2026-141
urllib3 is an HTTP client library for Python. From 1.23 to before 2.7.0, cross-origin redirects followed from the low-level API via ProxyManager.connection_from_url().urlopen(..., assert_same_host=False) still forward these sensitive headers. This vulnerability is fixed in 2.7.0.
docs/requirements.txt
high Security checks software dependencies conf 0.88 urllib3: PYSEC-2026-142
urllib3 is an HTTP client library for Python. From 2.6.0 to before 2.7.0, urllib3 could decompress the whole response instead of the requested portion (1) during the second HTTPResponse.read(amt=N) call when the response was decompressed using the official Brotli library or (2) when HTTPResponse.dr…
docs/requirements.txt
high System graph security security conf 1.00 Insecure pattern 'eval_used' in scrapy/shell.py:168
Found a known-risky pattern (eval_used). Review and replace if possible.
scrapy/shell.py:168 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in scrapy/utils/engine.py:35
Found a known-risky pattern (eval_used). Review and replace if possible.
scrapy/utils/engine.py:35 Eval used
low Security checks security Injection conf 0.50 [SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
Use subprocess with shell=False and a list of args. Never eval user input.
scrapy/commands/genspider.py:123
low Security checks security Deserialization conf 1.00 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code.
Use yaml.safe_load() instead of yaml.load(). Avoid pickle for untrusted data.
scrapy/extensions/spiderstate.py:44
medium Security checks security Crypto conf 1.00 [SEC014] SSL Verification Disabled: SSL certificate verification is disabled, allowing man-in-the-middle attacks.
Enable SSL verification. Use verify=True (default) for requests. Pin certificates if needed.
scrapy/utils/ssl.py:51
medium Security checks security Crypto conf 1.00 [SEC107] Weak TLS version requested (TLSv1.0, TLSv1.1, SSLv3, SSLv2): TLS 1.0 and 1.1 were deprecated by IETF in 2021 (RFC 8996). Most browsers no longer support them. Code requesting these protocols is talking to an attacker-controllable downgrade target.
Use TLSv1.2 minimum, TLSv1.3 preferred. Java: `SSLContext.getInstance("TLSv1.2")`. Python: `ssl.PROTOCOL_TLS_CLIENT` + `MinimumVersion = TLSVersion.TLSv1_2`. Go: `MinVersion: tls.VersionTLS12`.
scrapy/utils/ssl.py:26
low Security checks quality Error handling conf 0.55 ✓ Repobility 20 occurrences Broad exception handler needs review
This handler catches Exception/BaseException. It is actionable when it swallows errors without logging, re-raising, or returning a structured error. Handlers that intentionally convert exceptions into typed error results should not be treated as high risk.
11 files, 20 locations
scrapy/contracts/__init__.py:48, 78, 131, 187 (4 hits)
scrapy/utils/defer.py:154, 360, 379, 440 (4 hits)
scrapy/core/spidermw.py:96, 110, 226 (3 hits)
scrapy/core/scraper.py:258, 290 (2 hits)
scrapy/core/downloader/__init__.py:259
scrapy/core/downloader/middleware.py:83
scrapy/extensions/httpcache.py:418
scrapy/pipelines/files.py:416
Error handlingquality
medium Security checks software dependencies conf 0.88 idna: GHSA-65pc-fj4g-8rjx
Internationalized Domain Names in Applications (IDNA): Specially crafted inputs to idna.encode() can bypass CVE-2024-3651 fix
docs/requirements.txt
medium Security checks software dependencies conf 0.90 Python package `service-identity` is 2 major version(s) behind (24.2.0 -> 26.1.0)
`service-identity==24.2.0` is 2 major version(s) behind the latest stable release on PyPI (26.1.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:125
medium Security checks software dependencies conf 0.90 Python package `twisted` is 1 major version(s) behind (25.5.0 -> 26.4.0)
`twisted==25.5.0` is 1 major version(s) behind the latest stable release on PyPI (26.4.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:178
medium System graph cicd CI/CD security conf 1.00 GitHub Actions workflow grants broad write permissions
CI tokens with write permissions increase blast radius when an action, dependency, or PR workflow is compromised. Prefer job-level least-privilege permissions.
.github/workflows/publish.yml CI/CD securitySupply chainGithub actions
medium System graph security security conf 1.00 Insecure pattern 'weak_hash' in scrapy/pipelines/files.py:249
Found a known-risky pattern (weak_hash). Review and replace if possible.
scrapy/pipelines/files.py:249 Weak hash
medium System graph quality Integrity conf 1.00 Network/subprocess call without timeout or try/except — scrapy/commands/bench.py:40
`subprocess.Popen(...)` here lacks both a `timeout=` arg and an enclosing try/except. This is exactly the class of bug that took down our git-clone earlier (HTTP/2 stream cancel surfaced as a fatal). Add a `timeout=` and wrap in try/except, or use a wrapper that retries.
runtime safetyRobustness
medium System graph quality Integrity conf 1.00 Network/subprocess call without timeout or try/except — scrapy/commands/parse.py:163
`requests.get(...)` here lacks both a `timeout=` arg and an enclosing try/except. This is exactly the class of bug that took down our git-clone earlier (HTTP/2 stream cancel surfaced as a fatal). Add a `timeout=` and wrap in try/except, or use a wrapper that retries.
runtime safetyRobustness
medium System graph security Coverage conf 1.00 No auth library detected
The scanner did not find any standard auth library (JWT, OAuth, NextAuth, Auth0, etc.). Either auth lives in custom code, in a separate service, or is missing.
auth
low Security checks quality Quality conf 0.60 15 occurrences Duplicated implementation block across source files
Duplicate implementation blocks are maintenance debt. Keep them visible, but they are not a high-severity defect unless the duplicated logic is security-sensitive or drifting.
12 files, 14 locations
tests/AsyncCrawlerProcess/asyncio_enabled_reactor_same_loop.py:10, 11 (2 hits)
tests/CrawlerProcess/asyncio_enabled_reactor.py:4, 38 (2 hits)
scrapy/http/response/text.py:166
tests/AsyncCrawlerProcess/asyncio_custom_loop_custom_settings_same.py:1
tests/AsyncCrawlerProcess/asyncio_enabled_reactor.py:25
tests/AsyncCrawlerProcess/asyncio_enabled_reactor_different_loop.py:9
tests/AsyncCrawlerRunner/custom_loop_same.py:2
tests/AsyncCrawlerRunner/multi_seq.py:2
duplicationquality
low Security checks software dependencies conf 0.88 pygments: GHSA-5239-wwwm-4pmq
Pygments has Regular Expression Denial of Service (ReDoS) due to Inefficient Regex for GUID Matching
docs/requirements.txt
low Security checks software dependencies conf 0.90 Python package `certifi` is minor version(s) behind (2026.2.25 -> 2026.5.20)
`certifi==2026.2.25` is minor version(s) behind the latest stable release on PyPI (2026.5.20). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:15
low Security checks software dependencies conf 0.90 Python package `docutils` is minor version(s) behind (0.22.4 -> 0.23)
`docutils==0.22.4` is minor version(s) behind the latest stable release on PyPI (0.23). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:34
low Security checks software dependencies conf 0.90 Python package `filelock` is minor version(s) behind (3.25.2 -> 3.29.1)
`filelock==3.25.2` is minor version(s) behind the latest stable release on PyPI (3.29.1). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:39
low Security checks software dependencies conf 0.90 Python package `idna` is minor version(s) behind (3.11 -> 3.18)
`idna==3.11` is minor version(s) behind the latest stable release on PyPI (3.18). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:49
low Security checks software dependencies conf 0.90 Python package `packaging` is minor version(s) behind (26.0 -> 26.2)
`packaging==26.0` is minor version(s) behind the latest stable release on PyPI (26.2). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:76
low Security checks software dependencies conf 0.90 Python package `pydantic` is minor version(s) behind (2.12.5 -> 2.13.4)
`pydantic==2.12.5` is minor version(s) behind the latest stable release on PyPI (2.13.4). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:98
low Security checks software dependencies conf 0.90 Python package `pygments` is minor version(s) behind (2.19.2 -> 2.20.0)
`pygments==2.19.2` is minor version(s) behind the latest stable release on PyPI (2.20.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:106
low Security checks software dependencies conf 0.90 Python package `pyopenssl` is minor version(s) behind (26.0.0 -> 26.2.0)
`pyopenssl==26.0.0` is minor version(s) behind the latest stable release on PyPI (26.2.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:108
low Security checks software dependencies conf 0.90 Python package `requests` is minor version(s) behind (2.33.0 -> 2.34.2)
`requests==2.33.0` is minor version(s) behind the latest stable release on PyPI (2.34.2). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:112
low Security checks software dependencies conf 0.90 Python package `scrapy` is minor version(s) behind (2.14.2 -> 2.16.0)
`scrapy==2.14.2` is minor version(s) behind the latest stable release on PyPI (2.16.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:121
low Security checks software dependencies conf 0.90 Python package `snowballstemmer` is minor version(s) behind (3.0.1 -> 3.1.1)
`snowballstemmer==3.0.1` is minor version(s) behind the latest stable release on PyPI (3.1.1). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:127
low Security checks software dependencies conf 0.90 Python package `urllib3` is minor version(s) behind (2.6.3 -> 2.7.0)
`urllib3==2.6.3` is minor version(s) behind the latest stable release on PyPI (2.7.0). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:188
low Security checks software dependencies conf 0.90 Python package `zope-interface` is minor version(s) behind (8.2 -> 8.5)
`zope-interface==8.2` is minor version(s) behind the latest stable release on PyPI (8.5). Pinned-but-stale Python dependencies drift away from upstream security and bugfix releases. This is the version-currency signal Dependabot raises.
docs/requirements.txt:194
low System graph software Dead code candidate conf 1.00 File has no detected symbols: docs/conf.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: scrapy/__main__.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: scrapy/core/downloader/handlers/http.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: scrapy/signals.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: scrapy/utils/_deps_compat.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/AsyncCrawlerProcess/reactorless_reactor.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/CrawlerProcess/reactorless.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/CrawlerRunner/reactorless.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/test_cmdline/settings.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/test_cmdline_crawl_with_pipeline/test_spider/settings.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/test_settings/default_settings.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/test_utils_misc/test_walk_modules/mod/mod0.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: tests/test_utils_misc/test_walk_modules/mod1.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph quality Integrity conf 1.00 13 occurrences Near-duplicate function bodies in 2 places
Functions with the same first-5-line body hash: extras/qpsclient.py:parse, scrapy/core/downloader/handlers/base.py:close This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate or document why they're separate.
13 occurrences
repo-level (13 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 4 occurrences Near-duplicate function bodies in 3 places
Functions with the same first-5-line body hash: scrapy/exporters.py:serialize_field, scrapy/exporters.py:serialize_field, scrapy/exporters.py:serialize_field This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate or document w…
4 occurrences
repo-level (4 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 2 occurrences Near-duplicate function bodies in 4 places
Functions with the same first-5-line body hash: scrapy/exporters.py:finish_exporting, scrapy/exporters.py:finish_exporting, scrapy/exporters.py:finish_exporting, scrapy/exporters.py:finish_exporting This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai…
2 occurrences
repo-level (2 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 Near-duplicate function bodies in 9 places
Functions with the same first-5-line body hash: scrapy/exporters.py:export_item, scrapy/exporters.py:export_item, scrapy/exporters.py:export_item, scrapy/exporters.py:export_item This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Cons…
duplicatesduplication
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `AlsoDeprecated` in tests/test_utils_deprecate.py:232
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_http_headers.py:107
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_http_request.py:185
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_http_response.py:65
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_http_response_text.py:627
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_item.py:250
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_settings/__init__.py:359
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_copy` in tests/test_utils_datatypes.py:195
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_spider_callback_deferred_deprecated` in tests/test_crawl.py:840
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_verify_certs_deprecated` in tests/test_downloader_handler_httpx.py:81
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `test_verify_certs_deprecated` in tests/test_downloader_handlers_http_base.py:835
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `TestDownloadDeprecated` in tests/test_downloadermiddleware.py:291
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `TestHttpAuthMiddlewareLegacy` in tests/test_downloadermiddleware_httpauth.py:26
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph software Dead code conf 1.00 Possibly dead Python function: attribute
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/deprecate.py:16
low System graph software Dead code conf 1.00 Possibly dead Python function: collect_scrapy_settings_refs
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:43
low System graph software Dead code conf 1.00 Possibly dead Python function: commit_role
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:148
low System graph software Dead code conf 1.00 Possibly dead Python function: defer_fail
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/defer.py:47
low System graph software Dead code conf 1.00 Possibly dead Python function: defer_succeed
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/defer.py:68
low System graph software Dead code conf 1.00 Possibly dead Python function: depart_settingslist_node_markdown
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:128
low System graph software Dead code conf 1.00 Possibly dead Python function: emit
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/log.py:239
low System graph software Dead code conf 1.00 Possibly dead Python function: getChild
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
extras/qps-bench-server.py:21
low System graph software Dead code conf 1.00 Possibly dead Python function: getChild
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/benchserver.py:12
low System graph software Dead code conf 1.00 Possibly dead Python function: handle
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/signalmanager.py:111
low System graph software Dead code conf 1.00 Possibly dead Python function: handle_exception
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/commands/parse.py:132
low System graph software Dead code conf 1.00 Possibly dead Python function: inspect_response
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/shell.py:310
low System graph software Dead code conf 1.00 Possibly dead Python function: is_setting_index
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:28
low System graph software Dead code conf 1.00 Possibly dead Python function: issue_role
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:140
low System graph software Dead code conf 1.00 Possibly dead Python function: logerror
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/signal.py:104
low System graph software Dead code conf 1.00 Possibly dead Python function: maybe_skip_member
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapyfixautodoc.py:12
low System graph software Dead code conf 1.00 Possibly dead Python function: md5sum
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/misc.py:138
low System graph software Dead code conf 1.00 Possibly dead Python function: process_chain
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/defer.py:298
low System graph software Dead code conf 1.00 Possibly dead Python function: process_parallel
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/utils/defer.py:317
low System graph software Dead code conf 1.00 Possibly dead Python function: replace_settingslist_nodes
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:99
low System graph software Dead code conf 1.00 Possibly dead Python function: rev_role
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:156
low System graph software Dead code conf 1.00 Possibly dead Python function: scraped_data
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/commands/parse.py:276
low System graph software Dead code conf 1.00 Possibly dead Python function: set_crawler
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scrapy/commands/__init__.py:40
low System graph software Dead code conf 1.00 Possibly dead Python function: setup
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:164
low System graph software Dead code conf 1.00 Possibly dead Python function: setup
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapyfixautodoc.py:19
low System graph software Dead code conf 1.00 Possibly dead Python function: source_role
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:132
low System graph software Dead code conf 1.00 Possibly dead Python function: visit_settingslist_node_markdown
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
docs/_ext/scrapydocs.py:115
low System graph quality Integrity conf 1.00 Stub function `acquire` (body is just `pass`/`return`) — scrapy/http/cookies.py:131
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `close_spider` (body is just `pass`/`return`) — scrapy/extensions/httpcache.py:332
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `depart_settingslist_node_markdown` (body is just `pass`/`return`) — docs/_ext/scrapydocs.py:128
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `extract_cookies` (body is just `pass`/`return`) — scrapy/utils/_download_handlers.py:44
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `open_spider` (body is just `pass`/`return`) — scrapy/statscollectors.py:86
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `open` (body is just `pass`/`return`) — scrapy/dupefilters.py:38
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `parse` (body is just `pass`/`return`) — extras/qpsclient.py:54
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `pauseProducing` (body is just `pass`/`return`) — scrapy/core/downloader/handlers/http11.py:610
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `referrer` (body is just `pass`/`return`) — scrapy/spidermiddlewares/referer.py:120
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `store` (body is just `pass`/`return`) — scrapy/extensions/feedexport.py:163
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Complexity conf 1.00 Very large file: tests/test_downloader_handlers_http_base.py (1356 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/9db5f11e-57f3-477a-9bb0-dfae01c72bf5/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/9db5f11e-57f3-477a-9bb0-dfae01c72bf5/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.