Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.
119 of your 194 findings came from Repobility's proprietary detections. ✓ Repobility tags below mark them.

Scan timing: clone 16.12s · analysis 9.45s · 21.6 MB · GitHub API rate-limit (preflight)

opendatalab/MinerU

https://github.com/opendatalab/MinerU · scanned 2026-06-05 08:44 UTC (5 days, 19 hours ago) · 10 languages

442 raw signals (186 security + 256 graph) 42nd percentile · Python · medium (20-100K LoC) System graph score 79 (lower by 27)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 5 days, 19 hours ago · v2 · 159 actionable findings from 2 signal sources. 155 repeated signals grouped for readability. Security checks, system graph analysis, and verified AI-agent feedback are merged into one review queue.

JSON
Score breakdown â 2026-05-18-v5
Component Sub-score Weight Contribution
structure_score 60.0 0.15 9.00
security_score 55.4 0.25 13.85
testing_score 25.0 0.20 5.00
documentation_score 69.0 0.15 10.35
practices_score 70.0 0.15 10.50
code_quality 30.9 0.10 3.09
Overall 1.00 51.8
Severity distribution — click a segment to filter
Active filters: excluding tests × Reset all
Scan summary Quality grade C- (52/100). Dimensions: security 55, maintainability 60. 186 findings (36 security). 68,498 lines analyzed.

Showing 123 of 159 actionable findings. 314 raw detector signals were grouped into reader-sized issues. Click TP / FP to vote on a finding's accuracy — votes adjust the confidence weighting and improve detection across the platform.

high Security checks quality Quality conf 1.00 ✓ Repobility 3 occurrences Missing import: `html` used but not imported
The file uses `html.something(...)` but never imports `html`. This raises NameError at runtime the first time the line executes.
3 files, 3 locations
mineru/backend/office/mkcontent/output_builders.py:607
mineru/cli/client.py:767
mineru/model/xlsx/xlsx_converter.py:1259
high Security checks quality Quality conf 1.00 ✓ Repobility [MINED036] Python Os System Call: os.system() invokes shell with no escaping.
Review and fix per the pattern semantics. See CWE-78 / for context.
mineru/model/vlm/lmdeploy_server.py:90
low Security checks security Injection conf 0.80 [SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
Use subprocess with shell=False and a list of args. Never eval user input.
mineru/model/vlm/lmdeploy_server.py:90
high Security checks quality Quality conf 1.00 ✓ Repobility 25 occurrences `self.stop` used but never assigned in __init__
Method `start` of class `LocalAPIServer` reads `self.stop`, but no assignment to it exists in __init__ (and no class-level fallback). This raises AttributeError the first time the method runs against an instance.
4 files, 25 locations
mineru/cli/router.py:408, 447, 449, 478, 479, 597, 602, 607, +2 more (10 hits)
mineru/cli/client.py:131, 157, 181, 187, 192, 209, 210, 223 (8 hits)
mineru/utils/span_pre_proc.py:194, 208, 217, 220 (4 hits)
mineru/cli/api_client.py:544, 570, 574 (3 hits)
high Security checks cicd CI/CD security conf 0.90 4 occurrences Compose service joins the host IPC namespace
Sharing host namespaces reduces isolation and can expose host processes, networking, or IPC resources.
lines 1, 29, 59, 94
docker/compose.yaml:1, 29, 59, 94 (4 hits)
CI/CD securitycontainers
high Security checks software dependencies conf 0.90 ✓ Repobility 11 occurrences Dockerfile FROM `crpi-vofi3w62lkohhxsp.cn-shanghai.personal.cr.aliyuncs.com/opendatalab-mineru/corex:4.4.0_torch2.7.1_vllm0.11.2_py3.10` not pinned by digest
`FROM crpi-vofi3w62lkohhxsp.cn-shanghai.personal.cr.aliyuncs.com/opendatalab-mineru/corex:4.4.0_torch2.7.1_vllm0.11.2_py3.10` resolves the tag at build time. The registry CAN re-push a different image for the same tag, so every build is potentially different. Production images should pin to `image@…
11 files, 11 locations
docker/china/Dockerfile:5
docker/china/corex.Dockerfile:2
docker/china/dcu.Dockerfile:2
docker/china/gcu.Dockerfile:2
docker/china/kxpu.Dockerfile:2
docker/china/maca.Dockerfile:3
docker/china/mlu.Dockerfile:3
docker/china/musa.Dockerfile:2
high Security checks quality Quality conf 0.80 ✓ Repobility 4 occurrences FastAPI POST (unknown path) has no auth
Handler `parse_pdf` is registered with router/app.post(...) but no Depends/Security parameter is declared and no auth marker appears in the function body.
2 files, 4 locations
mineru/cli/fast_api.py:1233, 1281 (2 hits)
mineru/cli/router.py:1468, 1529 (2 hits)
medium Security checks cicd CI/CD security conf 0.90 ✓ Repobility 4 occurrences GitHub Action is tag-pinned rather than SHA-pinned
Action `contributor-assistant/github-action` pinned to mutable ref `@v2.6.1` uses a mutable tag or branch. Pin external actions to a reviewed full commit SHA when the workflow is security-sensitive.
2 files, 4 locations
.github/workflows/cla.yml:21 (2 hits)
.github/workflows/cli.yml:29 (2 hits)
CI/CD securitySupply chainGitHub Actions
low Security checks cicd CI/CD security conf 0.90 ✓ Repobility 20 occurrences GitHub Action is tag-pinned rather than SHA-pinned
Action `actions/checkout` pinned to mutable ref `@v6` uses a mutable tag or branch. Pin external actions to a reviewed full commit SHA when the workflow is security-sensitive.
3 files, 20 locations
.github/workflows/python-package.yml:19, 25, 67, 78, 98, 113, 124, 127 (16 hits)
.github/workflows/cli.yml:23 (2 hits)
.github/workflows/mkdocs.yml:14 (2 hits)
CI/CD securitySupply chainGitHub Actions
high Security checks cicd CI/CD security conf 0.90 ✓ Repobility GitHub Action is tag-pinned rather than SHA-pinned
Action `mhausenblas/mkdocs-deploy-gh-pages` pinned to mutable ref `@master` uses a mutable tag or branch. Pin external actions to a reviewed full commit SHA when the workflow is security-sensitive.
.github/workflows/mkdocs.yml:18 CI/CD securitySupply chainGitHub Actions
high System graph security auth conf 1.00 FastAPI POST `file_parse` without auth dependency — mineru/cli/router.py:1519
`@router.post` decorator with no `Depends(get_current_user)` or auth-shaped dependency in its signature. Mutating endpoints should require authentication unless explicitly public.
mineru/cli/router.py:1519 securityAuth fastapi unauth mutation
high System graph security auth conf 1.00 FastAPI POST `parse_pdf` without auth dependency — mineru/cli/fast_api.py:1224
`@router.post` decorator with no `Depends(get_current_user)` or auth-shaped dependency in its signature. Mutating endpoints should require authentication unless explicitly public.
mineru/cli/fast_api.py:1224 securityAuth fastapi unauth mutation
high System graph security auth conf 1.00 FastAPI POST `submit_parse_task` without auth dependency — mineru/cli/fast_api.py:1272
`@router.post` decorator with no `Depends(get_current_user)` or auth-shaped dependency in its signature. Mutating endpoints should require authentication unless explicitly public.
mineru/cli/fast_api.py:1272 securityAuth fastapi unauth mutation
high System graph security auth conf 1.00 FastAPI POST `submit_parse_task` without auth dependency — mineru/cli/router.py:1459
`@router.post` decorator with no `Depends(get_current_user)` or auth-shaped dependency in its signature. Mutating endpoints should require authentication unless explicitly public.
mineru/cli/router.py:1459 securityAuth fastapi unauth mutation
high System graph cicd CI/CD security conf 1.00 GitHub Action tracks a moving branch
mhausenblas/mkdocs-deploy-gh-pages@master can move without a code change in this repo. Pin third-party actions to a reviewed 40-character commit SHA.
.github/workflows/mkdocs.yml:18 CI/CD securitySupply chainGithub actions
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/layout/pp_doclayoutv2.py:911
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/layout/pp_doclayoutv2.py:911 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/mfr/pp_formulanet_plus_m/predict_formula.py:57
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/mfr/pp_formulanet_plus_m/predict_formula.py:57 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/mfr/unimernet/Unimernet.py:41
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/mfr/unimernet/Unimernet.py:41 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/table/rec/slanet_plus/table_structure_utils.py:280
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/table/rec/slanet_plus/table_structure_utils.py:280 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/base_ocr_v20.py:10
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/base_ocr_v20.py:10 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/modeling/backbones/__init__.py:65
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/modeling/backbones/__init__.py:65 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/modeling/backbones/rec_pphgnetv2.py:1078
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/modeling/backbones/rec_pphgnetv2.py:1078 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/modeling/backbones/rec_svtrnet.py:220
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/modeling/backbones/rec_svtrnet.py:220 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/modeling/heads/__init__.py:44
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/modeling/heads/__init__.py:44 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/modeling/necks/__init__.py:28
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/modeling/necks/__init__.py:28 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/pytorchocr/postprocess/__init__.py:33
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/pytorchocr/postprocess/__init__.py:33 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/tools/infer/predict_cls.py:36
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/tools/infer/predict_cls.py:36 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/tools/infer/predict_det.py:121
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/tools/infer/predict_det.py:121 Eval used
high System graph security security conf 1.00 Insecure pattern 'eval_used' in mineru/model/utils/tools/infer/predict_rec.py:96
Found a known-risky pattern (eval_used). Review and replace if possible.
mineru/model/utils/tools/infer/predict_rec.py:96 Eval used
medium Security checks security auth conf 0.92 [AUC001] No Repobility access matrix policy found: The repository uses web/API frameworks but does not define .repobility/access.yml or equivalent authorization documentation.
The repository uses web/API frameworks but does not define .repobility/access.yml or equivalent authorization documentation.
medium Security checks security auth conf 0.72 [AUC012] FastAPI interactive docs may be exposed by framework defaults: FastAPI exposes /docs, /redoc, and /openapi.json by default. Public production APIs should explicitly disable those defaults, protect them behind admin authentication, or publish a reviewed OpenAPI spec with declared security requirements.
FastAPI exposes /docs, /redoc, and /openapi.json by default. Public production APIs should explicitly disable those defaults, protect them behind admin authentication, or publish a reviewed OpenAPI spec with declared security requirements.
low Security checks quality Error handling conf 1.00 [ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even in cleanup code, log at DEBUG level.
Log the error: `except Exception: logger.debug('cleanup failed', exc_info=True)`. Or handle specific exception types.
mineru/utils/pdf_text_tool.py:54
low Security checks quality Error handling conf 1.00 [ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even in cleanup code, log at DEBUG level.
Log the error: `except Exception: logger.debug('cleanup failed', exc_info=True)`. Or handle specific exception types.
mineru/utils/config_reader.py:104
high Security checks quality Quality conf 0.72 Agent control bridge may listen on a network interface without visible auth
Agent, MCP, sidecar, and command bridge servers often start as local helpers. Binding them to 0.0.0.0 or a default all-interface listener without an authorization guard can expose tool execution or session data to the LAN.
mineru/cli/fast_api.py:13
low Security checks quality Error handling conf 0.55 ✓ Repobility 24 occurrences Broad exception handler needs review
This handler catches Exception/BaseException. It is actionable when it swallows errors without logging, re-raising, or returning a structured error. Handlers that intentionally convert exceptions into typed error results should not be treated as high risk.
12 files, 24 locations
mineru/cli/router.py:163, 618, 683, 758 (4 hits)
mineru/utils/config_reader.py:88, 92, 96, 100 (4 hits)
mineru/cli/fast_api.py:109, 121, 611 (3 hits)
mineru/cli/api_client.py:584, 711 (2 hits)
mineru/cli/visualization.py:52, 76 (2 hits)
mineru/model/xlsx/xlsx_converter.py:197, 205 (2 hits)
mineru/utils/language.py:32, 38 (2 hits)
mineru/cli/client.py:754
Error handlingquality
medium Security checks cicd CI/CD security conf 0.94 4 occurrences Compose service `mineru-openai-server` image uses the latest tag
The latest tag is mutable and can change without a code review, producing different images from the same source.
lines 1, 29, 59, 94
docker/compose.yaml:1, 29, 59, 94 (4 hits)
CI/CD securitycontainers
medium Security checks cicd CI/CD security conf 0.90 Docker build context has no .dockerignore
Without .dockerignore, build context can include source history, local env files, dependencies, and generated artifacts.
.dockerignore CI/CD securitycontainers
high Security checks cicd CI/CD security conf 0.82 Docker final stage has no non-root USER
Docker images run as root unless the image or Dockerfile switches to a non-root user.
docker/global/Dockerfile:5 CI/CD securitycontainers
high Security checks cicd CI/CD security conf 0.82 Docker final stage has no non-root USER
Docker images run as root unless the image or Dockerfile switches to a non-root user.
docker/china/Dockerfile:5 CI/CD securitycontainers
medium Security checks quality Quality conf 1.00 ✓ Repobility 14 occurrences Mutable default argument in `parse_request_form` (list)
`def parse_request_form(... = []/{}/set())` — Python's default value is constructed ONCE at function definition time and shared across all calls. Mutating it in one call mutates it for every future call too.
8 files, 14 locations
mineru/model/utils/pytorchocr/modeling/backbones/rec_svtrnet.py:102, 131, 200, 268, 355, 405 (6 hits)
mineru/model/utils/pytorchocr/modeling/backbones/rec_donut_swin.py:21, 1098 (2 hits)
mineru/cli/api_request.py:54
mineru/model/mfr/unimernet/unimernet_hf/unimer_swin/configuration_unimer_swin.py:91
mineru/model/utils/pytorchocr/modeling/backbones/rec_hgnet.py:112
mineru/model/utils/pytorchocr/modeling/backbones/rec_lcnetv3.py:355
mineru/model/utils/pytorchocr/modeling/backbones/rec_pphgnetv2.py:1208
mineru/model/utils/pytorchocr/modeling/necks/rnn.py:91
medium Security checks quality Quality conf 0.78 Public web service has no security.txt
security.txt gives researchers and customers a safe disclosure channel. Public web apps and APIs should publish it under /.well-known/security.txt.
.well-known/security.txt
medium Security checks software dependencies conf 0.90 ✓ Repobility 4 occurrences requirements.txt: `mkdocs` has no version pin
Unpinned pip requirement means every fresh install may resolve a different version. Newer releases can introduce malicious code (typosquats, account compromises). Reproducible installs need exact pins.
lines 1, 2, 3, 4
docs/requirements.txt:1, 2, 3, 4 (4 hits)
medium System graph hardware Security conf 1.00 Dockerfile runs as root: docker/china/Dockerfile
No non-root USER set. Containers running as root expand the blast radius of any vulnerability inside the image.
Container
medium System graph hardware Security conf 1.00 Dockerfile runs as root: docker/global/Dockerfile
No non-root USER set. Containers running as root expand the blast radius of any vulnerability inside the image.
Container
medium System graph cicd CI/CD security conf 1.00 GitHub Actions workflow grants broad write permissions
CI tokens with write permissions increase blast radius when an action, dependency, or PR workflow is compromised. Prefer job-level least-privilege permissions.
.github/workflows/cla.yml CI/CD securitySupply chainGithub actions
medium System graph quality Integrity conf 1.00 Network/subprocess call without timeout or try/except — mineru/cli/models_download.py:19
`requests.get(...)` here lacks both a `timeout=` arg and an enclosing try/except. This is exactly the class of bug that took down our git-clone earlier (HTTP/2 stream cancel surfaced as a fatal). Add a `timeout=` and wrap in try/except, or use a wrapper that retries.
runtime safetyRobustness
medium System graph security Coverage conf 1.00 No auth library detected
The scanner did not find any standard auth library (JWT, OAuth, NextAuth, Auth0, etc.). Either auth lives in custom code, in a separate service, or is missing.
auth
medium System graph quality Tests conf 1.00 Very low test-to-source ratio
3 test file(s) for 197 source file(s) (ratio 0.02). Consider adding integration or unit tests for critical paths.
Coverage
low Security checks security auth conf 0.76 [AUC005] No authorization-focused tests detected: No test files with common authorization, ownership, 403, admin, or super_admin assertions were found.
No test files with common authorization, ownership, 403, admin, or super_admin assertions were found.
low Security checks software Race condition conf 1.00 [SEC124] TOCTOU file access (os.access then open): Check-then-use file pattern (access/exists then open) lets an attacker swap the file between check and use (symlink attack). `mktemp` is deprecated for the same reason.
Use `os.open(path, os.O_CREAT | os.O_EXCL | os.O_WRONLY)` for atomic create-only. Use `tempfile.NamedTemporaryFile()` (not `mktemp`). For locking, use `fcntl.flock`.
mineru/data/data_reader_writer/filebase.py:59
high Security checks cicd CI/CD security conf 0.56 4 occurrences Compose service does not declare a runtime user
If the image does not define USER internally, this service may run as root.
lines 1, 29, 59, 94
docker/compose.yaml:1, 29, 59, 94 (4 hits)
CI/CD securitycontainers
high Security checks cicd CI/CD security conf 0.62 4 occurrences Compose service lacks no-new-privileges hardening
no-new-privileges prevents processes from gaining additional privileges through setuid binaries or file capabilities.
lines 1, 29, 59, 94
docker/compose.yaml:1, 29, 59, 94 (4 hits)
CI/CD securitycontainers
low Security checks cicd CI/CD security conf 0.72 Dockerfile installs recommended OS packages
Installing recommended packages often pulls in unnecessary runtime surface area.
docker/global/Dockerfile:9 CI/CD securitycontainers
low Security checks cicd CI/CD security conf 0.72 Dockerfile installs recommended OS packages
Installing recommended packages often pulls in unnecessary runtime surface area.
docker/china/Dockerfile:9 CI/CD securitycontainers
high Security checks cicd CI/CD security conf 0.72 Dockerfile keeps pip download cache
Pip's package cache increases image size and can preserve unnecessary artifacts.
docker/global/Dockerfile:20 CI/CD securitycontainers
high Security checks cicd CI/CD security conf 0.72 Dockerfile keeps pip download cache
Pip's package cache increases image size and can preserve unnecessary artifacts.
docker/china/Dockerfile:20 CI/CD securitycontainers
low Security checks quality Quality conf 0.64 Duplicate top-level symbol appears in a patch-style file
A generated replacement file defining the same public function or class name as another module can mean the new logic is not actually wired into the running code.
mineru/utils/span_block_fix.py:1
low Security checks quality Quality conf 0.60 17 occurrences Duplicated implementation block across source files
Duplicate implementation blocks are maintenance debt. Keep them visible, but they are not a high-severity defect unless the duplicated logic is security-sensitive or drifting.
12 files, 15 locations
mineru/backend/vlm/vlm_middle_json_mkcontent.py:23, 366 (2 hits)
mineru/model/utils/pytorchocr/modeling/necks/rnn.py:117, 119 (2 hits)
mineru/utils/visual_magic_model_utils.py:31, 59 (2 hits)
mineru/backend/office/xlsx_analyze.py:7
mineru/backend/pipeline/model_json_to_middle_json.py:200
mineru/backend/pipeline/pipeline_analyze.py:28
mineru/backend/vlm/model_output_to_middle_json.py:35
mineru/backend/vlm/vlm_analyze.py:349
duplicationquality
high Security checks quality Quality conf 0.62 Source file name looks like an AI patch artifact
Files named as final, fixed, copy, new, or backup are often temporary patch artifacts. They may be legitimate, but they deserve review before becoming production surface area.
mineru/utils/span_block_fix.py:1
low System graph hardware Coverage conf 1.00 Containers defined but no K8s/orchestration manifest found
Repo has Dockerfiles/compose but no Kubernetes/Nomad manifests. If the target deployment is K8s, the manifests may live in a separate ops repo.
Deployment
low System graph hardware Supply chain conf 1.00 Docker base image is tag-pinned but not digest-pinned: docker.m.daocloud.io/vllm/vllm-openai:v0.21.0
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
docker/china/Dockerfile:5 containersPinned dependencies
low System graph hardware Supply chain conf 1.00 Docker base image is tag-pinned but not digest-pinned: vllm/vllm-openai:v0.21.0
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
docker/global/Dockerfile:5 containersPinned dependencies
low System graph software Dead code candidate conf 1.00 File has no detected symbols: mineru/backend/office/office_middle_json_mkcontent.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: mineru/cli/api_protocol.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: mineru/model/docx/tools/math/latex_dict.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: mineru/model/mfr/unimernet/unimernet_hf/unimer_mbart/tokenization_unimer_mbart.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: mineru/version.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph quality Integrity conf 1.00 16 occurrences Near-duplicate function bodies in 2 places
Functions with the same first-5-line body hash: mineru/cli/api_client.py:cleanup_process_tree_descendants_by_pid, mineru/cli/api_client.py:cleanup_process_tree_descendants This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate…
16 occurrences
repo-level (16 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 3 occurrences Near-duplicate function bodies in 3 places
Functions with the same first-5-line body hash: mineru/cli/api_client.py:stop_managed_process, mineru/cli/api_client.py:stop, mineru/cli/api_client.py:stop This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate or document why…
3 occurrences
repo-level (3 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 Near-duplicate function bodies in 4 places
Functions with the same first-5-line body hash: mineru/cli/router.py:get_int_env, mineru/cli/router.py:get, mineru/cli/fast_api.py:get_int_env, mineru/cli/fast_api.py:get This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate …
duplicatesduplication
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `_content_list_v2` in mineru/cli/client_side_output.py:53
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `_content_list_v2` in mineru/cli/fast_api.py:564
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `_flatten_list_items_v2` in mineru/backend/office/mkcontent/output_builders.py:104
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `content_list_v2` in mineru/cli/common.py:322
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `doclayout_v2` in mineru/backend/pipeline/batch_analyze.py:369
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `hgnet_v2` in mineru/model/layout/pp_doclayoutv2.py:122
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `img_copy` in mineru/model/table/rec/unet_table/utils.py:463
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `make_blocks_to_content_list_v2` in mineru/backend/office/office_middle_json_mkcontent.py:13
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `make_blocks_to_content_list_v2` in mineru/backend/vlm/vlm_middle_json_mkcontent.py:526
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `merge_para_with_text_v2` in mineru/backend/pipeline/pipeline_middle_json_mkcontent.py:550
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `model_copy` in mineru/utils/office_rich_text.py:106
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `pp_doclayout_v2` in mineru/backend/pipeline/model_init.py:277
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `pp_doclayout_v2` in mineru/cli/models_download.py:69
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `pp_doclayout_v2` in mineru/utils/enum_class.py:42
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `ppocr_keys_v1` in mineru/model/utils/tools/infer/pytorchocr_utility.py:92
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `reference_row_copy` in mineru/utils/table_merge.py:780
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `vllm_use_v1` in mineru/backend/vlm/utils.py:38
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph software Dead code conf 1.00 Possibly dead Python function: build_result_dict
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/fast_api.py:435
low System graph software Dead code conf 1.00 Possibly dead Python function: cleanup_process_tree_descendants
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/api_client.py:210
low System graph software Dead code conf 1.00 Possibly dead Python function: convert_to_markdown_stream
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1734
low System graph software Dead code conf 1.00 Possibly dead Python function: create_result_zip
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/fast_api.py:490
low System graph software Dead code conf 1.00 Possibly dead Python function: detect_cid_font_signal_pypdf
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/pdf_classify.py:632
low System graph software Dead code conf 1.00 Possibly dead Python function: enqueue_status
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1132
low System graph software Dead code conf 1.00 Possibly dead Python function: get_s3_config_dict
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/config_reader.py:51
low System graph software Dead code conf 1.00 Possibly dead Python function: get_u72xx_text_signal_pdfium
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/pdf_classify.py:505
low System graph software Dead code conf 1.00 Possibly dead Python function: handle_task_status
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1041
low System graph software Dead code conf 1.00 Possibly dead Python function: load_parse_inputs
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/fast_api.py:806
low System graph software Dead code conf 1.00 Possibly dead Python function: mark_failed
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/router.py:939
low System graph software Dead code conf 1.00 Possibly dead Python function: parse_request_form
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/api_request.py:54
low System graph software Dead code conf 1.00 Possibly dead Python function: reduct_overlap
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/magic_model_utils.py:11
low System graph software Dead code conf 1.00 Possibly dead Python function: regenerate_client_side_outputs
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/client_side_output.py:44
low System graph software Dead code conf 1.00 Possibly dead Python function: replace_html_src
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:714
low System graph software Dead code conf 1.00 Possibly dead Python function: replace_md
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:699
low System graph software Dead code conf 1.00 Possibly dead Python function: replace_pattern
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/visual_magic_model_utils.py:91
low System graph software Dead code conf 1.00 Possibly dead Python function: reset_primary_ui
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1925
low System graph software Dead code conf 1.00 Possibly dead Python function: run_output_task
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/common.py:378
low System graph software Dead code conf 1.00 Possibly dead Python function: str_md5
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/utils/hash_utils.py:12
low System graph software Dead code conf 1.00 Possibly dead Python function: submit_parse_task_sync
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/api_client.py:861
low System graph software Dead code conf 1.00 Possibly dead Python function: submit_payload_to_upstream_sync
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/router.py:1119
low System graph software Dead code conf 1.00 Possibly dead Python function: update_doc_show
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1429
low System graph software Dead code conf 1.00 Possibly dead Python function: update_file_options_html_for_ui
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1943
low System graph software Dead code conf 1.00 Possibly dead Python function: update_interface
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
mineru/cli/gradio_app.py:1703
low System graph quality Integrity conf 1.00 Stub function `process_unknow` (body is just `pass`/`return`) — mineru/model/docx/tools/math/omml.py:153
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Complexity conf 1.00 Very large file: mineru/cli/gradio_app.py (2004 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/cli/router.py (1673 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/docx/docx_converter.py (3585 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/layout/pp_doclayoutv2.py (1548 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/mfr/unimernet/unimernet_hf/unimer_mbart/modeling_unimer_mbart.py (2361 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/pptx/pptx_converter.py (2250 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/utils/pytorchocr/modeling/backbones/rec_pphgnetv2.py (1624 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/utils/pytorchocr/modeling/heads/rec_ppformulanet_head.py (1384 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/utils/pytorchocr/modeling/heads/rec_unimernet_head.py (2632 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: mineru/model/xlsx/xlsx_converter.py (1612 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/307eb78d-dd1b-4d20-a65e-ea4268f41aeb/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/307eb78d-dd1b-4d20-a65e-ea4268f41aeb/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.