File as GitHub Issue repo: LiveBench/LiveBench

Push this scan report to `LiveBench/LiveBench`

Click the green button below to open GitHub’s new-issue form, pre-filled with the report title, summary table, top findings, and an embedded score-card image. No authentication needed — you review on GitHub before submitting. Repobility is credited as the scanner.

Embedded score card image

This image will render at the top of the issue body. Hosted on Repobility, refreshes automatically after re-scans.

Issue title

tensorflow: GHSA-gf97-q72m-7579

Curate findings to include

Pick exactly which findings appear in the issue body. By default the top 5 are included. Uncheck noise, check what matters.

Top 5 (default)

Severity	Rule	Title	File:line
HIGH	`GHSA-rcf8-g8jv-vg6p`	tensorflow: GHSA-rcf8-g8jv-vg6p	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-qjqc-vqcf-5qvj`	tensorflow: GHSA-qjqc-vqcf-5qvj	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-j5w9-hmfh-4cr6`	tensorflow: GHSA-j5w9-hmfh-4cr6	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-gjh7-xx4r-x345`	tensorflow: GHSA-gjh7-xx4r-x345	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-gf97-q72m-7579`	tensorflow: GHSA-gf97-q72m-7579	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-f637-vh3r-vfh2`	tensorflow: GHSA-f637-vh3r-vfh2	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-f49c-87jh-g47q`	tensorflow: GHSA-f49c-87jh-g47q	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-94mm-g2mv-8p7r`	tensorflow: GHSA-94mm-g2mv-8p7r	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-93vr-9q9m-pj8p`	tensorflow: GHSA-93vr-9q9m-pj8p	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-7x4v-9gxg-9hwj`	tensorflow: GHSA-7x4v-9gxg-9hwj	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-7jvm-xxmr-v5cw`	tensorflow: GHSA-7jvm-xxmr-v5cw	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-6wfh-89q8-44jq`	tensorflow: GHSA-6wfh-89q8-44jq	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-6hg6-5c2q-7rcr`	tensorflow: GHSA-6hg6-5c2q-7rcr	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-68v3-g9cm-rmm6`	tensorflow: GHSA-68v3-g9cm-rmm6	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-64jg-wjww-7c5w`	tensorflow: GHSA-64jg-wjww-7c5w	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-647v-r7qq-24fh`	tensorflow: GHSA-647v-r7qq-24fh	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-5w96-866f-6rm8`	tensorflow: GHSA-5w96-866f-6rm8	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-558h-mq8x-7q9g`	tensorflow: GHSA-558h-mq8x-7q9g	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-49rq-hwc3-x77w`	tensorflow: GHSA-49rq-hwc3-x77w	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2023-114`	scipy: PYSEC-2023-114	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2023-102`	scipy: PYSEC-2023-102	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2024-110`	scikit-learn: PYSEC-2024-110	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-j225-cvw7-qrx7`	pycryptodome: GHSA-j225-cvw7-qrx7	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-jm6w-m3j8-898g`	nltk: GHSA-jm6w-m3j8-898g	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-469j-vmhf-r6v7`	nltk: GHSA-469j-vmhf-r6v7	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-99`	nltk: PYSEC-2026-99	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-98`	nltk: PYSEC-2026-98	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-97`	nltk: PYSEC-2026-97	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2024-167`	nltk: PYSEC-2024-167	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-87`	lxml: PYSEC-2026-87	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-hjqc-jx6g-rwp9`	keras: GHSA-hjqc-jx6g-rwp9	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-4f3f-g24h-fr8m`	keras: GHSA-4f3f-g24h-fr8m	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-36fq-jgmw-4r9c`	keras: GHSA-36fq-jgmw-4r9c	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-121`	keras: PYSEC-2025-121	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-62`	geopandas: PYSEC-2026-62	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-x4qr-2fvf-3mr5`	cryptography: GHSA-x4qr-2fvf-3mr5	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-r6ph-v2qm-q3c2`	cryptography: GHSA-r6ph-v2qm-q3c2	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-3ww4-gg4f-jr7f`	cryptography: GHSA-3ww4-gg4f-jr7f	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-35`	cryptography: PYSEC-2026-35	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2024-225`	cryptography: PYSEC-2024-225	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2023-254`	cryptography: PYSEC-2023-254	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2023-11`	cryptography: PYSEC-2023-11	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-2g68-c3qc-8985`	werkzeug: GHSA-2g68-c3qc-8985	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-whj4-6x5x-4v2j`	pillow: GHSA-whj4-6x5x-4v2j	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-pwv6-vv43-88gr`	pillow: GHSA-pwv6-vv43-88gr	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-cfh3-3jmp-rvhc`	pillow: GHSA-cfh3-3jmp-rvhc	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-165`	pillow: PYSEC-2026-165	`livebench/code_runner/requirements_eval…`
HIGH	`GHSA-8p8v-wh79-9r56`	django: GHSA-8p8v-wh79-9r56	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-53`	django: PYSEC-2026-53	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-52`	django: PYSEC-2026-52	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-51`	django: PYSEC-2026-51	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-49`	django: PYSEC-2026-49	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-48`	django: PYSEC-2026-48	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-47`	django: PYSEC-2026-47	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-46`	django: PYSEC-2026-46	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-45`	django: PYSEC-2026-45	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-44`	django: PYSEC-2026-44	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-43`	django: PYSEC-2026-43	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2026-42`	django: PYSEC-2026-42	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-47`	django: PYSEC-2025-47	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-37`	django: PYSEC-2025-37	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-13`	django: PYSEC-2025-13	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-109`	django: PYSEC-2025-109	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-107`	django: PYSEC-2025-107	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-106`	django: PYSEC-2025-106	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-105`	django: PYSEC-2025-105	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-104`	django: PYSEC-2025-104	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2025-1`	django: PYSEC-2025-1	`livebench/code_runner/requirements_eval…`
HIGH	`PYSEC-2024-69`	django: PYSEC-2024-69	`livebench/code_runner/requirements_eval…`
MED	`SEC136`	[SEC136] AI-typical over-broad exception handler swallowing all errors: Catch-all excepti…	`livebench/scripts/inspect_agentic_traj.…:139`
MED	`SEC136`	[SEC136] AI-typical over-broad exception handler swallowing all errors: Catch-all excepti…	`livebench/scripts/check_grading_flakine…:43`
MED	`ERR001`	[ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even…	`livebench/process_results/math/integral…:122`
MED	`ERR001`	[ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even…	`livebench/process_results/data_analysis…:15`
MED	`SEC123`	[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …	`livebench/scripts/edit_questions.py:138`
MED	`SEC123`	[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …	`livebench/scripts/check_grading_flakine…:111`
MED	`SEC123`	[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …	`livebench/lcb_runner/evaluation/compute…:29`
MED	`SEC045`	[SEC045] eval()/exec() on stored or user-supplied data: eval() and exec() on data — even …	`livebench/code_runner/eval/__init__.py:158`
MED	`SEC127`	[SEC127] AI agent stub — TODO: implement / pass placeholder body: Function body left as T…	`livebench/agentic_code_runner/eval/harn…:24`
MED	`SEC127`	[SEC127] AI agent stub — TODO: implement / pass placeholder body: Function body left as T…	`livebench/agentic_code_runner/eval/harn…:52`
MED	`MINED109`	Mutable default argument in `compute_metrics_from_results` (list)	`livebench/lcb_runner/evaluation/pass_k_…:26`
MED	`MINED109`	Mutable default argument in `codegen_metrics` (list)	`livebench/lcb_runner/evaluation/compute…:157`
MED	`MINED111`	Bare except continues silently	`livebench/process_results/instruction_f…:53`
MED	`MINED109`	Mutable default argument in `from_reports` (list)	`livebench/agentic_code_runner/eval/harn…:303`
MED	`MINED111`	Bare except continues silently	`livebench/agentic_code_runner/minisweag…:233`
MED	`MINED111`	Bare except continues silently	`livebench/code_runner/eval/utils.py:236`
MED	`MINED111`	Bare except continues silently	`livebench/code_runner/eval/__init__.py:182`
MED	`MINED111`	Bare except continues silently	`livebench/code_runner/eval/__init__.py:346`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/syntax_error_finder.py:196`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/check_grading_flakine…:56`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/check_grading_flakine…:45`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/compare_score_tables.…:30`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/replay_agent_trajecto…:373`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/replay_agent_trajecto…:79`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/inspect_model_answers…:365`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/inspect_agentic_traj.…:141`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/inspect_agentic_traj.…:181`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/inspect_agentic_traj.…:144`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/answer_csv_to_jsonl.py:42`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/spend_report.py:22`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/calc_token_offset.py:60`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/edit_questions.py:184`
MED	`MINED111`	Bare except continues silently	`livebench/scripts/edit_questions.py:144`
MED	`MINED111`	Bare except continues silently	`livebench/model/completions.py:524`
MED	`MINED111`	Bare except continues silently	`livebench/model/completions.py:231`
MED	`MINED111`	Bare except continues silently	`livebench/common.py:48`
MED	`MINED111`	Bare except continues silently	`livebench/gen_ground_truth_judgment.py:206`
MED	`MINED111`	Bare except continues silently	`livebench/run_livebench.py:169`
MED	`COMP001`	[COMP001] High cognitive complexity: Function `parse_log` has cognitive complexity 18 (So…	`livebench/agentic_code_runner/eval/harn…:242`
MED	`COMP001`	[COMP001] High cognitive complexity: Function `check` has cognitive complexity 21 (SonarS…	`livebench/agentic_code_runner/eval/harn…:90`
MED	`MINED124`	requirements.txt: `immutabledict` has no version pin	`livebench/if_runner/instruction_followi…:4`
MED	`MINED124`	requirements.txt: `nltk` has no version pin	`livebench/if_runner/instruction_followi…:3`
MED	`MINED124`	requirements.txt: `langdetect` has no version pin	`livebench/if_runner/instruction_followi…:2`
MED	`MINED124`	requirements.txt: `absl-py` has no version pin	`livebench/if_runner/instruction_followi…:1`
MED	`GHSA-fxgc-95xx-grvq`	tensorflow: GHSA-fxgc-95xx-grvq	`livebench/code_runner/requirements_eval…`
MED	`GHSA-fqm2-gh8w-gr68`	tensorflow: GHSA-fqm2-gh8w-gr68	`livebench/code_runner/requirements_eval…`
MED	`GHSA-6w46-j5rx-g56g`	pytest: GHSA-6w46-j5rx-g56g	`livebench/code_runner/requirements_eval…`
MED	`GHSA-fpfv-jqm9-f5jm`	numpy: GHSA-fpfv-jqm9-f5jm	`livebench/code_runner/requirements_eval…`
MED	`GHSA-rf74-v2fm-23pw`	nltk: GHSA-rf74-v2fm-23pw	`livebench/code_runner/requirements_eval…`
MED	`GHSA-gfwx-w7gr-fvh7`	nltk: GHSA-gfwx-w7gr-fvh7	`livebench/code_runner/requirements_eval…`
MED	`GHSA-mq84-hjqx-cwf2`	keras: GHSA-mq84-hjqx-cwf2	`livebench/code_runner/requirements_eval…`
MED	`GHSA-h4gh-qq45-vh27`	cryptography: GHSA-h4gh-qq45-vh27	`livebench/code_runner/requirements_eval…`
MED	`GHSA-9v9h-cgj8-h64p`	cryptography: GHSA-9v9h-cgj8-h64p	`livebench/code_runner/requirements_eval…`
MED	`GHSA-39hc-v87j-747x`	cryptography: GHSA-39hc-v87j-747x	`livebench/code_runner/requirements_eval…`
MED	`GHSA-q34m-jh98-gwm2`	werkzeug: GHSA-q34m-jh98-gwm2	`livebench/code_runner/requirements_eval…`
MED	`GHSA-hgf8-39gv-g3f2`	werkzeug: GHSA-hgf8-39gv-g3f2	`livebench/code_runner/requirements_eval…`
MED	`GHSA-f9vj-2wh5-fj8j`	werkzeug: GHSA-f9vj-2wh5-fj8j	`livebench/code_runner/requirements_eval…`
MED	`GHSA-87hc-h4r5-73f7`	werkzeug: GHSA-87hc-h4r5-73f7	`livebench/code_runner/requirements_eval…`
MED	`GHSA-29vq-49wr-vm6x`	werkzeug: GHSA-29vq-49wr-vm6x	`livebench/code_runner/requirements_eval…`
MED	`GHSA-gc5v-m9x4-r6x2`	requests: GHSA-gc5v-m9x4-r6x2	`livebench/code_runner/requirements_eval…`
MED	`GHSA-9wx4-h78v-vm56`	requests: GHSA-9wx4-h78v-vm56	`livebench/code_runner/requirements_eval…`
MED	`GHSA-9hjg-9r4m-mvj7`	requests: GHSA-9hjg-9r4m-mvj7	`livebench/code_runner/requirements_eval…`
MED	`GHSA-r73j-pqj5-w3x7`	pillow: GHSA-r73j-pqj5-w3x7	`livebench/code_runner/requirements_eval…`
MED	`GHSA-vm8q-m57g-pff3`	django: GHSA-vm8q-m57g-pff3	`livebench/code_runner/requirements_eval…`
MED	`GHSA-rrqc-c2jx-6jgv`	django: GHSA-rrqc-c2jx-6jgv	`livebench/code_runner/requirements_eval…`
MED	`AGT015`	Remote install command pipes network code directly to a shell	`livebench/agentic_code_runner/eval/harn…:58`
MED	`AGT015`	Remote install command pipes network code directly to a shell	`livebench/agentic_code_runner/eval/harn…:52`
MED	`AGT015`	Remote install command pipes network code directly to a shell	`livebench/agentic_code_runner/eval/harn…:60`
MED	`AGT015`	Remote install command pipes network code directly to a shell	`livebench/agentic_code_runner/eval/harn…:98`
MED	`SEC005`	[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.	`livebench/code_runner/eval/utils.py:201`
MED	`SEC005`	[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.	`livebench/agentic_code_runner/minisweag…:23`
MED	`SEC005`	[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.	`livebench/agentic_code_runner/minisweag…:106`
MED	`CORE_NO_CI`	No CI/CD configuration found	—
LOW	`COMP001`	[COMP001] High cognitive complexity: Function `parse_log` has cognitive complexity 14 (So…	`livebench/agentic_code_runner/eval/harn…:241`
LOW	`GHSA-68rp-wp8r-4726`	flask: GHSA-68rp-wp8r-4726	`livebench/code_runner/requirements_eval…`
LOW	`GHSA-v8gr-m533-ghj9`	cryptography: GHSA-v8gr-m533-ghj9	`livebench/code_runner/requirements_eval…`
LOW	`GHSA-jm77-qphf-c4w8`	cryptography: GHSA-jm77-qphf-c4w8	`livebench/code_runner/requirements_eval…`
LOW	`GHSA-5cpq-8wj7-hf2v`	cryptography: GHSA-5cpq-8wj7-hf2v	`livebench/code_runner/requirements_eval…`
LOW	`GHSA-q95w-c7qg-hrff`	django: GHSA-q95w-c7qg-hrff	`livebench/code_runner/requirements_eval…`
LOW	`GHSA-mjgh-79qc-68w3`	django: GHSA-mjgh-79qc-68w3	`livebench/code_runner/requirements_eval…`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:49`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:175`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:98`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:25`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:211`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:175`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:1`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:342`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:25`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:88`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:174`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:125`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:87`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:72`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:1`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:7`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:18`
LOW	`AIC003`	Duplicated implementation block across source files	`livebench/agentic_code_runner/eval/harn…:83`
INFO	`MINED063`	[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…	`livebench/scripts/spend_report.py:52`
INFO	`MINED063`	[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…	`livebench/scripts/edit_questions.py:121`
INFO	`MINED063`	[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…	`livebench/scripts/check_question_varian…:169`
INFO	`MINED049`	[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.	`livebench/scripts/spend_report.py:108`
INFO	`MINED049`	[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.	`livebench/scripts/rerun_failed_question…:73`
INFO	`MINED049`	[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.	`livebench/scripts/calc_token_offset.py:38`
INFO	`MINED069`	[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.	`livebench/scripts/edit_questions.py:138`
INFO	`MINED069`	[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.	`livebench/scripts/check_grading_flakine…:111`
INFO	`MINED069`	[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.	`livebench/lcb_runner/evaluation/compute…:29`
INFO	`MINED064`	[MINED064] Python Input Call: input() blocks for stdin. Inappropriate in services.	`livebench/lcb_runner/evaluation/compute…:247`
INFO	`MINED072`	[MINED072] Python Pass Only Class: class Foo: pass — stub waiting to be filled in.	`livebench/code_runner/eval/utils.py:252`
INFO	`MINED077`	[MINED077] Python Open No Context: fp = open(path) outside with-block leaks file handles.	`livebench/code_runner/eval/__init__.py:265`
INFO	`MINED055`	[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…	`livebench/agentic_code_runner/eval/harn…:133`
INFO	`MINED055`	[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…	`livebench/agentic_code_runner/eval/harn…:127`
INFO	`MINED055`	[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…	`livebench/agentic_code_runner/eval/harn…:130`
INFO	`MINED043`	[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…	`livebench/agentic_code_runner/eval/harn…:134`
INFO	`MINED043`	[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…	`livebench/agentic_code_runner/eval/harn…:212`
INFO	`MINED043`	[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…	`livebench/agentic_code_runner/eval/harn…:120`
INFO	`MINED050`	[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…	`livebench/agentic_code_runner/minisweag…:60`
INFO	`MINED050`	[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…	`livebench/agentic_code_runner/eval/harn…:25`
INFO	`MINED050`	[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…	`livebench/agentic_code_runner/eval/harn…:53`

Reset to top 5 200 findings available (after auto-suppression of test files + won't-fix)

Issue body (markdown)

## Code-quality scan: `LiveBench/LiveBench`

**Score: 64/100 (D)**  ·  295 findings  ·  scanned 2026-06-05 21:06 UTC  ·  61,565 LOC

| Severity | Count |
|---|---|
| CRITICAL | 11 |
| HIGH | 153 |
| MEDIUM | 73 |
| LOW | 37 |

📊 [Full filterable report](https://repobility.com/scan/285d8c54-1310-4654-8c87-9d14ef632d84/)  ·  ![scorecard](https://repobility.com/scan/285d8c54-1310-4654-8c87-9d14ef632d84/report.png?v=1780693564-s2)

### Top findings

1. **HIGH** `GHSA-rcf8-g8jv-vg6p` — tensorflow: GHSA-rcf8-g8jv-vg6p
   `livebench/code_runner/requirements_eval.txt`
2. **HIGH** `GHSA-qjqc-vqcf-5qvj` — tensorflow: GHSA-qjqc-vqcf-5qvj
   `livebench/code_runner/requirements_eval.txt`
3. **HIGH** `GHSA-j5w9-hmfh-4cr6` — tensorflow: GHSA-j5w9-hmfh-4cr6
   `livebench/code_runner/requirements_eval.txt`
4. **HIGH** `GHSA-gjh7-xx4r-x345` — tensorflow: GHSA-gjh7-xx4r-x345
   `livebench/code_runner/requirements_eval.txt`
5. **HIGH** `GHSA-gf97-q72m-7579` — tensorflow: GHSA-gf97-q72m-7579
   `livebench/code_runner/requirements_eval.txt`

---

_Filed automatically. Close this issue if not useful — we won't refile. Full report: https://repobility.com/scan/285d8c54-1310-4654-8c87-9d14ef632d84/_

Megaproject â high spam risk

Could not determine 'LiveBench/LiveBench' star count (GitHub API rate-limited or unreachable). When in doubt about repo size, prefer opening a focused PR or a discussion rather than an issue.

Already filed

116/306 findings (38%) on this scan are already flagged as test-file, won't-fix, or suppressed. The scan is too noisy to file as a single issue. Curate down to specific actionable findings, or address the FP source first.

Open in GitHub to file this issue Filing guard overridden Cancel

The button opens GitHubâs new-issue page in a new tab. You will see the title + body pre-filled â review, edit if you want, then click GitHubâs "Submit new issue" button. Repobility never posts anything on your behalf.

For real security findings on big repos: use the project's SECURITY.md or private advisory flow instead of a public issue.

Push this scan report to LiveBench/LiveBench

Embedded score card image

Issue title

Curate findings to include

Issue body (markdown)

Push this scan report to `LiveBench/LiveBench`