HIGH
GHSA-rcf8-g8jv-vg6p
tensorflow: GHSA-rcf8-g8jv-vg6p
livebench/code_runner/requirements_eval…
HIGH
GHSA-qjqc-vqcf-5qvj
tensorflow: GHSA-qjqc-vqcf-5qvj
livebench/code_runner/requirements_eval…
HIGH
GHSA-j5w9-hmfh-4cr6
tensorflow: GHSA-j5w9-hmfh-4cr6
livebench/code_runner/requirements_eval…
HIGH
GHSA-gjh7-xx4r-x345
tensorflow: GHSA-gjh7-xx4r-x345
livebench/code_runner/requirements_eval…
HIGH
GHSA-gf97-q72m-7579
tensorflow: GHSA-gf97-q72m-7579
livebench/code_runner/requirements_eval…
HIGH
GHSA-f637-vh3r-vfh2
tensorflow: GHSA-f637-vh3r-vfh2
livebench/code_runner/requirements_eval…
HIGH
GHSA-f49c-87jh-g47q
tensorflow: GHSA-f49c-87jh-g47q
livebench/code_runner/requirements_eval…
HIGH
GHSA-94mm-g2mv-8p7r
tensorflow: GHSA-94mm-g2mv-8p7r
livebench/code_runner/requirements_eval…
HIGH
GHSA-93vr-9q9m-pj8p
tensorflow: GHSA-93vr-9q9m-pj8p
livebench/code_runner/requirements_eval…
HIGH
GHSA-7x4v-9gxg-9hwj
tensorflow: GHSA-7x4v-9gxg-9hwj
livebench/code_runner/requirements_eval…
HIGH
GHSA-7jvm-xxmr-v5cw
tensorflow: GHSA-7jvm-xxmr-v5cw
livebench/code_runner/requirements_eval…
HIGH
GHSA-6wfh-89q8-44jq
tensorflow: GHSA-6wfh-89q8-44jq
livebench/code_runner/requirements_eval…
HIGH
GHSA-6hg6-5c2q-7rcr
tensorflow: GHSA-6hg6-5c2q-7rcr
livebench/code_runner/requirements_eval…
HIGH
GHSA-68v3-g9cm-rmm6
tensorflow: GHSA-68v3-g9cm-rmm6
livebench/code_runner/requirements_eval…
HIGH
GHSA-64jg-wjww-7c5w
tensorflow: GHSA-64jg-wjww-7c5w
livebench/code_runner/requirements_eval…
HIGH
GHSA-647v-r7qq-24fh
tensorflow: GHSA-647v-r7qq-24fh
livebench/code_runner/requirements_eval…
HIGH
GHSA-5w96-866f-6rm8
tensorflow: GHSA-5w96-866f-6rm8
livebench/code_runner/requirements_eval…
HIGH
GHSA-558h-mq8x-7q9g
tensorflow: GHSA-558h-mq8x-7q9g
livebench/code_runner/requirements_eval…
HIGH
GHSA-49rq-hwc3-x77w
tensorflow: GHSA-49rq-hwc3-x77w
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2023-114
scipy: PYSEC-2023-114
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2023-102
scipy: PYSEC-2023-102
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2024-110
scikit-learn: PYSEC-2024-110
livebench/code_runner/requirements_eval…
HIGH
GHSA-j225-cvw7-qrx7
pycryptodome: GHSA-j225-cvw7-qrx7
livebench/code_runner/requirements_eval…
HIGH
GHSA-jm6w-m3j8-898g
nltk: GHSA-jm6w-m3j8-898g
livebench/code_runner/requirements_eval…
HIGH
GHSA-469j-vmhf-r6v7
nltk: GHSA-469j-vmhf-r6v7
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-99
nltk: PYSEC-2026-99
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-98
nltk: PYSEC-2026-98
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-97
nltk: PYSEC-2026-97
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2024-167
nltk: PYSEC-2024-167
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-87
lxml: PYSEC-2026-87
livebench/code_runner/requirements_eval…
HIGH
GHSA-hjqc-jx6g-rwp9
keras: GHSA-hjqc-jx6g-rwp9
livebench/code_runner/requirements_eval…
HIGH
GHSA-4f3f-g24h-fr8m
keras: GHSA-4f3f-g24h-fr8m
livebench/code_runner/requirements_eval…
HIGH
GHSA-36fq-jgmw-4r9c
keras: GHSA-36fq-jgmw-4r9c
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-121
keras: PYSEC-2025-121
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-62
geopandas: PYSEC-2026-62
livebench/code_runner/requirements_eval…
HIGH
GHSA-x4qr-2fvf-3mr5
cryptography: GHSA-x4qr-2fvf-3mr5
livebench/code_runner/requirements_eval…
HIGH
GHSA-r6ph-v2qm-q3c2
cryptography: GHSA-r6ph-v2qm-q3c2
livebench/code_runner/requirements_eval…
HIGH
GHSA-3ww4-gg4f-jr7f
cryptography: GHSA-3ww4-gg4f-jr7f
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-35
cryptography: PYSEC-2026-35
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2024-225
cryptography: PYSEC-2024-225
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2023-254
cryptography: PYSEC-2023-254
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2023-11
cryptography: PYSEC-2023-11
livebench/code_runner/requirements_eval…
HIGH
GHSA-2g68-c3qc-8985
werkzeug: GHSA-2g68-c3qc-8985
livebench/code_runner/requirements_eval…
HIGH
GHSA-whj4-6x5x-4v2j
pillow: GHSA-whj4-6x5x-4v2j
livebench/code_runner/requirements_eval…
HIGH
GHSA-pwv6-vv43-88gr
pillow: GHSA-pwv6-vv43-88gr
livebench/code_runner/requirements_eval…
HIGH
GHSA-cfh3-3jmp-rvhc
pillow: GHSA-cfh3-3jmp-rvhc
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-165
pillow: PYSEC-2026-165
livebench/code_runner/requirements_eval…
HIGH
GHSA-8p8v-wh79-9r56
django: GHSA-8p8v-wh79-9r56
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-53
django: PYSEC-2026-53
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-52
django: PYSEC-2026-52
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-51
django: PYSEC-2026-51
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-49
django: PYSEC-2026-49
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-48
django: PYSEC-2026-48
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-47
django: PYSEC-2026-47
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-46
django: PYSEC-2026-46
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-45
django: PYSEC-2026-45
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-44
django: PYSEC-2026-44
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-43
django: PYSEC-2026-43
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2026-42
django: PYSEC-2026-42
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-47
django: PYSEC-2025-47
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-37
django: PYSEC-2025-37
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-13
django: PYSEC-2025-13
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-109
django: PYSEC-2025-109
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-107
django: PYSEC-2025-107
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-106
django: PYSEC-2025-106
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-105
django: PYSEC-2025-105
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-104
django: PYSEC-2025-104
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2025-1
django: PYSEC-2025-1
livebench/code_runner/requirements_eval…
HIGH
PYSEC-2024-69
django: PYSEC-2024-69
livebench/code_runner/requirements_eval…
MED
SEC136
[SEC136] AI-typical over-broad exception handler swallowing all errors: Catch-all excepti…
livebench/scripts/inspect_agentic_traj.…:139
MED
SEC136
[SEC136] AI-typical over-broad exception handler swallowing all errors: Catch-all excepti…
livebench/scripts/check_grading_flakine…:43
MED
ERR001
[ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even…
livebench/process_results/math/integral…:122
MED
ERR001
[ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even…
livebench/process_results/data_analysis…:15
MED
SEC123
[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …
livebench/scripts/edit_questions.py:138
MED
SEC123
[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …
livebench/scripts/check_grading_flakine…:111
MED
SEC123
[SEC123] Production stack trace / debug output exposed: Debug mode left on in production …
livebench/lcb_runner/evaluation/compute…:29
MED
SEC045
[SEC045] eval()/exec() on stored or user-supplied data: eval() and exec() on data — even …
livebench/code_runner/eval/__init__.py:158
MED
SEC127
[SEC127] AI agent stub — TODO: implement / pass placeholder body: Function body left as T…
livebench/agentic_code_runner/eval/harn…:24
MED
SEC127
[SEC127] AI agent stub — TODO: implement / pass placeholder body: Function body left as T…
livebench/agentic_code_runner/eval/harn…:52
MED
MINED109
Mutable default argument in `compute_metrics_from_results` (list)
livebench/lcb_runner/evaluation/pass_k_…:26
MED
MINED109
Mutable default argument in `codegen_metrics` (list)
livebench/lcb_runner/evaluation/compute…:157
MED
MINED111
Bare except continues silently
livebench/process_results/instruction_f…:53
MED
MINED109
Mutable default argument in `from_reports` (list)
livebench/agentic_code_runner/eval/harn…:303
MED
MINED111
Bare except continues silently
livebench/agentic_code_runner/minisweag…:233
MED
MINED111
Bare except continues silently
livebench/code_runner/eval/utils.py:236
MED
MINED111
Bare except continues silently
livebench/code_runner/eval/__init__.py:182
MED
MINED111
Bare except continues silently
livebench/code_runner/eval/__init__.py:346
MED
MINED111
Bare except continues silently
livebench/scripts/syntax_error_finder.py:196
MED
MINED111
Bare except continues silently
livebench/scripts/check_grading_flakine…:56
MED
MINED111
Bare except continues silently
livebench/scripts/check_grading_flakine…:45
MED
MINED111
Bare except continues silently
livebench/scripts/compare_score_tables.…:30
MED
MINED111
Bare except continues silently
livebench/scripts/replay_agent_trajecto…:373
MED
MINED111
Bare except continues silently
livebench/scripts/replay_agent_trajecto…:79
MED
MINED111
Bare except continues silently
livebench/scripts/inspect_model_answers…:365
MED
MINED111
Bare except continues silently
livebench/scripts/inspect_agentic_traj.…:141
MED
MINED111
Bare except continues silently
livebench/scripts/inspect_agentic_traj.…:181
MED
MINED111
Bare except continues silently
livebench/scripts/inspect_agentic_traj.…:144
MED
MINED111
Bare except continues silently
livebench/scripts/answer_csv_to_jsonl.py:42
MED
MINED111
Bare except continues silently
livebench/scripts/spend_report.py:22
MED
MINED111
Bare except continues silently
livebench/scripts/calc_token_offset.py:60
MED
MINED111
Bare except continues silently
livebench/scripts/edit_questions.py:184
MED
MINED111
Bare except continues silently
livebench/scripts/edit_questions.py:144
MED
MINED111
Bare except continues silently
livebench/model/completions.py:524
MED
MINED111
Bare except continues silently
livebench/model/completions.py:231
MED
MINED111
Bare except continues silently
livebench/common.py:48
MED
MINED111
Bare except continues silently
livebench/gen_ground_truth_judgment.py:206
MED
MINED111
Bare except continues silently
livebench/run_livebench.py:169
MED
COMP001
[COMP001] High cognitive complexity: Function `parse_log` has cognitive complexity 18 (So…
livebench/agentic_code_runner/eval/harn…:242
MED
COMP001
[COMP001] High cognitive complexity: Function `check` has cognitive complexity 21 (SonarS…
livebench/agentic_code_runner/eval/harn…:90
MED
MINED124
requirements.txt: `immutabledict` has no version pin
livebench/if_runner/instruction_followi…:4
MED
MINED124
requirements.txt: `nltk` has no version pin
livebench/if_runner/instruction_followi…:3
MED
MINED124
requirements.txt: `langdetect` has no version pin
livebench/if_runner/instruction_followi…:2
MED
MINED124
requirements.txt: `absl-py` has no version pin
livebench/if_runner/instruction_followi…:1
MED
GHSA-fxgc-95xx-grvq
tensorflow: GHSA-fxgc-95xx-grvq
livebench/code_runner/requirements_eval…
MED
GHSA-fqm2-gh8w-gr68
tensorflow: GHSA-fqm2-gh8w-gr68
livebench/code_runner/requirements_eval…
MED
GHSA-6w46-j5rx-g56g
pytest: GHSA-6w46-j5rx-g56g
livebench/code_runner/requirements_eval…
MED
GHSA-fpfv-jqm9-f5jm
numpy: GHSA-fpfv-jqm9-f5jm
livebench/code_runner/requirements_eval…
MED
GHSA-rf74-v2fm-23pw
nltk: GHSA-rf74-v2fm-23pw
livebench/code_runner/requirements_eval…
MED
GHSA-gfwx-w7gr-fvh7
nltk: GHSA-gfwx-w7gr-fvh7
livebench/code_runner/requirements_eval…
MED
GHSA-mq84-hjqx-cwf2
keras: GHSA-mq84-hjqx-cwf2
livebench/code_runner/requirements_eval…
MED
GHSA-h4gh-qq45-vh27
cryptography: GHSA-h4gh-qq45-vh27
livebench/code_runner/requirements_eval…
MED
GHSA-9v9h-cgj8-h64p
cryptography: GHSA-9v9h-cgj8-h64p
livebench/code_runner/requirements_eval…
MED
GHSA-39hc-v87j-747x
cryptography: GHSA-39hc-v87j-747x
livebench/code_runner/requirements_eval…
MED
GHSA-q34m-jh98-gwm2
werkzeug: GHSA-q34m-jh98-gwm2
livebench/code_runner/requirements_eval…
MED
GHSA-hgf8-39gv-g3f2
werkzeug: GHSA-hgf8-39gv-g3f2
livebench/code_runner/requirements_eval…
MED
GHSA-f9vj-2wh5-fj8j
werkzeug: GHSA-f9vj-2wh5-fj8j
livebench/code_runner/requirements_eval…
MED
GHSA-87hc-h4r5-73f7
werkzeug: GHSA-87hc-h4r5-73f7
livebench/code_runner/requirements_eval…
MED
GHSA-29vq-49wr-vm6x
werkzeug: GHSA-29vq-49wr-vm6x
livebench/code_runner/requirements_eval…
MED
GHSA-gc5v-m9x4-r6x2
requests: GHSA-gc5v-m9x4-r6x2
livebench/code_runner/requirements_eval…
MED
GHSA-9wx4-h78v-vm56
requests: GHSA-9wx4-h78v-vm56
livebench/code_runner/requirements_eval…
MED
GHSA-9hjg-9r4m-mvj7
requests: GHSA-9hjg-9r4m-mvj7
livebench/code_runner/requirements_eval…
MED
GHSA-r73j-pqj5-w3x7
pillow: GHSA-r73j-pqj5-w3x7
livebench/code_runner/requirements_eval…
MED
GHSA-vm8q-m57g-pff3
django: GHSA-vm8q-m57g-pff3
livebench/code_runner/requirements_eval…
MED
GHSA-rrqc-c2jx-6jgv
django: GHSA-rrqc-c2jx-6jgv
livebench/code_runner/requirements_eval…
MED
AGT015
Remote install command pipes network code directly to a shell
livebench/agentic_code_runner/eval/harn…:58
MED
AGT015
Remote install command pipes network code directly to a shell
livebench/agentic_code_runner/eval/harn…:52
MED
AGT015
Remote install command pipes network code directly to a shell
livebench/agentic_code_runner/eval/harn…:60
MED
AGT015
Remote install command pipes network code directly to a shell
livebench/agentic_code_runner/eval/harn…:98
MED
SEC005
[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
livebench/code_runner/eval/utils.py:201
MED
SEC005
[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
livebench/agentic_code_runner/minisweag…:23
MED
SEC005
[SEC005] Command Injection Risk: Unsafe shell execution or eval of user input.
livebench/agentic_code_runner/minisweag…:106
MED
CORE_NO_CI
No CI/CD configuration found
—
LOW
COMP001
[COMP001] High cognitive complexity: Function `parse_log` has cognitive complexity 14 (So…
livebench/agentic_code_runner/eval/harn…:241
LOW
GHSA-68rp-wp8r-4726
flask: GHSA-68rp-wp8r-4726
livebench/code_runner/requirements_eval…
LOW
GHSA-v8gr-m533-ghj9
cryptography: GHSA-v8gr-m533-ghj9
livebench/code_runner/requirements_eval…
LOW
GHSA-jm77-qphf-c4w8
cryptography: GHSA-jm77-qphf-c4w8
livebench/code_runner/requirements_eval…
LOW
GHSA-5cpq-8wj7-hf2v
cryptography: GHSA-5cpq-8wj7-hf2v
livebench/code_runner/requirements_eval…
LOW
GHSA-q95w-c7qg-hrff
django: GHSA-q95w-c7qg-hrff
livebench/code_runner/requirements_eval…
LOW
GHSA-mjgh-79qc-68w3
django: GHSA-mjgh-79qc-68w3
livebench/code_runner/requirements_eval…
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:49
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:175
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:98
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:25
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:211
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:175
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:1
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:342
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:25
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:88
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:174
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:125
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:87
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:72
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:1
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:7
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:18
LOW
AIC003
Duplicated implementation block across source files
livebench/agentic_code_runner/eval/harn…:83
INFO
MINED063
[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…
livebench/scripts/spend_report.py:52
INFO
MINED063
[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…
livebench/scripts/edit_questions.py:121
INFO
MINED063
[MINED063] Toctou Os Path Exists: if os.path.exists(p): open(p) — file can be replaced/de…
livebench/scripts/check_question_varian…:169
INFO
MINED049
[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.
livebench/scripts/spend_report.py:108
INFO
MINED049
[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.
livebench/scripts/rerun_failed_question…:73
INFO
MINED049
[MINED049] Print Pii: Logging password/token/email/ssn directly to stdout.
livebench/scripts/calc_token_offset.py:38
INFO
MINED069
[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.
livebench/scripts/edit_questions.py:138
INFO
MINED069
[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.
livebench/scripts/check_grading_flakine…:111
INFO
MINED069
[MINED069] Debug True Prod: Django/Flask DEBUG=True or app.debug=True in non-test files.
livebench/lcb_runner/evaluation/compute…:29
INFO
MINED064
[MINED064] Python Input Call: input() blocks for stdin. Inappropriate in services.
livebench/lcb_runner/evaluation/compute…:247
INFO
MINED072
[MINED072] Python Pass Only Class: class Foo: pass — stub waiting to be filled in.
livebench/code_runner/eval/utils.py:252
INFO
MINED077
[MINED077] Python Open No Context: fp = open(path) outside with-block leaks file handles.
livebench/code_runner/eval/__init__.py:265
INFO
MINED055
[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…
livebench/agentic_code_runner/eval/harn…:133
INFO
MINED055
[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…
livebench/agentic_code_runner/eval/harn…:127
INFO
MINED055
[MINED055] Npm Install No Lockfile: Production image runs npm install (resolves new versi…
livebench/agentic_code_runner/eval/harn…:130
INFO
MINED043
[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…
livebench/agentic_code_runner/eval/harn…:134
INFO
MINED043
[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…
livebench/agentic_code_runner/eval/harn…:212
INFO
MINED043
[MINED043] Http Not Https: Hardcoded http:// (not localhost) for endpoints that handle cr…
livebench/agentic_code_runner/eval/harn…:120
INFO
MINED050
[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…
livebench/agentic_code_runner/minisweag…:60
INFO
MINED050
[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…
livebench/agentic_code_runner/eval/harn…:25
INFO
MINED050
[MINED050] Stub Only Function: Function declared but body is just pass, return None, rais…
livebench/agentic_code_runner/eval/harn…:53