Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.
73 of your 97 findings came from Repobility's proprietary detections. ✓ Repobility tags below mark them.

Scan timing: clone 4.79s · analysis 19.54s · 101.0 MB · GitHub API rate-limit (preflight)

facebookincubator/Glean

https://github.com/facebookincubator/Glean · scanned 2026-06-05 16:45 UTC (4 days, 23 hours ago) · 10 languages

205 raw signals (93 security + 112 graph) 11/13 scanners ran 100th percentile · Cpp · medium (20-100K LoC) System graph score 79 (higher by 3)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 4 days, 23 hours ago · v2 · 76 actionable findings from 2 signal sources. 48 repeated signals grouped for readability. Security checks, system graph analysis, and verified AI-agent feedback are merged into one review queue.

JSON
Score breakdown â 2026-05-18-v5
Component Sub-score Weight Contribution
structure_score 40.0 0.15 6.00
security_score 100.0 0.25 25.00
testing_score 85.0 0.20 17.00
documentation_score 97.0 0.15 14.55
practices_score 77.0 0.15 11.55
code_quality 80.0 0.10 8.00
Overall 1.00 82.1
security_score may be inflated — optional security scanners were skipped on this fast scan
Severity distribution — click a segment to filter
Active filters: excluding tests × Reset all
Scan summary Quality grade A- (82/100). Dimensions: security 100, maintainability 40. 93 findings (24 security). 83,909 lines analyzed.

Showing 56 of 76 actionable findings. 124 raw detector signals were grouped into reader-sized issues. Click TP / FP to vote on a finding's accuracy — votes adjust the confidence weighting and improve detection across the platform.

critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED015] Ruby Eval Call: eval() executes arbitrary code. Code injection.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Types.hs:15
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED015] Ruby Eval Call: eval() executes arbitrary code. Code injection.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Index.hs:42
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED024] Js Eval Usage: eval() executes arbitrary code. Code injection risk.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Types.hs:15
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED024] Js Eval Usage: eval() executes arbitrary code. Code injection risk.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Index.hs:42
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED025] Php Eval: eval() executes arbitrary PHP. Code injection.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Types.hs:15
critical Security checks quality Quality conf 1.00 ✓ Repobility [MINED025] Php Eval: eval() executes arbitrary PHP. Code injection.
Review and fix per the pattern semantics. See CWE-95 / for context.
glean/shell/Glean/Shell/Index.hs:42
high Security checks software dependencies conf 0.90 ✓ Repobility 2 occurrences [MINED118] Dockerfile FROM `ghcr.io/facebookincubator/hsthrift/ci-base:latest` not pinned by digest: `FROM ghcr.io/facebookincubator/hsthrift/ci-base:latest` resolves the tag at build time. The registry CAN re-push a different image for the same tag, so every build is potentially different. Production images should pin to `image@sha256:...` for reproducibility + supply-chain integrity.
Replace with: `FROM ghcr.io/facebookincubator/hsthrift/ci-base:latest@sha256:<digest>`. Get the digest from `docker manifest inspect`. Re-pin via a scheduled bot (Renovate, Dependabot).
lines 1, 42
Dockerfile:1, 42 (2 hits)
high Security checks software dependencies conf 0.90 ✓ Repobility [MINED119] Dockerfile `ADD https://api.github.com/repos/facebookincubator/hsthrift/compare/main...HEAD`: Dockerfile `ADD <url>` downloads a remote artifact into the image with no integrity check. If the host or DNS is compromised between layers — or if the URL serves a different file later — malicious content gets baked into the image.
Download the file in CI with a known checksum, vendor it into the repo, and COPY it during the build. Or use `RUN curl -sSL URL | sha256sum -c <(echo '<expected> -')` to verify.
Dockerfile:9
high Security checks software dependencies conf 0.90 ✓ Repobility [MINED126] Workflow container/services image `ubuntu:24.04` unpinned: `container/services image: ubuntu:24.04` without `@sha256:...` pulls a mutable tag at workflow-run time. Treat workflow container references with the same supply-chain discipline as Dockerfile FROM lines.
Replace with `ubuntu:24.04@sha256:<digest>`. Re-pin via Dependabot Docker scope.
.github/workflows/ci.yml:19
low Security checks cicd CI/CD security conf 0.90 ✓ Repobility 26 occurrences GitHub Action is tag-pinned rather than SHA-pinned
[MINED115] Action `actions/checkout` pinned to mutable ref `@v4`: `uses: actions/checkout@v4` resolves at workflow-run time. Tags and branches can be re-pushed by the action owner; that made the tj-actions/changed-files compromise (2025) instantly affect ~23K repos. Pin to a 40-char commit SHA + lo…
4 files, 26 locations
.github/workflows/ci.yml:23, 47, 69, 79, 214, 216, 221 (14 hits)
.github/workflows/ci-aarch64.yml:20, 122, 124, 129 (8 hits)
.github/workflows/gh_pages.yml:18 (2 hits)
.github/workflows/glean-docker.yml:22 (2 hits)
CI/CD securitySupply chainGitHub Actions
medium Security checks cicd CI/CD security conf 0.90 ✓ Repobility 5 occurrences GitHub Action is tag-pinned rather than SHA-pinned
[MINED115] Action `JamesIves/github-pages-deploy-action` pinned to mutable ref `@releases/v3`: `uses: JamesIves/github-pages-deploy-action@releases/v3` resolves at workflow-run time. Tags and branches can be re-pushed by the action owner; that made the tj-actions/changed-files compromise (2025) ins…
2 files, 5 locations
.github/workflows/glean-docker.yml:26, 29, 36 (3 hits)
.github/workflows/gh_pages.yml:32 (2 hits)
CI/CD securitySupply chainGitHub Actions
medium Security checks security path traversal conf 1.00 [SEC012] ZipSlip — Archive Path Traversal: Archive extraction without path validation allows writing files outside the target directory.
Validate extracted paths with os.path.realpath() and ensure they stay within the target directory.
glean/lang/java-alpha/index_and_extract.py:40
medium Security checks security path traversal conf 1.00 [SEC012] ZipSlip — Archive Path Traversal: Archive extraction without path validation allows writing files outside the target directory.
Validate extracted paths with os.path.realpath() and ensure they stay within the target directory.
glean/lang/java-alpha/debug.py:39
medium Security checks quality Quality conf 1.00 [SEC123] Production stack trace / debug output exposed: Debug mode left on in production exposes stack traces, environment variables, framework internals — sometimes triggers RCE (Django debug page with arbitrary template eval).
Set DEBUG=False / APP_DEBUG=false in production. Provide a generic 500 handler that logs to backend but returns a sanitized page to clients.
glean/db/Glean/Database/Env.hs:162
high Security checks cicd CI/CD security conf 0.82 Docker final stage has no non-root USER
Add a non-root USER in the final runtime stage after files and permissions are prepared.
Dockerfile:43 CI/CD securitycontainers
medium Security checks cicd CI/CD security conf 0.84 Dockerfile ADD downloads remote content
Use curl/wget with a pinned URL, verify checksum or signature, and prefer COPY for local files.
Dockerfile:9 CI/CD securitycontainers
medium Security checks cicd CI/CD security conf 0.94 Dockerfile base image uses the latest tag
Pin to a maintained version tag or digest and update it deliberately through dependency automation.
Dockerfile:1 CI/CD securitycontainers
high Security checks software dependencies conf 0.70 Remote install command pipes network code directly to a shell
Publish a package-manager install path or add checksum/signature verification before execution. For docs, show the inspect-then-run flow and pin the downloaded artifact version.
.github/workflows/ci.yml:96
medium System graph hardware Supply chain conf 1.00 Docker base image uses a mutable or implicit tag: ghcr.io/facebookincubator/hsthrift/ci-base:latest
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
Dockerfile:1 containersPinned dependencies
medium System graph hardware Supply chain conf 1.00 Dockerfile ADD downloads remote content without checksum
Remote build inputs can change or be replaced upstream. Use Dockerfile ADD --checksum or download with an explicit digest/signature verification step.
Dockerfile:9 containersChecksum
medium System graph hardware Security conf 1.00 Dockerfile runs as root: Dockerfile
No non-root USER set. Containers running as root expand the blast radius of any vulnerability inside the image.
Container
medium System graph security security conf 1.00 Insecure pattern 'weak_hash' in glean/lang/java-alpha/sync_to_xplat.py:83
Found a known-risky pattern (weak_hash). Review and replace if possible.
glean/lang/java-alpha/sync_to_xplat.py:83 Weak hash
medium System graph quality Integrity conf 1.00 Network/subprocess call without timeout or try/except — glean/lang/kotlin/indexer/scripts/build.py:18
`subprocess.check_output(...)` here lacks both a `timeout=` arg and an enclosing try/except. This is exactly the class of bug that took down our git-clone earlier (HTTP/2 stream cancel surfaced as a fatal). Add a `timeout=` and wrap in try/except, or use a wrapper that retries.
runtime safetyRobustness
medium System graph security Coverage conf 1.00 No auth library detected
The scanner did not find any standard auth library (JWT, OAuth, NextAuth, Auth0, etc.). Either auth lives in custom code, in a separate service, or is missing.
auth
low Security checks cicd CI/CD security conf 0.72 .dockerignore misses sensitive defaults
Add missing patterns such as .env, .git, private keys, certificates, dependency folders, and local databases.
.dockerignore CI/CD securitycontainers
low Security checks cicd CI/CD security conf 0.72 2 occurrences Dockerfile installs recommended OS packages
Add `--no-install-recommends` and explicitly list only packages the image needs.
lines 5, 50
Dockerfile:5, 50 (2 hits)
CI/CD securitycontainers
low Security checks quality Quality conf 0.60 8 occurrences Duplicated implementation block across source files
Duplicate implementation blocks are maintenance debt. Keep them visible, but they are not a high-severity defect unless the duplicated logic is security-sensitive or drifting.
8 files, 8 locations
glean/client/swift/GlassSwiftRemoteClient.cpp:11
glean/lang/java-alpha/index_and_extract.py:12
glean/lang/java-alpha/indexer/java/com/facebook/glean/descriptors/MethodDescriptor.java:37
glean/rocksdb/container-impl.h:25
glean/rocksdb/database-impl.cpp:4
glean/rocksdb/database-impl.h:13
glean/rts/lookup.h:65
glean/rts/query.h:19
duplicationquality
low System graph hardware Coverage conf 1.00 Containers defined but no K8s/orchestration manifest found
Repo has Dockerfiles/compose but no Kubernetes/Nomad manifests. If the target deployment is K8s, the manifests may live in a separate ops repo.
Deployment
low System graph hardware Supply chain conf 1.00 Docker base image is tag-pinned but not digest-pinned: ubuntu:20.04
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
Dockerfile:42 containersPinned dependencies
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/client/swift/msdk/swift_glass_client.dotslash.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/codemarkup/tests/python-pyrefly/cases/xrefs/big_lib/relative.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/codemarkup/tests/python/cases/xrefs/big_lib/relative.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/kotlin/indexer/scripts/indexer_spec.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/core/declarations/bar/baz.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/core/declarations/quux/empty.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/core/xrefs/big_lib/relative.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/import_statements/by_name/mod.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/import_statements/spec/current_module.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/python-pyrefly/tests/regression/without_dynamic_import/import_statements/spec/star.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/lang/scip/indexer/glean-encode-scip2.dotslash.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/tools/derive/test/cases/python/big_lib/relative.py
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/website/babel.config.js
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/website/docusaurus.config.js
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: glean/website/sidebars.js
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph quality Integrity conf 1.00 3 occurrences Near-duplicate function bodies in 2 places
Functions with the same first-5-line body hash: glean/lang/java-alpha/index_and_extract.py:extract_jar_to_tmp_dir, glean/lang/java-alpha/debug.py:extract_jar_to_tmp_dir This is *the* AI-coder failure mode (4× more duplication in vibe-coded repos — see https://jw.hn/ai-code-hygiene). Consolidate or…
3 occurrences
repo-level (3 hits)
duplicatesduplication
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v1` in glean/lang/codemarkup/tests/python-pyrefly/branches_cases/branch_v1/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v1` in glean/lang/codemarkup/tests/python/branches_cases/branch_v1/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v1` in glean/lang/python-pyrefly/tests/regression/without_dynamic_import/core/branches/branch_v1/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v2` in glean/lang/codemarkup/tests/python-pyrefly/branches_cases/branch_v2/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v2` in glean/lang/codemarkup/tests/python/branches_cases/branch_v2/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `helper_v2` in glean/lang/python-pyrefly/tests/regression/without_dynamic_import/core/branches/branch_v2/lib.py:15
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph quality Integrity conf 1.00 Stub function `f` (body is just `pass`/`return`) — glean/tools/derive/test/cases/python/main.py:13
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `fn` (body is just `pass`/`return`) — glean/tools/derive/test/cases/python/all.py:13
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `method` (body is just `pass`/`return`) — glean/tools/derive/test/cases/python/lib.py:14
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Integrity conf 1.00 Stub function `notindexedwithdigests` (body is just `pass`/`return`) — glean/tools/derive/test/cases/python/nonindex.py:8
Likely an AI scaffold that was never filled in. Remove or implement.
Empty handlerDead code
low System graph quality Complexity conf 1.00 Very large file: glean/lang/scip/indexer/scip_to_glean/cli/src/main.rs (1633 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/9f7b24e9-5493-41b2-9242-dd59f1f83a99/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/9f7b24e9-5493-41b2-9242-dd59f1f83a99/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.