Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.

Scan timing: clone 3.57s · analysis 7.89s · 29.3 MB · GitHub API rate-limit (preflight)

kreuzberg-dev/kreuzcrawl

https://github.com/kreuzberg-dev/kreuzcrawl · scanned 2026-05-31 01:25 UTC (1 week, 6 days ago) · 10 languages

383 raw signals (150 security + 233 graph) 11/13 scanners ran 97th percentile · Rust · large (100-500K LoC) System graph score 85 (higher by 5)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 1 week, 6 days ago · v2 · last Δ +3.1 (diff) · 106 actionable findings from 2 signal sources. 174 repeated signals grouped for readability. Security checks, system graph analysis, and verified AI-agent feedback are merged into one review queue.

JSON
Score breakdown â 2026-05-18-v5
Component Sub-score Weight Contribution
structure_score 60.0 0.15 9.00
security_score 100.0 0.25 25.00
testing_score 100.0 0.20 20.00
documentation_score 100.0 0.15 15.00
practices_score 100.0 0.15 15.00
code_quality 65.0 0.10 6.50
Overall 1.00 90.5
security_score may be inflated — optional security scanners were skipped on this fast scan
Severity distribution — click a segment to filter
Active filters: excluding tests × Reset all
Scan summary Quality grade A (90/100). Dimensions: security 100, maintainability 60. 150 findings (50 security). 230,665 lines analyzed.

Showing 55 of 106 actionable findings. 280 raw detector signals were grouped into reader-sized issues. Click TP / FP to vote on a finding's accuracy — votes adjust the confidence weighting and improve detection across the platform.

low Security checks quality Quality conf 1.00 ✓ Repobility [MINED013] Password In Url: https://user:password@host — leaks creds via logs, referrer, error messages.
Review and fix per the pattern semantics. See CWE-200 / A07:2021 for context.
crates/kreuzcrawl/src/native_browser.rs:48
low Security checks quality Quality conf 1.00 ✓ Repobility [MINED013] Password In Url: https://user:password@host — leaks creds via logs, referrer, error messages.
Review and fix per the pattern semantics. See CWE-200 / A07:2021 for context.
crates/kreuzcrawl/src/interact/native.rs:85
low Security checks cicd CI/CD security conf 0.35 ✓ Repobility Workflow references repository secrets in a pull_request workflow
Fork pull_request runs do not receive normal repository secrets on GitHub Actions. Review this as a reliability/intent signal, not as direct fork-secret exfiltration. Raise severity only for pull_request_target or another trusted-context path that runs untrusted PR code with secrets.
.github/workflows/coverage.yaml:81 CI/CD securityworkflow secretsGitHub Actions
high Security checks security auth conf 0.70 [AUC003] Object-level route lacks visible authorization: A route with an object id-like parameter does not show nearby authentication or authorization evidence. This is a BOLA/IDOR review target. Endpoint: ANY /v1/batch/scrape/{id}.
Add ownership, tenant, relationship, or policy checks before reading or mutating the target object.
crates/kreuzcrawl/src/api/router.rs:59
low Security checks quality Quality conf 1.00 ✓ Repobility [MINED012] Curl Pipe Bash: curl ... | sh / bash — runs unverified network code.
Review and fix per the pattern semantics. See CWE-494 / A08:2021 for context.
scripts/ci/wasm/install-wasm-pack.sh:14
high Security checks software dependencies conf 0.90 ✓ Repobility 9 occurrences [MINED118] Dockerfile FROM `rust:1.91-bookworm` not pinned by digest: `FROM rust:1.91-bookworm` resolves the tag at build time. The registry CAN re-push a different image for the same tag, so every build is potentially different. Production images should pin to `image@sha256:...` for reproducibility + supply-chain integrity.
Replace with: `FROM rust:1.91-bookworm@sha256:<digest>`. Get the digest from `docker manifest inspect`. Re-pin via a scheduled bot (Renovate, Dependabot).
6 files, 9 locations
docker/Dockerfile:4, 42 (2 hits)
docker/Dockerfile.alpine:12, 52 (2 hits)
docker/Dockerfile.cli:4, 30 (2 hits)
docker/Dockerfile.musl-build:10
docker/Dockerfile.musl-ffi:10
docker/Dockerfile.musl-nif:10
high Security checks software dependencies conf 0.90 ✓ Repobility [MINED122] package.json dep `@kreuzberg/kreuzcrawl-wasm` pulled from URL/Git: `devDependencies.@kreuzberg/kreuzcrawl-wasm` = `file:../../crates/kreuzcrawl-wasm/pkg/nodejs` bypasses the npm registry. No integrity hash, no version locking, no registry-side scanning. If the URL or git host is compromised, every `npm install` pulls the new payload.
Publish the dependency to npm (or your private registry) and reference it by `^x.y.z`. If that's not possible, lock by commit SHA: `git+https://...#<full-sha>` AND verify the SHA in CI.
e2e/wasm/package.json:1
high Security checks software dependencies conf 0.90 ✓ Repobility [MINED126] Workflow container/services image `kreuzcrawl-test:latest` unpinned: `container/services image: kreuzcrawl-test:latest` without `@sha256:...` pulls a mutable tag at workflow-run time. Treat workflow container references with the same supply-chain discipline as Dockerfile FROM lines.
Replace with `kreuzcrawl-test:latest@sha256:<digest>`. Re-pin via Dependabot Docker scope.
.github/workflows/publish-docker.yaml:166
high Security checks software dependencies conf 0.90 ✓ Repobility 2 occurrences [MINED131] pre-commit hook `https://github.com/Goldziher/gitfluff` pinned to mutable rev `v0.8.0`: `.pre-commit-config.yaml` references `https://github.com/Goldziher/gitfluff` at `rev: v0.8.0`. If `{rev}` is a branch or version tag, the repo owner can push new code there and `pre-commit install --install-hooks` will fetch it on every developer's machine.
Pin to a commit SHA: `rev: <40-char-sha>` and bump it through `pre-commit autoupdate` (which writes to PRs that are reviewed).
lines 8, 16
.pre-commit-config.yaml:8, 16 (2 hits)
high Security checks cicd CI/CD security conf 0.92 4 occurrences Dockerfile pipes a remote script into a shell
Download the artifact, verify its checksum or signature, pin the version, and then execute it.
4 files, 4 locations
docker/Dockerfile.alpine:23
docker/Dockerfile.musl-build:21
docker/Dockerfile.musl-ffi:21
docker/Dockerfile.musl-nif:21
CI/CD securitycontainers
medium Security checks security auth conf 0.92 [AUC001] No Repobility access matrix policy found: The repository uses web/API frameworks but does not define .repobility/access.yml or equivalent authorization documentation.
Add .repobility/access.yml mapping routes to anonymous, authenticated, owner, admin, and super_admin. Keep business-specific rules in the repo so CI can enforce them.
high Security checks security auth conf 0.74 [AUC002] Low visible authorization coverage in route inventory: Only 33.3% of discovered routes show nearby authentication, authorization, middleware, or public-route evidence.
Review the access matrix and add explicit framework auth declarations or policy-file exceptions for intentionally public routes.
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/batch/scrape.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:58
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/batch/scrape/{id}.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:59
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/crawl.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:52
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/download.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:60
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/map.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:57
high Security checks security auth conf 0.68 [AUC009] Sensitive function route lacks elevated authorization evidence: A route appears to perform a sensitive function such as export, invite, role, token, billing, or destructive action without elevated policy evidence. Endpoint: ANY /v1/scrape.
Require an explicit admin, maintainer, super_admin, or scoped service role in code and .repobility/access.yml.
crates/kreuzcrawl/src/api/router.rs:51
high Security checks quality Quality conf 0.72 Agent control bridge may listen on a network interface without visible auth
Bind local agent bridges to 127.0.0.1 by default. If remote access is required, require a bearer token or mTLS, enforce origin/CSRF checks for browser clients, and document the threat model.
fixtures/stealth/stealth_ua_rotation_config.json:16
low Security checks quality Error handling conf 0.55 ✓ Repobility Broad exception handler needs review
This handler catches Exception/BaseException. It is actionable when it swallows errors without logging, re-raising, or returning a structured error. Handlers that intentionally convert exceptions into typed error results should not be treated as high risk.
scripts/ci/ruby/vendor-kreuzcrawl-core.py:456 Error handlingquality
high Security checks software dependencies conf 0.70 Remote install command pipes network code directly to a shell
Publish a package-manager install path or add checksum/signature verification before execution. For docs, show the inspect-then-run flow and pin the downloaded artifact version.
scripts/ci/wasm/install-wasm-pack.sh:14
medium System graph quality Integrity conf 1.00 `fetch()` without try/.catch or AbortSignal — crates/kreuzcrawl-browser/js/bootstrap.js:3850
Bare `fetch(...)` will throw an unhandled rejection on network failure. Wrap in try/catch, attach a `.catch(...)`, or pass an AbortSignal with a timeout.
runtime safetyRobustness
medium System graph quality Integrity conf 1.00 `fetch()` without try/.catch or AbortSignal — e2e/node/globalSetup.ts:15
Bare `fetch(...)` will throw an unhandled rejection on network failure. Wrap in try/catch, attach a `.catch(...)`, or pass an AbortSignal with a timeout.
runtime safetyRobustness
medium System graph quality Integrity conf 1.00 `fetch()` without try/.catch or AbortSignal — test_apps/node/globalSetup.ts:15
Bare `fetch(...)` will throw an unhandled rejection on network failure. Wrap in try/catch, attach a `.catch(...)`, or pass an AbortSignal with a timeout.
runtime safetyRobustness
medium System graph cicd CI/CD security conf 1.00 92 occurrences GitHub Action is tag-pinned rather than SHA-pinned
kreuzberg-dev/actions/free-disk-space-linux@v1 can move without a code change in this repo. Pin third-party actions to a reviewed 40-character commit SHA.
11 files, 92 locations
.github/workflows/ci-e2e.yaml:70, 76, 79, 84, 179, 225, 229, 236, +9 more (32 hits)
.github/workflows/publish.yaml:104, 111, 115, 137, 152, 167, 182, 197, +10 more (18 hits)
.github/workflows/ci-lint.yaml:26, 31, 38, 52, 55, 62, 75, 78, +1 more (9 hits)
.github/workflows/ci-rust.yaml:57, 63, 69, 72, 99, 105, 111, 114, +1 more (9 hits)
.github/workflows/publish-docker.yaml:64, 111, 149, 152, 164, 176, 185, 194 (8 hits)
.github/workflows/coverage.yaml:50, 56, 62, 65, 68, 74 (6 hits)
.github/workflows/ci-docker.yaml:30, 33, 36, 54 (4 hits)
.github/workflows/ci-docs.yaml:55, 80 (2 hits)
CI/CD securitySupply chainGitHub Actions
medium System graph cicd CI/CD security conf 1.00 4 occurrences GitHub Actions workflow grants broad write permissions
CI tokens with write permissions increase blast radius when an action, dependency, or PR workflow is compromised. Prefer job-level least-privilege permissions.
4 files, 4 locations
.github/workflows/ci-docs.yaml
.github/workflows/publish-docker.yaml
.github/workflows/publish-pubdev.yaml
.github/workflows/publish.yaml
CI/CD securitySupply chainGithub actions
low Security checks cicd CI/CD security conf 0.72 .dockerignore misses sensitive defaults
Add missing patterns such as .env, .git, private keys, certificates, dependency folders, and local databases.
.dockerignore CI/CD securitycontainers
low Security checks quality Quality conf 0.60 30 occurrences Duplicated implementation block across source files
Duplicate implementation blocks are maintenance debt. Keep them visible, but they are not a high-severity defect unless the duplicated logic is security-sensitive or drifting.
12 files, 14 locations
crates/kreuzcrawl/src/interact/chromiumoxide.rs:108, 372 (2 hits)
packages/elixir/lib/kreuzcrawl/batch_scrape_result.ex:4, 11 (2 hits)
crates/kreuzcrawl/src/tower/service.rs:114
crates/kreuzcrawl/src/tower/tracing_layer.rs:29
packages/csharp/Kreuzcrawl/CrawlEvent.cs:25
packages/csharp/Kreuzcrawl/PageAction.cs:41
packages/csharp/Kreuzcrawl/ScrapeResult.cs:12
packages/elixir/lib/kreuzcrawl/batch_crawl_result.ex:11
duplicationquality
low System graph hardware Coverage conf 1.00 Containers defined but no K8s/orchestration manifest found
Repo has Dockerfiles/compose but no Kubernetes/Nomad manifests. If the target deployment is K8s, the manifests may live in a separate ops repo.
Deployment
low System graph hardware Supply chain conf 1.00 Docker base image is tag-pinned but not digest-pinned: debian:bookworm-slim
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
docker/Dockerfile:42 containersPinned dependencies
low System graph hardware Supply chain conf 1.00 Docker base image is tag-pinned but not digest-pinned: rust:1.91-bookworm
Container tags can be retagged upstream. Pin production base images to a reviewed digest (`image@sha256:...`) when reproducibility and supply-chain integrity matter.
docker/Dockerfile:4 containersPinned dependencies
low System graph software Dead code candidate conf 1.00 File has no detected symbols: e2e/node/vitest.config.ts
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: e2e/wasm/vitest.config.ts
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: fixtures/responses/assets/script.js
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: test_apps/node/vitest.config.ts
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph software Dead code candidate conf 1.00 File has no detected symbols: test_apps/wasm/vitest.config.ts
Source file with no class/function declarations — possible config, dead code, or scratch file.
low System graph cicd CI/CD security conf 1.00 26 occurrences GitHub Action is tag-pinned rather than SHA-pinned
actions/upload-artifact@v7 can move without a code change in this repo. Pin third-party actions to a reviewed 40-character commit SHA.
7 files, 26 locations
.github/workflows/ci-e2e.yaml:65, 101, 174, 184, 240, 254, 261 (12 hits)
.github/workflows/ci-docs.yaml:65, 100, 106, 112 (4 hits)
.github/workflows/ci-lint.yaml:41, 46, 65 (3 hits)
.github/workflows/coverage.yaml:45, 85 (2 hits)
.github/workflows/publish-docker.yaml:60, 74 (2 hits)
.github/workflows/publish.yaml:96, 130 (2 hits)
.github/workflows/publish-pubdev.yaml:32
CI/CD securitySupply chainGitHub Actions
low System graph security security conf 1.00 Insecure pattern 'document_write' in crates/kreuzcrawl-browser/src/js/runtime.rs:1488
Found a known-risky pattern (document_write). Review and replace if possible.
crates/kreuzcrawl-browser/src/js/runtime.rs:1488 Document write
low System graph quality Integrity conf 1.00 Old/deprecated-named symbol `crates_to_copy` in scripts/ci/ruby/vendor-kreuzcrawl-core.py:300
Names with suffixes like `_old`, `_v1`, `_deprecated` usually indicate replaced-but-not-removed code (typical AI-coder leftover). Confirm and delete, or rename if it's the active version.
old markerDead code
low System graph software Dead code conf 1.00 Possibly dead Python function: replace_with_fields
No callers detected by AST scan in this repo. Could be exported for external callers or a framework handler.
scripts/ci/ruby/vendor-kreuzcrawl-core.py:126
low System graph frontend Frontend quality conf 1.00 Stray `console.log` in TS/JS — fixtures/responses/assets/script.js:2
Replace with the toast helper, an error boundary, or remove. `console.warn` / `console.error` are acceptable. Why: Hygiene — easy to leak debug output. Rule id: fq.console-leak
Fq console leak
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-browser/js/bootstrap.js (5631 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-browser/src/js/runtime.rs (1868 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-ffi/src/lib.rs (9382 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-node/src/lib.rs (3751 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-php/src/lib.rs (5213 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-py/src/lib.rs (4707 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl-wasm/src/lib.rs (6548 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: crates/kreuzcrawl/src/engine/mod.rs (1253 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/dart/rust/src/frb_generated.rs (5795 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/dart/rust/src/lib.rs (2349 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/elixir/native/kreuzcrawl_nif/src/lib.rs (3278 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/go/binding.go (1952 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/ruby/ext/kreuzcrawl_rb/src/lib.rs (7533 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
low System graph quality Complexity conf 1.00 Very large file: packages/swift/rust/src/lib.rs (5053 lines)
Files with >800 lines often hide complexity hotspots and discourage tests.
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/9547b571-0e8d-4259-aeef-b8d8016b44e9/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/9547b571-0e8d-4259-aeef-b8d8016b44e9/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.