Public scan — anyone with this URL can view this analysis. Sign up to track your own repos privately, run scheduled re-scans, and get AI fix prompts via your dashboard.

lightseekorg/tokenspeed

https://github.com/lightseekorg/tokenspeed · scanned 2026-05-16 22:31 UTC (19 hours, 52 minutes ago) · 10 languages

746 findings (51 legacy + 695 scanner) 60th percentile · Python · large (100-500K LoC) Scanner says 70 (higher by 3)

UNIFIED Repobility · multi-layer engine · AI coders

Complete repo analysis

Last scanned 19 hours, 52 minutes ago · v1 · 746 findings from 2 sources. Findings combine the legacy security pipeline AND the multi-layer engine (atlas, wiring, flows, ranked) AND verified AI agent contributions.

JSON
Severity distribution — click a segment to filter
Active filters: source: legacy × excluding tests × Reset all
Scan summary Repository scanned at 69.6/100 with 88.9% coverage. It contains 6761 nodes across 12 cross-layer flows, written primarily in mixed languages. Engine surfaced 695 findings — concentrated in quality (586), software (45), cicd (43). Risk profile is high: 0 critical, 7 high, 13 medium. Recommended next step: open the quality layer findings first — that's where the highest-impact wins live.

Showing 51 of 746 findings. Click TP / FP to vote on a finding's accuracy — votes adjust the confidence weighting and improve detection across the platform.

high Legacy software ssrf conf 1.00 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTTP request to a user-controlled URL without allowlist validation. Attackers can probe internal services (169.254.169.254 metadata, internal Kubernetes endpoints, file:// URIs), exfiltrate data, or pivot through your network. SSRF is OWASP A10:2021 and a frequent foothold in cloud breaches.
Validate the URL against an allowlist BEFORE fetching: ALLOWED = {'images.example.com', 'cdn.example.com'} host = urlparse(url).hostname if host not in ALLOWED: abort(400) Or use a server-side proxy (Imgproxy / serve-files-only-from-S3) that isolates outbound network access from the request h…
python/tokenspeed/runtime/pd/mini_lb.py:60 ssrflegacy
high Legacy software ssrf conf 1.00 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTTP request to a user-controlled URL without allowlist validation. Attackers can probe internal services (169.254.169.254 metadata, internal Kubernetes endpoints, file:// URIs), exfiltrate data, or pivot through your network. SSRF is OWASP A10:2021 and a frequent foothold in cloud breaches.
Validate the URL against an allowlist BEFORE fetching: ALLOWED = {'images.example.com', 'cdn.example.com'} host = urlparse(url).hostname if host not in ALLOWED: abort(400) Or use a server-side proxy (Imgproxy / serve-files-only-from-S3) that isolates outbound network access from the request h…
python/tokenspeed/runtime/cache/storage/mooncake_store/mooncake_store.py:288 ssrflegacy
high Legacy software ssrf conf 1.00 [SEC029] Server-Side Request Forgery (SSRF) — outbound HTTP from user input: Outbound HTTP request to a user-controlled URL without allowlist validation. Attackers can probe internal services (169.254.169.254 metadata, internal Kubernetes endpoints, file:// URIs), exfiltrate data, or pivot through your network. SSRF is OWASP A10:2021 and a frequent foothold in cloud breaches.
Validate the URL against an allowlist BEFORE fetching: ALLOWED = {'images.example.com', 'cdn.example.com'} host = urlparse(url).hostname if host not in ALLOWED: abort(400) Or use a server-side proxy (Imgproxy / serve-files-only-from-S3) that isolates outbound network access from the request h…
python/tokenspeed/bench.py:285 ssrflegacy
high Legacy cicd docker conf 0.95 Docker final stage runs as root
The final runtime stage explicitly uses root. A compromised app process would have root inside the container.
docker/Dockerfile:4 dockerlegacy
high Legacy quality error_handling conf 1.00 [ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even in cleanup code, log at DEBUG level.
Log the error: `except Exception: logger.debug('cleanup failed', exc_info=True)`. Or handle specific exception types.
python/tokenspeed/runtime/models/deepseek_v4.py:2808 error_handlinglegacy
high Legacy quality error_handling conf 1.00 [ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even in cleanup code, log at DEBUG level.
Log the error: `except Exception: logger.debug('cleanup failed', exc_info=True)`. Or handle specific exception types.
python/tokenspeed/runtime/layers/deepseek_v4_mhc.py:62 error_handlinglegacy
high Legacy quality error_handling conf 1.00 [ERR001] Silent Exception Swallowing: Silently swallowing all exceptions hides bugs. Even in cleanup code, log at DEBUG level.
Log the error: `except Exception: logger.debug('cleanup failed', exc_info=True)`. Or handle specific exception types.
python/tokenspeed/_logging.py:109 error_handlinglegacy
medium Legacy security deserialization conf 1.00 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code.
Use yaml.safe_load() instead of yaml.load(). Avoid pickle for untrusted data.
python/tokenspeed/runtime/utils/common.py:418 deserializationlegacy
medium Legacy security deserialization conf 1.00 [SEC007] Unsafe Deserialization: Unsafe deserialization can execute arbitrary code.
Use yaml.safe_load() instead of yaml.load(). Avoid pickle for untrusted data.
python/tokenspeed/runtime/distributed/utils.py:132 deserializationlegacy
medium Legacy security deserialization conf 1.00 [SEC011] Unsafe PyTorch Model Loading: torch.load() uses pickle internally and can execute arbitrary code from untrusted model files.
Use torch.load(..., weights_only=True) or use safetensors format.
python/tokenspeed/runtime/model_loader/weight_utils.py:333 deserializationlegacy
medium Legacy security crypto conf 1.00 [SEC015] Insecure Randomness for Security: Weak PRNG used in security-sensitive context. Output is predictable.
Use secrets module (Python) or crypto.getRandomValues() (JS) for security-sensitive randomness.
python/tokenspeed/bench.py:742 cryptolegacy
medium Legacy quality quality conf 0.72 Agent control bridge may listen on a network interface without visible auth
Agent, MCP, sidecar, and command bridge servers often start as local helpers. Binding them to 0.0.0.0 or a default all-interface listener without an authorization guard can expose tool execution or session data to the LAN.
python/tokenspeed/cli/serve_smg.py:13 qualitylegacy
high Legacy cicd docker conf 0.76 Dockerfile copies broad context with incomplete .dockerignore
COPY . or ADD . is safer when .dockerignore excludes secrets, git history, keys, and generated artifacts.
docker/Dockerfile:11 dockerlegacy
low Legacy cicd docker conf 0.72 .dockerignore misses sensitive defaults
.dockerignore exists but does not cover common secret or VCS patterns.
.dockerignore dockerlegacy
low Legacy cicd docker conf 0.72 Dockerfile keeps pip download cache
Pip's package cache increases image size and can preserve unnecessary artifacts.
docker/Dockerfile:16 dockerlegacy
low Legacy quality quality conf 0.64 Duplicate top-level symbol appears in a patch-style file
A generated replacement file defining the same public function or class name as another module can mean the new logic is not actually wired into the running code.
python/tokenspeed/runtime/models/deepseek_v3.py:1 qualitylegacy
low Legacy quality quality conf 0.64 Duplicate top-level symbol appears in a patch-style file
A generated replacement file defining the same public function or class name as another module can mean the new logic is not actually wired into the running code.
python/tokenspeed/runtime/models/deepseek_v4.py:1 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/qwen2.py:355 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/minimax_m2.py:660 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/minimax_m2.py:565 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/minimax_m2.py:530 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/llama_eagle3.py:118 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/models/llama_eagle3.py:47 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/moe/backends/w8a8_fp8/triton.py:44 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/moe/backends/unquantized/triton.py:28 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/moe/backends/unquantized/flashinfer_trtllm.py:47 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/moe/backends/nvfp4/flashinfer_cutlass.py:33 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/moe/backends/fp8/triton.py:23 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/dense/w8a8_fp8.py:86 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/linear/wy_fast.py:26 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/linear/wy_fast.py:25 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/linear/solve_tril.py:17 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/linear/cumsum.py:31 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/kv_cache/mla.py:31 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/configs/mla.py:19 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/trtllm_mla.py:182 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/trtllm_mla.py:65 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/trtllm.py:379 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/triton.py:711 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/triton.py:650 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/layers/attention/backends/tokenspeed_mla.py:204 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/engine/scheduler_control_client.py:59 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/engine/request.py:212 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/distributed/comm_backend/trtllm_allreduce.py:114 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/distributed/comm_backend/triton_allreduce.py:53 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/configs/qwen3_config.py:118 qualitylegacy
low Legacy quality quality conf 0.86 Duplicated implementation block across source files
Duplicated blocks are a common artifact when generated code is pasted or recreated instead of reused. They increase maintenance cost because every future bug fix must be found in multiple locations.
python/tokenspeed/runtime/configs/qwen3_config.py:46 qualitylegacy
info Legacy quality quality conf 0.62 Source file name looks like an AI patch artifact
Files named as final, fixed, copy, new, or backup are often temporary patch artifacts. They may be legitimate, but they deserve review before becoming production surface area.
tokenspeed-kernel/python/tokenspeed_kernel/ops/attention/triton/deepseek_v4.py:1 qualitylegacy
info Legacy quality quality conf 0.62 Source file name looks like an AI patch artifact
Files named as final, fixed, copy, new, or backup are often temporary patch artifacts. They may be legitimate, but they deserve review before becoming production surface area.
python/tokenspeed/runtime/models/deepseek_v3.py:1 qualitylegacy
info Legacy quality quality conf 0.62 Source file name looks like an AI patch artifact
Files named as final, fixed, copy, new, or backup are often temporary patch artifacts. They may be legitimate, but they deserve review before becoming production surface area.
python/tokenspeed/runtime/layers/attention/kv_cache/deepseek_v4.py:1 qualitylegacy
info Legacy quality quality conf 0.62 Source file name looks like an AI patch artifact
Files named as final, fixed, copy, new, or backup are often temporary patch artifacts. They may be legitimate, but they deserve review before becoming production surface area.
python/tokenspeed/runtime/layers/attention/backends/deepseek_v4.py:1 qualitylegacy
{# ── 2026-05-17 Round 14: AI-agent bridge footer ────────────────────── Discoverability: the /agents/voting/ guide + MCP manifest exist but aren't linked from anywhere users actually land. Small, opt-in footer. #}
For AI agents: Voting guide (TP/FP) MCP manifest Stdio wrapper SARIF Integrate Findings queue Vote TP/FP on findings to calibrate the engine.
For AI agents + API integrations
Email me when this repo regresses
Free. We re-scan periodically; new criticals → your inbox. No signup required for the scan itself.
API access

This page is publicly accessible at: https://repobility.com/scan/36c59998-1773-42d3-8cb0-456402726f35/

To check status programmatically (no auth required):

curl -s https://repobility.com/api/v1/public/scan/36c59998-1773-42d3-8cb0-456402726f35/

Important — please don't re-submit the same URL repeatedly. The submission endpoint is idempotent: re-submitting the same git URL returns this same scan_token, not a new one. To re-scan this repo, sign up free and use the dashboard.