What is this?
VulnDB surfaces open-source components that are currently the most active in terms of known security vulnerabilities, across three major package ecosystems: Maven (Java), PyPI (Python), and npm (JavaScript). The goal is to help engineering teams make informed adoption decisions β not to discourage use of open source, but to highlight components that warrant closer scrutiny or an upgrade before use.
Data is refreshed automatically every Monday at 06:00 UTC and deployed to this site within minutes of the pipeline completing.
Ecosystems
| Dashboard | Ecosystem | Package format | Example |
|---|---|---|---|
| β Maven | Java / JVM | groupId:artifactId | org.apache.logging.log4j:log4j-core |
| π PyPI | Python | package name | requests, django |
| π¦ npm | JavaScript / Node.js | package name | lodash, express |
Data Source
All vulnerability data is sourced from the Open Source Vulnerabilities (OSV) database, maintained by Google. OSV aggregates vulnerability data from multiple authoritative sources including:
- NVD β NIST National Vulnerability Database (the primary CVE registry)
- GitHub Advisory Database β security advisories published by GitHub
- Package maintainers β direct disclosures via the OSV schema
The pipeline downloads a complete bulk export per ecosystem β a single ZIP file containing all known CVEs for that ecosystem. No API key is required.
| Ecosystem | OSV bulk URL |
|---|---|
| Maven | osv-vulnerabilities.storage.googleapis.com/Maven/all.zip |
| PyPI | osv-vulnerabilities.storage.googleapis.com/PyPI/all.zip |
| npm | osv-vulnerabilities.storage.googleapis.com/npm/all.zip |
How the Pipeline Works
The same six-stage process runs for each ecosystem in sequence:
- Download β The OSV bulk ZIP for the ecosystem is downloaded into memory.
- Parse β Each JSON record is parsed to extract the CVE ID, CVSS score, severity, published date, affected versions, and fixed version.
- Score β CVSS scores are extracted from three sources in priority order:
database_specific.cvss_score, the CVSS vector string (parsed using the cvss library), or the severity label as a fallback. - Aggregate β Records are grouped by package identity. For Maven this is
groupId:artifactId; for PyPI and npm it is the flat package name. - Generate β Two CSVs and a metadata JSON are written to
/tmp/{ecosystem}/. - Upload β All outputs are uploaded to S3 under an ecosystem-specific prefix (
maven/,pypi/,npm/). A single CloudFront cache invalidation is issued after all ecosystems complete.
The pipeline runs as an AWS Lambda function (container image, 3 GB RAM, 15-minute timeout) triggered by an EventBridge schedule. The Lambda image is built and deployed automatically via GitHub Actions whenever the pipeline code changes.
Risk Score
Raw CVE count is a poor adoption signal because it heavily favours old, widely-used components that have simply accumulated history. A component with 300 CVEs published over 15 years is very different from one with 20 CVEs published in the last 12 months.
The Risk Score is designed to capture current risk:
β¦counting only CVEs published in the last 24 months
| Severity | CVSS Range | Weight | Rationale |
|---|---|---|---|
| CRITICAL | 9.0 β 10.0 | 4 | Remote code execution, authentication bypass β immediate action required |
| HIGH | 7.0 β 8.9 | 2 | Significant impact, exploitable with moderate effort |
| MEDIUM | 4.0 β 6.9 | 1 | Limited impact or requires specific conditions to exploit |
| LOW | 0.1 β 3.9 | 0 | Minimal practical impact |
A component scoring 0 has had no medium-or-above CVEs in the last two years β regardless of its historical total. A high score indicates active, severe vulnerability activity worth investigating before adoption.
Trend
The Trend column compares the CVE count in the most recent 12 months against the prior 12 months:
- β Increasing β more CVEs in the last 12 months than the previous 12. The component is getting noisier.
- β Decreasing β fewer recent CVEs. May indicate improving security posture or reduced scrutiny.
- β Stable β no change between periods.
Min Safe Version & Unfixed CVEs
OSV records include a fixed version for each CVE where one has been published by the maintainer. The Min Safe Version shown in the summary is the highest fixed version referenced across all CVEs for that package β upgrading to this version or above addresses the maximum number of known vulnerabilities.
Unfixed CVEs counts CVEs for which OSV records no fixed version. This could mean:
- The vulnerability has not yet been patched by the maintainer
- The fix exists but has not been recorded in OSV (OSV data lags behind in some cases)
- The component is abandoned and no fix is planned
Affected Versions
The Affected Versions count in the summary is the number of distinct version strings that appear in at least one CVE's affected list for that package. It is deduplicated across all CVEs β so a version affected by five different CVEs is counted once.
Limitations
- OSV completeness β OSV does not capture every vulnerability. Some disclosures appear only in NVD or vendor advisories before being ingested into OSV. The pipeline reflects OSV's current state at the time of the weekly run.
- CVSS accuracy β CVSS scores are taken from OSV records which may use NVD scores, maintainer-supplied scores, or be inferred from severity labels. Scores can change as NVD re-analyses CVEs.
- Transitive dependencies β Widely-used utility packages (e.g. commons-codec in Maven, urllib3 in PyPI, debug in npm) may appear with high CVE counts despite being low-risk in practice, because they are almost always pulled in as transitive dependencies rather than chosen directly.
- Ecosystem scope β Only Maven, PyPI, and npm are currently covered. Other ecosystems supported by OSV (Go, Rust, RubyGems, etc.) are not included but use the same data source and could be added.
- Fixed version lag β Fixed version data may lag behind actual patch releases by days or weeks depending on how quickly maintainers update OSV records.
Infrastructure
| Component | Technology | Purpose |
|---|---|---|
| Pipeline | Python 3.12 Β· pandas Β· boto3 | Download, parse, score, and generate outputs for each ecosystem |
| Compute | AWS Lambda (container, 3 GB) | Runs the pipeline on demand and on schedule |
| Container Registry | Amazon ECR | Stores the pipeline Docker image |
| Schedule | Amazon EventBridge Scheduler | Triggers Lambda every Monday 06:00 UTC |
| Storage | Amazon S3 | Hosts CSV outputs, metadata, and this website (under maven/, pypi/, npm/ prefixes) |
| CDN | Amazon CloudFront | HTTPS delivery and edge caching |
| CI/CD | GitHub Actions | Builds and deploys Lambda image on code push; deploys HTML on content change |