About — VulnDB

What is this?

VulnDB surfaces open-source components that are currently the most active in terms of known security vulnerabilities, across three major package ecosystems: Maven (Java), PyPI (Python), and npm (JavaScript). The goal is to help engineering teams make informed adoption decisions — not to discourage use of open source, but to highlight components that warrant closer scrutiny or an upgrade before use.

Data is refreshed automatically every Monday at 06:00 UTC and deployed to this site within minutes of the pipeline completing.

Ecosystems

Dashboard	Ecosystem	Package format	Example
☕ Maven	Java / JVM	`groupId:artifactId`	`org.apache.logging.log4j:log4j-core`
🐍 PyPI	Python	package name	`requests`, `django`
📦 npm	JavaScript / Node.js	package name	`lodash`, `express`

Data Source

All vulnerability data is sourced from the Open Source Vulnerabilities (OSV) database, maintained by Google. OSV aggregates vulnerability data from multiple authoritative sources including:

NVD — NIST National Vulnerability Database (the primary CVE registry)
GitHub Advisory Database — security advisories published by GitHub
Package maintainers — direct disclosures via the OSV schema

The pipeline downloads a complete bulk export per ecosystem — a single ZIP file containing all known CVEs for that ecosystem. No API key is required.

Ecosystem	OSV bulk URL
Maven	`osv-vulnerabilities.storage.googleapis.com/Maven/all.zip`
PyPI	`osv-vulnerabilities.storage.googleapis.com/PyPI/all.zip`
npm	`osv-vulnerabilities.storage.googleapis.com/npm/all.zip`

How the Pipeline Works

The same six-stage process runs for each ecosystem in sequence:

Download — The OSV bulk ZIP for the ecosystem is downloaded into memory.
Parse — Each JSON record is parsed to extract the CVE ID, CVSS score, severity, published date, affected versions, and fixed version.
Score — CVSS scores are extracted from three sources in priority order: database_specific.cvss_score, the CVSS vector string (parsed using the cvss library), or the severity label as a fallback.
Aggregate — Records are grouped by package identity. For Maven this is groupId:artifactId; for PyPI and npm it is the flat package name.
Generate — Two CSVs and a metadata JSON are written to /tmp/{ecosystem}/.
Upload — All outputs are uploaded to S3 under an ecosystem-specific prefix (maven/, pypi/, npm/). A single CloudFront cache invalidation is issued after all ecosystems complete.

The pipeline runs as an AWS Lambda function (container image, 3 GB RAM, 15-minute timeout) triggered by an EventBridge schedule. The Lambda image is built and deployed automatically via GitHub Actions whenever the pipeline code changes.

Risk Score

Raw CVE count is a poor adoption signal because it heavily favours old, widely-used components that have simply accumulated history. A component with 300 CVEs published over 15 years is very different from one with 20 CVEs published in the last 12 months.

The Risk Score is designed to capture current risk:

Risk Score = (Critical × 4) + (High × 2) + (Medium × 1)
…counting only CVEs published in the last 24 months

Severity	CVSS Range	Weight	Rationale
CRITICAL	9.0 – 10.0	4	Remote code execution, authentication bypass — immediate action required
HIGH	7.0 – 8.9	2	Significant impact, exploitable with moderate effort
MEDIUM	4.0 – 6.9	1	Limited impact or requires specific conditions to exploit
LOW	0.1 – 3.9	0	Minimal practical impact

A component scoring 0 has had no medium-or-above CVEs in the last two years — regardless of its historical total. A high score indicates active, severe vulnerability activity worth investigating before adoption.

Trend

The Trend column compares the CVE count in the most recent 12 months against the prior 12 months:

↑ Increasing — more CVEs in the last 12 months than the previous 12. The component is getting noisier.
↓ Decreasing — fewer recent CVEs. May indicate improving security posture or reduced scrutiny.
→ Stable — no change between periods.

Min Safe Version & Unfixed CVEs

OSV records include a fixed version for each CVE where one has been published by the maintainer. The Min Safe Version shown in the summary is the highest fixed version referenced across all CVEs for that package — upgrading to this version or above addresses the maximum number of known vulnerabilities.

Unfixed CVEs counts CVEs for which OSV records no fixed version. This could mean:

The vulnerability has not yet been patched by the maintainer
The fix exists but has not been recorded in OSV (OSV data lags behind in some cases)
The component is abandoned and no fix is planned

Important limitation: Fixed version data in OSV is only as complete as what maintainers and reporters have submitted. A blank "Fixed In" field does not definitively mean no fix exists — always check the upstream project and NVD directly for the latest status.

Affected Versions

The Affected Versions count in the summary is the number of distinct version strings that appear in at least one CVE's affected list for that package. It is deduplicated across all CVEs — so a version affected by five different CVEs is counted once.

Limitations

OSV completeness — OSV does not capture every vulnerability. Some disclosures appear only in NVD or vendor advisories before being ingested into OSV. The pipeline reflects OSV's current state at the time of the weekly run.
CVSS accuracy — CVSS scores are taken from OSV records which may use NVD scores, maintainer-supplied scores, or be inferred from severity labels. Scores can change as NVD re-analyses CVEs.
Transitive dependencies — Widely-used utility packages (e.g. commons-codec in Maven, urllib3 in PyPI, debug in npm) may appear with high CVE counts despite being low-risk in practice, because they are almost always pulled in as transitive dependencies rather than chosen directly.
Ecosystem scope — Only Maven, PyPI, and npm are currently covered. Other ecosystems supported by OSV (Go, Rust, RubyGems, etc.) are not included but use the same data source and could be added.
Fixed version lag — Fixed version data may lag behind actual patch releases by days or weeks depending on how quickly maintainers update OSV records.

Infrastructure

Component	Technology	Purpose
Pipeline	Python 3.12 · pandas · boto3	Download, parse, score, and generate outputs for each ecosystem
Compute	AWS Lambda (container, 3 GB)	Runs the pipeline on demand and on schedule
Container Registry	Amazon ECR	Stores the pipeline Docker image
Schedule	Amazon EventBridge Scheduler	Triggers Lambda every Monday 06:00 UTC
Storage	Amazon S3	Hosts CSV outputs, metadata, and this website (under `maven/`, `pypi/`, `npm/` prefixes)
CDN	Amazon CloudFront	HTTPS delivery and edge caching
CI/CD	GitHub Actions	Builds and deploys Lambda image on code push; deploys HTML on content change