How We Swept 25 Repos for Security Issues in an Afternoon (and What We Actually Found)
A practitioner's account of running a full Dependabot and CodeQL sweep across an entire GitHub organization, fixing real vulnerabilities, and hardening a production server the same day.
We had one of those days where you start with “let’s quickly check the security alerts” and end up eight hours later having fixed injection vulnerabilities, hardened a server, rewritten URL parsers, and accidentally broken a website twice.
This is that story.
The trigger
GitHub had been sending email notifications for a while. Dependabot alerts here, CodeQL warnings there. Nothing that felt urgent individually. But when I finally sat down and looked at the full picture across all our repositories, the number was embarrassing: over 30 open Dependabot vulnerability alerts and 50+ code scanning findings spread across 25 repos.
So we decided to do a proper sweep. All at once.
The sweep
The first step was getting a real overview. One API call per repo to count open vulnerability alerts, another to list open code scanning alerts. Within a few minutes we had a prioritized picture:
- Eight repos with multiple vulnerability alerts, led by one with a critical
handlebarsCVE that had been sitting open for weeks - Several repos with CodeQL findings ranging from
command-line-injectiontopolynomial-redostoincomplete-url-substring-sanitization - A long tail of repos each with a single
picomatchorbrace-expansionadvisory
We split the work into phases. Phase 1 tackled the high-alert repos in parallel. Phase 2 did a bulk sweep of the one-alert repos. Phase 3 was the actual code fixes.
What the automated tooling caught
Dependabot is good at the obvious stuff. Most of the vulnerability alerts were transitive dependencies: a package deep in the tree pulling in an old version of picomatch with a regex injection issue, or handlebars with a prototype pollution CVE. The fix is usually a version bump, and if the package maintainer has kept up, Dependabot has already opened a PR.
We merged everything that was a safe patch or minor bump. We held back anything involving a major version change, specifically TypeScript 5 to 6, which showed up across five repos. That migration needs intentional testing, not a blind merge.
CodeQL caught more interesting things.
The real findings
A few stood out.
URL substring checks. Several provider files were doing URL validation like this:
if (url.includes('chatgpt.com')) {
// handle this provider
}This looks fine until you think about it for a second. A URL like https://attacker.com/redirect?to=chatgpt.com would pass this check. The fix is a line longer but unambiguous:
try {
const parsed = new URL(url);
if (parsed.hostname === 'chatgpt.com' ||
parsed.hostname.endsWith('.chatgpt.com')) {
// handle this provider
}
} catch {
return false;
}Seven instances of this pattern, all fixed.
Backslash escaping order. This is a classic. When you're sanitizing strings for use in shell commands or markdown, you need to escape backslashes before you escape anything else. If you do it the other way around, a value like a\"b gets corrupted on the way through. The correct order: backslash first, then everything else. Found in command-building code across two different projects.
ReDoS vulnerabilities. Regular expressions with nested quantifiers on user-controlled input can be made to backtrack exponentially. The fixes ranged from replacing the regex entirely with indexOf/lastIndexOf calls to rewriting the quantifier to remove the ambiguity:
// Safe: indexOf instead of regex, no backtracking
const start = input.indexOf('{');
const end = input.lastIndexOf('}');
const match = start !== -1 && end > start
? input.slice(start, end + 1) : null;Not every ReDoS is exploitable in practice, but if the fix is a one-liner, there's no reason not to.
User-controlled timer duration. One finding was js/resource-exhaustion: a timeout value read from user input was being passed directly to setTimeout without any bounds checking. A crafted request with a very large timeout would hold a process handle open indefinitely.
// Before: no bounds
setTimeout(callback, userProvidedTimeout);
// After: clamped to 30 seconds max
const safeDuration = Math.min(Math.max(0, userProvidedTimeout), 30_000);
setTimeout(callback, safeDuration);Two lines. Done.
The server hardening
The same day, we also ran a hardening pass on the production server. The checklist covered SSH configuration, Apache security headers, mail server hardening, database security settings, and a few other areas.
The allow_url_fopen = Off PHP setting, which prevents PHP from making outbound HTTP requests via file_get_contents, is good security practice but breaks any PHP code that fetches external APIs. We had a site that pulled blog posts and repository data from external APIs on each page load. After hardening, those sections went blank.
The fix was to decouple the API calls from the web request entirely. A cron job now refreshes the cached data every ten minutes. The PHP code only reads from the local cache file. No more outbound requests from the web process, and no more blank sections. This is actually a better architecture regardless of the hardening requirement.
The other gotcha was Content Security Policy and inline styles. A strict CSP that uses nonces for scripts and stylesheets will block any style="..." attributes on HTML elements, because those can't receive nonces. We had a handful scattered across the codebase. The fix for static values is straightforward: move them to CSS classes. The dynamic color case needed a small JavaScript snippet to apply the colors at runtime via element.style.background = element.dataset.color, which runs with the page's script nonce and is perfectly CSP-compliant.
What didn't get fixed today
A few TypeScript 5 to 6 major version bumps are still waiting. That's not something to merge without reading the changelog and testing. The Dependabot PRs are open, they're just on hold.
There are also a few vulnerabilities where Dependabot hasn't opened a PR yet. These are likely deeply transitive, so the fix requires tracing back to the top-level package that's pulling them in.
The takeaway
Running a sweep like this across many repos at once reveals patterns you'd miss looking at one repo at a time. The URL substring issue appeared in multiple codebases independently. The backslash escaping order mistake appeared in two different projects. These are the kinds of things that happen when developers write similar code in parallel without a shared review.
The automated tooling catches a lot, but it needs human judgment to sort signal from noise. False positives are real: CodeQL flagged a key-masking function as “clear-text logging” because it technically accessed apiKeys before printing only the masked version. That's not a bug, that's correct code. Dismissing false positives is part of the job.
The actual exploitable issues took maybe two hours to fix across all repos. The process of finding them, triaging them, and deciding what to prioritize took most of the day. That ratio feels about right.