Static Analysis Beyond Linting: CodeQL, Semgrep, SonarQube, and Snyk Code
Static Analysis Beyond Linting: CodeQL, Semgrep, SonarQube, and Snyk Code
Linters enforce style. Static analysis finds bugs. The difference matters. ESLint will tell you about unused variables and inconsistent formatting. CodeQL will tell you that user input reaches a SQL query without sanitization. These are fundamentally different classes of tools, and most teams only use the first kind.
Modern static analysis tools can detect SQL injection, cross-site scripting, insecure deserialization, race conditions, null pointer dereferences, and hundreds of other bug patterns -- all without running the code. This guide covers four tools that go well beyond linting: CodeQL (GitHub's semantic analysis engine), Semgrep (lightweight pattern matching), SonarQube (code quality platform), and Snyk Code (AI-powered security scanning).
Quick Comparison
| Feature | CodeQL | Semgrep | SonarQube | Snyk Code |
|---|---|---|---|---|
| Analysis type | Semantic (builds AST + data flow graph) | Pattern matching (AST) | Multi-technique | AI + pattern matching |
| Languages | 10+ (Java, JS/TS, Python, Go, C/C++, C#, Ruby, Swift) | 30+ | 30+ | 15+ |
| Custom rules | QL (custom query language) | YAML patterns | Java plugin API | Limited |
| CI integration | GitHub Actions (native) | CLI, any CI | CLI, any CI | CLI, any CI |
| Pricing | Free for public repos, GitHub Advanced Security for private | Free (OSS rules), Team/Enterprise | Community (free), Developer+, Enterprise | Free tier, Team, Enterprise |
| Speed | Slow (full semantic analysis) | Fast (seconds to minutes) | Medium | Fast |
| Best for | Deep vulnerability analysis | Custom rule enforcement | Code quality dashboards | Quick security scanning |
Semgrep: Fast, Flexible Pattern Matching
Semgrep is the easiest tool to start with. It matches code patterns using a syntax that looks like the code itself -- no need to learn a query language. You can go from zero to scanning your codebase in under five minutes.
Setup and First Scan
# Install
pip install semgrep
# or
brew install semgrep
# Run with the default security ruleset
semgrep scan --config auto .
# Run specific rulesets
semgrep scan --config p/owasp-top-ten .
semgrep scan --config p/typescript .
semgrep scan --config p/docker .
Writing Custom Rules
Semgrep rules are YAML files. The pattern syntax mirrors the language being scanned.
# .semgrep/custom-rules.yml
rules:
- id: no-raw-sql-queries
patterns:
- pattern: |
db.query($SQL, ...)
- pattern-not: |
db.query($SQL, $PARAMS)
- metavariable-regex:
metavariable: $SQL
regex: .*\+.*
message: "Raw SQL with string concatenation detected. Use parameterized queries."
languages: [typescript, javascript]
severity: ERROR
metadata:
cwe: ["CWE-89: SQL Injection"]
owasp: ["A03:2021 - Injection"]
- id: no-hardcoded-secrets
patterns:
- pattern: |
const $KEY = "..."
- metavariable-regex:
metavariable: $KEY
regex: .*(password|secret|token|api_key|apikey).*
message: "Potential hardcoded secret in variable '$KEY'. Use environment variables."
languages: [typescript, javascript]
severity: WARNING
- id: missing-error-handling-fetch
pattern: |
const $RESP = await fetch(...);
const $DATA = await $RESP.json();
fix: |
const $RESP = await fetch(...);
if (!$RESP.ok) throw new Error(`HTTP ${$RESP.status}`);
const $DATA = await $RESP.json();
message: "fetch() response not checked for errors before parsing JSON."
languages: [typescript, javascript]
severity: WARNING
# Run your custom rules
semgrep scan --config .semgrep/custom-rules.yml .
CI Integration
# .github/workflows/semgrep.yml
name: Semgrep
on:
pull_request: {}
push:
branches: [main]
jobs:
semgrep:
runs-on: ubuntu-latest
container:
image: semgrep/semgrep
steps:
- uses: actions/checkout@v4
- run: semgrep scan --config auto --config .semgrep/ --error --sarif --output results.sarif .
- uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: results.sarif
CodeQL: Deep Semantic Analysis
CodeQL treats code as data. It builds a database from your source code (AST, control flow graph, data flow graph) and then lets you query it with a purpose-built language called QL. This allows CodeQL to trace data flow through your application -- for example, tracking user input from an HTTP request parameter through multiple function calls until it reaches a SQL query.
GitHub Actions Setup
CodeQL is natively integrated with GitHub. For public repositories, it is free and can be enabled with a single workflow file.
# .github/workflows/codeql.yml
name: CodeQL
on:
push:
branches: [main]
pull_request:
branches: [main]
schedule:
- cron: '0 6 * * 1' # Weekly scan
jobs:
analyze:
runs-on: ubuntu-latest
permissions:
security-events: write
strategy:
matrix:
language: ['javascript-typescript', 'python']
steps:
- uses: actions/checkout@v4
- uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
queries: +security-extended,security-and-quality
- uses: github/codeql-action/autobuild@v3
- uses: github/codeql-action/analyze@v3
Writing Custom CodeQL Queries
CodeQL's power comes from data flow analysis. Here is a query that finds SQL injection vulnerabilities in a TypeScript application:
/**
* @name SQL injection from user input
* @description User input flows to a SQL query without sanitization
* @kind path-problem
* @problem.severity error
* @security-severity 9.8
* @precision high
* @id custom/sql-injection
* @tags security
*/
import javascript
import DataFlow::PathGraph
class SqlInjectionConfig extends TaintTracking::Configuration {
SqlInjectionConfig() { this = "SqlInjectionConfig" }
override predicate isSource(DataFlow::Node source) {
exists(Express::RequestInputAccess input |
source = input
)
}
override predicate isSink(DataFlow::Node sink) {
exists(DataFlow::CallNode call |
call.getCalleeName() = "query" and
sink = call.getArgument(0)
)
}
}
from SqlInjectionConfig config, DataFlow::PathNode source, DataFlow::PathNode sink
where config.hasFlowPath(source, sink)
select sink.getNode(), source, sink,
"This SQL query depends on $@.", source.getNode(), "user-provided input"
Running CodeQL Locally
# Install CodeQL CLI
gh extension install github/gh-codeql
# Create a database
codeql database create my-db --language=javascript-typescript --source-root=.
# Run queries
codeql database analyze my-db codeql/javascript-queries --format=sarif-latest --output=results.sarif
# Run a specific custom query
codeql query run my-query.ql --database=my-db
SonarQube: The Code Quality Platform
SonarQube is less about individual scans and more about continuous code quality tracking. It maintains a history of your codebase's health, tracks technical debt, enforces quality gates on pull requests, and provides dashboards that managers love.
Setup with Docker
# docker-compose.yml
services:
sonarqube:
image: sonarqube:community
ports:
- "9000:9000"
environment:
- SONAR_JDBC_URL=jdbc:postgresql://db:5432/sonar
- SONAR_JDBC_USERNAME=sonar
- SONAR_JDBC_PASSWORD=sonar
volumes:
- sonarqube_data:/opt/sonarqube/data
- sonarqube_extensions:/opt/sonarqube/extensions
depends_on:
- db
db:
image: postgres:16
environment:
- POSTGRES_USER=sonar
- POSTGRES_PASSWORD=sonar
- POSTGRES_DB=sonar
volumes:
- postgresql_data:/var/lib/postgresql/data
volumes:
sonarqube_data:
sonarqube_extensions:
postgresql_data:
Scanning Your Project
# Install the scanner
npm install -g sonarqube-scanner
# Create sonar-project.properties
cat > sonar-project.properties << 'EOF'
sonar.projectKey=my-project
sonar.projectName=My Project
sonar.sources=src
sonar.tests=tests
sonar.javascript.lcov.reportPaths=coverage/lcov.info
sonar.exclusions=**/node_modules/**,**/dist/**
sonar.host.url=http://localhost:9000
sonar.token=sqp_xxxxx
EOF
# Run the scan
sonar-scanner
Quality Gates
Quality gates are the most useful SonarQube feature. They block merges when code does not meet your standards.
Default Quality Gate conditions:
- New code coverage < 80% → FAIL
- New duplicated lines > 3% → FAIL
- New bugs > 0 (critical/blocker) → FAIL
- New vulnerabilities > 0 → FAIL
- New security hotspots reviewed → Required
You can create custom quality gates that match your team's standards. The key insight is that SonarQube applies these gates only to new code -- it does not force you to fix every legacy issue before you can merge.
Snyk Code: AI-Powered Analysis
Snyk Code uses machine learning trained on millions of open-source commits and vulnerability fixes. It is fast (scans complete in seconds), explains findings in plain English, and provides fix suggestions.
Setup
# Install Snyk CLI
npm install -g snyk
# Authenticate
snyk auth
# Scan for code vulnerabilities
snyk code test
# Scan dependencies (different from code scanning)
snyk test
# Monitor continuously
snyk monitor
IDE Integration
Snyk's real strength is IDE integration. The VS Code extension shows findings inline as you type, with explanations and fix suggestions. This is faster feedback than waiting for CI.
// .vscode/settings.json
{
"snyk.features.codeSecurity": true,
"snyk.features.codeQuality": true,
"snyk.features.openSourceSecurity": true,
"snyk.severity": {
"critical": true,
"high": true,
"medium": true,
"low": false
}
}
Building a Defense-in-Depth Pipeline
No single tool catches everything. The practical approach is to layer them:
- IDE (real-time): Snyk Code or Semgrep LSP -- immediate feedback while typing
- Pre-commit: Semgrep with your custom rules -- fast, catches obvious issues
- CI (pull request): Semgrep (full ruleset) + CodeQL -- comprehensive analysis before merge
- Nightly: SonarQube -- track trends, enforce quality gates on the full codebase
# .pre-commit-config.yaml
repos:
- repo: https://github.com/semgrep/semgrep
rev: v1.60.0
hooks:
- id: semgrep
args: ['--config', 'auto', '--config', '.semgrep/', '--error']
Handling False Positives
Every static analysis tool produces false positives. Have a clear process:
// Semgrep: inline suppression
// nosemgrep: no-raw-sql-queries
const result = db.query(`SELECT * FROM ${tableName}`, []);
// CodeQL: suppression in codeql-config.yml
// SonarQube: mark as "Won't Fix" in the UI with a comment explaining why
Keep a .semgrepignore file for paths that do not need scanning (generated code, vendor directories, test fixtures):
# .semgrepignore
vendor/
generated/
**/testdata/
*.min.js
Which Tool to Start With
Start with Semgrep if you want fast results with minimal setup. You can be scanning your codebase in five minutes and writing custom rules in an afternoon. It is the highest return on investment for most teams.
Add CodeQL if you are on GitHub and want deep vulnerability analysis. The default queries catch real bugs, and it is free for open-source.
Add SonarQube if your organization needs code quality dashboards, trend tracking, and quality gates. It is a management tool as much as a developer tool.
Add Snyk Code if you want real-time IDE feedback and your team is willing to pay for it.
The tools are complementary, not competing. Semgrep catches pattern-based issues fast. CodeQL catches data flow issues deep. SonarQube tracks quality over time. Running all of them is not overkill -- it is defense in depth.