projectdiscovery httpx: HTTP Enumeration at Scale
How projectdiscovery httpx works, where it fits between subfinder and nuclei, how it compares to httprobe and EyeWitness, and a reproducible recon recipe.
You have a few thousand subdomains from a passive recon sweep. Now you need to know which ones are alive, what they’re serving, and what metadata you can pull before touching individual endpoints. That’s the exact problem projectdiscovery/httpx solves — and it does it better than stitching together curl, httprobe, and a custom parser.
What httpx Actually Does
httpx is an open-source HTTP toolkit maintained by the ProjectDiscovery team (github.com/projectdiscovery/httpx). It accepts hosts or URLs on stdin or from a file, fires HTTP/HTTPS requests using the retryablehttp library, and returns structured output per host. The retryablehttp layer matters in practice: automatic retries with configurable backoff cuts false negatives when you’re scanning across networks with uneven latency or soft rate limits.
Single statically linked Go binary, cross-platform, installable in one line. That’s the operational baseline.
Probe categories from the official documentation:
- HTTP status codes, response length, content type
- HTML title extraction
- Web server and technology fingerprinting via response headers
- TLS certificate metadata (subject, SANs, expiry)
- Redirect chain following or suppression
- CDN and WAF detection
- Screenshot capture via headless Chromium
- Custom header injection and arbitrary method support
- JSON and CSV output for pipeline consumption
That breadth in a single binary is what separates httpx from narrower tools that only answer the live/dead question.
Where httpx Sits in the Pipeline
External recon has at least three phases: asset discovery, service identification, and vulnerability research. httpx owns the middle phase — the bridge between raw hostnames and actionable endpoint data.
[Passive recon] [Service ID] [Vuln research]
amass / subfinder --> httpx --> nuclei / manual
dnsx / shuffledns (this tool) burp / ffuf
ProjectDiscovery designs its tools to chain via stdin/stdout, and httpx is the canonical middle layer. Feed it subfinder or dnsx output; its NDJSON feeds directly into nuclei for template scanning or katana for crawling. Outside the ProjectDiscovery stack, anything that accepts newline-delimited JSON works: custom Python scripts, Elasticsearch ingest pipelines, spreadsheet triage.
httpx vs. the Alternatives
httprobe
httprobe (tomnomnom) checks whether a host is serving HTTP or HTTPS. That’s it. Fast and minimal. If a live/dead verdict is the only output you need and you want the smallest possible dependency, httprobe is defensible. The moment you need status codes, titles, TLS data, or technology tags, httpx replaces it without requiring anything else alongside it.
curl / wget
curl is irreplaceable for single-target investigation and precise request crafting. It doesn’t parallelize across large host lists, doesn’t produce structured multi-field output, and has no bulk-scan retry logic. These are different tools for different contexts — httpx for bulk enumeration, curl for manual follow-up.
EyeWitness
EyeWitness captures screenshots and generates an HTML report for visual triage. httpx can also screenshot, but its primary value is structured programmatic output. For large-scope work where visual review is part of triage, running both and correlating on URL is reasonable — they’re not competing for the same role.
Aquatone
Same category as EyeWitness. httpx handles the metadata extraction; Aquatone handles the visual layer if you still want it. A common pattern is using httpx to filter the live set first, then passing that shorter list to Aquatone.
Shodan / Censys
Passive scan data from Shodan or Censys reflects a historical snapshot and may not match your target’s current configuration. httpx gives you real-time first-party results against your exact scope. Use Shodan/Censys for breadth and historical context; use httpx for current ground truth. They answer different questions.
A Reproducible Recon Recipe
The following assumes Debian/Ubuntu with Go 1.21+. Adjust for macOS or Windows. All flags reflect behavior in the official httpx README.
Step 1 — Install
go install -v github.com/projectdiscovery/httpx/cmd/httpx@latest
httpx -version
Or via the ProjectDiscovery tool manager:
pdtm -install httpx
Step 2 — Prepare a Host List
subfinder -d example.com -silent -o hosts.txt
Or a manual hosts.txt for testing:
example.com
blog.example.com
dev.example.com
staging.example.com
Step 3 — Basic Liveness Sweep
httpx -l hosts.txt -silent -status-code -title -o results.txt
Probes HTTP (80) and HTTPS (443) per host, prints status codes and HTML titles, writes to results.txt. -silent suppresses the banner.
Step 4 — Enriched Fingerprinting
httpx -l hosts.txt \
-status-code \
-title \
-tech-detect \
-web-server \
-content-length \
-follow-redirects \
-threads 50 \
-timeout 10 \
-json \
-o enriched.jsonl
| Flag | What it does |
|---|---|
-tech-detect | Fingerprints technologies using Wappalyzer-style signatures |
-web-server | Extracts the Server response header |
-content-length | Records response body size |
-follow-redirects | Follows 3xx chains, reports final URL |
-threads 50 | Parallelism; lower this if you’re hitting rate limits |
-timeout 10 | Per-request timeout in seconds |
-json | NDJSON output, one object per line |
Step 5 — TLS Certificate Extraction
httpx -l hosts.txt \
-tls-probe \
-tls-grab \
-json \
-o tls_data.jsonl
-tls-grab pulls the full certificate chain: subject, issuer, SANs, validity period. Cross-referencing SANs across your target list surfaces in-scope hosts that passive DNS missed — this step catches more than people expect.
Step 6 — Filter and Feed Downstream
# Hosts returning 200, with titles
cat enriched.jsonl | jq -r 'select(.status_code == 200) | "\(.url) | \(.title)"'
# Hosts running nginx
cat enriched.jsonl | jq -r 'select(.webserver | test("nginx"; "i")) | .url'
# Pipe live URLs directly into nuclei
cat enriched.jsonl | jq -r 'select(.status_code == 200) | .url' | nuclei -t cves/ -o nuclei_findings.txt
Raw host list to nuclei-ready input, a few hundred hosts on a standard connection: under ten minutes.
Production Notes
Rate limiting. httpx defaults are aggressive. On engagements with defined rate limits, lower -threads and add -rate-limit (requests per second). Uncontrolled scanning against production infrastructure triggers alerts and can blow scope agreements.
NDJSON vs. JSON array. -json produces NDJSON — one object per line, not a JSON array. If downstream tooling expects an array: jq -s '.' enriched.jsonl > enriched_array.json.
Screenshots. -screenshot requires a Chromium-compatible browser on the host. On headless servers, confirm Chromium is installed and locatable. If it isn’t, the flag produces nothing and doesn’t tell you why — one of the few rough edges in an otherwise solid tool.
PDCP integration. ProjectDiscovery’s cloud platform can ingest httpx output for collaborative triage and asset inventory. For team red team or continuous monitoring setups, it’s worth a look as an alternative to managing flat files across machines.
The Honest Tradeoffs
httpx is the right tool for bulk HTTP enumeration in a structured recon pipeline. Its structured output, retry logic, and native handoff to the rest of the ProjectDiscovery stack — particularly nuclei — make it faster to operationalize than any equivalent combination of single-purpose tools.
Where it doesn’t help: deep manual request crafting (use curl), historical exposure across the internet (use Shodan or Censys), or human-readable visual reports as the primary deliverable (use EyeWitness). Knowing those boundaries keeps the workflow clean.
If you’re already using httpx for liveness checks and nothing else, the enriched probe (Step 4) and TLS extraction (Step 5) are fast additions that surface signal — particularly the SAN cross-referencing — without pulling in new dependencies.