Search Techniques
This guide covers URL discovery methods, search techniques, and optimization strategies for maximizing reconnaissance effectiveness with WayHack's built-in search command.
Search Fundamentals
Basic Search Patterns
Domain-based Search:
wayhack search --domain example.com
Subdomain Discovery:
wayhack search --domain example.com --include-subdomains
Path-specific Search:
wayhack search --domain example.com --path "/admin"
File Extension Targeting:
wayhack search --domain example.com --extensions "pdf,doc,xls"
Output Format Options:
# JSON output
wayhack search --domain example.com --output json
# CSV output
wayhack search --domain example.com --output csv
# Text output (default)
wayhack search --domain example.com --output text
Result Limiting:
# Limit results to 500
wayhack search --domain example.com --limit 500
# Get maximum results
wayhack search --domain example.com --limit 5000
Advanced Search Options
Source Selection:
# Use specific data sources
wayhack search --domain example.com --sources wayback,crtsh
# Use all available sources (default)
wayhack search --domain example.com --sources wayback,crtsh,commoncrawl
# Single source for faster results
wayhack search --domain example.com --sources wayback
Combined Filters:
# Combine multiple filters
wayhack search --domain example.com --path "/api" --extensions "json"
# Subdomain discovery with specific extensions
wayhack search --domain example.com --include-subdomains --extensions "pdf,doc"
# Path and subdomain combination
wayhack search --domain example.com --include-subdomains --path "/admin"
Multiple Search Strategy:
# Separate searches for different purposes
wayhack search --domain example.com --path "/api" --output json
wayhack search --domain example.com --path "/admin" --output json
wayhack search --domain example.com --extensions "pdf,doc,xls" --output json
# View all results
wayhack view --latest --count 3
Data Sources
Available Sources
WayHack's search command supports multiple data sources for comprehensive URL discovery:
-
wayback: Wayback Machine archives for historical web data
-
urlscan: URLScan.io for live web scanning and analysis
-
otx: AlienVault OTX for threat intelligence data
-
commoncrawl: Common Crawl web archive data
-
shodan: Shodan for internet-connected device discovery
-
profundis: Profundis.io for deep web crawling
-
virustotal: VirusTotal for domain and URL intelligence
-
securitytrails: SecurityTrails for historical DNS data
-
censys: Censys for internet asset discovery
-
intelx: IntelX.io for threat intelligence gathering
-
leakix: LeakIX.net for data leak discovery
-
fofa: Fofa for cyber asset discovery
-
crtsh: Certificate Transparency logs
-
netlas: Netlas.io for internet asset intelligence
-
builtwith: BuiltWith for technology stack analysis
-
zoomeye: ZoomEye for cyberspace search
-
hunter: Hunter.how for attack surface discovery
-
github: GitHub code and repository search
-
gitlab: GitLab code and repository search
Wayback Machine
Basic Usage:
# Wayback Machine only
wayhack search --domain example.com --sources wayback
# Include subdomains for broader coverage
wayhack search --domain example.com --sources wayback --include-subdomains
Best Practices:
# Combine with path filtering for targeted discovery
wayhack search --domain example.com --sources wayback --path "/api"
# Focus on specific file types
wayhack search --domain example.com --sources wayback --extensions "js,json,xml"
Certificate Transparency (crt.sh)
Subdomain Discovery:
# Basic certificate transparency search
wayhack search --domain example.com --sources crtsh
# Include subdomains for comprehensive enumeration
wayhack search --domain example.com --sources crtsh --include-subdomains
Targeted Discovery:
# Combine with other sources for validation
wayhack search --domain example.com --sources crtsh,wayback
# Focus on specific paths in discovered subdomains
wayhack search --domain example.com --sources crtsh --include-subdomains --path "/admin"
Common Crawl
Large-scale Discovery:
# Common Crawl data mining
wayhack search --domain example.com --sources commoncrawl
# Limit results for faster processing
wayhack search --domain example.com --sources commoncrawl --limit 2000
Practical Search Workflows
Comprehensive Domain Reconnaissance
Step 1: Initial Discovery:
# Start with all sources for maximum coverage
wayhack search --domain example.com --sources wayback,crtsh,commoncrawl
Step 2: Subdomain Enumeration:
# Focus on subdomain discovery
wayhack search --domain example.com --sources crtsh --include-subdomains
Step 3: Targeted Path Discovery:
# Look for admin interfaces
wayhack search --domain example.com --path "/admin" --include-subdomains
# API endpoint discovery
wayhack search --domain example.com --path "/api" --include-subdomains
# Common sensitive paths
wayhack search --domain example.com --path "/backup" --include-subdomains
Document and File Discovery
Sensitive File Types:
# Configuration files
wayhack search --domain example.com --extensions "xml,json,yml,yaml"
# Documentation and backups
wayhack search --domain example.com --extensions "pdf,doc,docx,xls,xlsx"
# Database and backup files
wayhack search --domain example.com --extensions "sql,db,bak,backup"
Development Files:
# Source code and configs
wayhack search --domain example.com --extensions "js,php,py,rb,java"
# Environment and config files
wayhack search --domain example.com --extensions "env,config,ini,conf"
Advanced Discovery Techniques
Multi-Source Strategy
Comprehensive Discovery:
# All available sources for maximum coverage
wayhack search --domain example.com --sources wayback,crtsh,commoncrawl
# Archive sources for historical data
wayhack search --domain example.com --sources wayback,commoncrawl
# Certificate transparency for subdomain discovery
wayhack search --domain example.com --sources crtsh --include-subdomains
Source Prioritization Strategy:
# Fast discovery with crt.sh
wayhack search --domain example.com --sources crtsh --output json
# Comprehensive follow-up with Wayback
wayhack search --domain example.com --sources wayback --output json
# Large dataset mining with Common Crawl
wayhack search --domain example.com --sources commoncrawl --limit 3000 --output json
# View all results
wayhack view --latest --count 3
Automated Discovery Workflows
Batch Subdomain Discovery:
#!/bin/bash
# Automated subdomain enumeration
domain="example.com"
# Initial subdomain discovery
echo "Starting subdomain discovery for $domain"
wayhack search --domain "$domain" --sources crtsh --include-subdomains --output json
# Get the latest scan ID for processing
latest_scan=$(wayhack view --latest --tool search | head -1 | awk '{print $1}')
echo "Latest scan ID: $latest_scan"
# View results
wayhack view "$latest_scan"
Path Enumeration Workflow:
#!/bin/bash
# Systematic path discovery
domain="example.com"
common_paths=("/admin" "/api" "/dashboard" "/login" "/upload" "/backup")
echo "Starting path enumeration for $domain"
for path in "${common_paths[@]}"; do
echo "Searching for path: $path"
wayhack search --domain "$domain" --path "$path" --include-subdomains --output json
sleep 2 # Rate limiting
done
echo "Path enumeration complete. View results with: wayhack view --latest --count ${#common_paths[@]}"
Result Analysis
Viewing Search Results:
# View latest search
wayhack view --latest
# View specific search by ID
wayhack view scan_1234567890
# View multiple recent searches
wayhack view --latest --count 5
# View detailed scan information
wayhack view --detailed
Processing Results:
# Search results are automatically saved in:
# ~/.wayhack-outputs/scan_ID/results.txt (or .json, .csv)
# Example: Extract unique domains from results
cat ~/.wayhack-outputs/scan_*/results.txt | grep -oP 'https?://[^/]+' | sort -u
# Example: Filter for specific file types
cat ~/.wayhack-outputs/scan_*/results.txt | grep -E '\.(pdf|doc|xls)$'
Best Practices and Tips
Search Optimization
Start Small, Scale Up:
# Begin with fast sources
wayhack search --domain example.com --sources crtsh
# Expand to comprehensive search
wayhack search --domain example.com --sources wayback,crtsh,commoncrawl
# Use limits for large domains
wayhack search --domain example.com --limit 2000
Targeted Discovery:
# Focus on specific areas of interest
wayhack search --domain example.com --path "/api" --extensions "json,xml"
# Combine filters for precision
wayhack search --domain example.com --include-subdomains --extensions "pdf,doc" --limit 500
Managing Large Result Sets
Use Appropriate Limits:
# Small test run
wayhack search --domain example.com --limit 100
# Medium discovery
wayhack search --domain example.com --limit 1000
# Comprehensive search
wayhack search --domain example.com --limit 5000
Output Format Selection:
# JSON for programmatic processing
wayhack search --domain example.com --output json
# CSV for spreadsheet analysis
wayhack search --domain example.com --output csv
# Text for simple viewing
wayhack search --domain example.com --output text
Workflow Integration
Sequential Searches:
# Progressive discovery approach
wayhack search --domain example.com --sources crtsh --include-subdomains
wayhack search --domain example.com --sources wayback --path "/admin"
wayhack search --domain example.com --sources commoncrawl --extensions "pdf,doc"
# Review all results
wayhack view --latest --count 3
Command Reference
Search Command Syntax
wayhack search [flags]
Available Flags
Flag |
Short |
Description |
Default |
---|---|---|---|
|
|
Target domain to search (required) |
- |
|
|
Comma-separated list of data sources |
|
|
|
Include subdomains in search |
|
|
|
Comma-separated list of file extensions |
- |
|
|
Specific path to search for |
- |
|
|
Output format (text, json, csv) |
|
|
|
Maximum number of results |
|
Quick Reference Examples
# Basic domain search
wayhack search -d example.com
# Subdomain discovery
wayhack search -d example.com -i
# Specific file types
wayhack search -d example.com -e pdf,doc,xls
# Admin panel discovery
wayhack search -d example.com -p "/admin" -i
# JSON output with limit
wayhack search -d example.com -o json -l 500
# Multiple sources
wayhack search -d example.com -s wayback,crtsh
Troubleshooting
Common Issues
API Connection Problems:
# Check API configuration
wayhack check
# Verify API key setup
wayhack setup
Large Result Sets:
# Use limits to manage large datasets
wayhack search --domain example.com --limit 1000
# Use specific sources for faster results
wayhack search --domain example.com --sources crtsh
No Results Found:
# Try different sources
wayhack search --domain example.com --sources wayback
wayhack search --domain example.com --sources commoncrawl
# Include subdomains for broader coverage
wayhack search --domain example.com --include-subdomains
Conclusion
The WayHack search command provides a powerful interface for URL discovery and OSINT reconnaissance. By combining multiple data sources, flexible filtering options, and automated result management, it streamlines the process of gathering intelligence about target domains.
Key benefits:
-
Multiple data sources: Wayback Machine, Certificate Transparency, and Common Crawl
-
Flexible filtering: Domain, subdomain, path, and extension filters
-
Multiple output formats: Text, JSON, and CSV support
-
Automatic result management: All searches are saved and can be reviewed later
-
Integration ready: Results integrate seamlessly with other WayHack tools
For more information on viewing and managing search results, see the CLI Tool Mastery guide.