OSINT Research Laboratory
Overview
A fully self-hosted OSINT research environment running inside a dedicated Proxmox LXC container, isolated from the rest of the homelab network. Eleven specialist tools covering username intelligence, email footprinting, subdomain discovery, web crawling, vulnerability scanning, and secrets detection are unified under a single Python/FastAPI dashboard with real-time WebSocket output streaming and persistent SQLite scan history.
The project demonstrates infrastructure isolation, process management, Python backend development, and the practical integration of open-source security tooling into a coherent operational platform.
Architecture
The environment runs inside CT200 — a dedicated Ubuntu 22.04 LXC container on the Proxmox homelab, deliberately network-isolated from production services. OSINT work involves querying unknown third-party sites, running scanners, and handling potentially hostile data. Keeping it contained prevents any risk from spreading to the rest of the stack.
Backend: Python 3.12 / FastAPI with uvicorn, serving on port 8000. Each tool invocation spawns an async subprocess with output streamed line-by-line to the browser over a persistent WebSocket connection. Process groups are used for clean cancellation. Scan history is written to SQLite on completion.
Frontend: A single-file dark-themed HTML/CSS/JS dashboard. Tool selection dynamically renders the appropriate input form from the tool configuration registry. Output lines are classified and colour-coded in real time — green for hits, red for errors, blue for informational output.
Tool isolation: Each Python tool runs in its own virtual environment under /home/osint/venvs/ to prevent dependency conflicts. Go-based tools are installed as static binaries to /usr/local/bin/.
Embedded services: SpiderFoot runs as a dedicated systemd service on port 5001. ttyd provides a full shell over WebSocket on port 7681. Both are accessible via the dashboard’s Terminal and SpiderFoot tabs.
Tools integrated
| Tool | Category | Purpose |
|---|---|---|
| Maigret | Username | Deep profile enrichment across 2,500+ sites |
| Sherlock | Username | Fast username sweep across 300+ platforms |
| Holehe | Email account footprinting via password reset flow | |
| Socialscan | Username / Email | Precise availability checking on core platforms |
| theHarvester | Domain | Email, subdomain, and host enumeration |
| Subfinder | Domain | Passive subdomain discovery via 50+ sources |
| Photon | Domain | Web crawler — URLs, emails, embedded secrets |
| Nuclei | Scanner | Template-based vulnerability and misconfiguration checks |
| Trufflehog | Secrets | Git repository credential and secret scanning |
| SpiderFoot | Platform | 200+ module automated OSINT correlation engine |
| ttyd | Terminal | Full browser-based shell access to the container |
Technical challenges
WebSocket streaming vs SSE: An earlier Node.js implementation used Server-Sent Events for output streaming. Tools that use \r carriage returns for progress bars — including Maigret and Holehe — caused the SSE parser to break silently. Switching to WebSockets and handling \r-delimited output in the backend resolved this entirely.
Process group cancellation: Cancelling a subprocess with terminate() only kills the parent process. Tools like Subfinder spawn child processes that continue running. Solved by using start_new_session=True on subprocess creation and sending SIGTERM to the entire process group via os.killpg().
SSL in isolated venvs: Photon’s virtual environment carried an older urllib3 version with a stale certifi CA bundle that could not be fixed via environment variable. The solution was to patch verify=False directly into the tool’s source zap.py for the two outbound request calls, and suppress the resulting warning at the top of the main entry point.
SpiderFoot iframe blocking: Initial design embedded SpiderFoot and ttyd as iframes within the dashboard. Both set headers (X-Frame-Options: sameorigin for SpiderFoot; cross-origin WebSocket restrictions for ttyd) that prevented embedding. Resolved by switching both to new-tab launch buttons, which also provides more working space for SpiderFoot’s graph visualisation.
Stack
- Proxmox VE 8.3 — LXC container host
- Ubuntu 22.04 LXC — isolated container
- Python 3.12 / FastAPI / uvicorn — dashboard backend
- WebSockets — real-time output streaming
- SQLite — scan history persistence
- systemd — service management for dashboard, SpiderFoot, ttyd
- Python venvs — per-tool dependency isolation
- Go binaries — Subfinder, dnsx, httpx, Nuclei, Trufflehog
Source
Tool reference guide and deployment scripts available in the homelab repository.