From Digital Entropy to an Impenetrable Fortress: Anatomy of an Extreme VPS Hardening - Part I

SYSADMIN / HARDENING / CLOUD_SECURITY Technical Read: 18 min

We've all been there. You spin up a $5 VPS, get root access, and the excitement kicks in. Docker here, FTP there, a GUI because "the terminal gets old", twenty containers running things you barely remember deploying. It feels productive. It isn't. What it actually is, is a ticking clock. This post documents the full process of turning that kind of server — exposed, noisy, chaotic — into infrastructure that defends itself while you sleep.

SYSTEM STATUS: POST-HARDENING / CLASSIFIED

Architect: 0n3Z3r0 | Roles: Senior SysAdmin & Ethical Hacker.
Goal: Move from total attack surface exposure to infrastructure under Least Privilege and Continuous Monitoring.
Series: Part I of III — Diagnosis, Permissions, Traffic Control and Storage.

1. The Diagnosis: What a "Typical" VPS Actually Looks Like

Before touching anything, you need to understand what you're dealing with. And the best way to do that is to stop assuming the server is fine and start looking at the evidence. The forensic analysis of a typical, unmanaged VPS tells a very uncomfortable story.

A. The "Swiss Cheese" Attack Surface

Running ss -tulpen is the first step toward confronting reality. Most people are surprised by what they find. The exposure is usually total:

      # The command that shows you everything listening on your server

      ss -tulpen

      # What you typically find on an unmanaged VPS:

      tcp  LISTEN  0.0.0.0:21    → FTP. Wide open. To the entire internet.

      tcp  LISTEN  0.0.0.0:3389  → RDP. Same story.

      tcp  LISTEN  0.0.0.0:3000  → Some Docker app, no auth, no proxy.

      tcp  LISTEN  0.0.0.0:8080  → Another one. You might not even remember what this is.

FTP and RDP open to 0.0.0.0 means open to the entire internet. Services like avahi-daemon and cups running on a cloud server make zero sense — your VPS doesn't need to discover printers or mDNS devices. And those Docker containers mapped to public ports? Each one is a personal invitation to every vulnerability scanner on the internet.

B. The GUI Mistake

Installing a graphical interface on a server is one of those decisions that seems harmless until it isn't. It doesn't just waste RAM. It pulls in hundreds of vulnerable dependencies — X11, audio stacks, display managers — that have no business being on a headless server. An attacker can exploit the window manager to escalate privileges in an environment that should never have had one in the first place.

C. The Container Graveyard

docker ps -a on a neglected server usually looks like a cemetery. Containers with status Exited (1) from months ago. Bare ubuntu and alpine images with no orchestration. This is Shadow IT: services the admin forgot exist, still consuming disk space and potentially exposing stale data. That /var directory sitting at 22GB? That's mostly this.

What we found	Why it's a problem
FTP (21) + RDP (3389) open to `0.0.0.0`	Direct brute-force targets. Shodan indexes these within minutes of exposure.
`avahi-daemon` + `cups` running	Services with no purpose on a cloud server. Unnecessary attack surface, zero benefit.
Docker containers on public ports	Every open port is a vector. No proxy, no auth, no protection.
GUI installed on a headless server	Hundreds of extra dependencies. Privilege escalation via display manager is a real attack path.

Non-technical explanation

Imagine leaving your house with every window open, the front door unlocked, and a note on the door that says "owner's inside". That's an unmanaged VPS. You're not being targeted personally — automated bots are scanning every IP on the internet around the clock. If you look like an easy target, you become one. The goal of this entire series is to make your server look like a closed, dark building with no visible entry points.

2. The Fix: Segregation, Visibility and Defense

Hardening isn't about installing a magic tool. It's about making deliberate decisions and not crossing the lines you draw. Three pillars. No exceptions.

A. User Hierarchy and Least Privilege

Most people run everything as root. It's comfortable right up until it isn't. The moment something gets compromised, the attacker has the keys to everything. The fix is simple: decide what each identity is allowed to do, and never let it do more than that.

      - root: Kernel maintenance and critical emergencies only. Nothing else.

      - labadmin: Day-to-day work: deployments, lab management, general administration.

      - pentest: Isolated user for offensive tools. Contained by design.

      - services: Runs Docker daemons. No interactive shell. No sudo. A cage.

If a container gets compromised and the attacker lands in services, they're stuck. No shell, no lateral movement, no path to escalation. That friction is the point.

B. Filesystem Standardization (/opt)

A professional server doesn't have scripts scattered across /home and /root. Everything lives under /opt, organized and with strict permissions. This isn't cosmetic — it's containment. Moving offensive tools like recon and exploit out of the standard PATH means an automated exploit landing on your server won't find them. It would have to know exactly where to look.

/opt/
 ├── services/   # Production apps (Docker Compose)
 ├── lab/        # Test environments and ethical hacking
 ├── tools/      # Maintenance scripts and binaries
 └── data/       # Persistent volumes and centralized logs

If the server goes down tomorrow, this structure tells you exactly what to save and where it is. That's not just security — that's operational sanity.

User / Directory	Why it matters
`root`	Direct SSH access disabled (`PermitRootLogin no`). To reach root, an attacker first has to compromise another account. One more step. One more barrier.
`labadmin`	If the lab environment gets hit, the blast radius stops there. Production and system services stay untouched.
`pentest`	Offensive tools live here, isolated. A mistake or a compromised tool doesn't affect the rest of the system.
`services`	Docker containers run as this user. No shell, no sudo. If Docker is exploited, the attacker is stuck in a box with nothing useful.
`/opt/recon` + `/opt/exploit`	Out of the system PATH. Automated exploits won't find them. You'd have to know the exact path to use them.

Non-technical explanation

Think about how a bank operates. The teller can handle your deposit but can't open the vault. The vault manager can open the vault but can't approve loans. The IT admin can access the servers but can't move money. Nobody has the keys to everything. If one person is compromised, the damage is contained. That's Least Privilege — not distrust, just damage control built into the architecture.

3. Network Hardening: Firewall Policy and the Tailscale Layer

Here's the thing about firewalls that most tutorials get wrong: the goal isn't to close every port you can find. The goal is to have a default policy of deny everything, and then consciously open only what you can justify. Everything else stays closed, forever, by default.

      # This is the entire UFW policy after hardening

      ufw status verbose

      # The default policy — the real kill switch:

      Default: deny (incoming)   → Everything uninvited: blocked

      Default: allow (outgoing)  → Server can reach out fine

      # The only three things allowed in from the internet:

      22/tcp   ALLOW IN   → SSH

      80/tcp   ALLOW IN   → HTTP (redirect to HTTPS only)

      443/tcp  ALLOW IN   → HTTPS

Three ports. That's it. Every Chinese and Russian bot scanner hits a wall. Every automated exploit trying random ports gets nothing. Three vectors to monitor instead of thirty.

And here's the part I find most elegant: Tailscale. With tailscaled.service running, Portainer, databases, internal services — none of them need a public port. They're only accessible through the private VPN mesh. To any scanner on the internet, they simply don't exist. Shodan can't index what it can't reach.

Defense layer	What it actually solves
UFW deny (default)	Kills the noise at the source. Bot scanners, brute force on random ports, junk traffic — all blocked before it reaches anything.
Ports 22, 80, 443 only	Minimal public attack surface. Three auditable vectors. Much easier to monitor and alert on than a wide-open server.
Tailscale VPN mesh	Sensitive services are invisible from the public internet. No extra ports needed. If you're not on the VPN, you can't see them at all.
Nginx Proxy Manager	Single gateway for all web-facing services. Ports 80 and 443 point here, and Nginx routes traffic to containers internally. Nothing else is public.

Non-technical explanation

Picture an office building in the middle of a city. UFW is the front door security: only three entrances are open (22, 80, 443), everything else is bricked up. Tailscale is a private tunnel that goes directly from your home to your office on the 10th floor, bypassing the lobby entirely. To anyone walking past on the street, that office doesn't exist. That's exactly the goal.

4. Automation: The Difference Between Sysadmin and Professional

The gap between someone who manages a server and someone who engineers infrastructure comes down to one thing: automation. You shouldn't be logging in to check if everything is fine. The server should be telling you.

A. SSH Audit Reports (Cron + Scripts)

Reading through /var/log/auth.log manually is not security operations — it's self-inflicted pain. The solution is a script that runs every night at 23:00 and drops a clean summary into a log file. No intervention required. If something happened during the day, you'll know about it in the morning.

      # The crontab after hardening

      crontab -l

      # The line that does the heavy lifting:

      0 23 * * * /usr/local/bin/ssh-summary.sh >> /var/log/ssh-summary.log 2>&1

      # What the script captures every day:

      → Failed SSH login attempts (source IPs, usernames tried)

      → IPs currently blocked by Fail2Ban

      → Successful logins (who, when, from where)

Over time, /var/log/ssh-summary.log becomes an intelligence feed. You start noticing patterns: recurring IP ranges, consistent scanning hours, country clusters. That's situational awareness you couldn't get from staring at raw logs.

While you're in the crontab, also schedule rkhunter for weekly rootkit scans and unattended security patches. If a critical CVE drops on a Tuesday night, you don't want to find out about it two weeks later.

Component	What it does for you
`ssh-summary.sh`	Nightly report at 23:00. Groups failed attempts, blocked IPs and successful logins into one readable summary.
`/var/log/ssh-summary.log`	Your intelligence history. Patterns emerge over weeks. Recurring attackers, consistent scan windows, coordinated campaigns.
`rkhunter` (weekly)	Scans for rootkits and binary modifications. If something tampered with your system binaries, you'll know.
`unattended-upgrades`	Security patches applied automatically. Critical CVEs don't wait for your maintenance window.

Non-technical explanation

Your server gets hit by hundreds of bots every single day. Going through the raw logs is like watching hours of security camera footage looking for a suspicious face. The ssh-summary.sh is hiring someone to do that for you and leaving a note on your desk every morning: "847 failed attempts, 23 new IPs blocked, only you logged in at 14:32 from your usual location." That's intelligence. Everything else is noise.

5. Storage Audit: Docker and Snap Will Eat Your Disk

Nobody warns you about this when you start stacking containers and Snap packages on a server. It doesn't happen all at once. It's gradual. And then one day you run df -h and the disk is at 87% and you genuinely don't know where it went.

      # The autopsy

      du -sh /* 2>/dev/null | sort -rh | head -10

      # What the evidence shows:

      22G   /var    → Docker volumes + unrotated logs

      12G   /snap   → Snap packages with multiple retained versions

      4.2G  /usr    → Base system — this is normal

Those 22GB in /var are mostly Docker orphans: images from containers you deleted months ago, build cache layers, volumes from services you stopped running. A single docker system prune -a --volumes can reclaim a significant chunk of that.

Snap is a different problem. By default it keeps the last two versions of every package installed. With a handful of security tools installed via Snap, that adds up fast. One line fixes it permanently.

The problem	The fix
`/var` at 22 GB	`docker system prune -a --volumes` to reclaim space. Then automate periodic cleanup via cron and enable `logrotate` for log management.
`/snap` at 12 GB	`snap set system refresh.retain=2` — tells Snap to stop hoarding old versions. Set it once, forget about it.
Manual patching	`unattended-upgrades` active and configured for security packages. CVEs don't wait for your calendar reminder.

Non-technical explanation

Docker and Snap are like roommates who never take out the trash. Every week they bring new stuff in but nothing old ever leaves. Eventually the apartment collapses under the weight of it, even though everyone's technically behaving. docker system prune is the big cleanup day. snap set refresh.retain=2 is the house rule that says you can only keep two sets of clothes. And unattended-upgrades is the maintenance crew that fixes the leaky pipes at 3am so you don't have to.

> SYSTEM_READY > NODE_ONLINE

< session_end // node: exit >