Back to Blog
devopsdeploymentlinux

Zero-Downtime Deploys on a Single VPS with Atomic Symlinks

Ryan C·

You don't need Kubernetes to get zero-downtime deploys. You don't even need Docker. The Linux primitives you already have — symlinks, systemd, and rsync — can give you atomic deploys with instant rollback. Here's how.

I run eight production apps on a single VPS. No Kubernetes. No Docker Swarm. No container orchestration of any kind. And I can deploy any of them with zero downtime and roll back in under a second.

The secret is embarrassingly simple: symlinks and systemd.

The Problem

The naive approach to deploying a Node.js app is: stop the process, replace the files, start the process. That's a 5-30 second gap where your app is completely down. For a marketing site, maybe that's fine. For a client portal where people are mid-checkout or uploading files, it's not.

I looked at blue-green deployments, rolling updates, and container orchestration. All of them added complexity that didn't match my scale. I needed something that works on a single box with zero additional infrastructure.

The Solution

Every project has this directory structure:

/home/myapp/
├── app -> releases/20260123-143022/    # symlink to current release
├── releases/
│   ├── 20260123-143022/                # current
│   ├── 20260122-091545/                # previous (rollback target)
│   ├── 20260120-164230/
│   ├── 20260118-112000/
│   └── 20260115-090000/                # oldest (will be cleaned up)
└── shared/                              # persistent data across releases

A deploy does three things:

  1. Rsync the new build into a fresh timestamped release directory
  2. Swap the symlink atomically: ln -sfn releases/20260123-143022 app
  3. Restart the systemd service: systemctl restart hostkit-myapp

The symlink swap is an atomic filesystem operation. There's no moment where app points to nothing. It either points to the old release or the new one. The restart takes 1-2 seconds while Node boots, and systemd's socket activation means requests queue rather than fail.

Instant Rollback

Rollback is the same operation in reverse — repoint the symlink to the previous release directory and restart. No rebuilding, no re-downloading, no database migrations to reverse. The old code is sitting right there on disk.

# Point back to previous release
ln -sfn releases/20260122-091545 app
systemctl restart hostkit-myapp

Sub-second. The only downtime is the Node.js restart, which is typically under 2 seconds.

I keep 5 releases on disk and garbage-collect the oldest on each deploy. That gives me a comfortable rollback window without eating disk space.

Health Checks as a Deploy Gate

The deploy isn't "done" when the files are copied. It's done when the health check passes. After the symlink swap and restart, I poll /api/health every 5 seconds for up to 2 minutes. If health never returns 200, the deploy is marked as failed.

// Every app has this — it's a platform requirement
// app/api/health/route.ts
export async function GET() {
  return Response.json({ status: 'ok' })
}

This catches the most common deploy failures: missing environment variables, broken imports, database connection issues. If the app can't start, I know immediately instead of finding out from a client email.

Why Not Containers?

I get asked this a lot. Containers solve real problems — dependency isolation, reproducible builds, orchestration at scale. But they also add a layer of indirection that I don't need at my scale.

With this setup:

  • I can tail -f a log file directly. No docker logs abstraction.
  • I can inspect the running process with htop. No container layer.
  • I can read and modify config files on disk. No image rebuilds.
  • Memory overhead is just the Node.js processes, not Docker daemons and overlay filesystems.

At eight apps, the operational simplicity of "it's just Linux processes" outweighs the packaging benefits of containers. If I were running fifty apps across multiple servers, I'd reconsider. But I'm not, and premature infrastructure complexity is just as real as premature code complexity.

The Full Deploy Sequence

1. Pre-flight checks (project exists, not rate-limited)
2. Create timestamped release directory
3. Rsync standalone build → release directory
4. Atomic symlink swap (app → new release)
5. Restart systemd service
6. Health check polling (every 5s, 2-min timeout)
7. Clean up old releases (keep 5)

Seven steps. No container registry. No orchestrator. No YAML files describing desired state. Just filesystem operations and a process manager that every Linux distro ships with.

Sometimes the best infrastructure is the infrastructure you already have.

Want us to build something like this for you?

We ship production software in days, not months. Tell us what you need — our AI receptionist is standing by.

Back to Blog
Page

Page

Client AI · Online

Page

Hey, I'm Page.

Tell me what you need. I'll point you to the right person — or tell you if we're not the right fit.

Powered by Claude · Responses may vary