How We Eliminated DNS Downtime with Systemd Socket Activation

2025-07-25

The Problem That Kept Me Up at Night

Picture this: you’re running a DNS service that handles over 100 queries per second. Every time you need to deploy an update or restart the service, you’re essentially playing Russian roulette with your users’ internet connectivity.

I learned this the hard way when we had a critical security patch that needed to go out. The traditional restart dance looked something like this:

systemctl restart dns-service
# 1. Service stops accepting new connections
# 2. Existing connections get dropped  
# 3. Brief moment where port 53 is unbound
# 4. New process starts and binds to port
# 5. Users get DNS timeouts during this gap

That “brief moment” was killing us. Even a 2-second restart meant 200+ dropped DNS queries. DNS timeouts make websites feel broken, and broken websites make angry customers.

The Lightbulb Moment: Let Systemd Handle the Sockets

Then I discovered systemd socket activation, and honestly, it felt like magic. The core idea is simple but brilliant: what if the operating system held onto the sockets for us during restarts?

Let me show you the difference visually:

Traditional Restart (with packet loss):

                                                            
     0s               2s              4s            6s      
      │               │               │              │      
      ▼               ▼               ▼              ▼      
┌──────────┐     ┌─────────┐     ┌─────────┐    ┌─────────┐ 
│          │     │         │     │         │    │         │ 
│  Client  ├────►│ App v1  ├─X──►│ SIGTERM ├───►│   New   │ 
│  Request │     │         │     │         │    │   App   │ 
│          │     │         │     │         │    │         │ 
└──────────┘     └─────────┘     └─────────┘    └─────────┘ 
                     │ 
                     ▼ 
                  ⚡ SIGTERM    
                  🔌 Socket closed

                  📱 Restart begins
                               ❌ Incoming requests LOST!

Socket Activation (zero packet loss):


      0s               2s              4s            6s      
       │               │               │              │      
       ▼               ▼               ▼              ▼      
 ┌──────────┐     ┌─────────┐     ┌─────────┐    ┌─────────┐ 
 │          │     │         │     │         │    │         │ 
 │  Client  ├────►│ systemd ├────►│ systemd ├───►│   New   │ 
 │  Request │     │ socket  │     │ socket  │    │   App   │ 
 │          │     │         │     │         │    │         │ 
 └──────────┘     └─────────┘     └────┬────┘    └─────────┘ 
                       ▲               │              ▲      
                       │               ▼              │      
                  ┌────┴────┐     ┌─────────┐    ┌────┴────┐ 
                  │         │     │         │    │         │ 
                  │ App v1  │     │ Kernel  │    │ Queue   │ 
                  │ exiting │     │ Queue   │    │ -> App  │ 
                  │         │     │         │    │         │ 
                  └─────────┘     └─────────┘    └─────────┘ 
                                                             
                  ⚡ App dies    📦 Requests    ✅ Zero loss!
                                   queued       

Key Differences:

  • Traditional: Socket dies with the application → incoming requests are lost during restart
  • Socket Activation: systemd owns the socket → requests are buffered in kernel queue during restart

What’s happening at each step:

  • 0s-2s: Normal operation - client requests flow through systemd socket to your app
  • 2s-4s: systemd sends SIGTERM to old app and starts new one. Critical bit: systemd socket stays alive, kernel queues incoming requests
  • 4s-6s: New app is ready and tells systemd “I’m listening.” Systemd hands over the socket, queued requests flow through
  • Result: Client never sees a connection refused or timeout - it’s completely transparent

The beauty is that systemd keeps the socket alive even when your app restarts. The kernel queues incoming connections during the brief restart window, so literally zero packets get dropped.

Setting It Up: The Systemd Side

The first step was creating the systemd socket unit. This is where you tell systemd which ports to bind to:

quietnet.socket:

[Unit]
Description=QuietNet DNS Proxy Sockets
Before=quietnet.service

[Socket]
ListenStream=443   # DoH (HTTPS)
ListenStream=853   # DoT (DNS over TLS)
Service=quietnet.service

[Install]
WantedBy=sockets.target

And then a companion service unit that knows about the socket:

quietnet.service:

[Unit]
Description=QuietNet DNS Service
Requires=quietnet.socket
After=quietnet.socket

[Service]
Type=notify
ExecStart=/usr/local/bin/quietnet
TimeoutStopSec=30

The key bit is Requires=quietnet.socket - this tells systemd “hey, make sure those sockets are ready before you start my app.”

The Go Code Changes

Now for the fun part - getting our Go application to use the sockets that systemd prepared for us. This required some rewiring, but it was surprisingly straightforward.

Before: Traditional Socket Binding

// Old way - we create and bind the sockets ourselves
func main() {
    server := &http.Server{Addr: ":443"}
    log.Fatal(server.ListenAndServeTLS("cert.pem", "key.pem"))
}

After: Socket Activation

import "github.com/coreos/go-systemd/v22/activation"

func main() {
    // Get the sockets that systemd prepared for us
    listeners, err := activation.Listeners()
    if err != nil {
        log.Fatal(err)
    }
    
    if len(listeners) > 0 {
        // Use systemd socket
        server := &http.Server{}
        log.Fatal(server.ServeTLS(listeners[0], "cert.pem", "key.pem"))
    } else {
        // Fallback to traditional binding
        server := &http.Server{Addr: ":443"}
        log.Fatal(server.ListenAndServeTLS("cert.pem", "key.pem"))
    }
}

The magic happens in activation.Listeners() - this function talks to systemd and says “give me any sockets you’ve prepared for this process.” Systemd hands them over as ready-to-use net.Listener objects.

The Results Blew My Mind

After deploying this change, the difference was night and day:

Before socket activation:

  • Restart time: 2-5 seconds of downtime
  • Dropped requests: 100-500 per restart
  • My stress level: 📈

After socket activation:

  • Restart time: Literally 0 seconds of downtime
  • Dropped requests: 0
  • My stress level: 📉

I was honestly skeptical at first, worried that there will still be dropped packets between the time the socket held onto the packet and the backend service being started. But after monitoring production restarts for a while now, weeks, the metrics don’t lie - systemd + kernel queueing really does eliminate the restart window completely.

Users just experienced a longer response time to get their response until the server is fully booted up.

The Gotchas I Wish Someone Had Told Me

Of course, it wasn’t all smooth sailing. Here are the things that tripped me up:

Signal Handling Is Critical

Your app must handle SIGTERM properly for graceful shutdowns. If you ignore signals, systemd will eventually SIGKILL your process, and you’ll lose the zero-downtime benefit:

// Don't forget this!
c := make(chan os.Signal, 1)
signal.Notify(c, os.Interrupt, syscall.SIGTERM)

go func() {
    <-c
    log.Println("Shutting down gracefully...")
    server.Shutdown(context.Background())
}()

Different Server Types, Different APIs

HTTP servers and DNS servers handle socket activation differently. HTTP has ServeTLS(listener) but DNS servers need their Listener field set before calling ActivateAndServe(). Read the docs for your specific server type.

Don’t Forget the Ready Notification

When you use Type=notify in your systemd service (which you need for socket activation), your app must tell systemd when it’s ready to serve requests. If you forget this, systemctl restart will hang forever in “activating” state:

import "github.com/coreos/go-systemd/v22/daemon"

func main() {
    // Start your servers
    go startDoTServer()
    go startDoHServer()
    
    // ✅ CRITICAL: Tell systemd we're ready!
    daemon.SdNotify(false, daemon.SdNotifyReady)
    
    // Block and wait for signals
    select {}
}

I learned this one the hard way when my restart commands started hanging - systemd was patiently waiting for a notification that never came!

Environment Variables

systemd services run in a minimal environment. Make sure to explicitly set any environment variables your app needs in the service unit file.

Testing Socket Activation Locally

Want to try this out without setting up a full production environment? You can test socket activation using systemd user mode:

# Create socket unit
mkdir -p ~/.config/systemd/user
cat > ~/.config/systemd/user/myapp.socket << EOF
[Socket]
ListenStream=8080
[Install]
WantedBy=sockets.target
EOF

# Enable and start
systemctl --user enable myapp.socket
systemctl --user start myapp.socket

# Test that it works
curl localhost:8080  # This triggers your app to start

When Should You Use Socket Activation?

Great candidates:

  • High-traffic services where any downtime hurts
  • Critical infrastructure (DNS, load balancers, API gateways)
  • Services you deploy frequently
  • Anything handling long-lived connections

Skip it for:

  • Batch jobs or cron tasks
  • Local development tools
  • Services with very slow startup times (they’ll block the socket)

Final Thoughts

Socket activation feels like a Linux superpower once you understand it. The fact that the kernel can seamlessly queue connections during process restarts is genuinely magical - it’s one of those “the more you know about how computers work, the more impressed you get” moments.

Our DNS service went from being a source of deployment anxiety to something we can restart multiple times per day without thinking twice. That peace of mind is worth the initial learning curve.

If you’re running any kind of network service on Linux, especially something critical, I can’t recommend socket activation enough. Your users (and your stress levels) will thank you.

Ready for ad-free browsing without DNS downtime?

QuietNet is the ad-blocking DNS service that never drops your connection. Built with systemd socket activation from day one, QuietNet blocks ads and trackers while ensuring zero interruptions - even during our updates and maintenance.

✅ Zero packet loss during our deployments
✅ Advanced ad and tracker blocking
✅ DoH and DoT privacy protection
✅ 99.99% uptime guarantee

Stop settling for ad-blockers that go offline during updates. Switch to QuietNet today →

The only ad-blocking DNS that never lets ads sneak through during downtime.

Resources