SSL Certificate Auto-Renewal for Trojan-Go: Why It Silently Breaks and the One-Line Hook That Fixes It Forever

A field manual for anyone running Trojan-Go (or any 443-hijacking proxy) behind Let’s Encrypt.

Three months after deploying the “Ultimate Stealth Proxy” setup, my www.ruianding.com suddenly went dark. Browsers yelled NET::ERR_CERT_DATE_INVALID, Shadowrocket refused to dial out. My first reaction: “Impossible. I configured certbot to auto-renew.” My second reaction, after 15 minutes of debugging: “Oh. That’s why.”

This post explains the exact failure mode, how to diagnose it in under 60 seconds, and the 6-line hook that guarantees it never happens again.


I. The Paradox: Certbot Renewed, But the Cert Is Still Expired

The first clue was this contradiction:

# Certbot says the cert is fine
$ sudo certbot certificates
  Expiry Date: 2026-07-04 01:27:30+00:00 (VALID: 58 days)
  Certificate Path: /etc/letsencrypt/live/ruianding.com/fullchain.pem

# But the live server is serving an expired one
$ echo | openssl s_client -servername www.ruianding.com -connect www.ruianding.com:443 2>/dev/null \
    | openssl x509 -noout -dates -subject
notBefore=Feb  3 13:35:43 2026 GMT
notAfter=May  4 13:35:42 2026 GMT       # <-- EXPIRED
subject=CN = ruianding.com              # <-- old cert, no www SAN

Same domain. Two different certificates. One on disk, one on the wire.


II. The Root Cause: Trojan-Go Holds the Cert in Memory

Unlike Apache or Nginx, which can reload on SIGHUP and re-read cert files from disk without dropping connections, Trojan-Go reads fullchain.pem and privkey.pem exactly once — at process startup — and keeps them in memory forever.

So the real timeline was:

DateEventWhat the client saw
Feb 3Trojan-Go started, loaded cert v1 (expires May 4)cert v1
Apr 5certbot.timer fired, cert v2 written to disk (expires Jul 4)cert v1 (still!)
May 4Cert v1 expires in Trojan-Go’s memorybroken
May 6Me, panickingbroken

The renewal succeeded. The deploy-hook to restart the consumer didn’t exist. Certbot has no idea that a 443-squatting process is quietly holding the old cert hostage.

The culprit — or rather, the missing piece — was visible the whole time in /etc/letsencrypt/renewal/ruianding.com.conf:

[renewalparams]
authenticator = webroot
webroot_path = /data/ruianding.com,
server = https://acme-v02.api.letsencrypt.org/directory
key_type = ecdsa
# ← no renew_hook, no post_hook, no deploy_hook. Nothing.

III. 60-Second Diagnostic Playbook

Run these five commands whenever HTTPS dies on a stealth-proxy box. The answer is almost always in the diff between commands 1 and 2.

# 1. What does certbot THINK is deployed?
sudo certbot certificates

# 2. What is ACTUALLY being served on port 443?
echo | openssl s_client -servername www.yourdomain.com -connect www.yourdomain.com:443 2>/dev/null \
    | openssl x509 -noout -dates -subject -ext subjectAltName

# 3. Who owns port 443? (Sanity check: not apache, not nginx)
sudo ss -tulpn | grep :443

# 4. Does the renewal config have a hook?
sudo cat /etc/letsencrypt/renewal/yourdomain.com.conf | grep -i hook

# 5. When was the cert file last written vs. when did the proxy start?
sudo ls -lL /etc/letsencrypt/live/yourdomain.com/fullchain.pem
systemctl show trojan-go -p ActiveEnterTimestamp

If cert file mtime is newer than Trojan-Go start time → you’ve hit this exact bug.


IV. The Fix (Two Minutes, Two Steps)

Step 1 — Restore service immediately

sudo systemctl restart trojan-go

# Verify the wire now matches disk
echo | openssl s_client -servername www.yourdomain.com -connect www.yourdomain.com:443 2>/dev/null \
    | openssl x509 -noout -dates -subject -ext subjectAltName
# Expect: notAfter ~ 90 days out, SAN includes both apex + www

Step 2 — Install a permanent deploy-hook

Certbot scans /etc/letsencrypt/renewal-hooks/deploy/ and runs every executable there only after a successful renewal (never on dry-runs, never on no-op checks). Inside the script, $RENEWED_LINEAGE is set to the live directory of the cert that was just renewed, which lets you scope actions per-domain on multi-cert boxes.

sudo tee /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh > /dev/null <<'EOF'
#!/bin/bash
# Fires after every successful Let's Encrypt renewal.
# $RENEWED_LINEAGE looks like /etc/letsencrypt/live/yourdomain.com
if [[ "$RENEWED_LINEAGE" == *"/yourdomain.com" ]]; then
    systemctl reload  apache2    || true   # reload is enough for apache
    systemctl restart trojan-go  || true   # MUST be restart; SIGHUP won't re-read certs
fi
EOF
sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh

Step 3 — Prove the hook works without waiting 60 days

Simulate what certbot will do:

# Execute the hook manually with the same env var certbot injects
sudo RENEWED_LINEAGE=/etc/letsencrypt/live/yourdomain.com \
    /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh

# Confirm trojan-go came back up seconds ago
systemctl status trojan-go --no-pager | head -5
# Look for: Active: active (running) since ... Xs ago

# Confirm the full renewal pipeline still works end-to-end
sudo certbot renew --dry-run
# Look for: Congratulations, all simulated renewals succeeded

Done. Next time certbot.timer ticks over and a real renewal happens, trojan-go gets restarted automatically, picks up the new cert in memory, and you never notice.


V. Why --post-hook in the Original Guide Isn’t Enough

The original stealth-proxy post ended with:

sudo certbot renew --post-hook "systemctl restart trojan-go apache2"

This works, but it has two foot-guns I want to flag:

  1. --post-hook is only honored for that single manual invocation. It does not get saved into /etc/letsencrypt/renewal/yourdomain.com.conf. The next certbot.timer run will not know about it. The hook you set this way vanishes the moment the terminal closes.
  2. post-hook fires even when nothing was renewed. On a healthy system that’s a no-op, but it means your service restarts on every successful dry-run-ish check — mildly annoying if you care about uptime counters.

The renewal-hooks/deploy/ directory approach is superior because:

  • It’s persistent (lives on disk, survives reboots, survives certbot package upgrades).
  • It only fires on actual successful renewals (deploy-hook semantics).
  • It’s per-domain aware via $RENEWED_LINEAGE, so one hook can handle multiple certs correctly.

If you want the hook saved into the renewal config instead, do it once with:

sudo certbot renew --force-renewal \
    --deploy-hook "systemctl restart trojan-go && systemctl reload apache2"

…but honestly, the renewal-hooks/deploy/ directory is cleaner.


VI. Bonus Land-Mine: Webroot Breaks When Trojan-Go Squats on 443

Since Trojan-Go owns 443, HTTP-01 validation must go over port 80, via Apache’s *:80 VirtualHost, using the webroot authenticator. This creates a silent dependency: your Apache port-80 vhost’s DocumentRoot must match webroot_path in /etc/letsencrypt/renewal/yourdomain.com.conf, or the .well-known/acme-challenge/ file Let’s Encrypt writes will 404.

Check it once, sleep well forever:

grep -E 'DocumentRoot|<VirtualHost' /etc/apache2/sites-available/yourdomain.com.conf
grep webroot_path /etc/letsencrypt/renewal/yourdomain.com.conf

If they disagree, either fix the DocumentRoot or add an explicit alias inside the *:80 vhost:

Alias /.well-known/acme-challenge/ /data/yourdomain.com/.well-known/acme-challenge/
<Directory "/data/yourdomain.com/.well-known/acme-challenge/">
    Require all granted
</Directory>

And always verify end-to-end with:

sudo certbot renew --dry-run

A --dry-run pass means both the ACME challenge path and your webroot config are healthy — the two things that most often rot silently between renewals.


VII. TL;DR Checklist

Run once on every stealth-proxy box you own:

# 1. Install persistent deploy hook
sudo tee /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh > /dev/null <<'EOF'
#!/bin/bash
if [[ "$RENEWED_LINEAGE" == *"/yourdomain.com" ]]; then
    systemctl reload  apache2    || true
    systemctl restart trojan-go  || true
fi
EOF
sudo chmod +x /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh

# 2. Prove the hook works
sudo RENEWED_LINEAGE=/etc/letsencrypt/live/yourdomain.com \
    /etc/letsencrypt/renewal-hooks/deploy/reload-services.sh
systemctl status trojan-go --no-pager | head -5

# 3. Prove the renewal pipeline works
sudo certbot renew --dry-run

# 4. Prove the wire serves what certbot has on disk
echo | openssl s_client -servername www.yourdomain.com -connect www.yourdomain.com:443 2>/dev/null \
    | openssl x509 -noout -dates -subject -ext subjectAltName

Four commands. Two minutes. Zero 3-AM “why is my site down” incidents for the next 90 days — and every 90 days after that.

The moral: “auto-renewal” only renews the file. Someone still has to tell the long-running process to re-read it. On a normal LAMP box, that someone is the package’s postinst script. On a stealth-proxy box where you’ve hand-rolled the 443 listener, that someone is you — exactly once, via the renewal-hooks/deploy/ directory.