Some paths on the site don’t belong to this app. They go to another host – a second backend behind the same domain – and nginx forwards them with a proxy_pass, POST and headers and all. It works, and for months I don’t think about it.
Then the other host goes down for a few minutes. The requests to it fail while it’s down – there’s nothing to forward to. That part I expect.
What I don’t expect: the host comes back, and the requests keep failing. The backend is up. I can curl it directly from the same machine. But nginx, sitting right there, keeps handing back errors for those paths as if nothing changed.
The fix is stupid and total. nginx -s reload, and the dead path is alive again instantly – no config change, the same config that was failing a second ago. Reloading is the only thing that brings it back.
My guess is that nginx resolved the host’s name once, when it loaded the config, and kept the answer for the life of the process. While the backend was down it went on dialing the address it had resolved at boot, and when the host came back at a different one, nginx had no reason to look again. A reload is the only thing that makes it look again.
I haven’t proven it. The shape of the fix is a resolver and the hostname behind a variable, so nginx re-resolves on a TTL instead of once at startup. For now I have a reload and a note to myself.
What I keep turning over is that nothing recovered on its own. The backend recovered. The proxy in front of it did not. A backend coming back is not the same as the thing in front of it noticing.
