While this is fascinating, and Willy is brilliant as always, I always wondered why HAProxy couldn't just, you know, reload the config.
Surely you don't need to fork: Just parse the new config, create the necessary internal data structures, and let traffic flow into the new ruleset while keeping all the sockets (except for those that are superfluous, and of course let in-flight requests finish). Is it because HAProxy's internals weren't designed to do that and that it would too big of a rewrite?
I always found Varnish's design very cool: It compiles the configuration (which is a DSL called VCL) to C and loads it as a dynamically loaded library. I don't know how it does hot reloads, but I believe it does do them seamlessly.
The post kinda touches on this, and makes it clear that config changes isn't the only situation that this would come in handy -- software updates are a big reason too.
"Service upgrades are even more important because, while some products are designed from the ground to support config updates without having to be reloaded, they cannot be upgraded at all without a stop/start sequence. Consequently, admins leave bogus versions of those software components running in production because it is never the right time to do it."
And if you have restarts without downtime, there is no need for configuration reloads anymore. Why solve the same problem in 2 ways? It would just increase a chance for bugs.
All in all, HAproxy is a brilliant piece of software and Willy is running it in an exemplary way. Kudos!
I mean, "apachectl graceful" has existed for 20+ years. Sending a graceful (USR1) to apache will cause it to re-read its config, close and re-open log files, and send a signal to all children that they should exit after their current work is finished. If they have no work, they die immediately. New children are created under a master process that has the new configuration.
I know that HAProxy is not an apples to apples comparison with something like apache, but I don't see how something similar would be a huge burden to add.
A multi-protocol (way more than http) load balancer with health checking just has more state than a web server. Check bug history for apache, and you'll see that modules that deal with state, like mod_security, have had issues with graceful reload in the past.
It's not that surprising to me, especially given that haproxy is 17 years old. Expectations of a load balancer where pretty light when it was invented, so the internals weren't built for hot reload.
If you can restart without downtime you don't need a hot config reload. Suddenly you can treat config as immutable and discard an entire category of bugs as well.
> Is it because HAProxy's internals weren't designed to do that and that it would too big of a rewrite?
Bingo. The internals will eventually be rewritten to support things like this (and I think that's what Willy was hinting at towards the end of the article) in time. But it's a big project, and it's a complicated piece of software, and there are a lot of conflicting demands on dev time.
That said: it's a welcoming community, and if you want to help... do it!
>>> I always wondered why HAProxy couldn't just, you know, reload the config.
It reloads just fine, same as all software.
Or if you prefer the pessimistic version: HAProxy, nginx, Apache and Varnish all suck at reloading configurations.
The difference is that HAProxy 1) tests it and 2) have a TCP mode.
To quote the issue: They manage to achieve 1 error per 40 000 connections... only if pinning to specifics CPU (typical in high performance environment to achieve 100% usage of all cores) while doing 10 reloads per second and creating 80k new connections per second.
Do you think any of apache/nginx/varnish, would do better than that in these circumstances? If you do, you are not very realistic ;)
I love HAProxy so much. In our architecture, it started with simple frontend load balancing, but it ended up mediating almost every inter-server communication, which gave us a great amount of flexibility in swapping machines in and out, by giving each service load balanced virtual ip addresses. Thanks for your great work, Willy.
I wonder, how does nginx and haproxy handle long/persistent connection session? The connection itself can't be terminated since there is actual client, established connection with a backend beneath. Will the reload be failed? (Something like, "connection termination timeout; can't reload; try later"). For web workers probably we won't see this; for most of the time, the connection is terminated after request done.
Both NGINX and HAProxy will hang around for as long as the connection is open (up to the timeout). It's actually quite an issue when you're rapidly reloading either proxy (you can run out of memory reasonably easily), but most services that have long lived TCP connections also handle resets reasonably well so you can typically just kill the old proxies and it'll be ok.
I've used HAProxy on and off throughout a lot of my career. I'm currently using it at my company as a way for services to talk to each other without specifically knowing who is where. I wouldn't call it "microservices" but probably similar: each server has HAProxy on it, and Ansible creates the HAProxy config/hosts file so that, say, a worker server can grab http://lb-api:6666/some/resource. lb-api is a host that routes to 127.0.0.1 and HAProxy runs on port 6666 locally, parses the "lb-api" host, and routes the request to one of the servers in the "api" group. Any time we change any servers, we just run our haproxy playbook and everything just flows.
As always, HAProxy is one of the few pieces of our infrastructure that "just works" day after day.
I highly recommend this tool. Yelp has used it in production for years to manage a fairly large PaaS (hundreds of services, thousands of containers, constant churn); it's proven quite flexible and resilient.
Synapse is available on github [0], and we've open sourced our automation used to create a highly available service router using Synapse as well [1][2].
Really appreciate not only the work put into Synapse, but also the release and maintenance of it. Its been very helpful for projects where I've used it.
This is great. We use haproxy at my work and I like it, it does it's job, but quirks like dns resolution only at startup, having to reload on config changes and no seamless reloading stop me from loving it.
It still requires explicit action. However, the old way had a little dance between the old process and the new process: the new process tells the old process to start shutting down, the old process stops listening for new connections, then the new process starts listening for new connections. That left a gap where connections got rejected.
The new technique is for the old process to use a Unix socket to seamlessly transfer ownership of the listening sockets to the new process. At no point are the listening sockets closed, so no connections are rejected.
It's still a (potentially) new haproxy binary starting up and parsing the (potentially) changed haproxy config because the user requested a graceful restart.
The new process listen to connections before the old process stop listening. The problem is that the old process can still have new connections queued up. They are lost when its sockets are closed.
I, too, am wondering about that. The only alternative I can see to reloading is doing it automatically every file change, which means everything would break if I saved before everything was ready. I am perplexed.
It certainly does not automatically reload on configuration file change.
This simply means you can have hitless reloads - change your configuration, reload HAProxy, and you will drop zero incoming connections during the reload time. Other methods previously existed to do this without having to first drain traffic, but they were both unwieldy and still tended to have a performance impact.
I'm so excited about this. We just finished rolling out a new seamless strategy involving pairing NGINX with HAProxy which I am almost done with the blog post for, but I envision this making our solution even simpler in the future when it hits stable branches.
All of your suggestions are more complicated than what you're suggesting replacing. People chain simple commands together because they're a language and it matches how they think of the problem. They're solving the problem with simple commands and pipes, you're trying to solve it with regex and as few commands as possible. All ways are valid but specific commands tend to be easier to remember on the fly than trying to do it all with sed and regexes. I use sed when I want to edit streams, not when I want to filter them. Tr is a simpler replace than a sed regex.
It's not silly at all, it's simple; there's many ways to accomplish something and knowing a shorter more precise way to do something doesn't make the longer simple ways silly. The author didn't know about pgrep, so he used what he did know about, ps and grep, nothing remotely silly about that; it's pragmatic.
As a comparison, nginx -- which is a quite different bit of software but is sometimes used in a similar fashion to haproxy -- can be gracefully restarted several times per second with no issues.
What do you mean by "with no issues"? Do you want to say that it doesn't drop connections? If that is the case, I would be curious to know more - but cursory search doesn't support this claim. [0]
If you mean something else then HAproxy also supports "graceful restarts several times per second with no issues". But the article is not talking about that.
Actually the link I posted dealt with reloading config, not restarting (which is something completely different - you can't upgrade binary that way). But broken clients are everywhere, so you can't discount them. And persisten connections too, while we are at it.
I did however find instructions how to properly restart Nginx without dropping connections: https://www.digitalocean.com/community/tutorials/how-to-upgr...
Apparently it can be done (and could even be automated), but the procedure looks very generic to me (not nginx-specific). Is this what you did?
"which is a quite different bit of software but is sometimes used in a similar fashion to haproxy"
You put your finger on it right there - it's quite different. For example can nginx tunnel an RDP session for example? No of course not - nginx is a web server (proxy) etc. They have overlapping use cases but there are quite a few bits outside the intersection of their capabilities. Even now I am mentally writing a haproxy.conf to serve web content. The static bit is easy but I'll stick to using nginx or apache for what they are good at.
Obviously I wouldn't dream of letting IIS accept external inbound connections unless mediated via HAProxy ...
Nginx supports L4 (TCP/UDP) proxying just fine, just like HAProxy [1]. Nginx's HTTP(S) proxy capabilities are extensive. Not as extensive as HAProxy, but definitely close. (I don't know anything about RDP, I guess it's a binary protocol that HAProxy doesn't know anything about?)
Surely you don't need to fork: Just parse the new config, create the necessary internal data structures, and let traffic flow into the new ruleset while keeping all the sockets (except for those that are superfluous, and of course let in-flight requests finish). Is it because HAProxy's internals weren't designed to do that and that it would too big of a rewrite?
I always found Varnish's design very cool: It compiles the configuration (which is a DSL called VCL) to C and loads it as a dynamically loaded library. I don't know how it does hot reloads, but I believe it does do them seamlessly.