This seems too expensive and I'd like to try and help, so I'm going to give you ...

patio11 · on Sept 11, 2011

This is a generous offering of expertise, but supposing you were to help him out and he were to take on implementation costs, wouldn't "Here's what I'd do to get you to a billion impressions a month of customer demand for your product" be a higher-priority task than saving a few hundred bucks?

The most common failure mode in scaling for startups is to have no scaling problem at all.

wheels · on Sept 11, 2011

Costs that are that out of whack with expectations would usually indicate some very low hanging fruit. If you can save $800/month by adding a couple of database indexes and tweaking your config file, then yeah, it's worth it.

That said, from what Maciej has written in the past, he sounds competent in these things (i.e. has his databases set up for remote replication and failover), so it would seem that the culprit is more likely over-engineering than under.

code_duck · on Sept 12, 2011

Well, if we scaled this expense level straight up to a billion impressions, pinboard would be bankrupt instantly.

rednaught · on Sept 11, 2011

I think his questions aren't meant to signify an offer expertise but rather say this base level of knowledge and data should be available whether you're on a shared hosting account or run your own Tier IV center. He's not suggesting a rearchitecture, he's just suggesting know the environment. Decisions are wishy-washy without evidence.

thaumaturgy · on Sept 11, 2011

Just for another data point, a while back one of my customers' sites got really popular overnight, and peaked at around 25Mbit/s sustained traffic, with Apache handling around 800+ requests/second, on a Linode 768. No RAM issues, the server wasn't stumbling at all, and the site was still loading nice and quick.

No nginx magic in that configuration, either. Just straight Apache, with a few tweaks.

mmaunder · on Sept 11, 2011

Could you share your apache config please! I find this incredible. Just very curious how you managed this and hoping to learn something. The reason I ask is because:

Apache has two stable modes of operation, threaded or one process per child. Both of them require one process or thread per connection busy being served, which requires memory. Apache consumes around 20M per thread or process if PHP or mod_perl are loaded. Roughly 5M if neither are loaded.

If you have keepalive enabled and you're doing 800 req/s and keepalive timeout is say 30 seconds (the default is 300) then you're going to need roughly 800 * 30 seconds apache children to keep up. So that's 24000 apache children or 120 gigs of memory in your server.

A common hack is to disable keepalive which causes clients to disconnect quicker, but they still tend to stay connected for 1 to 2 seconds per request - especially folks geographically far away thanks to latency. So at 800 r/s you're going to need 1600 apache children to keep up, or 8 gigabytes of memory in your server.

Now you could have used an experimental MPM and gotten these results, but I haven't heard of that being used much. So I'm really curious how you managed to get 800 r/s from Apache on a server with 768 Megs of memory.

For the uninitiated: Most people put Nginx or Lighttpd in front of apache which can serve 10,000+ concurrent connections with a single thread. That then talks to apache and connects/disconnects very fast which keeps the threads or processes free and often 10 processes is enough for thousands of concurrent connections with keepalive enabled on the front end.

saurik · on Sept 12, 2011

> Apache has two stable modes of operation, threaded or one process per child.

This is obsolete information: if you are using remotely recent versions of Apache 2.2 (as in, anything from the last year and a half at least: older might even still be fine), you should really be using mpm_event (unless your architecture is reliant on some horribly broken Apache module).

(Also, after having spent years running nginx for this specific purpose, I have to say that it is actually a horrible choice: it isn't smart enough to juggle HTTP/1.1 connections to backend servers, so it burns through ephemeral ports and is fundamentally incapable of warming up its TCP windows... to make that configuration scale in a real "tens of millions of users environment" you end up having to get pretty nit-picky with your kernel-level TCP configuration.)

(Really, though, you should just get a real CDN and drop nginx: a CDN will also reduce the latency of your application from far away locations by holding open connections half-way around the world to your backend, allowing you to drop a whole round-trip from a request that often only requires two round-trips total. I now use CDN->ELB->Apache, and I couldn't be happier with the result: the things nginx was attempting to do are much better handled by these other services.)

mmaunder · on Sept 12, 2011

The event mpm is experimental: http://httpd.apache.org/docs/2.2/mod/event.html

Recent benchmarks show apache is a dog with high concurrency, even with the event mpm.

http://nbonvin.wordpress.com/2011/03/14/apache-vs-nginx-vs-v...

Not sure about your nginx criticism. It's the darling of many high traffic production sites. Here's my weekly netstat off a low-end dell front end box on a gigabit link:

http://i.imgur.com/8fTg0.png

(The bottom right number is the interesting one)

My nginx config is pretty stock - standard reverse proxy to apache.

CDN's are gods gift to latency and I use several, but now and then you need an actual web server to do actual work.

saurik · on Sept 12, 2011

Another (unrelated) point: I just looked at the nbonvin benchmark, and it has some serious issues.

First, he misconfigured Apache for this environment: by having StartServers lower than the expected minimal process concurrency, they are goading Apache into constantly spawning and shutting down backends (I had similarly spiky performance until I realized this a while back).

However, the setting that simply damns this benchmark is that he has MaxClients set to 150... nginx's equivalent setting (worker_processes) is set to 1024. In essence, he needlessly hobbled Apache, and if Apache does well at all in comparison to nginx, it is probably because the benchmark is flawed. Seriously: this setup is so bad that Apache was returning 503 errors (this is mentioned in the conclusion area) because its configuration told it to stop responding to these incoming requests (and yet, he didn't bother considering that worth examining: he just tossed the detail out there as if it was Apache's fault).

And, given that Apache was less than half as bad (and depending on whether he counted 503 responses in his numbers, possibly "almost as good"), we then do go ahead and question the benchmark itself, just to find that the guy is using Apache bench... Apache bench doesn't actually claim to be very good at highly concurrent testing... specifically, it isn't actually good at swamping remote servers as it isn't itself very efficient. As one random backed-up example, from the BUGS section of the ab man page:

"""The rather heavy use of strstr(3) shows up top in profile, which might indicate a performance problem; i.e., you would measure the ab performance rather than the server's."""

Really, this website's results are based on a kind of "toy" benchmark, and should not really be trusted: the guy (as mentioned in his comments on his post), was trying to go for "the default setup" on these systems, but the default setup is not tuned for performance (this is especially the case with Apache, where distributions expect people to install it on almost everything, including your calculator). I mean, even the nginx settings should be tweaked: in a high-concurrency environment, 64/10096 (where I will admit I haven't spent much time tuning the ratio) would be a much better choice than 1/1024 (the Ubuntu default).

(Reading more of the comments on the benchmark, you can see that other people commented on some of these problems and more, and even wrote entire blog posts responses ;P.)

saurik · on Sept 12, 2011

I am not arguing that the documentation calls the module "experimental", but if you follow the discussion by the developers about this module you will find out that the fact that it is still called this is due only to a combination of 1) the major version of Apache not having been bumped in a very long time, 2) a number of Apache modules in the wild that are poorly coded (not that I've ever managed to actually find one), and most importantly: 3) it does not work on all platforms (Linux is great). Things that have previously been considered "the reason" mpm_event is marked experimental are all now obsolete; for a specific example, SSL now works 100% correctly with mpm_event.

Also, nginx being a "darling of many high traffic production sites" does not mean it actually works well for this purpose: if you do a Google search for "nginx" and "ephemeral" you get lots of evidence to the contrary, and you can also prove my statements from first principals of TCP if you really don't believe me: this need not be based on silly anecdotes, you would simply expect nginx to have issues with ephemeral ports due to the way it is designed and implemented (a reverse proxy making new outgoing connections for each incoming one), and if it didn't you would be surprised and probably want to publish a paper on it. ;P

""Compared to putting tornado processes behind nginx, this approach is simpler (since fdserver is much simpler than configuring nginx/haproxy) and avoids issues with ephemeral port limits that can be a problem in high-traffic proxied services.""" -- http://tornadogists.org/1073945/

"""This makes load testing complicated since the nginx machine quickly runs out of ephemeral ports.""" -- http://mailman.nginx.org/pipermail/nginx/2008-February/00352...

My service has tens of millions of users distributed worldwide, making many billions of requests per month to my hostnames. My setup is mostly coded for mod_python (generally considered to be an "older module", especially considering it is no longer even maintained by the upstream developers). I make complex usage of requests making recurrent subrequests through different languages. A good amount of my traffic is SSL.

Of course, most requests are cached at the CDN, so they don't have to go through to my backends, but I still handle way more than a billion requests all the way through to my dynamic webapp every month. These are all handled, eventually, by two boxes running Apache, and I only need two boxes because I want to handle one of them randomly failing (I can easily handle the load on one box: each box can handle, and actually has under previous concepts for my architecture, 3200 concurrent clients).

As for mpm_event in this environment? It works, is stable, is why I could handle 3200 concurrent clients, and you should not be avoiding it because you feel it is "experimental" (yes, even with mod_python). I did run across one or two Linux kernel builds that had regressions that affected Apache+mpm_event (horrible concurrent performance), but you are better off noticing that and steering away from them than avoiding mpm_event.

That said, I want to make it clear that I am not arguing against reverse proxies: I am only making the point that your CDN /is/ a reverse proxy, so there's little point in additionally adding nginx to the setup unless you can't handle enough concurrent connections from the master CDN nodes around the world, in which case what you really want is "just" a load balancer, and you really still want one that is smart enough to use HTTP/1.1 to connect to its backends, and that simply isn't nginx. (Humorously, DNS round-robin, if you think of it as a load balancer, actually works great for this HTTP/1.1 problem, but there are other reasons to avoid it, of course. ;P)

(Now, this said, I heard a few days ago that the just released nginx 1.1 branch now supports persistent backend connections, but I haven't been able to find it in the release notes.)

(Also, as your comment about "now and then you need an actual web server to do actual work" implies to me, but this might totally be incorrect, that you didn't yet notice that a CDN actually provides insanely high latency benefits even if all of your content is dynamic and all of it has to go through to the backend. If you did not know this, you should read my commentary here: http://news.ycombinator.com/item?id=2823268 .)

thaumaturgy · on Sept 11, 2011

Well, you were right in your math -- I made my previous comment before going back and checking the actual numbers, which was really stupid on my part. From the "Congratulations!" email I sent the client:

> ...sustained 100+ requests per second for the last couple of hours; peaks of over 40Mbits of traffic; thousands of simultaneous connections.

So, 1/8th of what you were calculating. Sorry about that.

I use mpm-worker and have keepalives turned off altogether. I also use suexec and fcgid (not fastcgi, unfortunately). I was using mem_cache at the time, but that's off now because it breaks the newest version of Wordpress.

Also, I should have mentioned that this was a static site, so PHP wasn't a factor.

If anybody's still interested in the server config, I'll be happy to share it.

mmaunder · on Sept 12, 2011

Thanks this is helpful.

mapgrep · on Sept 12, 2011

Maciej has said repeatedly (including on HN) that disk is the bottleneck keeping him off Linode and VPS in general.

Not all web apps use resources in the same way; you're quite likely making an apples and oranges comparison here (though you don't say what your own web app actually does). It makes sense that Pinboard hits the disk a lot because there's not a ton of shared data -- each user tends to have his own trove of data. Yes, certain URLs will be bookmarked a lot but each bookmark can have different access restrictions, summary data, tags and so on. And no one bookmark is going to make up a big percentage of hits -- you're talking about a ton of bookmarks, each accessed rarely. It sounds a lot more like webmail than, say, a blog in terms of access patterns.

You have questions, and it's nice that you (eventually) asked them and admitted all that you DON'T know about Pinboard. But you should probably ask them before you dole out unsolicited advice, not after.

tptacek · on Sept 12, 2011

I would be thrilled to get unsolicited advice like this about anything I built, even if the advice was wrong. You made a good point, but if you had just worded it a little differently, it wouldn't come across like you were trying to take someone down a peg for writing something thoughtful on HN.

biturd · on Sept 11, 2011

I don't know anything about your setup and architecture but do remember, pinboard was one of the sites that was swooped up in the RAID that got instapaper and a few others.

Instapaper stayed up because the server taken was only a slave. Pinboard had some of their main hardware taken, and while things did slow down, they too stayed up. Half, maybe more of those hosting fees could be servers that do nothing at all but sit around and wait for a RAID.

Would you still be able to build out a hypothetical system as you mention and be able to handle one data center effectively disappearing? Or does that mean taking your information and essentially doubling it, to have a second backup source?

timsally · on Sept 11, 2011

Take note. This is actually a useful response that people benefit from reading. As opposed to the several one liners already posted that all say variants of "your hosting is expensive". Thanks for taking the time to write this response up.

anonym · on Sept 11, 2011

Don't you think web crawling (for users with archival accounts, where the full text + associated images/resources for each bookmark is stored and indexed on Pinboard's servers) is probably using up more of those resources than actually serving up pages? I don't think 10 pages per second or whatever is really the relevant metric for this app.

sbierwagen · on Sept 11, 2011

jambo is not Maciej Cegłowski, (username "idlewords") who runs pinboard. So asking him for technical details won't be very helpful.

Incidentally, Maciej is banned from HN,[1] due to not being being nice enough to someone.

1: https://twitter.com/#!/pinboard/status/111332316458135553

tptacek · on Sept 11, 2011

Maciej is not currently banned from HN (he was, briefly, after an impolitic comment about a post, but Paul Graham unbanned him within moments of me questioning the ban).

Maciej seems pretty annoyed by what happened, and I don't blame him. But there you go. Shit happens!

He's basically one of the friendliest people I've met on the Internet, though, so if you want to help him out you can just ping him on Twitter.

staunch · on Sept 12, 2011

You use the word "friendliest" to describe a guy who three comments ago called someone a "douchebag" because his post was too long?

He's got a history of similar behavior. He really is a bit of a troll. He certainly doesn't follow HN guideline of "Don't say things you wouldn't say in a face to face conversation.".

My guess is PG only unbanned him so he wouldn't face the criticism of seeming to censor one of his critics.

tptacek · on Sept 12, 2011

I read Maciej's comments compulsively, all of them, and your summary does not square with my experience. At all.

I thought his comment about Sebastian's post was impolitic, but I chose that word carefully.

I don't care to psychoanalyze 'pg or how he chooses to run the site. He's a busy guy and is more than entitled to make moderation decisions that I disagree with.

I don't understand why you felt the need to write your comment at all. What good did it serve?

joshu · on Sept 12, 2011

How do you know he wouldn't say that in person?

staunch · on Sept 12, 2011

Experience. Every Internet Tough Guy I've met is normal/civil in person. Besides, I left out the "Be civil" part of that guideline. Calling someone a "douchebag" for writing a long-winded post wouldn't fit most people's definition of civility.

tptacek · on Sept 12, 2011

He didn't call Sebastian a douchebag for writing a long post.

He called Sebastian prolix for writing a long post.

He called Sebastian a douchebag for writing that long post.

If we're going to hellban people for writing individual uncivil posts, that's a problem. But I don't think that's what's going to happen.

You should stop calling people "Internet Tough Guys". You have literally no idea who you're talking about. Comments like yours are Part Of The Problem. Maybe you should re-evaluate the notion of starting whole threads on how much you dislike one particular person.

statictype · on Sept 12, 2011

He didn't 'start whole threads' on how much he dislikes a particular person. He merely pointed out that calling someone a douchebag doesn't fit the guidelines for this site.

I like reading idlewords and think he's a good/interesting writer and am glad he wasn't permanently banned. But your reply here defending his words seems a little emotional and knee-jerk don't you think?

staunch · on Sept 12, 2011

Comments like his "douchebag" comment are Part Of The Problem and that's why he was banned. Your description of his comment as merely "impolitic" is only slightly more ridiculous than you referring to him as the "friendliest" guy you know. I probably shouldn't have pointed it out, but I did.

Regardless, I don't have any vendetta against him or care strongly about him being on the site or not. I realize he's a smart guy who just has a bit of an Uncov streak in him. No big deal.

huhtenberg · on Sept 12, 2011

For what it's worth I was a proud recipient of a 50+ score reply that started with "Fuck you!". Maciej's "douchebag" is clearly nothing in comparison and it hardly warranted the permaban.

statictype · on Sept 12, 2011

You're missing the point of the guideline. Even if that particular individual would say it in person, it's still not considered acceptable behavior.

nikcub · on Sept 12, 2011

The comment in question, if anybody is interested:

http://news.ycombinator.com/item?id=2946828

_ouxp · on Sept 11, 2011

Thanks. I didn't know idlewords was banned & was assuming he'd be by to answer.

nphase · on Sept 12, 2011

I can second this. I hosted CentSports (a recently acquired side project) (~1mm userbase, ~500gb dataset) in a full cabinet for under $1100/mo. At it's peak we did about 80 million pageviews per month on a heavily IO-constrained, heavily active OLTP workload (...not nearly as impressive as mmaunder's!)

By the time the doors closed, we had about 16 RU's filled, IIRC. The beefiest DB box had 64gb of RAM in it, which as it turns out is a $1200 one-time upgrade for colo vs $1500/mo extra on your dedicated (quick price checks on newegg and softlayer, I'm sure both numbers can come down).

Yes, making the leap into colo can seem like a big up front cost, but over time these costs do work in your favor. Yes, there is a support cost involved (either with remote hands or you getting up at 3am and replacing that disk yourself). But If your workload and dataset are such that you don't necessarily require a new box spun up in less than 24 hours, the savings could be quite great for you.

joelhaasnoot · on Sept 11, 2011

While I pay nowhere near this much, and we're not nearly as popular, I still like his solutions. Why? I like to sleep. I'd rather pay 2x as much for a reliable host for a solution that works than that I'm up at 3am fixing stuff.