It would be more correct to say that V8 has a 1G memory limit. But node.js actually gives you these nice things called Buffers which, according to the node.js manual, are "similar to an array of integers but corresponds to a raw memory allocation outside the V8 heap" (see http://nodejs.org/api/buffers.html).
Buffers are used all over the place in node, which should mitigate the degree to which this is actually a problem. Even if this bug report stays a wontfix, it won't affect my decision to use node because of Buffer support.
Sure, Node is a nice platform, did not imply otherwise.
For a certain category of apps it helps to know the limits beforehand, though (I've described my use case in another comment). I think this particular limit deserves to be more widely known.
It's probably worth mentioning that the GC cycles start to take upwards of one second at around 500mb of RAM usage with lots of small objects.
This limit doesn't stop you using NodeJS, but it's definitely something newcomers should be made aware of before they start writing database servers or making heavy use of a naive in-memory cache through an object literal.
Hopefully when the new GC gets rolled out we'll really be able to let loose with the RAM usage.
Funny how all VMs go through this. Remember when JVM GC pauses killed websites - and how over the years this was fixed by better - non pausing - GC strategies.
That's very easy to explain. A low-pause GC requires a concurrent garbage collector. Writing a correct concurrent garbage collector is VERY tricky, so you usually start with something much simpler. Concurrent GCs typically also add overhead to the mutator (the user program), so it tends to be a trade-off of high throughput vs. short maximum pause times. As an example Java's G1 collector ("Garbage First") took a team of experts about 5 years to get correct.
OTOH, writing a standard (single-threaded) Cheney-style copying GC can be done in a few days. Actually, the more time consuming part for me was to get all the pointer information to the GC.
There's a general point to be made here: this is the sort of thing that happens when you have "closed development open source". Admittedly, there seem to be very few people in the world capable of delivering a world-class GC.
... In my case, I've been forced to suddenly stop my work ... with node.js because it cannot handle more than ~140K websockets concurrent connections ..."
"erik.corry, Nov 2, 2010
The limit for 64 bit V8 is around 1.9Gbytes now. Start V8 with the
--max-old-space-size=1900 flag."
I don't understand why he has to stop the project just because ONE node.js process cannot handle more than 140K concurrent connections. If the goal is to handle millions of concurrent connections, spreading the connection load out to multiple servers is a must.
I'm doing something similar with node.js now to handle millions of concurrent connections. I use multiple node.js servers to handle it. Not only it helps in even out the load, it has the nice failover property. One node down doesn't mean the whole app is down. Clients of the down node migrate their connections to the rest of the nodes.
If you have 140M active users, handling a fleet of 1000 servers is not an easy operation, considering you need to manage inter-cluster communications (i.e. when sender and receiver sockets fell on different servers).
I've built Erlang/OTP websockets clustered server, which can handle 3M per node (giving you have enough RAM). Here you handle all your users with "only" 47 servers.
There is a big operational and scaling difference between cluster of 1000 servers and cluster of 47 servers.
I'm a bit curious about this. If you have 140M concurrent users (since this was the original complaint) and you are NOT prepared or capable of servicing/monitoring/maintaining 1000 servers, that seems like a fatal flaw in your server management and analytics processes.
Certainly 47 servers vs. 1000 is far nicer. But at 140M concurrent users levels (there must only be a few handfuls of sites with these types of concerns), not having a team prepared to oversee 1000 servers seems like folly.
You're spot on. I think a lot of people can't wrap their head around node: It's all about scaling horizontally (lots of processes, many servers) rather than vertically (one process with lots of threads on one big server). Once you realise that, it makes a lot of sense.
Exactly. I pick Node.js not because one process can handle lots of connections, but because one process can handle lots of connections with low resource utilization. It needs less servers to get the job done. It's more of a cost-based decision.
To add my own perspective: I was running some analytics batch job on Node and hit this limit, had to add multi-stage processing to accommodate it. Using --max-old-space-size=1900 did help a bit.
Welcome to Web 2.0, where people that think they're programmers try to interact with people that are programmers. The results are often ... depressing.
> Welcome to Web 2.0, where people who think they're programmers try to interact with people who are programmers. The results are often ... depressing.
I was confused by your initial framing of the sentence, though my not being a native speaker would have played a part.
Yeah, though not really "correct", it's very common to use "that" as a relative pronoun for a person instead of "who". Some pedants may correct you, but it's rare for native English speakers to actually be confused by it.
That's news to me. Language follows speakers, especially influential speakers, and the list of speakers using that for human antecedents includes Shakespeare and Mark Twain, so I guess it's all right now, even if it were an error at some point(grammar isn't static - it has been changing ever since it came into being).
Now I am wondering about why I thought that isn't to be used for human antecedents. Was it an error earlier or it changed recently? If it was an error earlier, Shakespeare using it doesn't fit.
You probably thought that because it's uncommon in good writing and you're taught not to do it. It's probably against style guidelines for major newspapers, for example, and it sounds elegant to educated native writers.
I guess the point of posting it here is to gather folks with pitchforks and torches demanding immediate attention to the issue. For a few months V8 team has been working hard on improving the GC (http://code.google.com/p/v8/source/browse/branches/experimen...) which was the major limiting factor here. I don't have an ETA of when it's going to be merged, but the new GC branch is in a pretty good shape.
This all boils down to using the right tool for the right task and knowing how to organize your code so you don't run in to limits in the platform components that you use.
In this case, the obvious solutions are to either use 'buffers' (think of them as extended memory from the old days) or to use multiple instances on the same machine or spread over several machines.
If you write your code in such a way that you end up handling all your users in a single process then you will sooner or later run in to some limitation.
This has been long known and looks to no longer be a problem soon reading the comments of the OP. It hasn't been a serious issue as long as you know about it going in.
No doubt. This comment really struck me as off-tone:
"Sorry, Google, but your open-source project is important to much more of the world than just >your browser< now; and while this may not be an issue for the zones of impact >you< care about (said browser.), it’s a >huge< issue for much of the area where V8 is important in general in the modern (post-Node) world."
You know Google is biting their tongue at what they really want to say to complainers like this. You really want it working for Node? Roll up your sleeves and get to refactoring. Hell, they even explained how to fix it.
The moral of the story is to never be nice to anyone. Nobody complains about the memory limits of IE's javascript engine as they pertain to server-side applications.
My line of thought is that V8 is used in Node.js and in the browser, but 1 Gb is unlikely to be a problem for the browser any time soon, so it's really only a problem for Node.js. Am I missing something?
The 1GB of today was the 128MB of yesterday. I'm guessing in a year or two, games implemented on webgl will be a step up from farmville and require more than 1GB of memory to run well. IE9 still has a chance to catch up if they end up becoming the goto browser for gaming.
And technically the memory issue has already been resolved if you read the comments in the bug posting. The next GC release won't have any of these drawbacks. So really the problem is isolated to any node projects currently in production.
If there's anything I've learned in my software career, is that problems can be fixed. It sounds like this problem can, or already has been fixed. NodeJS is a wonderful platform to work with, nonetheless.
Why? It's been a real bummer for me to find that out (the hard way).
Knowing about the limit is hurting you, or you have really faced this constraint in a project? As mentioned elsewhere, node has buffers which are allocated outside of v8 heap, and the user data will largely be unaffected by the memory limit if you are using buffers.
> To add my own perspective: I was running some analytics batch job on Node and hit this limit, had to add multi-stage processing to accommodate it. Using --max-old-space-size=1900 did help a bit.
So yeah, it did affect me, although not on a typical web project. It wouldn't be a problem at all if I knew about this limit from the start. Thus this is a warning for others.
(BTW another limit I was told about is that objects can't have more than a million keys. Thankfully I did not hit that one.)
Buffers are used all over the place in node, which should mitigate the degree to which this is actually a problem. Even if this bug report stays a wontfix, it won't affect my decision to use node because of Buffer support.