Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Node.js Has 1 Gb Memory Limit (code.google.com)
85 points by andreyvit on Sept 16, 2011 | hide | past | favorite | 43 comments


It would be more correct to say that V8 has a 1G memory limit. But node.js actually gives you these nice things called Buffers which, according to the node.js manual, are "similar to an array of integers but corresponds to a raw memory allocation outside the V8 heap" (see http://nodejs.org/api/buffers.html).

Buffers are used all over the place in node, which should mitigate the degree to which this is actually a problem. Even if this bug report stays a wontfix, it won't affect my decision to use node because of Buffer support.


Sure, Node is a nice platform, did not imply otherwise.

For a certain category of apps it helps to know the limits beforehand, though (I've described my use case in another comment). I think this particular limit deserves to be more widely known.


It's probably worth mentioning that the GC cycles start to take upwards of one second at around 500mb of RAM usage with lots of small objects.

This limit doesn't stop you using NodeJS, but it's definitely something newcomers should be made aware of before they start writing database servers or making heavy use of a naive in-memory cache through an object literal.

Hopefully when the new GC gets rolled out we'll really be able to let loose with the RAM usage.


Funny how all VMs go through this. Remember when JVM GC pauses killed websites - and how over the years this was fixed by better - non pausing - GC strategies.


That's very easy to explain. A low-pause GC requires a concurrent garbage collector. Writing a correct concurrent garbage collector is VERY tricky, so you usually start with something much simpler. Concurrent GCs typically also add overhead to the mutator (the user program), so it tends to be a trade-off of high throughput vs. short maximum pause times. As an example Java's G1 collector ("Garbage First") took a team of experts about 5 years to get correct.

OTOH, writing a standard (single-threaded) Cheney-style copying GC can be done in a few days. Actually, the more time consuming part for me was to get all the pointer information to the GC.


There's a general point to be made here: this is the sort of thing that happens when you have "closed development open source". Admittedly, there seem to be very few people in the world capable of delivering a world-class GC.


Relevant snippets:

"diego.ca...@gmail.com, Sep 19, 2010

... In my case, I've been forced to suddenly stop my work ... with node.js because it cannot handle more than ~140K websockets concurrent connections ..."

"erik.corry, Nov 2, 2010

The limit for 64 bit V8 is around 1.9Gbytes now. Start V8 with the --max-old-space-size=1900 flag."


I don't understand why he has to stop the project just because ONE node.js process cannot handle more than 140K concurrent connections. If the goal is to handle millions of concurrent connections, spreading the connection load out to multiple servers is a must.

I'm doing something similar with node.js now to handle millions of concurrent connections. I use multiple node.js servers to handle it. Not only it helps in even out the load, it has the nice failover property. One node down doesn't mean the whole app is down. Clients of the down node migrate their connections to the rest of the nodes.


If you have 140M active users, handling a fleet of 1000 servers is not an easy operation, considering you need to manage inter-cluster communications (i.e. when sender and receiver sockets fell on different servers).

I've built Erlang/OTP websockets clustered server, which can handle 3M per node (giving you have enough RAM). Here you handle all your users with "only" 47 servers.

There is a big operational and scaling difference between cluster of 1000 servers and cluster of 47 servers.


I'm a bit curious about this. If you have 140M concurrent users (since this was the original complaint) and you are NOT prepared or capable of servicing/monitoring/maintaining 1000 servers, that seems like a fatal flaw in your server management and analytics processes.

Certainly 47 servers vs. 1000 is far nicer. But at 140M concurrent users levels (there must only be a few handfuls of sites with these types of concerns), not having a team prepared to oversee 1000 servers seems like folly.


The problem is not only operational, but also a technical one:

Nodes in the cluster need to communicate with each other and with other systems, like databases, message queues, monitoring servers, etc.

You can aggregate data per node, so the less servers in the front-end cluster you have, the less load on the back-end servers.

There is also financial problem: many organizations can afford 50 servers. But not many can afford 1000 servers.


There's no reason why you can't have many node processes per server. This also raises the fault tolerance per server.


Facebook chat has only 38 gateway servers. I guess, that only possible b/c not every active facebook user, uses Chat.


Perhaps they should consider adding more, it's permanently screwed up in some way. Messages not being sent or the other person not receiving them.


You're spot on. I think a lot of people can't wrap their head around node: It's all about scaling horizontally (lots of processes, many servers) rather than vertically (one process with lots of threads on one big server). Once you realise that, it makes a lot of sense.


Exactly. I pick Node.js not because one process can handle lots of connections, but because one process can handle lots of connections with low resource utilization. It needs less servers to get the job done. It's more of a cost-based decision.


To add my own perspective: I was running some analytics batch job on Node and hit this limit, had to add multi-stage processing to accommodate it. Using --max-old-space-size=1900 did help a bit.

In case you're wondering, here's a test case: https://gist.github.com/1148761

I was told by one of the V8 developers that the new GC is pretty usable now, so worth giving it a shot if you need more than 2 Gb.


What rude and demanding bug reports.


Welcome to Web 2.0, where people that think they're programmers try to interact with people that are programmers. The results are often ... depressing.


It's like they don't know what horizontal scaling is or something.


I don't think the bad bug reports have anything to do with whether or not the petitioner is a programmer.


A minor re-formatting.

> Welcome to Web 2.0, where people who think they're programmers try to interact with people who are programmers. The results are often ... depressing.

I was confused by your initial framing of the sentence, though my not being a native speaker would have played a part.


Yeah, though not really "correct", it's very common to use "that" as a relative pronoun for a person instead of "who". Some pedants may correct you, but it's rare for native English speakers to actually be confused by it.


Hope this helps: "[T]his distinction applies only to which and who. The alternative that is found with both human and non-human antecedents." http://en.wikipedia.org/wiki/English_relative_clauses#Human_...


That's news to me. Language follows speakers, especially influential speakers, and the list of speakers using that for human antecedents includes Shakespeare and Mark Twain, so I guess it's all right now, even if it were an error at some point(grammar isn't static - it has been changing ever since it came into being).

Now I am wondering about why I thought that isn't to be used for human antecedents. Was it an error earlier or it changed recently? If it was an error earlier, Shakespeare using it doesn't fit.


You probably thought that because it's uncommon in good writing and you're taught not to do it. It's probably against style guidelines for major newspapers, for example, and it sounds elegant to educated native writers.


I guess the point of posting it here is to gather folks with pitchforks and torches demanding immediate attention to the issue. For a few months V8 team has been working hard on improving the GC (http://code.google.com/p/v8/source/browse/branches/experimen...) which was the major limiting factor here. I don't have an ETA of when it's going to be merged, but the new GC branch is in a pretty good shape.


This all boils down to using the right tool for the right task and knowing how to organize your code so you don't run in to limits in the platform components that you use.

In this case, the obvious solutions are to either use 'buffers' (think of them as extended memory from the old days) or to use multiple instances on the same machine or spread over several machines.

If you write your code in such a way that you end up handling all your users in a single process then you will sooner or later run in to some limitation.


This has been long known and looks to no longer be a problem soon reading the comments of the OP. It hasn't been a serious issue as long as you know about it going in.


They're running up against a V8 limit.


Yep, but the reason we care about it is exactly because it affects Node.


The reason you care about it is because it affects Node. I'm more interested in V8 than Node to be honest.


No doubt. This comment really struck me as off-tone:

"Sorry, Google, but your open-source project is important to much more of the world than just >your browser< now; and while this may not be an issue for the zones of impact >you< care about (said browser.), it’s a >huge< issue for much of the area where V8 is important in general in the modern (post-Node) world."

You know Google is biting their tongue at what they really want to say to complainers like this. You really want it working for Node? Roll up your sleeves and get to refactoring. Hell, they even explained how to fix it.


The moral of the story is to never be nice to anyone. Nobody complains about the memory limits of IE's javascript engine as they pertain to server-side applications.


My line of thought is that V8 is used in Node.js and in the browser, but 1 Gb is unlikely to be a problem for the browser any time soon, so it's really only a problem for Node.js. Am I missing something?


The 1GB of today was the 128MB of yesterday. I'm guessing in a year or two, games implemented on webgl will be a step up from farmville and require more than 1GB of memory to run well. IE9 still has a chance to catch up if they end up becoming the goto browser for gaming.

And technically the memory issue has already been resolved if you read the comments in the bug posting. The next GC release won't have any of these drawbacks. So really the problem is isolated to any node projects currently in production.


If there's anything I've learned in my software career, is that problems can be fixed. It sounds like this problem can, or already has been fixed. NodeJS is a wonderful platform to work with, nonetheless.


nice FUD headline


Why? It's been a real bummer for me to find that out (the hard way).


Why? It's been a real bummer for me to find that out (the hard way).

Knowing about the limit is hurting you, or you have really faced this constraint in a project? As mentioned elsewhere, node has buffers which are allocated outside of v8 heap, and the user data will largely be unaffected by the memory limit if you are using buffers.


I've tried to explain it in another comment:

> To add my own perspective: I was running some analytics batch job on Node and hit this limit, had to add multi-stage processing to accommodate it. Using --max-old-space-size=1900 did help a bit.

So yeah, it did affect me, although not on a typical web project. It wouldn't be a problem at all if I knew about this limit from the start. Thus this is a warning for others.

(BTW another limit I was told about is that objects can't have more than a million keys. Thankfully I did not hit that one.)


Well, if your data set is big enough to hit that limit then you'll probably need to horizontally scale it out anyway...

Or rewrite in C


"C Has 16 Exabyte Memory Limit"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: