On Linux, somewhat infamously, malloc never fails. It will always return a pointer to some fresh part of the address space. It is able to do this because, in turn, sbrk/anonymous mmap never fails - it always allocates some fresh address space. It is able to do this because Linux does not allocate physical memory (or swap) when it assigns address space, but when that address space is actually used. It will happily allocate more address space than it has memory for - a practice known as 'overcommit'. So, on Linux, you can indeed not worry about malloc failing:
Firstly, malloc actually can fail, not because it runs out of memory, but because it runs out of address space. If have 2^64 bytes of memory in your address space already (2^48 on most practical machines, i believe), then there is no value malloc could return that would satisfy you.
Secondly, this behaviour is configurable. An administrator could configure a Linux system not to do this, and instead to only allocate address space that can be backed with memory. And actually, some things i have read suggest that overcommit is not unlimited to begin with; the kernel will only allocate address space equal to some multiple of the memory it has.
Thirdly, failure is conserved. While malloc can't fail, something else can. Linux's behaviour is essentially fractional reserve banking with address space, and that means that the allocator will sometimes write cheques the page tables can't cash. If it does, if it allocates more address space than it can supply, and if every process attempts to use all the address space that it has been allocated, we have the equivalent of a run on the bank, and there is going to be a failure. The way the failure manifests is through the action of the out-of-memory killer, which picks one process on the system, kills it, and so reclaims the memory allocated to it for distribution to surviving processes:
The OOM killer is a widely-feared bogeyman amongst Linux sysadmins. It sometimes manages to choose exactly the wrong thing as a victim. At one time, and perhaps still, it had a particular grudge against PostgreSQL:
And in the last month or so, on systems where i work, i have seen a situation where a Puppet run on an application server provoked the OOM killer into killing the application, and another where a screwed up attempt to create a swap file on an infrastructure server provoked it into killing the SSH daemon and BIND.
I don't know about what other operating systems do. Apparently all modern unixes overcommit address space in much the same way as Linux. However, i can't believe that FreeBSD handles this as crassly as Linux does.
I don't know about what other operating systems do.
Apparently all modern unixes overcommit address space in
much the same way as Linux. However, i can't believe that
FreeBSD handles this as crassly as Linux does.
Solaris does not (generally, unless you use MAP_NORESERVE w/ mmap).
In general, the Linux kernel's default OOM behaviour is undesirable for the vast majority of enterprise use cases. That's why RedHat and many other vendors used to disable it by default (unknown who still does).
Why is it bad? Simple; imagine your giant database is running with a large address space mapped. Another program decides to allocate a large amount of memory. The Linux kernel sees the tasty database target, kills it, and gives the smaller program its memory. Congratulations, your database just went poof.
There's an article that discusses the advantages/disadvantages with respect to Solaris here:
> On Linux, somewhat infamously, malloc never fails. It will always return a pointer to some fresh part of the address space. It is able to do this because, in turn, sbrk/anonymous mmap never fails - it always allocates some fresh address space. It is able to do this because Linux does not allocate physical memory (or swap) when it assigns address space, but when that address space is actually used. It will happily allocate more address space than it has memory for - a practice known as 'overcommit'. So, on Linux, you can indeed not worry about malloc failing
True. However, you can disable this behavior if you like by running 'sysctl vm.overcommit_memory=2'; see proc(5).
> On Linux, somewhat infamously, malloc never fails.
Pretty close to true but I think that is a bit of a simplification. I seem to recall for instance on 32-bit Linux it's not hard to get malloc to return NULL: ask for some absurd size, like maybe a few allocations of a gigabyte or two, something that fits in a size_t but a 32-bit address space could not possibly accommodate with all the other things in the address space (stacks, your binary, libraries, kernel-only addresses in the page table, etc).
http://www.scvalex.net/posts/6/ http://www.drdobbs.com/embedded-systems/malloc-madness/23160...
There are a few caveats to this.
Firstly, malloc actually can fail, not because it runs out of memory, but because it runs out of address space. If have 2^64 bytes of memory in your address space already (2^48 on most practical machines, i believe), then there is no value malloc could return that would satisfy you.
Secondly, this behaviour is configurable. An administrator could configure a Linux system not to do this, and instead to only allocate address space that can be backed with memory. And actually, some things i have read suggest that overcommit is not unlimited to begin with; the kernel will only allocate address space equal to some multiple of the memory it has.
Thirdly, failure is conserved. While malloc can't fail, something else can. Linux's behaviour is essentially fractional reserve banking with address space, and that means that the allocator will sometimes write cheques the page tables can't cash. If it does, if it allocates more address space than it can supply, and if every process attempts to use all the address space that it has been allocated, we have the equivalent of a run on the bank, and there is going to be a failure. The way the failure manifests is through the action of the out-of-memory killer, which picks one process on the system, kills it, and so reclaims the memory allocated to it for distribution to surviving processes:
http://linux-mm.org/OOM_Killer
The OOM killer is a widely-feared bogeyman amongst Linux sysadmins. It sometimes manages to choose exactly the wrong thing as a victim. At one time, and perhaps still, it had a particular grudge against PostgreSQL:
http://thoughts.davisjeff.com/2009/11/29/linux-oom-killer/
And in the last month or so, on systems where i work, i have seen a situation where a Puppet run on an application server provoked the OOM killer into killing the application, and another where a screwed up attempt to create a swap file on an infrastructure server provoked it into killing the SSH daemon and BIND.
I don't know about what other operating systems do. Apparently all modern unixes overcommit address space in much the same way as Linux. However, i can't believe that FreeBSD handles this as crassly as Linux does.