Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Erlang and network connections?
6 points by Tichy on May 18, 2008 | hide | past | favorite | 7 comments
I was interested in the latest use of Erlang for Comet by Facebook. However, thinking further about it, I wonder how the HTTP connections are typically being implemented (on Linux, say)? Erlang solves the problem of having lots of threads listening, but I am guessing it does not solve the problem of having lots of connections? Is there one thread listening to a network adapter, distributing the packages to the "virtual" connections? Or does the OS spawn a thread for every Connection it creates (presumably in some arcane C implementation)?

Maybe for the comet problem it would make more sense to optimize at that point, ie only spawn threads on incoming data? Erlang might still come in handy, depending on the number of concurrent messages, but perhaps not be the killer app it looks to be?



Most operating systems have some highly efficient way of handling many open network connections, for example epoll on linux, kqueue on xBSD, event ports on solaris. These are much more efficient than select() with large numbers of open sockets and the Erlang VM will use the appropriate one for the host OS when run with the +Ktrue option.

The SMP Erlang VM (as of Erlang/OTP R11B, R12B highly recommended) runs, by default, one process scheduler per CPU. So on a dual-core machine you can have two sockets being read from/written to simultaneously. Erlang allows you to program to a very simple model of one Erlang process per socket, with the VM using select/epoll/kqueue/etc to determine which processes have incoming data and are thus runnable.

Native processes use a lot of resources on most operating systems, so you can't use a process-per-socket model. So many web servers use a pool of processes/threads and queue up incoming requests that exceed that limit. Others use relatively few processes and use epoll/kqueue/etc directly.


Very interesting, thanks - also for the other replies by davidw and wmf.


In the normal case, Erlang is only one OS process. It handles everything via select. The only limits are OS limits on open sockets and things like that.


I'm pretty sure it uses kernel events (epoll, kqueue, /dev/poll) if available.


What is a select? And if Erlang is only one OS process, how does it benefit from multicore CPUs?

But my real question is, how does the OS (Linux) handle the open connections (a ka open sockets, I guess)?


Here are some in-depth resources about asynchronous network IO:

http://www.unpbook.com/ (I'm glad to see that someone took over from Stevens so that these books don't languish.) http://www.kegel.com/c10k.html


The select syscall:

http://linux.die.net/man/2/select

That's how most high throughput networking works.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: