"Among these, Pony is the only language that concurrently garbage collects actor...

rdtsc · on May 3, 2015

> If so, why can't Erlang GC multiple actors at a time? It does full SMP for actor execution.

It does. The article is incorrect. Erlang has a fully concurrent garbage collector among actors. One actor's GC running on one CPU scheduler will not interfere with execution of actors running in other CPU schedulers.

0cachecoherency · on May 3, 2015

There's a difference between GC'ing the memory reachable from an actor and GC'ing the actors themselves. Erlang requires a "poison pill" message to kill actors.

The research paper is here:

http://ponylang.org/papers/opsla237-clebsch.pdf

istvan__ · on May 3, 2015

Also, looking at the white paper I see really weird numbers.

This section in particular: Benchmarks and preliminary comparisons with Erlang, Scala, Akka, and libccpa Table1 & Table2

It seems that you have the exact same numbers for Erlang and Scala in table2, this is very hard to believe. You either accidentally put down the same number twice or you measured the performance of your benchmarking tool, otherwise it is extremely unlikely that two entirely different systems give the the same exact measurement. Similar story in table1. maybe I am missing something terribly obvious but this looks off to me.

pedrocr · on May 3, 2015

>It seems that you have the exact same numbers for Erlang and Scala in table2, this is very hard to believe.

The numbers are taken from another source and it seems they didn't have the exact numbers, hence the ~9s figure which is then used to calculate 333,333. They seem to have gotten the numbers by looking up the graphs on this page:

http://libcppa.blogspot.co.uk/search/label/benchmark

Makes me wonder how precise you could be at getting the numbers from just the graphs. Pixel precision at 800x600 is probably not too bad.

wpietri · on May 3, 2015

I am not an expert in benchmarking, so maybe I'm missing something. But how is that not crazy?

If I were looking to compare two things, I would run all the benchmarks on a single machine under my control. I might look at previously published reports to make sure I was getting comparable numbers. But there is no way I would publish a comparison that I merely hoped was apples to apples. I've just had too many benchmarks depend on subtle issues, ones that I had presumed were irrelevant.

decklebench · on May 4, 2015

The benchmarks reported on the web page (http://ponylang.org) and also reported at http://ponylang.org/papers/fast-cheap.pdf are more recent, and for these the code was run for different languages, and on one machine.

istvan__ · on May 3, 2015

I am checking the source code of this benchmark and trying to reproduce the numbers.

istvan__ · on May 3, 2015

There is but what is relevant from the design of the Erlang concurrent GC is that your actor operations latency is not impacted by it. This is why Erlang is extremely suitable for HTTP routers and request dispatch because you can maintain tight SLA on the p99.99 latency as opposed something like JVM where the GC locks up all of the executions, or at least this used to be the case.

0cachecoherency · on May 3, 2015

If you're interested in the object GC portion, there's this:

http://ponylang.org/papers/OGC.pdf

The Pony object garbage collector is fully concurrent, the reachable memory for any actor is GC'd totally independently. At the same time, Pony allows (safely, with no data races) sharing pointers across actors, for performance (ie without copying).

There's a paper on the type system that allows this:

http://ponylang.org/papers/fast-cheap.pdf

istvan__ · on May 3, 2015

What I am saying is that the Erlang GC is good enough from the practical point of view. I am not sure what value are you trying to add with the "fully concurrent" GC.

jlouis · on May 3, 2015

It's not the same thing. The post is talking about automatically figuring out that nobody knows the Pid of a process and hence that process can be reclaimed. This is a transitive notion: If a group of processes can't be "reached" from another group and the latter group is the "important" one, then you can just kill off the first group of proceses. This is why it is GC-like behaviour. Like in a GC, processes can "leak" if you forget to throw the Pid away.

Erlang's method is to form linked webs of processes and then the death of a process "poisons the web" and kills off all processes in the web. By trapping exits, you can put in stopgaps for this behaviour, which is what supervisors do, among other things.

Process handling in Erlang is more akin to "manual memory management" or ARC/RAII style memory management here.

toast0 · on May 4, 2015

> The post is talking about automatically figuring out that nobody knows the Pid of a process and hence that process can be reclaimed.

This seems pretty impossible in distributed Erlang. Perhaps the Pid was sent to another node (which may be alive but not currently dist connected), or was serialized and may be deserialized later.

istvan__ · on May 3, 2015

Thanks, I just try to understand the practical advantage of this different approach.

barrelrider · on May 4, 2015

It's like jlouis said, with Erlang you have to kill your processes off when you've finished with them and if you don't they leak. In Pony that's done for you automatically.

istvan__ · on May 4, 2015

Thanks, I think you end up killing your Erlang processes most of the time because this is the model you follow when programming in Erlang. Using HTTP as example, while in other systems it is a really bad idea to have 1 req -> 1 process (or thread) mapping in Erlang it is encouraged. When the request is answered and the response is sent back the process dies. I think is a fairly simple model. I guess I need to look into Pony more to understand the importance of this in it.

rdtsc · on May 3, 2015

Can you explain some more or point to the exact part of the research paper. When you say "GC'ing actors themselves vs GC'ing memory reachable from actor" what exactly do you mean? Are you talking about the process dictionary and mailbox vs the state of actor passed through its loop function?

For arguments' sake here is what a simple Erlang actor looks like:

    loop(State) -> 
      ...
      loop(NewState).
      ...

Maybe you can spawn it with:

    Pid = spawn(fun () -> loop(InitState) end)

Matthias247 · on May 3, 2015

I guess the claim was that such an erlang actor would run forever when nobody sends it a message to stop it. And with pony it would be detected that the actor is no longer reachable and it would be automatically stopped.

SnowyOwl · on May 3, 2015

Indeed, Pony can automatically detect actors with empty stacks and message queues, also when these are part of a cycle.

rdtsc · on May 4, 2015

Erlang has something like that as well:

http://erldocs.com/17.3/erts/erlang.html#hibernate/3

Hibernate compresses the current memory, save it and then kill the active process. Then if a message is sent to it, it "wakes" up.

panic · on May 3, 2015

Erlang can "leak" processes if the process doesn't exit once it's no longer needed. Pony (I assume) will clean up a process automatically once there are no more references to it.

decklebench · on May 4, 2015

Exactly: it will clean an actor, once there are no messages to it, and it has no work to do, and there are no references to it (or any references come from such actors which have no work themselves -- i.e. are in a cycle).

amelius · on May 4, 2015

I'm wondering about something. In big systems you'll see actors on different platforms communicating with eachother. For example, a javascript actor would communicate with a server actor. How would garbage collection work in that case? Would cycles be detected across machine boundaries?

digitalzombie · on May 4, 2015

Perhaps you're thinking of some different concept of actor?

I don't recall Javascript supporting or having any library that would enable actor model of concurrency.

The general idea of actor model is that each actor is a process with a queue (mail box) and they talk to each other via msg. Javascript does not have such a thing nor does a server actor?