Assuming pretty cache-optimal code and 1.5GHz ops with no superscalar gains, tha...

greendestiny · on Oct 30, 2009

This whole discussion is silly, there isn't nearly enough information given in the post to infer anything. The author is just making the point that it isn't a toy application.

KirinDave · on Oct 30, 2009

This seems to be a somewhat unfair comparison you are drawing. Video games use dedicated hardware to achieve their drawing throughput, and frequently for their sound and physics throughput. 4.32ms/rec in a high-level garbage collected language based around a data structure like the cons cell is something that is an impressive testament.

As has been said, though, this is all speculation.

barrkel · on Oct 30, 2009

If almost all allocations are specific to the processing of a record, and become garbage after that record is done with, GC is almost free. GC cost is proportional to retained allocations, not allocations made; and if old generation memory is never modified to point to new allocations, it doesn't even need to be scanned (this can be detected by write barriers, either injected into JIT code or via page faults, so it can be quite fine-grained). That's why GC is asymptotically faster than manual allocation and ideally suited to record-processing and server request/response kinds of applications. If it were using manual paired allocate/free memory management, it would actually be more impressive.

And it's not 4.32ms/rec; that it's 4.32/rec with 4 cores, so estimating around 17ms (like the graphics frame) is closer.

I don't agree with your suggestion that the fact that because graphics is usually accelerated and physics very rarely is, the comparison is unfair. Take a look at Pixomatic that Mike Abrash worked on. DirectX 7-level API, done entirely in software; and efficient enough that the game can still do all its work in its own time. Games still have to pump an awful lot of data through to the hardware; the hardware isn't going to do all the high-level scene graph calculations, culling and occlusion itself. The hardware expects a list of pretty basic primitives, and takes care of transforming them into the view frustum, with depth converting into Z-buffer value. The game still needs to make sure it doesn't give the hardware too much stuff that isn't actually intersecting with the view frustum.

KirinDave · on Oct 30, 2009

We're not talking about the implementation details here, we're talking about the perception of dynamic garbage collected languages, which this summary helps to shift from an unfairly negative to a more balanced light. I'm well aware of why generational GC could be a faster choice for this kind of record processing (then again, we simply do not know from the description given how much information is shared).

As for the comparison being unfair, the person who initially made that comparison was you, sir. Bringing real-time visual simulation into the equation is unfair for a variety of reasons, including the fact that a lot more research has been done in that field's optimization. I'm not sure exactly what you want from a simple high-level example that "Yes, Clojure can be used to do real work," but I don't think anyone here is going to give it to you if you set such outlandish goals as the tip of a heavily funded branch of computer science and mathematics involving real-time rendering.

mseebach · on Oct 30, 2009

> that's over 25 million operations per record.

If the algorithm is Θ(n*n) it's suddenly a bit more impressive.