Assuming pretty cache-optimal code and 1.5GHz ops with no superscalar gains, that's over 25 million operations per record.
Most PC video games, for example, expect a refresh rate somewhere between 30Hz and 60Hz. 60Hz gives you 17ms for an entire frame, in which it's doing an incredible amount of work: drawing a whole scene, updating the world, running physics, etc. etc.
I think you have too low expectations. Modern hardware is very, very fast.
This whole discussion is silly, there isn't nearly enough information given in the post to infer anything. The author is just making the point that it isn't a toy application.
This seems to be a somewhat unfair comparison you are drawing. Video games use dedicated hardware to achieve their drawing throughput, and frequently for their sound and physics throughput. 4.32ms/rec in a high-level garbage collected language based around a data structure like the cons cell is something that is an impressive testament.
As has been said, though, this is all speculation.
If almost all allocations are specific to the processing of a record, and become garbage after that record is done with, GC is almost free. GC cost is proportional to retained allocations, not allocations made; and if old generation memory is never modified to point to new allocations, it doesn't even need to be scanned (this can be detected by write barriers, either injected into JIT code or via page faults, so it can be quite fine-grained). That's why GC is asymptotically faster than manual allocation and ideally suited to record-processing and server request/response kinds of applications. If it were using manual paired allocate/free memory management, it would actually be more impressive.
And it's not 4.32ms/rec; that it's 4.32/rec with 4 cores, so estimating around 17ms (like the graphics frame) is closer.
I don't agree with your suggestion that the fact that because graphics is usually accelerated and physics very rarely is, the comparison is unfair. Take a look at Pixomatic that Mike Abrash worked on. DirectX 7-level API, done entirely in software; and efficient enough that the game can still do all its work in its own time. Games still have to pump an awful lot of data through to the hardware; the hardware isn't going to do all the high-level scene graph calculations, culling and occlusion itself. The hardware expects a list of pretty basic primitives, and takes care of transforming them into the view frustum, with depth converting into Z-buffer value. The game still needs to make sure it doesn't give the hardware too much stuff that isn't actually intersecting with the view frustum.
We're not talking about the implementation details here, we're talking about the perception of dynamic garbage collected languages, which this summary helps to shift from an unfairly negative to a more balanced light. I'm well aware of why generational GC could be a faster choice for this kind of record processing (then again, we simply do not know from the description given how much information is shared).
As for the comparison being unfair, the person who initially made that comparison was you, sir. Bringing real-time visual simulation into the equation is unfair for a variety of reasons, including the fact that a lot more research has been done in that field's optimization. I'm not sure exactly what you want from a simple high-level example that "Yes, Clojure can be used to do real work," but I don't think anyone here is going to give it to you if you set such outlandish goals as the tip of a heavily funded branch of computer science and mathematics involving real-time rendering.
Most PC video games, for example, expect a refresh rate somewhere between 30Hz and 60Hz. 60Hz gives you 17ms for an entire frame, in which it's doing an incredible amount of work: drawing a whole scene, updating the world, running physics, etc. etc.
I think you have too low expectations. Modern hardware is very, very fast.