How many of those take advantage of GPU floating point power? This is no longer a nice to have sort of thing, but a critical one, since as the new Mac Pro shows, there's going to be orders of magnitude more GPU than CPU power available in newer workstations.
I've seem some, like Furry Ball, that support only CUDA, but very few are vendor agnostic with OpenCL. I hope AMD tries to fix this.
For doing production VFX shots, unless there's fur/hair or volumetrics involved, you're pretty much IO constrained in terms of pulling in textures and paging them in memory.
Often scenes can have > 300GB of textures total. Out-of-core rendering is not something that GPU renderers can do very well yet (redshift can do it a bit), so the 12GB limit on GPU memory is a huge problem. Arnold and PRMan both have deferred shading which means for GI they can re-order texture requests for similar texture mipmap levels to reduce the swapping of textures needed. I'm not aware of any GPU renderers that can do that as of yet.
On top of this, geometry assets are always overbuilt these days, with millions of polys/subds, and often multiple displacement layers on top of this. This geometry requires even more memory.
And the biggest studios are rendering deep output, requiring even more memory for the output files as more of the render samples need to be kept in memory for the render duration.
And while GPUs are faster, they're not THAT much faster for raytracing - a Dual Xeon top end CPU can almost (85% of ) match a K6000 at pure ray/triangle intersection speed.
Where GPUs can win is the fact it's a lot easier to cram multiple GPUs into a single workstation than get a 4 socket CPU system.
In a traditional render farm that's not an issue. As the new supercomputing centres have shown, there's a huge push to add massive amounts of GPU power to these because it's more cost effective.
I'm not saying GPU rendering is easy, it's way harder to do correctly because of memory constraints, but the performance of GPUs over CPUs (compute per watt, compute total) is a gap that's only getting wider. It's easy to add a few thousand more shaders, just add another card. Not easy to strap on another 12-core chip, you need to engineer from the ground up for that.
A GPU also has access to all system memory via DMA, so that shouldn't be an issue. It's just going to be hard to coordinate that data transfer in a performant way.
In a traditional render farm that is an issue, as the vast majority of renderfarms for VFX companies are CPU only. renderfarms aren't just using for renderer rendering, they're used for comp rendering, fluid sims (often using up to 96 GB of RAM), physics sims, etc.
Very little of this is GPU-based currently.
Where GPUs are starting to be used is on the artist workstation to speed up iteration/preview time. But not on renderfarms.
GPU memory access across the PCIE bus is ridiculously slow for smaller jobs. It's often the case that the slower CPU can do the work before the cumulative time of copy data to GPU, get GPU to do work, then copy it back again has been done. For longer jobs of smaller data, GPUs make sense.
GPUs also have much higher power draw than CPUs and produce much more heat, which is a huge issue for renderfarms with close to 100% utilisation at crunch time.
I've used Octane, which is a GPU-only renderer. Unfortunately, it is VRAM-per-GPU bound. In other words, you cannot render any scene larger than the usable VRAM per GPU. High end video cards with lots of VRAM also tend to have more than one GPU, so the VRAM-per-GPU figure is actually (total VRAM)/(total GPUs), with slightly less than that actually available for the scene.
I've also experimented a bit with Lux which has a hybrid CPU/GPU mode. However I've found it isn't necessarily any faster than CPU only on my system (which has a lot of CPU cores) and it isn't as stable.
AFAIK, there are no video cards currently available with more than 6GB per GPU, since something like a nVidia Titan Z with 12GB has to share that between 2 GPUs.
It's conceivable that as GPU rendering becomes more commonplace we'll start to see manufacturers loading more and more RAM only high end cards, possibly at the expense of compute units if power consumption is a problem. After all, a render farm with many separate cards is just as fast as one card with more compute units, but VRAM per GPU is currently a hard limit that will affect anyone rendering very large, complex scenes.
Xeon Phi may well cement this further. Knight's Landing, due in 2015, is going to feature 72 Atom cores (288 threads) with AVX, socketed in a standard Xeon motherboard.
In situations where your problem domain doesn't fit comfortably within the memory restrictions of a GPU, or porting a legacy code base difficult, this could be a very interesting option.
It is a non-starter for the successful commercial renderers because the majority of their market already has render farms with standard CPUs. Also studios generally use a mix of tools and if only one of them is optimized for Xeon Phi's that isn't enough of a motivation to spend the money on Xeon Phi's.
It is generally a no go in the mainstream rendering market, although cloud-based renderers can use specialized solutions.
You may be able to accelerate some steps in a GPU, but rendering usually involves "a lot of data", if you add all resources for a given scene it might be on the order of gigabytes
This is not like bitcoin mining where you have to do a lot of math in a small batch of data.
I've seem some, like Furry Ball, that support only CUDA, but very few are vendor agnostic with OpenCL. I hope AMD tries to fix this.