Why not? The Thrust folks have done a lot of good work on implementing highly optimized radix sort on GPU.
That said, there is interesting academic work around GPU sort that achieves even higher performance than Thrust in many scenarios, and we are looking at the feasibility of incorporating a framework we have found particularly promising.
Can thrust sort operate on datasets larger than GPU and CPU memory or does it require manually combining smaller sort operations into larger sorted sequences akin to merge sort?