And by "codes", you mean specific legacy software artifacts written in FORTRAN (note that I'm not even spelling it as Fortran)? Of course that's a problem.
No, that means specific problem domains that are not easily partitioned, and where latency or affinity are the primary performance constraints.
There are still some fortran libraries in large scale use for this sort of thing. They are still in use because they are very good, and replacing them would be very expensive for little gain.
Write in any language you want; that's irrelevant to the nature of the computations being done here.
By codes they mean -- at the minimum -- pretty much anything that requires frequent communication between any or all nodes as a necessary part of computation. (For example, simulations across a large 3D space, where the changing states of particles on node A directly impacts the states of particles on adjacent nodes.)
You can't write your code in any language you want on a supercomputer.
Also, there is a wide range of literature about communication patterns for supercomputer apps; my argument is that often times, to solve the problem that matters, you may not actually need to run the simulation you think you do. It's more that people are just used to running that way.
For example, with MD, you can run 1 sim parallelized over 100 machines using tightly coupled communication (doesn't necessarily mean the forces and positions of every particle have to be shared between node decompositions) or run 100 sims over 100 machines, with no communication except for input and output files. The latter can often answer the same question far more cheaply.
No, the people who run the clusters won't let you run any language just because it has an MPI binding. They invest a lot in ensuring peak performance, and right now, only C++ and FORTRAN can achieve that. Very few, if any, major supercomputer centers support Java codes.
Oh, you're talking about a policy limitation, not a technological one. (And if you're talking about the DOE or NSF/Teragrid/XSEDE clusters, then you're probably right. Haven't touched those in years -- and even when I did, I wasn't doing anything crazy.)
To be frank, if I was running a computer that was designed for peak performance, I probably wouldn't use Java. There are some very significant performance issues with garbage collection that prevent you from making peak use of them machine.
Supercomputers aren't built so that people can squander the resource (desktop PCs, closest clusters, and phones fulfill that role).
It is a technical limitation. Oftentimes the platform is so specialized that only a tiny handful of compilers are ported to it. Say, just gcc, g++, and gfortran, and xlc, xlC, and xlf. And just one version at that. Java would require porting the JVM to the cut-down, weird Linux on the compute nodes. Some $$$ machines don't even support dynamic linking! The number of these machines is so small that extensive compiler and tool support just isn't happening unless you want to add millions to the cost.
The JVM probably calls fork() and system(), no? Not allowed. Dynamic thread creation? Not allowed. And 50% of your flops go away unless your program uses the BG/Q-specific "double hummer" floating point instructions. These are primitive machines, in terms of development environment and typically require significant rewriting to get even "standard" system software working.