This is not remotely true for a wide range of codes that matter to the supercomp...

jakub_h · on July 30, 2015

And by "codes", you mean specific legacy software artifacts written in FORTRAN (note that I'm not even spelling it as Fortran)? Of course that's a problem.

ska · on July 30, 2015

No, that means specific problem domains that are not easily partitioned, and where latency or affinity are the primary performance constraints.

There are still some fortran libraries in large scale use for this sort of thing. They are still in use because they are very good, and replacing them would be very expensive for little gain.

apawloski · on July 30, 2015

Write in any language you want; that's irrelevant to the nature of the computations being done here.

By codes they mean -- at the minimum -- pretty much anything that requires frequent communication between any or all nodes as a necessary part of computation. (For example, simulations across a large 3D space, where the changing states of particles on node A directly impacts the states of particles on adjacent nodes.)

dekhn · on July 30, 2015

You can't write your code in any language you want on a supercomputer.

Also, there is a wide range of literature about communication patterns for supercomputer apps; my argument is that often times, to solve the problem that matters, you may not actually need to run the simulation you think you do. It's more that people are just used to running that way.

For example, with MD, you can run 1 sim parallelized over 100 machines using tightly coupled communication (doesn't necessarily mean the forces and positions of every particle have to be shared between node decompositions) or run 100 sims over 100 machines, with no communication except for input and output files. The latter can often answer the same question far more cheaply.

apawloski · on July 30, 2015

I'm somewhat confused -- I thought we were arguing similar points?

I don't want to drag this out, but where do you see the language constraint? You need an MPI binding, sure, but what else?

dekhn · on July 30, 2015

No, the people who run the clusters won't let you run any language just because it has an MPI binding. They invest a lot in ensuring peak performance, and right now, only C++ and FORTRAN can achieve that. Very few, if any, major supercomputer centers support Java codes.

apawloski · on July 30, 2015

Oh, you're talking about a policy limitation, not a technological one. (And if you're talking about the DOE or NSF/Teragrid/XSEDE clusters, then you're probably right. Haven't touched those in years -- and even when I did, I wasn't doing anything crazy.)

dekhn · on July 30, 2015

To be frank, if I was running a computer that was designed for peak performance, I probably wouldn't use Java. There are some very significant performance issues with garbage collection that prevent you from making peak use of them machine.

Supercomputers aren't built so that people can squander the resource (desktop PCs, closest clusters, and phones fulfill that role).

wnissen · on July 30, 2015

It is a technical limitation. Oftentimes the platform is so specialized that only a tiny handful of compilers are ported to it. Say, just gcc, g++, and gfortran, and xlc, xlC, and xlf. And just one version at that. Java would require porting the JVM to the cut-down, weird Linux on the compute nodes. Some $$$ machines don't even support dynamic linking! The number of these machines is so small that extensive compiler and tool support just isn't happening unless you want to add millions to the cost.

dekhn · on July 31, 2015

There is no problem running a JVM on cut-down linux nodes. A JVM is just a process.

Anyway, the issue with JVMs is that they don't have predictable performance, not that the compilers can't be ported.

wnissen · on July 31, 2015

The JVM probably calls fork() and system(), no? Not allowed. Dynamic thread creation? Not allowed. And 50% of your flops go away unless your program uses the BG/Q-specific "double hummer" floating point instructions. These are primitive machines, in terms of development environment and typically require significant rewriting to get even "standard" system software working.

dekhn · on July 30, 2015

Where to begin?

There are many existing valuable codes written in FORTRAN. They work, it's not worth the investment to replace them with something else.

Second, many of the codes are in C++, not FORTRAN. Not clear that's any less of a problem.