For certain research problems, different and often obscure tool sets are required. That, to me, is a great argument behind recreating everything your environment needs in a VM - unlike Chemists who can't bundle up their lab into a little transportable box, Computer Scientists can.
In making the argument for recomputation to aid its adoption, it's important not to add too many constraints on the researcher e.g. "Your results should be reproduceable and written in C because it should still be around ten years from now."
Even an obsolete language in a working VM sandbox can be edited and inspected to make changes, verify correctness etc. If you want to extend the results of that research you may have a porting project to update the code to more a current language, but that's leaps and bounds ahead of how things are now: read and re-read the paper until it makes sense and implement the algorithms from scratch.
Agree with your hypervisor/architecture point but I suppose there has to be some broad architectural choice made.
(The author of this article was my PhD supervisor. I've always found him to be an inspiring, forward thinking researcher so as much as I love the recomputation idea on merit, I'm also a little biased.)
unlike Chemists who can't bundle up their lab into a little transportable box, Computer Scientists can
This is often a good thing for scientific progress in chemistry, though. Independent reproduction in independent labs gives a lot more confidence in results than shipping the exact same lab around would. A lot of interesting results come up due to issues in replication, which you wouldn't find if you just literally reran the identical experiment on the identical equipment.
That said, being able to look at and experiment with someone's original apparatus is better than nothing.
I agree completely. From my experience, because of things like the nasty omitted details mentioned in the parent comment, techniques are rarely reimplemented unless they claim pretty substantial improvements.
I think having a starting point of reproduceability would lead to more reimplementations.
It will be slower, but I don't think unusable. Actually, comparing gameboy (and other consoles) is not really a good idea, as they are systems where getting all the processors exactly in sync is important and difficult. Emulating a PC is much easier, as no-one expects to get exactly equal speeds out of processors, as every PC model is slightly different.
As I say, I (and many other people) emulated x86 windows on PPCs for years. I ran visual studio and it dragged a little, but was fairly usable. Certainly it won't be as fast, but then again by the time x86 has died and people have moved, hopefully systems will have got fast enough to make up the difference!
In gaming world it seems like it's always hard to emulate the last generation, and quite easy to do ones before that. (E.g. Xbox not emulating their predecessors.)
It's not my area of expertise but I expect to see something similar here. If you need to emulate hardware you might take a 10x hit, which might be unacceptable for very large scale computations. But in the longer run that might not be a problem as more resources become available.
>Agree with your hypervisor/architecture point but I suppose there has to be some broad architectural choice made.
There's a risk that the VM platform decision could date and, ultimately, render useless an experiment sandbox.
However, two points. Firstly, any attempt to wholly encapsulate an experiment will be susceptible to problems concerning the technology choices made. By choosing the abstraction at a VM level, you're more likely to get a better number of years out of your choice than picking specific programming languages (which I contend would fail to gain adoption as researchers wouldn't change their obscure toolkit if they felt it was the right one for their research).
Secondly, even a VM for an obsolete architecture is better than what we have today - the original paper and perhaps an email address for the lead author.
Pulling the long-time-portability lever one mark further, I suggest Bochs. It's slow because it doesn't use Qemu's dynamic recompilation, but on the other hand, that simplicity makes the job of future archaeologists easier.
In making the argument for recomputation to aid its adoption, it's important not to add too many constraints on the researcher e.g. "Your results should be reproduceable and written in C because it should still be around ten years from now."
Even an obsolete language in a working VM sandbox can be edited and inspected to make changes, verify correctness etc. If you want to extend the results of that research you may have a porting project to update the code to more a current language, but that's leaps and bounds ahead of how things are now: read and re-read the paper until it makes sense and implement the algorithms from scratch.
Agree with your hypervisor/architecture point but I suppose there has to be some broad architectural choice made.
(The author of this article was my PhD supervisor. I've always found him to be an inspiring, forward thinking researcher so as much as I love the recomputation idea on merit, I'm also a little biased.)