Manuel Blum: Advice to a beginning graduate student

jdp23 · on Dec 24, 2010

A classic from 2001.

I was a grad student at Berkeley in the 80s in Theoretical Computer Science before heading off in a different direction, and got as far as taking my oral prelims from Manuel Blum and Richard Karp. No pressure! Rather embarassingly I forgot the Blum speedup theorem until I was prompted for it. They were both very nice about it though.

wittgenstein · on Dec 25, 2010

"For example, Finite Automata can add but not multiply."

Can someone please explain why this is true? If it can add, how can it not multiply if multiplication is repeated addition?

cscheid · on Dec 25, 2010

Roughly because the number of times it needs to sum to get a multiply is not constant; it depends on the input. Think about writing the automaton -- how would you know when to stop? You're only given one automaton to write, and you don't know ahead of time how big either of the input numbers are.

djcapelis · on Dec 25, 2010

You can make the FSM repeat the addition, but you can't give it the memory it needs to know when to stop.

If you need more explanation just try implementing it, you should see where the problem is fairly quickly. :)

sesqu · on Dec 26, 2010

So how do you get it to add? Specifically, what representation of arbitrary numbers are we using with our finite automaton? Infinite read-only tape plus infinite write-only tape for output, in base 10? If so, it should be simple to do multiplication by representing the numbers in factorized form and applying our adder.

forensic · on Dec 24, 2010

most of it only applies to a few classical hard sciences - math, physics, chemistry, engineering

a lot of it doesnt even apply to CS in my opinion

p_nathan · on Dec 24, 2010

really? It seems very applicable to CS research.

djcapelis · on Dec 24, 2010

I suppose that depends on what you think of when you say CS research. CS research is very different depending on where you do you work, even the way authors are ordered on papers changes depending on how close to mathematics you work. Parts of this advice seem a better match for the theory side of the field than the systems side, for instance. In the systems side we often find it helpful to tell people to stop trying to overthink things, shut up, and code. (The systems will show you how they work when you're working with them, if you don't understand something: run an experiment.) In the theory side (where the author did his work) it is much more common to tell the students to think more instead.

_delirium · on Dec 24, 2010

I agree with that, and also think that it varies in the other direction, as far as emphasis on writing/coding/etc.

In programming languages and parts of AI, it's common for the thesis to mainly be an "analysis" of something, or a long-form "proposal" of a new approach, rather than a "write-up" of a "result". I get the sense that in Blum's corner of CS, and probably also systems, it's the other way around: you get some results, and then you write them up (though what counts as getting results, and how to go about it, differs quite a bit, as you point out). Whereas in the areas I work in, in some sense the thesis is sometimes the result. Sure, there are a bunch of technical results in it too, but they're a supporting cast of characters, included because they bolster the main analysis/argument/proposal.

I suppose the model I'm thinking of is somewhat more humanities-style thesis organization, which makes some sense, since AI has always been a bit coy about whether it's a subfield of CS, of cog sci, of philosophy, of engineering, or of something else. Interestingly, though, it's an even more common thesis approach in engineering. People think of engineering as less humanities-ish, but it's much more open to the idea that deep analysis of a problem/domain/approach is itself a research result.

p_nathan · on Dec 24, 2010

There's sort of the "Computer Science" vs. "Software Engineering" difference going on there.

djcapelis · on Dec 25, 2010

No, this is an overblown but common thing for people to say. Both sides do science, just some of us do science on abstract systems where in mind analysis is appropriate and others do science in less abstract systems where it is appropriate to experiment earlier.

Engineering on the other hand, is defined by an exemplary focus on good reliable design a system. Engineers often run tests as part of their work, but don't typically run experiments. (The difference, by the way, is that one can fail a test, whereas with a good experiment you succeed in showing something no matter your results.)

Please refrain from buying into the notion that science must be done only in worlds of abstraction and science done in less abstract systems must be labelled engineering. The use of the scientific method is what distinguishes science from engineering. My research requires me to do both at times.

Daniel_Newby · on Dec 25, 2010

Engineers often do pure experiments. It was EEs that came up with ROC curves, and good data sheets for electronic devices commonly show histograms and statistics for operating parameters. Hard drive optimization is largely about measuring the data density versus signal-to-noise ratio curve, whatever shape it happens to be, then picking a suitable error correcting code. Chemical process optimization is balls-out experimentation, especially when compounds insist on crystallizing in inconvenient forms. Civil engineers measure degradation to know when bridges need attention.

I think the difference is that engineers must often be satisfied with boring things that must be useful, while scientists aspire to interesting things that may be useless.

djcapelis · on Dec 25, 2010

Since we have nowhere to go but more pedantic, I argue that engineers are not infrequently called upon to do science, just as many scientists are often called upon to do engineering.

Whether you call yourself an engineer or a scientist I would agree, has a lot to do with whether your goal at the end of the day is to produce something that works or something that fascinates. As someone who has the luxury to have a job doing the latter, I don't stop seeing myself as a scientist even when I happen to be doing some engineering work on one of my experiments.

HilbertSpace · on Dec 25, 2010

I see Blum's advice as very misleading and a disservice to all concerned.

Here is my view (worked for me):

Note that the main goal is the Ph.D. A second goal is, if you wish, a start on capabilities in research.

Okay, for both, especially the first, there is a condition that is essentially both necessary and sufficient: PUBLISH.

Right: F'get about the 'dissertation' for now. Instead, just PUBLISH.

Why? At many of the best research universities, the official criterion for a Ph.D. dissertation is "an original contribution to knowledge worthy of publication". So, publish (did I mention that before?) and remove all doubt.

Note: Typically just publishing is easier than slogging through all the nonsense of dissertation proposal, research on a problem from a dissertation advisor, getting the dissertation approved by a committee, etc.

Now, what to publish?

Pick a problem and solve it. The usual criteria for publication are "new, correct, and significant".

Pick your own problem. Try hard to avoid a problem given by a prof.

How to solve it? First remember that in essentially all fields, the most admired research presents a mathematical solution. So, get one.

How to do that? Study appropriate mathematics. For the 'computing', mostly f'get about CS because for computing problems, compared with the math, it's next to junk.

How to pick a problem? To help meet the 'significant' part, pick a problem with some obvious practical importance. Typically find this problem off campus. E.g., maybe do something in server farm monitoring, performance, cost, response to load uncertainty, etc.

Broad solution direction: Find 'optimal' means for attacking the ubiquitous uncertainty.

How to say that your solution is significant? Get a solution that is provably 'optimal' in some sense. In particular, don't try to plow ground in some theoretical direction starting where all the profs broke their plows. Especially, f'get about P versus NP.

To get a math solution, start by studying, guess what, right, math.

Summary: Pick a practical problem that is clearly of importance. Use math to get 'new, correct, significant' solution. Publish the solution.

If one such paper is enough, fine. Else publish about three papers, likely on the same problem, that is, as one 'research stream', stack up the papers, put on a title page, put a staple in the UL corner, and submit it. Your profs have to admit that your work has been judged 'new, correct, and significant' and definitely is 'worthy of publication'.

For a better start on research, do the same thing but just do it a little deeper, especially in the math.

What math? Linear algebra -- through Halmos. Modern algebra -- pick something, maybe old Birkhoff and Mclane, Herstein, or Lang or something newer. Advanced calculus -- through Rudin. Measure theory and functional analysis -- the real half of Rudin's R&CA and Royden. Optimization -- linear programming, linear programming on networks (often get integer programming for free -- looks NP-complete but it's not), nonlinear programming (especially the Kuhn-Tucker conditions). Probability based on measure theory -- Breiman, Chung, Neveu, Loeve, Chow and Teicher. The above is a bit minimal; more could be from important up to really valuable.

Generally you are trying to find tools in math not yet well exploited for problems in computing. So, if find something relatively new in math, then almost certainly it has not yet been well exploited.

Under no circumstances take seriously anything in math done by any CS prof; in particular, totally flush 'machine learning, artificial intelligence, and data mining' -- good problems but total puke for work. Stay with math from names such as I listed: You'll be flying in luxury at Mach 3 at 40,000 feet while the CS profs are still slogging in mud and covered in puke.

Study some mathematical statistics -- hold your nose at how badly they do the math.

Good way to get tight bounds: Use the martingale inequality, fairly easy to apply and one of the strongest in math. See examples of Talagrand and Rhee. Yes, they got a lot of nice papers getting some tight bounds on, right, even NP-complete problems.

Generally the key is the math, and in the math, especially if go to seminars by the best people, can find 'themes' and 'directions' that can result in research 'paradigms' and lots of papers.

Then get on with making money!

JonnieCache · on Dec 24, 2010

He may have won the turing prize but he's apparently forgotten what linebreaks are for in written english.

sliverstorm · on Dec 25, 2010

Didn't he just write an entire article discussing how a PhD must know everything about a domain so small as to be nearly nothing? Practically illustrates what he was talking about- just because he won the turing prize doesn't mean he has to know anything about easy reading & page layout.