Not true. Maximum likelihood picks the hypothesis/theory that best "explains" the data, hence you run into issues with overfitting, etc. A Bayesian starts with a prior over a set of hypotheses (the P(T) term), and then uses the data to update their confidence in the various hypotheses. Assuming you have a sane prior, you end up with a simple theory with a record of making decent predictions (i.e. the kind of thing you look for in science, cf. Occam's razor, etc).
I don't see any conflict with the scientific method (at least in a fundamental sense). Scientists aren't oracles - hypotheses need to come from somewhere. Some call it intuition - I would call it an implicit application of Bayesian reasoning, where the data is experience, and the prior is governed by genetic constraints of the brain. From this you obtain a set of intuitive hypotheses to be (further) tested (i.e. those with large posterior P(T|O)).
Testing a set of hypotheses then just involves collecting observations that differentiate them (i.e. where they make conflicting predictions). This can still be considered an application of Bayes' rule, but usually one tries to collect enough data that it's quite obvious which is consistently making the best predictions (in other words has posterior close to 1), in which case it becomes a theory.
I never disagreed with the equation (of course it's correct). My point is that the prior always comes first, even in UII. You're not simply picking the hypothesis that best explains the data (assuming by best explains you mean has the greatest likelihood P(O|T)), otherwise you just end up with the hypothesis containing a lookup table of all previous data. You need to take into account your confidence in the hypothesis before the data arrived (e.g. based on the complexity/size of programs expressing that hypothesis for UII).
Ah. Your use of not true made it look like an outright dismissal of his whole statement. As for the order of when to pick the prior, I think what is more important is that the data not influence your choice of prior. If you were some oracular machine you could see the data and generate hypothesis and priors for them independent of the data and still not fall for the problem you state.
And then there is the problem of how do you form sensible hypotheses without at least knowing the shape of the data first. The form of these hypotheses are themselves a restriction on the possible space. I think that is what the GGP was getting at.
I don't see any conflict with the scientific method (at least in a fundamental sense). Scientists aren't oracles - hypotheses need to come from somewhere. Some call it intuition - I would call it an implicit application of Bayesian reasoning, where the data is experience, and the prior is governed by genetic constraints of the brain. From this you obtain a set of intuitive hypotheses to be (further) tested (i.e. those with large posterior P(T|O)).
Testing a set of hypotheses then just involves collecting observations that differentiate them (i.e. where they make conflicting predictions). This can still be considered an application of Bayes' rule, but usually one tries to collect enough data that it's quite obvious which is consistently making the best predictions (in other words has posterior close to 1), in which case it becomes a theory.