Email address obfuscation in effect -- please
click here to turn it off.
[
Date Prev][
Date Next][
Thread Prev][
Thread Next][
Date Index][
Thread Index]
On Fri, 2 Jun 2006, Stephen Montgomery-Smith wrote:
Here are two good books on inference:
Edwards AWF (1972) Likelihood. Cambridge University Press,London
Royall, R. (1997). Statistical Evidence: A Likelihood Paradigm. London:
Chapman & Hall
Here is a review by geneticist/statisticians:
http://taxa.epi.umn.edu/~mbmiller/journals/ajhg/199807_Vieland_Hodge_Likelihood.pdf
(the first 7 pages of that PDF)
I read Edwards very thoroughly and thought he had great ideas. I
haven't read Royall, but I think his ideas are much like those of
Edwards.
I will try to read those books sometime. (I am also occupied with other
projects, so it won't be too soon.)
But the opening lines of the PDF review tell me that the "best"
statistical method is clearly up in the air. You happen to like
Edwards' approach, but someone else might prefer something different.
Sort of. It really depends on the situation. I'm not always sure what
drives arguments about "Bayesian" and "frequentist" perspectives on
inference, but I think a lot of it is due to the fact that it is a
difficult topic and a statistician can get by professionally without ever
really coming to grips with the core philosophical problems.
By the way, I see the "likelihood method" as an unweighted Baysian
method i.e. assume that both options are equally likely in the prior
distribution.
Sort of, but not quite. What Edwards points out is that the likelihood
contains all the information about the parameters that can be found in the
data. In a Bayesian analysis, one uses the likelihood along with a
"prior" which is a sort of weighting scheme based on, well, based on
whatever the hell you want it to be based on -- and that's the problem
with Bayesian analysis, but that doesn't mean it isn't a good thing.
I do admit that I am not an expert in statistics, and my guess is that
you and Jon know way more than I do. On the other hand I do think I
know a great deal about probability. I recently saw an account of how
mitochondrial DNA could be used as evidence that all the different types
of ape (including the human being) must have a non-trivial tree of
ancestry. Not being that familiar with statistics, I thought about why
the test he chose (a chi-squared test) was appropriate, and I could see
that it had many underlying assumptions, not all of which were
reasonable - (in his case that evolutionary pressures might not cause a
change in the DNA in one place to speed up changes in DNA in other
positions). He computed an absurdly small p value, which meant that he
could reject his null hypothesis. But it made me question the value of
these tests in being able to produce absurdly small p values, because
the assumptions he made, which were reasonable, nevertheless could be
violated with a probability which, while small, were way bigger than the
absurdly small p value he obtained. Thus he might still have a small p
value, but perhaps more like 1% than .000000000000001% which was the
kind of value he got.
It seems like you are understanding how the game is played. It's all
about the assumptions. If you have a good model and reasonable
assumptions, a statistical test can be extremely persuasive.
An awful lot of work in genetics has been done to show that violations of
certain assumptions are not going to ruin a statistical test. Most of
statistics is about approximation and extracting meaning and direction
from data -- random data that includes all sorts of errors. We don't know
that we are doing things the best way because we don't know the answers
yet. There is a lot of guess work. But through all of this guessing and
testing and approximating we find ways to advance knowledge and make
progress. We've made huge strides in science and part of the reason is
that statistical analysis provides an excellent way of telling when we
have the wrong idea. By discovering which ideas are bad, we move forward.
And 1%, while small, is not a kind of certainty that you want to have
when you are trying to say "evolution is right and creationism is
wrong."
But, of course, he wasn't testing that. I'm sure he was assuming that
"evolution is right." My guess is that he didn't even mention
creationism, probably because he wouldn't have any reason at this point,
with all of his knowledge, even to consider the possibility that he should
evoke a supernatural cause for anything he was studying.
Now this "randomization" process you described is an attempt to push the
experiment into the highly controled scenario in which we have been so
successful in computing probabilities of getting good poker hands. But
I question our ability to perform this randomization with any certainty
when we are dealing with data like exit polls or mytochondrial DNA.
Think about your term "any certainty" and what that means. That is a
really key issue.
And so if you can only guarantee your exit polls with a probability of
about 1%, then for one election in a 100 to be wrong is not surprizing.
That is imprecise terminology. If the "margin of error" is 1%, that means
that about 95% of true proportions should fall within 1% of the predicted
value. Thus, we would expect about 5 "wrong" (by more than 1%) out of
100, but none should be wrong by very much. With the Ohio data, the
concern is that some of the predictions are wrong by a lot and in some
unusual ways.
(And the assumption that the elections are independent - now that surely
is unreasonable.)
That depends on what things are assumed to be independent.
Mike
_______________________________________________
discussion mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/discussion