MLUG: Re: [MLUG - DISCUSSION] [POLITICS] Was the 2004 Election Stolen?
Re: [MLUG - DISCUSSION] [POLITICS] Was the 2004 Election Stolen?
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Mike Miller wrote:

But more than that, I have spent some time thinking in general about statistics. Some of the large data problems that people are dealing with (e.g. microarrays that Jon told me about recently) is revealing to me just how ad-hoc statistical methods are. As best as I can see the only statistical method with any pretension to any kind of solid foundations is the Baysian method, but my impression is that for large data problems it tends to give way worse results than the other more ad-hoc methods.

I don't see it that way. Here are two good books on inference:

Edwards AWF (1972) Likelihood. Cambridge University Press,London
Royall, R. (1997). Statistical Evidence: A Likelihood Paradigm. London: Chapman & Hall


Here is a review by geneticist/statisticians:

http://taxa.epi.umn.edu/~mbmiller/journals/ajhg/199807_Vieland_Hodge_Likelihood.pdf

(the first 7 pages of that PDF)

I read Edwards very thoroughly and thought he had great ideas. I haven't read Royall, but I think his ideas are much like those of Edwards.

I will try to read those books sometime. (I am also occupied with other projects, so it won't be too soon.)


But the opening lines of the PDF review tell me that the "best" statistical method is clearly up in the air. You happen to like Edwards' approach, but someone else might prefer something different.

By the way, I see the "likelihood method" as an unweighted Baysian method i.e. assume that both options are equally likely in the prior distribution.

I do admit that I am not an expert in statistics, and my guess is that you and Jon know way more than I do. On the other hand I do think I know a great deal about probability. I recently saw an account of how mitochondrial DNA could be used as evidence that all the different types of ape (including the human being) must have a non-trivial tree of ancestry. Not being that familiar with statistics, I thought about why the test he chose (a chi-squared test) was appropriate, and I could see that it had many underlying assumptions, not all of which were reasonable - (in his case that evolutionary pressures might not cause a change in the DNA in one place to speed up changes in DNA in other positions). He computed an absurdly small p value, which meant that he could reject his null hypothesis. But it made me question the value of these tests in being able to produce absurdly small p values, because the assumptions he made, which were reasonable, nevertheless could be violated with a probability which, while small, were way bigger than the absurdly small p value he obtained. Thus he might still have a small p value, but perhaps more like 1% than .000000000000001% which was the kind of value he got. And 1%, while small, is not a kind of certainty that you want to have when you are trying to say "evolution is right and creationism is wrong."

Now this "randomization" process you described is an attempt to push the experiment into the highly controled scenario in which we have been so successful in computing probabilities of getting good poker hands. But I question our ability to perform this randomization with any certainty when we are dealing with data like exit polls or mytochondrial DNA.

And so if you can only guarantee your exit polls with a probability of about 1%, then for one election in a 100 to be wrong is not surprizing.

(And the assumption that the elections are independent - now that surely is unreasonable.)

Stephen

_______________________________________________
discussion mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/discussion