MLUG: Re: [MLUG - DISCUSSION] these are the days of miracles and wonders: a radically new sequencing machine appears
Re: [MLUG - DISCUSSION] these are the days of miracles and wonders: a radically new sequencing machine appears
Email address obfuscation in effect -- please click here to turn it off.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
On 8/1/05, Stephen Montgomery-Smith <EMAIL:PROTECTED> wrote:
> Mike Miller wrote:
>
> > In both gene chip and microarray studies, many SNPs are analyzed, so the
> > dimensionality is very high.  Imagine that you have 100,000 observations
> > on all of 100 cases (with some disease) and 100 controls (without the
> > disease and matched with cases on ethnicity, etc.).  If you use a
> > p-value cutoff of .05, and there are no real effects in the data, you
> > will have 5,000 false positives.  So if you have 6,000 positives, you
> > have about 17% true and 83% false.  And so it goes...
> 
> But surely they have some analysis that tells you that 5000 positives is
> not enough, and they can tell you just how many positives they do need
> before it is significant?

Yes and no.  Yes, you can predict the distribution of false positives
you could get with some probability for your experiment.  And you can
compare that number with the actual number you get.  But you can't
just from this information know which "positive" results are real and
which are just false positives.  In other words, if I expect 5000
false positives under some alternative hypothesis and get 6000
positives, it's really likely there are real effects there.  But if I
pick any random postive result, it seems more likely that it is a
false positive than a true one.  So one response to this is to use a
much more stringent statistical criteria, but that could easily lead
you to disregard real differences that happen to be smaller in
magnitude.
 
> Actually, I have done a lot of research in the pure math side of this
> kind of high dimensional probability, and indeed some of the other
> faculty at MU are renouned world experts in this area.  If you have any
> problems in this area, I would certainly enjoy trying to tackle them.

OK, then.  There are literally tons of micro array data out there on
the web that should be interesting to look at; any result you get with
those would likely be important. :-)

jking

_______________________________________________
discussion mailing list
EMAIL:PROTECTED
http://mlug.missouri.edu/mailman/listinfo/discussion