Re: [R] Multiple testing corrections on very large vector

2011-03-09 Thread Jorge Ivan Velez
Hi terdon, Very happy to help and know it worked. Honestly, I do not know exactly where the differences are, but it is not hard to check the sources and compare both algorithms. When doing this, you can can see that p.adjust() uses vectorization whereas mt.rawp2adjp() does not. Perhaps that is th

Re: [R] Multiple testing corrections on very large vector

2011-03-08 Thread terdon
Hi Jorge, first of all THANKS! I just ran you suggestion and got blown away: system.time(res <- p.adjust(pv, method = 'fdr')) user system elapsed 55.052 3.100 62.560 I had tried the same thing using mt.rawp2adjp as per my original post, sent it to a cluster here on friday afternoo

Re: [R] Multiple testing corrections on very large vector

2011-03-08 Thread Jorge Ivan Velez
Hi terdon, You are absolutely right. I apologize for any inconvenience my lack of coffee might have caused :-) I simulated some p-values with the length of your vector and ran the p.adjust() function on them. Here is what I got: system.time(res <- p.adjust(pv, method = 'fdr')) user system el

Re: [R] Multiple testing corrections on very large vector

2011-03-08 Thread terdon
Hi Jorge, and thanks for your answer, it looks promising. However, I have a question. First of all, I am a lowly biologist so please excuse any horrible errors of understanding I may make. So, the BH correction depends on, among other things, sorting the vector of P-values from the smallest to th

Re: [R] Multiple testing corrections on very large vector

2011-03-08 Thread Jorge Ivan Velez
Hi terdon, Take a look at ?p.adjust and its argument n. For example, you could adjust B pv values by using p.adjust(pv[1:B], method = 'BH', n = B) Then, you can continue processing other subsets of pv and concatenate the result. Here, a for() loop might be useful. HTH, Jorge On Tue, Mar 8,

[R] Multiple testing corrections on very large vector

2011-03-08 Thread terdon
Hello all, I am calculating probabilities of association for terms in the GeneOntology database. I have ~4e7 probabilities and I would like to correct for multiple testing. At the moment I have something like the following: pv is a vector containing my 4e7 probabilities. To run the multipl