HI, Henrique, Thanks for the great help!
I compared the output from your codes: > te<-rev(100 * cumsum(matt$reads > 1) / length(matt$reads) ) > te [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 [37] 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 [55] 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 [91] 10 9 8 7 6 5 4 3 2 1 the output from my code, > result [1] 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 [19] 82 81 80 79 79 77 77 77 74 73 72 71 70 70 68 67 67 65 [37] 64 64 62 62 60 59 58 57 56 56 54 53 52 51 51 49 48 47 [55] 46 45 45 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 [73] 28 27 27 27 24 24 22 21 20 19 19 19 19 15 14 14 12 11 [91] 10 9 8 7 7 5 4 3 2 1 There is no tie in your output. Look at the data set: There are ties in the data set. Your codes work fast, but I think the results is not accurate. Thanks so much for the great help! > matt[c(1:35), ] id reads 1 Contig79:1 4 2 Contig79:2 8 ; ; 22 Contig79:22 64 23 Contig79:23 64 24 Contig79:24 68 25 Contig79:25 68 26 Contig79:26 68 I also attached the testing file with this email. Thanks! On Thu, Nov 4, 2010 at 9:12 AM, Henrique Dallazuanna <www...@gmail.com>wrote: > Try this: > > rev(100 * cumsum(matt$reads > 1) / length(matt$reads) ) > > On Thu, Nov 4, 2010 at 1:46 PM, Changbin Du <changb...@gmail.com> wrote: > >> HI, Dear R community, >> >> I have one data set like this, What I want to do is to calculate the >> cumulative coverage. The following codes works for small data set (#rows = >> 100), but when feed the whole data set, it still running after 24 hours. >> Can someone give some suggestions for long vector? >> >> id reads >> Contig79:1 4 >> Contig79:2 8 >> Contig79:3 13 >> Contig79:4 14 >> Contig79:5 17 >> Contig79:6 20 >> Contig79:7 25 >> Contig79:8 27 >> Contig79:9 32 >> Contig79:10 33 >> Contig79:11 34 >> >> >> matt<-read.table("/house/groupdirs/genetic_analysis/mjblow/ILLUMINA_ONLY_MICROBIAL_GENOME_ASSEMBLY/4083340/STANDARD_LIBRARY/GWZW.994.5.1129.trim_69.fastq.19621832.sub.sorted.bam.clone.depth", >> sep="\t", skip=0, header=F,fill=T) # >> dim(matt) >> [1] 3384766 2 >> >> matt_plot<-function(matt, outputfile) { >> names(matt)<-c("id","reads") >> >> cover<-matt$reads >> >> >> #calculate the cumulative coverage. >> + cover_per<-function (data) { >> + output<-numeric(0) >> + for (i in data) { >> + x<-(100*sum(ifelse(data >= i, 1, 0))/length(data)) >> + output<-c(output, x) >> + } >> + return(output) >> + } >> >> >> result<-cover_per(cover) >> >> >> Thanks so much! >> >> >> -- >> Sincerely, >> Changbin >> -- >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Henrique Dallazuanna > Curitiba-Paraná-Brasil > 25° 25' 40" S 49° 16' 22" O > -- Sincerely, Changbin -- Changbin Du DOE Joint Genome Institute Bldg 400 Rm 457 2800 Mitchell Dr Walnut Creet, CA 94598 Phone: 925-927-2856
Contig79:1 4 Contig79:2 8 Contig79:3 13 Contig79:4 14 Contig79:5 17 Contig79:6 20 Contig79:7 25 Contig79:8 27 Contig79:9 32 Contig79:10 33 Contig79:11 34 Contig79:12 36 Contig79:13 39 Contig79:14 40 Contig79:15 44 Contig79:16 49 Contig79:17 55 Contig79:18 56 Contig79:19 59 Contig79:20 60 Contig79:21 62 Contig79:22 64 Contig79:23 64 Contig79:24 68 Contig79:25 68 Contig79:26 68 Contig79:27 70 Contig79:28 73 Contig79:29 76 Contig79:30 77 Contig79:31 78 Contig79:32 78 Contig79:33 79 Contig79:34 80 Contig79:35 80 Contig79:36 84 Contig79:37 87 Contig79:38 87 Contig79:39 88 Contig79:40 88 Contig79:41 89 Contig79:42 93 Contig79:43 94 Contig79:44 98 Contig79:45 99 Contig79:46 99 Contig79:47 102 Contig79:48 103 Contig79:49 108 Contig79:50 112 Contig79:51 112 Contig79:52 113 Contig79:53 116 Contig79:54 118 Contig79:55 120 Contig79:56 124 Contig79:57 124 Contig79:58 126 Contig79:59 128 Contig79:60 130 Contig79:61 133 Contig79:62 134 Contig79:63 136 Contig79:64 139 Contig79:65 144 Contig79:66 145 Contig79:67 146 Contig79:68 148 Contig79:69 149 Contig79:70 151 Contig79:71 156 Contig79:72 157 Contig79:73 158 Contig79:74 159 Contig79:75 159 Contig79:76 159 Contig79:77 160 Contig79:78 160 Contig79:79 161 Contig79:80 163 Contig79:81 164 Contig79:82 165 Contig79:83 165 Contig79:84 165 Contig79:85 165 Contig79:86 166 Contig79:87 170 Contig79:88 170 Contig79:89 172 Contig79:90 174 Contig79:91 178 Contig79:92 180 Contig79:93 181 Contig79:94 184 Contig79:95 184 Contig79:96 187 Contig79:97 190 Contig79:98 192 Contig79:99 194 Contig79:100 199
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.