Hi, Try this: directory<- "/home/arunksa111/dados" #modified the function GetFileList <- function(directory,number){ setwd(directory) filelist1<-dir() lista<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = TRUE, recursive = TRUE) output<- list(filelist1,lista) return(output) }
file.list.names<-GetFileList(directory,23)[[1]] lista<-GetFileList(directory,23)[[2]] FacGroup<-c(0,1,0,2,2,0,3) ReadDir<-function(FacGroup){ list.new<-lista[FacGroup!=0] read.list<-lapply(list.new, function(x) read.table(x,header=TRUE, sep = "\t")) names(read.list)<-file.list.names[FacGroup!=0] return (read.list) } ListFacGroup<-ReadDir(FacGroup) z.boxplot<- function(lst){ new.list<- lapply(lst,function(x) x[x$FDR<0.01,]) pdf("VeraBP.pdf") lapply(names(new.list),function(x) lapply(new.list[x],function(y) boxplot(FDR~z,data=y,xlab="Charge",ylab="FDR",main=x))) dev.off() } z.boxplot(ListFacGroup) A.K. ________________________________ From: Vera Costa <veracosta...@gmail.com> To: arun <smartpink...@yahoo.com> Sent: Friday, March 15, 2013 2:08 PM Subject: Re: new question Sorry, you could give me a small new help? Using the same data, I need a boxplot by groups. I write he the functions I'm using. The last (z.boxplot is what I need, the other is ok). Thank you one more time. GetFileList <- function(directory,number){ setwd(directory) filelist1<-dir()[file.info(dir())$isdir] direct<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, recursive = TRUE) direct<-lapply(direct,function(x) paste(directory,"/",x,sep="")) lista<-unlist(direct) output<- list(filelist1,lista) return(output) } ReadDir<-function(FacGroup){ list.new<-lista[FacGroup!=0] read.list<-lapply(list.new, function(x) read.table(x,header=TRUE, sep = "\t")) names(read.list)<-file.list.names[FacGroup!=0] return (read.list) } directory<-"C:/Users/Vera Costa/Desktop/dados.lixo" file.list.names<-GetFileList(directory,23) [[1]] lista<-GetFileList(directory,23) [[2]] FacGroup<-c(0,1,0,2,2,0,3) ListFacGroup<-ReadDir(FacGroup) #zPValues(ListFacGroup,FacGroup) z.boxplot <- function(lista) { #I need eliminate all data with FDR<0.01 new.list<-lista[FDR<0.01] #boxplots split by groups boxplot(FDR ~ z, data = dct1, xlab = "Charge", ylab = "FDR",main=(paste("t",i))) } z.boxplot(ListFacGroup) 2013/3/13 Vera Costa <veracosta...@gmail.com> No problem! >Sorry my questions. > > > >2013/3/13 arun <smartpink...@yahoo.com> > >As I mentioned earlier, I don't find it useful to do anova on that kind of >data. Previously, I tried with chisq.test also. It gave warnings() and then >you responded that it is not correct. I would suggest you to dput an example >dataset of the specific columns that you want to compare (possibly by row) >and post in the R-help list. If you get any reply, then you can implement it >on your whole list of files. Sorry, today, I am busy. >> >> >> >> >> >> >> >>________________________________ >>From: Vera Costa <veracosta...@gmail.com> >>To: arun <smartpink...@yahoo.com> >>Sent: Wednesday, March 13, 2013 9:43 AM >> >>Subject: Re: new question >> >> >>Ok. Thank you. >>Could you help me to apply this? >> >> >> >>2013/3/13 arun <smartpink...@yahoo.com> >> >>you are comparing one datapoint to another. It doesn't make sense. For >>anova, you need replications to calculate df. may be you could try >>chisq.test. >>> >>> >>> >>> >>> >>> >>> >>>________________________________ >>>From: Vera Costa <veracosta...@gmail.com> >>>To: arun <smartpink...@yahoo.com> >>>Sent: Wednesday, March 13, 2013 8:56 AM >>> >>>Subject: Re: new question >>> >>> >>>I agree with you. >>> >>>I write this tests because I need to compare with some test. I agree is not >>>very correct, but what is bioconductor?I need to eliminate some data (rows) >>>not very significant based in some statistics. What about your idea? How can >>>I do this? >>> >>> >>> >>>2013/3/13 arun <smartpink...@yahoo.com> >>> >>>Ok. >>>> >>>>" >>>> >>>>I need a t test (it's in this function). But I need a chisq.test corrected >>>>and a Anova with data in attach. >>>>" >>>>What do you mean by this? >>>> >>>>Though, I calculated the t test based on comparing a single value against >>>>another for each row, I don't think it makes sense statistically. Here, >>>>you are estimating the mean by just one value, which then is the mean value >>>>and comparing it with another value. It doesn't make much sense. I think >>>>in bioconductor there are some packages which do this kind of comparison (I >>>>don't remember the names). Also, I am not sure what kind of inference you >>>>want from chisquare test. Also, from anova test (?using just 2 datapoints) >>>>(if the comparison is rowwise). >>>> >>>> >>>> >>>> >>>>________________________________ >>>>From: Vera Costa <veracosta...@gmail.com> >>>>To: arun <smartpink...@yahoo.com> >>>>Sent: Tuesday, March 12, 2013 6:04 PM >>>> >>>>Subject: Re: new question >>>> >>>> >>>>Ok. It isn't the last code... >>>>You sent me this code >>>> >>>>directory<- "/home/arunksa111/data.new" >>>>#first function >>>>filelist<-function(directory,number,list1){ >>>>setwd(directory) >>>>filelist1<-dir(directory) >>>> >>>>direct<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), >>>>full.names = FALSE, recursive = TRUE) >>>> >>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = >>>>"\t",stringsAsFactors=FALSE)) >>>>names(list1)<-filelist1 >>>>list2<- list(filelist1,list1) >>>>return(list2) >>>>} >>>>foldernames1<-filelist(directory,23,list1)[[1]] >>>>foldernames1 >>>>#[1] "a1" "c1" "c2" "c3" "t1" "t2" >>>>lista<-filelist(directory,23,list1)[[2]] #lista output >>>> >>>>FacGroup<- c("c1","c3","t2") >>>> >>>>#Second function >>>>f<-function(listRes,Toselect){ >>>>res2<-split(listRes,gsub("[0-9]","",names(listRes))) >>>>res3<-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) >>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")])) >>>>res4<-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]]) >>>>res4New<- lapply(res4,function(x) lapply(names(x), function(i) >>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) )) >>>>library(plyr) >>>>library(data.table) >>>>res5<-lapply(res4New,function(x) lapply(x,function(x1){ x1<- >>>>data.table(x1);x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]})) >>>>res6<- lapply(res5,function(x) lapply(x,function(x1) >>>>{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s", "", >>>>unlist(strsplit(x2, ",")))));x3<-as.data.frame(x1);names(x3)[6]<- >>>>as.character(unique(x3$folder_name));x3[,-c(1,5)]})) >>>> >>>>res7<-lapply(res6,function(x) Reduce(function(...) >>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),x)) >>>> res8<-res7[lapply(res7,length)!=0] >>>> res9<- Reduce(function(...) merge(...,by=c("Seq","Mod","z"),all=TRUE),res8) >>>>res9[is.na(res9)] <- 0 >>>>return(res9) >>>>} >>>> >>>>f(lista,FacGroup) >>>> head(f(lista,FacGroup)) >>>> # Seq Mod z c1 c3 t2 >>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2 0 0 1 >>>>#2 aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2 0 0 1 >>>>#3 aAAAAAAAAAGAAGGR 1-n_acPro/ 2 0 0 1 >>>>#4 aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2 1 0 0 >>>>#5 AAAAAAALQAK 2 0 1 1 >>>>#6 aAAAAAGAGPEMVR 1-n_acPro/ 2 0 0 2 >>>> >>>>resCounts<- f(lista,FacGroup) >>>>t.test.p.value <- function(...) { >>>> obj<-try(t.test(...), silent=TRUE) >>>> if (is(obj, "try-error")) return(NA) else return(obj$p.value) >>>> } >>>> >>>>#3rd function for p-value >>>>fpv<- function(Countdata){ >>>>resNew<-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub("[0-9]","",names(Countdata)[4:ncol(Countdata)])), >>>> function(i) {x<-if(ncol(Countdata[i])>1) rowSums(Countdata[i]) else >>>>Countdata[i]; colnames(x)<-NULL;x})) >>>>indx<-combn(names(resNew),2) >>>>resPval<-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) >>>>{x<-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); >>>>colnames(x)<-paste("Pvalue",paste(indx[,i],collapse=""),sep="_");x})) >>>>resF<-cbind(resCounts,resPval) >>>>resF >>>>} >>>> >>>>fpv(resCounts) >>>> >>>> >>>>I need a t test (it's in this function). But I need a chisq.test corrected >>>>and a Anova with data in attach. >>>>Sorry! >>>> >>>> >>>>No dia 12 de Mar de 2013 20:08, "arun" <smartpink...@yahoo.com> escreveu: >>>> >>>>where is the reference "t-test above"? >>>>> >>>>>Which dataset you want to do this? >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>________________________________ >>>>>From: Vera Costa <veracosta...@gmail.com> >>>>>To: arun <smartpink...@yahoo.com> >>>>>Sent: Tuesday, March 12, 2013 1:50 PM >>>>>Subject: Re: new question >>>>> >>>>> >>>>>Hi. >>>>> >>>>>Could I ask a little help? >>>>> >>>>>Could you hel me to do a chisq.test (corrected), and a Anova, like a >>>>>t-test above? After that I need to remove all data with a p values<0.05. >>>>> >>>>>Sorry and thank you again >>>>> >>>>> >>>>> >>>>>2013/3/7 arun <smartpink...@yahoo.com> >>>>> >>>>>Hi, >>>>>> >>>>>> >>>>>>directory<- "/home/arunksa111/dados" #renamed directory to dados >>>>>> >>>>>>filelist<-function(directory,number,list1){ >>>>>>setwd(directory) >>>>>>filelist1<-dir(directory) >>>>>>direct<-dir(directory,pattern = >>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, recursive >>>>>>= TRUE) >>>>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = >>>>>>"\t",stringsAsFactors=FALSE)) >>>>>>names(list1)<-filelist1 >>>>>>list2<- list(filelist1,list1) >>>>>>return(list2) >>>>>>} >>>>>>foldernames1<-filelist(directory,23,list1)[[1]] >>>>>>foldernames1 >>>>>>#[1] "a1" "a2" "c1" "c2" "c3" "t1" "t2" >>>>>> >>>>>>lista<-filelist(directory,23,list1)[[2]] #lista output >>>>>> >>>>>>#If you look at the >>>>>> lapply(lista,function(x) sapply(x,class)) #some spec were integer, and >>>>>>some were character >>>>>>#do this >>>>>> listaNew<-lapply(lista,function(x) within(x,{spec<- as.character(spec)})) >>>>>> >>>>>> >>>>>>FacGroup<- c("c1","c3","t2") >>>>>>#Second function >>>>>>#f<- function(....) >>>>>> >>>>>>head(f(listaNew,FacGroup)) >>>>>> >>>>>># Seq Mod z c1 c3 t2 >>>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2 0 0 1 >>>>>>#2 aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2 0 0 1 >>>>>>#3 aAAAAAAAAAGAAGGR 1-n_acPro/ 2 0 0 1 >>>>>>#4 aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2 1 0 0 >>>>>>#5 AAAAAAALQAK 2 0 1 1 >>>>>>#6 aAAAAAGAGPEMVR 1-n_acPro/ 2 0 0 2 >>>>>> >>>>>> >>>>>> >>>>>>A.K. >>>>>> >>>>>> >>>>>> >>>>>>________________________________ >>>>>>From: Vera Costa <veracosta...@gmail.com> >>>>>>To: arun <smartpink...@yahoo.com> >>>>>> >>>>>>Sent: Thursday, March 7, 2013 7:12 AM >>>>>>Subject: Re: new question >>>>>> >>>>>> >>>>>> >>>>>>Hi. >>>>>> >>>>>>Sorry again a question about this, but when I run this code I have this >>>>>>error: >>>>>> >>>>>>Error in `[.data.table`(x1, , `:=`(spec, paste(spec, collapse = ",")), : >>>>>> Type of RHS ('character') must match LHS ('integer'). To check and >>>>>>coerce would impact performance too much for the fastest cases. Either >>>>>>change the type of the target column, or coerce the RHS of := yourself >>>>>>(e.g. by using 1L instead of 1) >>>>>> >>>>>>Could you help me to with this? How can I eliminate this? >>>>>> >>>>>>Thank you >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>2013/2/28 arun <smartpink...@yahoo.com> >>>>>> >>>>>>Hi, >>>>>>>directory<- "/home/arunksa111/data.new" >>>>>>>#first function >>>>>>>filelist<-function(directory,number,list1){ >>>>>>>setwd(directory) >>>>>>>filelist1<-dir(directory) >>>>>>> >>>>>>>direct<-dir(directory,pattern = >>>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, >>>>>>>recursive = TRUE) >>>>>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = >>>>>>>"\t",stringsAsFactors=FALSE)) >>>>>>>names(list1)<-filelist1 >>>>>>>list2<- list(filelist1,list1) >>>>>>>return(list2) >>>>>>>} >>>>>>>foldernames1<-filelist(directory,23,list1)[[1]] >>>>>>>foldernames1 >>>>>>>#[1] "a1" "c1" "c2" "c3" "t1" "t2" >>>>>>>lista<-filelist(directory,23,list1)[[2]] #lista output >>>>>>> >>>>>>>FacGroup<- c("c1","c3","t2") >>>>>>> >>>>>>>#Second function >>>>>>>f<-function(listRes,Toselect){ >>>>>>>res2<-split(listRes,gsub("[0-9]","",names(listRes))) >>>>>>>res3<-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) >>>>>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")])) >>>>>>>res4<-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]]) >>>>>>>res4New<- lapply(res4,function(x) lapply(names(x), function(i) >>>>>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) )) >>>>>>>library(plyr) >>>>>>>library(data.table) >>>>>>>res5<-lapply(res4New,function(x) lapply(x,function(x1){ x1<- >>>>>>>data.table(x1);x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]})) >>>>>>>res6<- lapply(res5,function(x) lapply(x,function(x1) >>>>>>>{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s", "", >>>>>>>unlist(strsplit(x2, ",")))));x3<-as.data.frame(x1);names(x3)[6]<- >>>>>>>as.character(unique(x3$folder_name));x3[,-c(1,5)]})) >>>>>>> >>>>>>>res7<-lapply(res6,function(x) Reduce(function(...) >>>>>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),x)) >>>>>>> res8<-res7[lapply(res7,length)!=0] >>>>>>> res9<- Reduce(function(...) >>>>>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),res8) >>>>>>>res9[is.na(res9)] <- 0 >>>>>>>return(res9) >>>>>>>} >>>>>>> >>>>>>>f(lista,FacGroup) >>>>>>> head(f(lista,FacGroup)) >>>>>>> # Seq Mod z c1 c3 t2 >>>>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2 0 0 1 >>>>>>>#2 aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2 0 0 1 >>>>>>>#3 aAAAAAAAAAGAAGGR 1-n_acPro/ 2 0 0 1 >>>>>>>#4 aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2 1 0 0 >>>>>>>#5 AAAAAAALQAK 2 0 1 1 >>>>>>>#6 aAAAAAGAGPEMVR 1-n_acPro/ 2 0 0 2 >>>>>>> >>>>>>>resCounts<- f(lista,FacGroup) >>>>>>>t.test.p.value <- function(...) { >>>>>>> obj<-try(t.test(...), silent=TRUE) >>>>>>> if (is(obj, "try-error")) return(NA) else return(obj$p.value) >>>>>>> } >>>>>>> >>>>>>>#3rd function for p-value >>>>>>>fpv<- function(Countdata){ >>>>>>>resNew<-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub("[0-9]","",names(Countdata)[4:ncol(Countdata)])), >>>>>>> function(i) {x<-if(ncol(Countdata[i])>1) rowSums(Countdata[i]) else >>>>>>>Countdata[i]; colnames(x)<-NULL;x})) >>>>>>>indx<-combn(names(resNew),2) >>>>>>>resPval<-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) >>>>>>>{x<-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); >>>>>>>colnames(x)<-paste("Pvalue",paste(indx[,i],collapse=""),sep="_");x})) >>>>>>>resF<-cbind(resCounts,resPval) >>>>>>>resF >>>>>>>} >>>>>>> >>>>>>>fpv(resCounts) >>>>>>> >>>>>>> >>>>>>>A.K. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>________________________________ >>>>>>>From: Vera Costa <veracosta...@gmail.com> >>>>>>>To: arun <smartpink...@yahoo.com> >>>>>>>Sent: Thursday, February 28, 2013 11:30 AM >>>>>>>Subject: new question >>>>>>> >>>>>>> >>>>>>> >>>>>>>Sorry about my question, but I need a new small thing...I need to split >>>>>>>my function to read data and to do the treatment of the data. >>>>>>> >>>>>>>At first I need to know the "names" of the files and read data, and >>>>>>>after a new function with my analysis. >>>>>>> >>>>>>>So, I did this >>>>>>> >>>>>>>directory<-"C:/Users/Vera Costa/Desktop/data.new" >>>>>>>filelist<-function(directory,number){ >>>>>>>setwd(directory) >>>>>>>filelist<-dir(directory) >>>>>>>return(filelist) >>>>>>>direct<-dir(directory,pattern = >>>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, >>>>>>>recursive = TRUE) >>>>>>>lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep = "\t")) >>>>>>>names(lista)<-filelist >>>>>>>return(lista) >>>>>>>} >>>>>>>filelist(directory,23) >>>>>>> >>>>>>> >>>>>>>###"a1" "a2" "c1" "c2" "c3" "t1" "t2" >>>>>>> >>>>>>> >>>>>>> >>>>>>>and after >>>>>>> >>>>>>>f<-function(filelist,FacGroup){ >>>>>>> >>>>>>>res2<-split(lista,names(lista)) >>>>>>> res3<- lapply(res2,function(x) >>>>>>>{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x}) >>>>>>>res3 >>>>>>>#Freq FDR<0.01 >>>>>>> res4<-lapply(seq_along(res3),function(i) lapply(res3[[i]],function(x) >>>>>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")])) >>>>>>> names(res4)<- names(res2) >>>>>>> res4 >>>>>>> res4New<-lapply(res4,function(x) lapply(names(x),function(i) >>>>>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) )) >>>>>>> res5<- lapply(res4New,function(x) if(length(x)>1) tail(x,-1) else NULL) >>>>>>> library(plyr) >>>>>>> library(data.table) >>>>>>> res6<- lapply(res5,function(x) lapply(x,function(x1) >>>>>>>{x1<-data.table(x1); x1[,spec:=past >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>How can I "ask lista in second function? Could you help me? >>>>>>> >>>>>> >>>>> >>>> >>> >> >
VeraBP.pdf
Description: Adobe PDF document
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.