Re: [R] new question

arun Fri, 15 Mar 2013 20:11:33 -0700

Hi,
Try this:

directory<- "/home/arunksa111/dados" 
#modified the function
GetFileList <- function(directory,number){
 setwd(directory)
 filelist1<-dir()
    lista<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), 
full.names = TRUE, recursive = TRUE)
     output<- list(filelist1,lista)
 return(output)
    }


 file.list.names<-GetFileList(directory,23)[[1]]
 lista<-GetFileList(directory,23)[[2]]
FacGroup<-c(0,1,0,2,2,0,3)


ReadDir<-function(FacGroup){
 list.new<-lista[FacGroup!=0]
 read.list<-lapply(list.new, function(x) read.table(x,header=TRUE, sep = "\t"))
 names(read.list)<-file.list.names[FacGroup!=0]
 return (read.list)
} 
ListFacGroup<-ReadDir(FacGroup)

z.boxplot<- function(lst){
new.list<-  lapply(lst,function(x) x[x$FDR<0.01,])
pdf("VeraBP.pdf")
lapply(names(new.list),function(x) lapply(new.list[x],function(y) 
boxplot(FDR~z,data=y,xlab="Charge",ylab="FDR",main=x)))
dev.off()
}
z.boxplot(ListFacGroup)

A.K.

________________________________
From: Vera Costa <veracosta...@gmail.com>
To: arun <smartpink...@yahoo.com> 
Sent: Friday, March 15, 2013 2:08 PM
Subject: Re: new question


Sorry, you could give me a small new help?

Using the same data, I need a boxplot by groups.

I write he the functions I'm using. The last (z.boxplot is what I need, the 
other is ok). Thank you one more time.

GetFileList <- function(directory,number){
 setwd(directory)
 filelist1<-dir()[file.info(dir())$isdir]
    direct<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), 
full.names = FALSE, recursive = TRUE)
 direct<-lapply(direct,function(x) paste(directory,"/",x,sep=""))
    lista<-unlist(direct)
 output<- list(filelist1,lista)
 return(output)
    }

ReadDir<-function(FacGroup){
 list.new<-lista[FacGroup!=0]
 read.list<-lapply(list.new, function(x) read.table(x,header=TRUE, sep = "\t"))
 names(read.list)<-file.list.names[FacGroup!=0]
 return (read.list)
} 

directory<-"C:/Users/Vera Costa/Desktop/dados.lixo"
 file.list.names<-GetFileList(directory,23) [[1]]
 lista<-GetFileList(directory,23) [[2]]
FacGroup<-c(0,1,0,2,2,0,3)
ListFacGroup<-ReadDir(FacGroup)
#zPValues(ListFacGroup,FacGroup) 

z.boxplot <- function(lista) {
#I need eliminate all data with FDR<0.01
new.list<-lista[FDR<0.01]
#boxplots split by groups 
boxplot(FDR ~ z, data = dct1,  xlab = "Charge", ylab = 
"FDR",main=(paste("t",i)))
 }
z.boxplot(ListFacGroup)




2013/3/13 Vera Costa <veracosta...@gmail.com>

No problem!
>Sorry my questions.
>
>
>
>2013/3/13 arun <smartpink...@yahoo.com>
>
>As I mentioned earlier, I don't find it useful to do anova on that kind of 
>data.  Previously, I tried with chisq.test also.  It gave warnings() and then 
>you responded that it is not correct.  I would suggest you to dput an example 
>dataset of the specific columns  that you want to compare (possibly by row) 
>and post in the R-help list.  If you get any reply, then you can implement it 
>on your whole list of files.  Sorry, today, I am busy.    
>>
>>
>>
>>
>>
>>
>>
>>________________________________
>>From: Vera Costa <veracosta...@gmail.com>
>>To: arun <smartpink...@yahoo.com>
>>Sent: Wednesday, March 13, 2013 9:43 AM
>>
>>Subject: Re: new question
>>
>>
>>Ok. Thank you.
>>Could you help me to apply this?
>>
>>
>>
>>2013/3/13 arun <smartpink...@yahoo.com>
>>
>>you are comparing one datapoint to another.  It doesn't make sense.  For 
>>anova, you need replications to calculate df.  may be you could try 
>>chisq.test.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>________________________________
>>>From: Vera Costa <veracosta...@gmail.com>
>>>To: arun <smartpink...@yahoo.com>
>>>Sent: Wednesday, March 13, 2013 8:56 AM
>>>
>>>Subject: Re: new question
>>>
>>>
>>>I agree with you.
>>>
>>>I write this tests because I need to compare with some test. I agree is not 
>>>very correct, but what is bioconductor?I need to eliminate some data (rows) 
>>>not very significant based in some statistics. What about your idea? How can 
>>>I do this?
>>>
>>>
>>>
>>>2013/3/13 arun <smartpink...@yahoo.com>
>>>
>>>Ok.
>>>>
>>>>"
>>>>
>>>>I need a t test (it's in this function). But I need a chisq.test corrected 
>>>>and a Anova with data in attach.
>>>>"
>>>>What do you mean by this?
>>>>
>>>>Though, I calculated the t test based on comparing a single value against 
>>>>another for each row, I don't think it makes sense statistically.  Here, 
>>>>you are estimating the mean by just one value, which then is the mean value 
>>>>and comparing it with another value.  It doesn't make much sense.  I think 
>>>>in bioconductor there are some packages which do this kind of comparison (I 
>>>>don't remember the names).  Also, I am not sure what kind of inference you 
>>>>want from chisquare test.  Also, from anova test (?using just 2 datapoints) 
>>>>(if the comparison is rowwise).
>>>>
>>>>
>>>>
>>>>
>>>>________________________________
>>>>From: Vera Costa <veracosta...@gmail.com>
>>>>To: arun <smartpink...@yahoo.com>
>>>>Sent: Tuesday, March 12, 2013 6:04 PM
>>>>
>>>>Subject: Re: new question
>>>>
>>>>
>>>>Ok. It isn't the last code...
>>>>You sent me this code
>>>>
>>>>directory<- "/home/arunksa111/data.new"
>>>>#first function
>>>>filelist<-function(directory,number,list1){
>>>>setwd(directory)
>>>>filelist1<-dir(directory)
>>>>
>>>>direct<-dir(directory,pattern = paste("MSMS_",number,"PepInfo.txt",sep=""), 
>>>>full.names = FALSE, recursive = TRUE)
>>>>
>>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
>>>>"\t",stringsAsFactors=FALSE))
>>>>names(list1)<-filelist1
>>>>list2<- list(filelist1,list1)
>>>>return(list2)
>>>>}
>>>>foldernames1<-filelist(directory,23,list1)[[1]]
>>>>foldernames1
>>>>#[1] "a1" "c1" "c2" "c3" "t1" "t2"
>>>>lista<-filelist(directory,23,list1)[[2]] #lista output
>>>>
>>>>FacGroup<- c("c1","c3","t2")
>>>>
>>>>#Second function
>>>>f<-function(listRes,Toselect){
>>>>res2<-split(listRes,gsub("[0-9]","",names(listRes)))
>>>>res3<-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) 
>>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
>>>>res4<-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]])
>>>>res4New<- lapply(res4,function(x) lapply(names(x), function(i) 
>>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
>>>>library(plyr)
>>>>library(data.table)
>>>>res5<-lapply(res4New,function(x) lapply(x,function(x1){ x1<- 
>>>>data.table(x1);x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]}))
>>>>res6<- lapply(res5,function(x) lapply(x,function(x1) 
>>>>{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s", "", 
>>>>unlist(strsplit(x2, ",")))));x3<-as.data.frame(x1);names(x3)[6]<- 
>>>>as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
>>>>
>>>>res7<-lapply(res6,function(x) Reduce(function(...) 
>>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),x))
>>>> res8<-res7[lapply(res7,length)!=0]
>>>> res9<- Reduce(function(...) merge(...,by=c("Seq","Mod","z"),all=TRUE),res8)
>>>>res9[is.na(res9)] <- 0
>>>>return(res9)
>>>>}
>>>>
>>>>f(lista,FacGroup)
>>>> head(f(lista,FacGroup))
>>>> #                    Seq        Mod z c1 c3 t2
>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2  0  0  1
>>>>#2  aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
>>>>#3       aAAAAAAAAAGAAGGR 1-n_acPro/ 2  0  0  1
>>>>#4  aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
>>>>#5            AAAAAAALQAK            2  0  1  1
>>>>#6         aAAAAAGAGPEMVR 1-n_acPro/ 2  0  0  2
>>>>
>>>>resCounts<- f(lista,FacGroup)
>>>>t.test.p.value <- function(...) {
>>>>    obj<-try(t.test(...), silent=TRUE)
>>>>    if (is(obj, "try-error")) return(NA) else return(obj$p.value)
>>>> }
>>>>
>>>>#3rd function for p-value
>>>>fpv<- function(Countdata){
>>>>resNew<-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub("[0-9]","",names(Countdata)[4:ncol(Countdata)])),
>>>> function(i) {x<-if(ncol(Countdata[i])>1) rowSums(Countdata[i]) else 
>>>>Countdata[i]; colnames(x)<-NULL;x}))
>>>>indx<-combn(names(resNew),2)
>>>>resPval<-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) 
>>>>{x<-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); 
>>>>colnames(x)<-paste("Pvalue",paste(indx[,i],collapse=""),sep="_");x}))
>>>>resF<-cbind(resCounts,resPval)
>>>>resF
>>>>}
>>>>
>>>>fpv(resCounts)
>>>>
>>>>
>>>>I need a t test (it's in this function). But I need a chisq.test corrected 
>>>>and a Anova with data in attach.
>>>>Sorry!
>>>>
>>>>
>>>>No dia 12 de Mar de 2013 20:08, "arun" <smartpink...@yahoo.com> escreveu:
>>>>
>>>>where is the reference "t-test above"?
>>>>>
>>>>>Which dataset you want to do this?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>________________________________
>>>>>From: Vera Costa <veracosta...@gmail.com>
>>>>>To: arun <smartpink...@yahoo.com>
>>>>>Sent: Tuesday, March 12, 2013 1:50 PM
>>>>>Subject: Re: new question
>>>>>
>>>>>
>>>>>Hi.
>>>>>
>>>>>Could I ask a little help?
>>>>>
>>>>>Could you hel me to do a chisq.test (corrected), and a Anova, like a 
>>>>>t-test above? After that I need to remove all data with a p values<0.05.
>>>>>
>>>>>Sorry and thank you again
>>>>>
>>>>>
>>>>>
>>>>>2013/3/7 arun <smartpink...@yahoo.com>
>>>>>
>>>>>Hi,
>>>>>>
>>>>>>
>>>>>>directory<- "/home/arunksa111/dados" #renamed directory to dados
>>>>>>
>>>>>>filelist<-function(directory,number,list1){
>>>>>>setwd(directory)
>>>>>>filelist1<-dir(directory)
>>>>>>direct<-dir(directory,pattern = 
>>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, recursive 
>>>>>>= TRUE)
>>>>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
>>>>>>"\t",stringsAsFactors=FALSE))
>>>>>>names(list1)<-filelist1
>>>>>>list2<- list(filelist1,list1)
>>>>>>return(list2)
>>>>>>}
>>>>>>foldernames1<-filelist(directory,23,list1)[[1]]
>>>>>>foldernames1
>>>>>>#[1] "a1" "a2" "c1" "c2" "c3" "t1" "t2"
>>>>>>
>>>>>>lista<-filelist(directory,23,list1)[[2]] #lista output 
>>>>>>
>>>>>>#If you look at the
>>>>>> lapply(lista,function(x) sapply(x,class)) #some spec were integer, and 
>>>>>>some were character
>>>>>>#do this
>>>>>> listaNew<-lapply(lista,function(x) within(x,{spec<- as.character(spec)}))
>>>>>>
>>>>>>
>>>>>>FacGroup<- c("c1","c3","t2")
>>>>>>#Second function
>>>>>>#f<- function(....)
>>>>>>
>>>>>>head(f(listaNew,FacGroup))
>>>>>>
>>>>>>#                     Seq        Mod z c1 c3 t2
>>>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2  0  0  1
>>>>>>#2  aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
>>>>>>#3       aAAAAAAAAAGAAGGR 1-n_acPro/ 2  0  0  1
>>>>>>#4  aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
>>>>>>#5            AAAAAAALQAK            2  0  1  1
>>>>>>#6         aAAAAAGAGPEMVR 1-n_acPro/ 2  0  0  2
>>>>>>
>>>>>>
>>>>>>
>>>>>>A.K.
>>>>>>
>>>>>>
>>>>>>
>>>>>>________________________________
>>>>>>From: Vera Costa <veracosta...@gmail.com>
>>>>>>To: arun <smartpink...@yahoo.com>
>>>>>>
>>>>>>Sent: Thursday, March 7, 2013 7:12 AM
>>>>>>Subject: Re: new question
>>>>>>
>>>>>>
>>>>>>
>>>>>>Hi.
>>>>>>
>>>>>>Sorry again a question about this, but when I run this code I have this 
>>>>>>error:
>>>>>>
>>>>>>Error in `[.data.table`(x1, , `:=`(spec, paste(spec, collapse = ",")),  :
>>>>>>  Type of RHS ('character') must match LHS ('integer'). To check and 
>>>>>>coerce would impact performance too much for the fastest cases. Either 
>>>>>>change the type of the target column, or coerce the RHS of := yourself 
>>>>>>(e.g. by using 1L instead of 1)
>>>>>>
>>>>>>Could you help me to with this? How can I eliminate this?
>>>>>>
>>>>>>Thank you
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>2013/2/28 arun <smartpink...@yahoo.com>
>>>>>>
>>>>>>Hi,
>>>>>>>directory<- "/home/arunksa111/data.new"
>>>>>>>#first function
>>>>>>>filelist<-function(directory,number,list1){
>>>>>>>setwd(directory)
>>>>>>>filelist1<-dir(directory)
>>>>>>>
>>>>>>>direct<-dir(directory,pattern = 
>>>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, 
>>>>>>>recursive = TRUE)
>>>>>>>list1<-lapply(direct, function(x) read.table(x,header=TRUE, sep = 
>>>>>>>"\t",stringsAsFactors=FALSE))
>>>>>>>names(list1)<-filelist1
>>>>>>>list2<- list(filelist1,list1)
>>>>>>>return(list2)
>>>>>>>}
>>>>>>>foldernames1<-filelist(directory,23,list1)[[1]]
>>>>>>>foldernames1
>>>>>>>#[1] "a1" "c1" "c2" "c3" "t1" "t2"
>>>>>>>lista<-filelist(directory,23,list1)[[2]] #lista output
>>>>>>>
>>>>>>>FacGroup<- c("c1","c3","t2")
>>>>>>>
>>>>>>>#Second function
>>>>>>>f<-function(listRes,Toselect){
>>>>>>>res2<-split(listRes,gsub("[0-9]","",names(listRes)))
>>>>>>>res3<-lapply(seq_along(res2),function(i) lapply(res2[[i]],function(x) 
>>>>>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
>>>>>>>res4<-lapply(res3,function(x) x[names(x)[names(x)%in%Toselect]])
>>>>>>>res4New<- lapply(res4,function(x) lapply(names(x), function(i) 
>>>>>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
>>>>>>>library(plyr)
>>>>>>>library(data.table)
>>>>>>>res5<-lapply(res4New,function(x) lapply(x,function(x1){ x1<- 
>>>>>>>data.table(x1);x1[,spec:=paste(spec,collapse=","),by=c("Seq","Mod","z")]}))
>>>>>>>res6<- lapply(res5,function(x) lapply(x,function(x1) 
>>>>>>>{x1$counts<-sapply(x1$spec, function(x2) length(gsub("\\s", "", 
>>>>>>>unlist(strsplit(x2, ",")))));x3<-as.data.frame(x1);names(x3)[6]<- 
>>>>>>>as.character(unique(x3$folder_name));x3[,-c(1,5)]}))
>>>>>>> 
>>>>>>>res7<-lapply(res6,function(x) Reduce(function(...) 
>>>>>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),x))
>>>>>>> res8<-res7[lapply(res7,length)!=0]
>>>>>>> res9<- Reduce(function(...) 
>>>>>>>merge(...,by=c("Seq","Mod","z"),all=TRUE),res8)
>>>>>>>res9[is.na(res9)] <- 0
>>>>>>>return(res9)
>>>>>>>}
>>>>>>>
>>>>>>>f(lista,FacGroup)
>>>>>>> head(f(lista,FacGroup))
>>>>>>> #                    Seq        Mod z c1 c3 t2
>>>>>>>#1 aAAAAAAAAAAAAAATATAGPR 1-n_acPro/ 2  0  0  1
>>>>>>>#2  aAAAAAAAAAAASSPVGVGQR 1-n_acPro/ 2  0  0  1
>>>>>>>#3       aAAAAAAAAAGAAGGR 1-n_acPro/ 2  0  0  1
>>>>>>>#4  aAAAAAAAGAAGGRGSGPGRR 1-n_acPro/ 2  1  0  0
>>>>>>>#5            AAAAAAALQAK            2  0  1  1
>>>>>>>#6         aAAAAAGAGPEMVR 1-n_acPro/ 2  0  0  2
>>>>>>>
>>>>>>>resCounts<- f(lista,FacGroup)
>>>>>>>t.test.p.value <- function(...) {
>>>>>>>    obj<-try(t.test(...), silent=TRUE)
>>>>>>>    if (is(obj, "try-error")) return(NA) else return(obj$p.value)
>>>>>>> }
>>>>>>>
>>>>>>>#3rd function for p-value
>>>>>>>fpv<- function(Countdata){
>>>>>>>resNew<-do.call(cbind,lapply(split(names(Countdata)[4:ncol(Countdata)],gsub("[0-9]","",names(Countdata)[4:ncol(Countdata)])),
>>>>>>> function(i) {x<-if(ncol(Countdata[i])>1) rowSums(Countdata[i]) else 
>>>>>>>Countdata[i]; colnames(x)<-NULL;x}))
>>>>>>>indx<-combn(names(resNew),2)
>>>>>>>resPval<-do.call(cbind,lapply(seq_len(ncol(indx)),function(i) 
>>>>>>>{x<-as.data.frame(apply(resNew[,indx[,i]],1,t.test.p.value)); 
>>>>>>>colnames(x)<-paste("Pvalue",paste(indx[,i],collapse=""),sep="_");x}))
>>>>>>>resF<-cbind(resCounts,resPval)
>>>>>>>resF
>>>>>>>}
>>>>>>>
>>>>>>>fpv(resCounts)
>>>>>>>
>>>>>>>
>>>>>>>A.K.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>________________________________
>>>>>>>From: Vera Costa <veracosta...@gmail.com>
>>>>>>>To: arun <smartpink...@yahoo.com>
>>>>>>>Sent: Thursday, February 28, 2013 11:30 AM
>>>>>>>Subject: new question
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>Sorry about my question, but I need a new small thing...I need to split 
>>>>>>>my function to read data and to do the treatment of the data.
>>>>>>>
>>>>>>>At first I need to know the "names" of the files and read data, and 
>>>>>>>after a new function with my analysis.
>>>>>>>
>>>>>>>So, I did this
>>>>>>>
>>>>>>>directory<-"C:/Users/Vera Costa/Desktop/data.new" 
>>>>>>>filelist<-function(directory,number){
>>>>>>>setwd(directory)
>>>>>>>filelist<-dir(directory)
>>>>>>>return(filelist)
>>>>>>>direct<-dir(directory,pattern = 
>>>>>>>paste("MSMS_",number,"PepInfo.txt",sep=""), full.names = FALSE, 
>>>>>>>recursive = TRUE)
>>>>>>>lista<-lapply(direct, function(x) read.table(x,header=TRUE, sep = "\t"))
>>>>>>>names(lista)<-filelist
>>>>>>>return(lista)
>>>>>>>}
>>>>>>>filelist(directory,23)
>>>>>>>
>>>>>>>
>>>>>>>###"a1" "a2" "c1" "c2" "c3" "t1" "t2"
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>and after
>>>>>>>
>>>>>>>f<-function(filelist,FacGroup){
>>>>>>>
>>>>>>>res2<-split(lista,names(lista))
>>>>>>> res3<- lapply(res2,function(x) 
>>>>>>>{names(x)<-paste(gsub(".*_","",names(x)),1:length(x),sep="");x})
>>>>>>>res3
>>>>>>>#Freq FDR<0.01
>>>>>>> res4<-lapply(seq_along(res3),function(i) lapply(res3[[i]],function(x) 
>>>>>>>x[x[["FDR"]]<0.01,c("Seq","Mod","z","spec")]))
>>>>>>> names(res4)<- names(res2)
>>>>>>> res4
>>>>>>>  res4New<-lapply(res4,function(x) lapply(names(x),function(i) 
>>>>>>>do.call(rbind,lapply(x[i],function(x) cbind(folder_name=i,x))) ))
>>>>>>> res5<- lapply(res4New,function(x) if(length(x)>1) tail(x,-1) else NULL)
>>>>>>> library(plyr)
>>>>>>> library(data.table)
>>>>>>> res6<- lapply(res5,function(x) lapply(x,function(x1) 
>>>>>>>{x1<-data.table(x1); x1[,spec:=past
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>How can I "ask lista in second function? Could you help me?             
>>>>>>>    
>>>>>>   
>>>>>      
>>>> 
>>>
>>
>

VeraBP.pdf
Description: Adobe PDF document

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] new question

Reply via email to