Thank you David. Using xtabs operation simplifies the code very much, many thanks ;)
On Tue, Jun 6, 2017 at 7:44 AM, David Winsemius <dwinsem...@comcast.net> wrote: > > > On Jun 6, 2017, at 4:01 AM, Jim Lemon <drjimle...@gmail.com> wrote: > > > > Hi Bogdan, > > Kinda messy, but: > > > > N <- data.frame(N=c("n1","n2","n3","n4")) > > M <- data.frame(M=c("m1","m2","m3","m4","m5")) > > C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), > I=c(100,300,400)) > > MN<-as.data.frame(matrix(NA,nrow=length(N[,1]),ncol=length(M[,1]))) > > names(MN)<-M[,1] > > rownames(MN)<-N[,1] > > C[,1]<-as.character(C[,1]) > > C[,2]<-as.character(C[,2]) > > for(row in 1:dim(C)[1]) MN[C[row,1],C[row,2]]<-C[row,3] > > `xtabs` offers another route: > > C$m <- factor(C$m, levels=M$M) > C$n <- factor(C$n, levels=N$N) > > Option 1: Zeroes in the empty positions: > > (X <- xtabs(I ~ m+n , C, addNA=TRUE)) > n > m n1 n2 n3 n4 > m1 100 300 0 0 > m2 0 0 0 0 > m3 0 0 400 0 > m4 0 0 0 0 > m5 0 0 0 0 > > Option 2: Sparase matrix > > (X <- xtabs(I ~ m+n , C, sparse=TRUE)) > 5 x 4 sparse Matrix of class "dgCMatrix" > n > m n1 n2 n3 n4 > m1 100 300 . . > m2 . . . . > m3 . . 400 . > m4 . . . . > m5 . . . . > > I wasn't sure if the sparse reuslts of xtabs would make a distinction > between 0 and NA, but happily it does: > > > C <- data.frame(n=c("n1","n2","n3", "n3", "n4"), m=c("m1","m1","m3", > "m4", "m5"), I=c(100,300,400, NA, 0)) > > C > n m I > 1 n1 m1 100 > 2 n2 m1 300 > 3 n3 m3 400 > 4 n3 m4 NA > 5 n4 m5 0 > > (X <- xtabs(I ~ m+n , C, sparse=TRUE)) > 4 x 4 sparse Matrix of class "dgCMatrix" > n > m n1 n2 n3 n4 > m1 100 300 . . > m3 . . 400 . > m4 . . . . > m5 . . . 0 > > (In the example I forgot to repeat the lines that augmented the factor > levels so m2 is not seen. > > -- > Davod > > > > > > Jim > > > > On Tue, Jun 6, 2017 at 3:51 PM, Bogdan Tanasa <tan...@gmail.com> wrote: > >> Dear Bert, > >> > >> thank you for your response. here it is the piece of R code : given 3 > data > >> frames below --- > >> > >> N <- data.frame(N=c("n1","n2","n3","n4")) > >> > >> M <- data.frame(M=c("m1","m2","m3","m4","m5")) > >> > >> C <- data.frame(n=c("n1","n2","n3"), m=c("m1","m1","m3"), > I=c(100,300,400)) > >> > >> how shall I integrate N, and M, and C in such a way that at the end we > have > >> a data frame with : > >> > >> > >> - list N as the columns names > >> - list M as the rows names > >> - the values in the cells of N * M, corresponding to the numerical > >> values in the data frame C. > >> > >> more precisely, the result shall be : > >> > >> n1 n2 n3 n4 > >> m1 100 200 - - > >> m2 - - - - > >> m3 - - 300 - > >> m4 - - - - > >> m5 - - - - > >> > >> thank you ! > >> > >> > >> On Mon, Jun 5, 2017 at 6:57 PM, Bert Gunter <bgunter.4...@gmail.com> > wrote: > >> > >>> Reproducible example, please. -- In particular, what exactly does C > look > >>> ilike? > >>> > >>> (You should know this by now). > >>> > >>> -- Bert > >>> Bert Gunter > >>> > >>> "The trouble with having an open mind is that people keep coming along > >>> and sticking things into it." > >>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>> > >>> > >>> On Mon, Jun 5, 2017 at 6:45 PM, Bogdan Tanasa <tan...@gmail.com> > wrote: > >>>> Dear all, > >>>> > >>>> please could you advise on the R code I could use in order to do the > >>>> following operation : > >>>> > >>>> a. -- I have 2 lists of "genome coordinates" : a list is composed by > >>>> numbers that represent genome coordinates; > >>>> > >>>> let's say list N : > >>>> > >>>> n1 > >>>> > >>>> n2 > >>>> > >>>> n3 > >>>> > >>>> n4 > >>>> > >>>> and a list M: > >>>> > >>>> m1 > >>>> > >>>> m2 > >>>> > >>>> m3 > >>>> > >>>> m4 > >>>> > >>>> m5 > >>>> > >>>> 2 -- and a data frame C, where for some pairs of coordinates (n,m) > from > >>> the > >>>> lists above, we have a numerical intensity; > >>>> > >>>> for example : > >>>> > >>>> n1; m1; 100 > >>>> > >>>> n1; m2; 300 > >>>> > >>>> The question would be : what is the most efficient R code I could use > in > >>>> order to integrate the list N, the list M, and the data frame C, in > order > >>>> to obtain a DATA FRAME, > >>>> > >>>> -- list N as the columns names > >>>> -- list M as the rows names > >>>> -- the values in the cells of N * M, corresponding to the numerical > >>> values > >>>> in the data frame C. > >>>> > >>>> A little example would be : > >>>> > >>>> n1 n2 n3 n4 > >>>> > >>>> m1 100 - - - > >>>> > >>>> m2 300 - - - > >>>> > >>>> m3 - - - - > >>>> > >>>> m4 - - - - > >>>> > >>>> m5 - - - - > >>>> I wrote a script in perl, although i would like to do this in R > >>>> Many thanks ;) > >>>> -- bogdan > >>>> > >>>> [[alternative HTML version deleted]] > >>>> > >>>> ______________________________________________ > >>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>> PLEASE do read the posting guide http://www.R-project.org/ > >>> posting-guide.html > >>>> and provide commented, minimal, self-contained, reproducible code. > >>> > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > David Winsemius > Alameda, CA, USA > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.