On Sun, Oct 10, 2010 at 11:40 AM, Alison Waller <alison.wal...@embl.de> wrote: > Hi all, > > I have a large table mapping thousands of COGs(groups of genes) to pathways. > # Ex > COG0001 patha pathb pathc > COG0002 pathd pathe > COG0003 pathe pathf pathg pathh > ## > > I would like to combine this information into a big list such as below > COG2PATHWAY<-list(COG0001=c("patha","pathb","pathc"),COG0002=c("pathd","pathe"),COG0003=c("pathf","pathg","pathh")) > > I am stuck and have tried various methods involving (probably mangled) > versions of lappy and loops. > > Any suggestions on the most efficient way to do this would be great. >
Try this: Lines <- "COG0001 patha pathb pathc COG0002 pathd pathe COG0003 pathe pathf pathg pathh" DF <- read.table(textConnection(Lines), header = FALSE, fill = TRUE, as.is = TRUE, na.strings = "") library(reshape2) m <- na.omit(melt(DF, 1)) result <- unstack(m, value ~ V1) giving > result $COG0001 [1] "patha" "pathb" "pathc" $COG0002 [1] "pathd" "pathe" $COG0003 [1] "pathe" "pathf" "pathg" "pathh" or > acast(DF, value ~ V1) COG0001 COG0002 COG0003 patha patha <NA> <NA> pathb pathb <NA> <NA> pathc pathc <NA> <NA> pathd <NA> pathd <NA> pathe <NA> pathe pathe pathf <NA> <NA> pathf pathg <NA> <NA> pathg pathh <NA> <NA> pathh Levels: patha pathb pathc pathd pathe pathf pathg pathh -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.