To get just the list you wanted, Gabor's solution is more elegant, but here's another using the apply family. First, your data:
dat <- scan(file="/g/bork8/waller/test_COGtoPath.txt",what="character",sep="\n") I expect dat to be a vector of strings where each string is a line of values separated by tabs, which I think, by looking at your other code, is what you get. sapply(dat, function(x){ tmp<-unlist(strsplit(x, '\t', fixed=T)) out <- list(tmp[seq_along(tmp)[-1]]) names(out) <- tmp[1] out }, USE.NAMES=F) The one difference between the two is that if you have a COG with no pathways (might not be realistic or that big of a deal), this solution will have the COG name in the list with a value of character(0) where Gabor's will omit the COG completely. Again, probably not a big deal. Cheers, Jeff. On Sun, Oct 10, 2010 at 11:40 AM, Alison Waller <alison.wal...@embl.de> wrote: > Hi all, > > I have a large table mapping thousands of COGs(groups of genes) to pathways. > # Ex > COG0001 patha pathb pathc > COG0002 pathd pathe > COG0003 pathe pathf pathg pathh > ## > > I would like to combine this information into a big list such as below > COG2PATHWAY<-list(COG0001=c("patha","pathb","pathc"),COG0002=c("pathd","pathe"),COG0003=c("pathf","pathg","pathh")) > > I am stuck and have tried various methods involving (probably mangled) > versions of lappy and loops. > > Any suggestions on the most efficient way to do this would be great. > > Thanks, > > Alison > > Here is my latest attempt. > > ##### > > line_num<-length(scan(file="/g/bork8/waller/test_COGtoPath.txt",what="character",sep="\n")) > COG2Path<-vector("list",line_num) > COG2Path<-lapply(1:(line_num-1),function(x) > scan(file="/g/bork8/waller/test_COGtopath.txt",skip=x,nlines=1,quiet=T,what='character',sep="\t")) > > ##### > > I am getting an error > > ##### > >>COG2Path<-lapply(1:(line_num-1),function(x) >> scan(file="/g/bork8/waller/test_COGtopath.txt",skip=x,nlines=1,quiet=T,what='character',sep="\t")) > Error in file(file, "r") : cannot open the connection > In addition: Warning message: > In file(file, "r") : > > But if I do scan alone I don't get an error > > # then I suppose it looks like the easiest wasy to name the list variables > is using unix to cut the first column out and then read that in. > names(COG2Path)<-scan(file="/g/bork8/waller/test_col_names.txt",sep="\t",what="character") > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.