Hi John,
You seem to transform a (relatively) simple problem into a complicated one.
First, you can get all the tree file names in one command, such as (I moved to
a directory with one subdir with trees estimated by ML and another one by NJ;
this is slightly arranged):
> f <- grep("\\.tre", list.files(recursive = TRUE), value = TRUE)
> f
[1] "ML/trees_55species.tre" "ML/treesNADH2_45species.tre"
[3] "ML/TR.ML.Dloop_55sp.tre" "ML/TR.ML.NADH2_45sp.tre"
[5] "nj/trees_55species.tre" "nj/treesNADH2_45species.tre"
[7] "nj/TR.NJ.Dloop_55sp.tre" "nj/TR.NJ.NADH2_45sp.tre"
Because in R file paths are resolved relatively, there is no need to navigate
with setwd().
Second, I think you should use a list instead of a data frame because (I
presume) you may have files with different numbers of trees (if this is not the
case, you can transform the list in a data frame later).
You may have commands like this, eg, if you want to get the mean branch length
of each tree:
ntree <- length(f)
L <- list()
for (i in 1:ntree) {
tr <- read.tree(f[i])
if (class(tr) == "phylo") L[[i]] <- mean(tr$edge.length)
if (class(tr) == "multiPhylo") L[[i]] <- sapply(tr, function(x)
mean(x$edge.length))
}
Finally, you may name your list with the file names:
names(L) <- f
This has the advantage that you can select some of the results, eg, the trees
that were estimated by NJ:
> grep("nj/", names(L))
[1] 5 6 7 8
or those from D-loop:
> grep("Dloop", names(L))
[1] 3 7
You get the number of trees in each file with sapply(L, length).
There can be many variations around this scheme. For instance, if you want to
extract the branch lengths as in your example, the two lines above with "mean"
would become:
if (class(tr) == "phylo") L[[i]] <- tr$edge.length
if (class(tr) == "multiPhylo") L[[i]] <- lapply(tr, "$", edge.length)
Best,
Emmanuel
-----Original Message-----
From: John Denton <[email protected]>
Sender: [email protected]
Date: Tue, 24 Apr 2012 20:30:26
To: R Sig Phylo Listserv<[email protected]>
Subject: [R-sig-phylo] problems with assign(), paste(),
and data.frame() for folders containing trees
Hi folks,
I am trying to recurse through several numbered subfolders in a directory. Each
folder has many trees that I want to display summary values for. I have been
expanding data frames using code with the structure name <- rbind(name,
newvals) to produce a data frame with n rows equal to the number of files in
one of the folders, and n column equal to the number of values in the file.
I can loop over the values within a single subdirectory fine with, for example,
library(ape)
trees <- list.files(pattern="*.tre")
iters=length(trees)
branchdata.5 <- data.frame()
iterations <- as.character(c(1:length(trees)))
for (i in 1:iters) {
tree <- read.tree(trees[i])
iteration.edges.5 <- as.numeric(tree$edge.length)
branchdata.5 <- rbind(branchdata.5, iteration.edges.5)
}
The problem comes when I want to iterate through the numbered subdirectories
while also iterating through the files in a given directory. I want to
recursively assign these data frames as well, with something like
f <- list.dirs(path = "/.../.../etc", full.names = FALSE, recursive = FALSE)
for (j in 1:length(f)) {
setwd(paste("/.../.../.",j,sep=""))
assign( paste("branchdata.5",j,sep=""), data.frame() )
iterations <- as.character(c(1:length(trees)))
for (i in 1:iters) {
tree <- read.tree(trees[i])
assign(paste("iteration.edges.5",j,sep=""), as.numeric(tree$edge.length) )
paste("branchdata.5",j,sep="") <- rbind(paste("branchdata.5",j,sep=""),
paste("iteration.edges.5",j,sep=""))
}
names(iterations) <- NULL
boxplot(t(paste("branchdata.5",j,sep="")) , horizontal=TRUE , names=iterations
, ylim=c(0,2), xlab="Branch Lengths" , ylab="Iterations" , main = "")
}
The problem seems to be in the rbind() when using values with assign() and
paste(). I would love some help on this!
John S. S. Denton
Ph.D. Candidate
Department of Ichthyology and Richard Gilder Graduate School
American Museum of Natural History
www.johnssdenton.com
_______________________________________________
R-sig-phylo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
_______________________________________________
R-sig-phylo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo