Hi John,

You seem to transform a (relatively) simple problem into a complicated one. 
First, you can get all the tree file names in one command, such as (I moved to 
a directory with one subdir with trees estimated by ML and another one by NJ; 
this is slightly arranged):

> f <- grep("\\.tre", list.files(recursive = TRUE), value = TRUE)
> f
[1] "ML/trees_55species.tre"      "ML/treesNADH2_45species.tre"
[3] "ML/TR.ML.Dloop_55sp.tre"     "ML/TR.ML.NADH2_45sp.tre"    
[5] "nj/trees_55species.tre"      "nj/treesNADH2_45species.tre"
[7] "nj/TR.NJ.Dloop_55sp.tre"     "nj/TR.NJ.NADH2_45sp.tre"    

Because in R file paths are resolved relatively, there is no need to navigate 
with setwd().

Second, I think you should use a list instead of a data frame because (I 
presume) you may have files with different numbers of trees (if this is not the 
case, you can transform the list in a data frame later).

You may have commands like this, eg, if you want to get the mean branch length 
of each tree:

ntree <- length(f)
L <- list()
for (i in 1:ntree) {
    tr <- read.tree(f[i])
    if (class(tr) == "phylo") L[[i]] <- mean(tr$edge.length)
    if (class(tr) == "multiPhylo") L[[i]] <- sapply(tr, function(x) 
mean(x$edge.length))
}

Finally, you may name your list with the file names:

names(L) <- f

This has the advantage that you can select some of the results, eg, the trees 
that were estimated by NJ:

> grep("nj/", names(L))
[1] 5 6 7 8

or those from D-loop:

> grep("Dloop", names(L))
[1] 3 7

You get the number of trees in each file with sapply(L, length).

There can be many variations around this scheme. For instance, if you want to 
extract the branch lengths as in your example, the two lines above with "mean" 
would become:

    if (class(tr) == "phylo") L[[i]] <- tr$edge.length
    if (class(tr) == "multiPhylo") L[[i]] <- lapply(tr, "$", edge.length)

Best,

Emmanuel
-----Original Message-----
From: John Denton <[email protected]>
Sender: [email protected]
Date: Tue, 24 Apr 2012 20:30:26 
To: R Sig Phylo Listserv<[email protected]>
Subject: [R-sig-phylo] problems with assign(), paste(),
 and data.frame() for folders containing trees

Hi folks,

I am trying to recurse through several numbered subfolders in a directory. Each 
folder has many trees that I want to display summary values for. I have been 
expanding data frames using code with the structure name <- rbind(name, 
newvals) to produce a data frame with n rows equal to the number of files in 
one of the folders, and n column equal to the number of values in the file.

I can loop over the values within a single subdirectory fine with, for example,

library(ape)

trees <- list.files(pattern="*.tre")
iters=length(trees)

branchdata.5 <- data.frame()

iterations <- as.character(c(1:length(trees)))

for (i in 1:iters) {

tree <- read.tree(trees[i])
iteration.edges.5 <- as.numeric(tree$edge.length)

branchdata.5 <- rbind(branchdata.5, iteration.edges.5)

}

The problem comes when I want to iterate through the numbered subdirectories 
while also iterating through the files in a given directory. I want to 
recursively assign these data frames as well, with something like

f <- list.dirs(path = "/.../.../etc", full.names = FALSE, recursive = FALSE)

for (j in 1:length(f)) {

setwd(paste("/.../.../.",j,sep=""))

assign( paste("branchdata.5",j,sep=""), data.frame() )

iterations <- as.character(c(1:length(trees)))

for (i in 1:iters) {

tree <- read.tree(trees[i])
assign(paste("iteration.edges.5",j,sep=""), as.numeric(tree$edge.length) )

paste("branchdata.5",j,sep="") <- rbind(paste("branchdata.5",j,sep=""), 
paste("iteration.edges.5",j,sep=""))

}

names(iterations) <- NULL
boxplot(t(paste("branchdata.5",j,sep="")) , horizontal=TRUE , names=iterations 
, ylim=c(0,2), xlab="Branch Lengths" , ylab="Iterations" , main = "")

}

The problem seems to be in the rbind() when using values with assign() and 
paste(). I would love some help on this! 


John S. S. Denton
Ph.D. Candidate
Department of Ichthyology and Richard Gilder Graduate School
American Museum of Natural History
www.johnssdenton.com
_______________________________________________
R-sig-phylo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
_______________________________________________
R-sig-phylo mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

Reply via email to