Dear Anna, My first answer would be to insert the individuals in the phylogeny from the beginning. But: how closely related do you expect the traits of individuals in the same species to be? If you can't answer soundly to this question, my approach would be not meaningful.
What PGLS need is a correlation structure matrix for the phylogeny. Roughly, you can see that as a matrix of pairwise phylogenetic distances between species (if you may assume that the traits evolved in a brownian motion way). If you have more than one individual per species you can add them to the matrix adding a row and a column for each extra individual. The distance with the individuals in the other species can be thought to be the same. But, if I understood your scenario correctly, there may be a problem with the way you determine the distance between two individuals in the same species, as you miss the genetic information for the individuals. Not knowing how closely related are individuals compares to the species, may result in an arbitrary placement of the individuals pendant branches. Very short branches means really high correlation, long branches low correlation. Deciding the length of the branches for the individuals based on their traits would probably produce a risky circularity, if you then use those branch lengths to estimate the fit of the traits to an evolutionary model. Anyhow, this is just an early morning answer, hence I suggest you to wait for other, more expert, answers. Best, Giulio Valentino Dalla Riva PhD candidate @ Biomathematics Research Centre University of Canterbury Christchurch, NZ Phone: +64 3642987 ext 4869 > On 8/08/2014, at 6:35 am, "Anna Bastian" <anna.bast...@uct.ac.za> wrote: > > Hello, > > I want to do a PGLS. The problem is that I have more taxa in my data set then > in my phylogeny. > > The questions are: Are peak frequency and skull length correlated? Are bite > force and skull length correlated? Are nasal capsule volume and skull length > correlated? > > I have ten species and each species has data for at least 2 individuals. > The phylogeny on the other hand has only the ten species. > > Is there a way to use the same taxa at the tips of the phylogeny for more > than one row (=individual) in my data? > > I did the PGLS with the averaged data for a species so that I had exactly the > same number of entries (ten species in phylogeny and ten species in data). > The following regressions lacked statistical power possible due to the low > number of samples (ten species versus 61 individuals of ten species). > I extracted the residuals for each of the ten species and plotted e.g. skull > length versus peak frequency. > > Could I use the residual for a species and apply it for each individual of > that species? > Meaning, removing/subtracting the same amount of variation which is explained > by phylogeny from the value of each individual belonging to that species. > > Wouldn't this way give me the same result then using a tree with polytomies > representing the individuals within each species clade? > > I am absolutely unsure if any of this is statistically correct and would > appreciate if somebody with a more profound knowledge in this procedure could > provide advice. > Thank you very much for your time and help > > Anna > > Here is our script used for the ten species: >> setwd("xy") >> regdata<-read.table("reg.txt",sep="\t",header=TRUE) >> phyl<-read.nexus("xy.nex") >> cdata<-comparative.data(phy=phyl,data=regdata,names.col="Species",vcv=TRUE,na.omit=FALSE,warn.dropped=TRUE) >> brcor<-corBrownian(phy=phyl) >> library('nlme') >> BFvsSKL.pgls<-pgls(LogBF~logSKL,data=cdata,lambda='ML') >> summary(BFvsSKL.pgls) > > > ...and the output: > Call: > pgls(formula = LogBF ~ logSKL, data = cdata, lambda = "ML") > > Residuals: > Min 1Q Median 3Q Max > -0.10204 0.01829 0.02849 0.04665 0.05957 > > Branch length transformations: > > kappa [Fix] : 1.000 > lambda [ ML] : 1.000 > lower bound : 0.000, p = 0.047565 > upper bound : 1.000, p = 1 > 95.0% CI : (0.041, NA) > delta [Fix] : 1.000 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -4.05669 0.58857 -6.8924 0.0001255 *** > logSKL 3.37458 0.44153 7.6430 6.057e-05 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 0.05415 on 8 degrees of freedom > Multiple R-squared: 0.8795, Adjusted R-squared: 0.8645 > F-statistic: 58.42 on 1 and 8 DF, p-value: 6.057e-05 >> names(BFvsSKL.pgls) > [1] "model" "formula" "call" "RMS" "NMS" "NSSQ" > "RSSQ" "aic" "aicc" > [10] "n" "k" "sterr" "fitted" "residuals" "phyres" > "x" "data" "varNames" > [19] "y" "param" "mlVals" "namey" "bounds" "Vt" > "dname" "param.CI" >> BFvsSKL.pgls[[14]] > [,1] > Species1 -0.22451460 > Species2 -0.04249437 > Species3 -0.04078567 > Species4 -0.14226713 > Species5 -0.11653145 > Species6 0.17663610 > Species7 0.10586335 > Species8 0.07830198 > Species9 0.03776658 > Species10 0.14624467 >> BFvsSKL.pgls[[26]]->BoundsBF >> summary(BoundsBF) > Length Class Mode > kappa 0 -none- NULL > lambda 5 -none- list > delta 0 -none- NULL >> plot(LogBF~logSKL, data=regdata) >> abline(BFvsSKL.pgls) >> plot(BFvsSKL.pgls) > > ________________________________ > UNIVERSITY OF CAPE TOWN > > This e-mail is subject to the UCT ICT policies and e-mai...{{dropped:14}} > > _______________________________________________ > R-sig-phylo mailing list - R-sig-phylo@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-sig-phylo > Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/ _______________________________________________ R-sig-phylo mailing list - R-sig-phylo@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-phylo Searchable archive at http://www.mail-archive.com/r-sig-phylo@r-project.org/