Re: [R-sig-eco] question for the R community : Plot RDA biplot without axis ?
Hi :D I finally had the chance to try your solutions. It was fast to modify the code of the function and it worked! I really appreciate your comments and your help. You are fast and clear ! I modified my code and I will never put white axes ever again ;) Thank you very much. 2013/2/26 Jari Oksanen : > Sarah, > > I added argument 'axis.bp' to text.cca and points.cca functions. To upgraded > functions can be found in http://vegan.r-forge.r-project.org/ (rev2452), and > will probably be included in the next minor release of vegan (2.0-7) > scheduled for March, 2013. > > You can also get the single files from R-Forge, or you can install devel > version of vegan with > > install.packages("vegan", repos="http://r-forge.r-project.org";) > > It will take a day at minimum to get the version packaged in R-Forge. > > Cheers, Jari Oksanen > > On 26 Feb 2013, at 3:49, Sarah Loboda wrote: > >> Hi, >> Here's the reproducible example that I made with dune data. When you do the >> 4 graphs, you can see that because of the text () function, there is an >> axis on the right and values appear in the plots on the right side. I >> understand that it is because of my text () function, but is there a way to >> delete that axis in the text funtion? if not, is there another way to plot >> my data on 4 panels without axis? >> >> I don't know what you mean by "body of vegan text.cca". You mean in the >> vegan tutorial ? >> I used col.axis because ann=FALSE as an argument in plot function does not >> work and col.axis seems fine... >> >> Thank you very much for your time. I really appreciate your help :D >> >> library(vegan) >> library(MASS) >> >> ### data >> data(dune) >> data(dune.env) >> >> ### Constrained ordination >> dune.hel<-decostand(dune, "hellinger") >> dune.cca <- cca(dune ~ A1 + Manure, data=dune.env) >> >> ### Plot with 4 panels >> par(mfrow=c(2,2)) >> par(mar=c(0.3,0.3,0.3,0.3)) >> >> >> ### plot 1 >> plot(dune.cca, type = "n", scaling = 2, col.axis="white") >> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3, >> col=2)) >> ### When I add the next line, it adds env. variables as arrows but also >> adds an axis on the right >> text(dune.cca, display="bp", col=1, cex=1.1) >> >> ###plot 2 >> plot(dune.cca, type = "n", scaling = 2, col.axis="white", col="grey") >> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3, >> col=2)) >> text(dune.cca, display="bp", col=1, cex=1.1) >> >> ###plot 3 >> plot(dune.cca, type = "n", scaling = 2, col.axis="white") >> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3, >> col=2)) >> text(dune.cca, display="bp", col=1, cex=1.1) >> >> ###plot 4 >> plot(dune.cca, type = "n", scaling = 2, col.axis="white") >> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3, >> col=2)) >> text(dune.cca, display="bp", col=1, cex=1.1) >> >> 2013/2/25 Gavin Simpson >> >>> On Mon, 2013-02-25 at 13:18 -0500, Sarah Loboda wrote: Hi, I have trouble to obtain the ordination graph I want. I want to have 4 >>> RDA biplot on the same page and I don't want to have (or I want to modify) >>> the axis numbers. I want the marks on the axis without numbers to maximize >>> the space for each RDA plot. >>> >>> A problem is the call to text() ( which calls text.cca() ). It doesn't >>> pass on arguments to the underlying axis() calls and hence you can't do >>> what you are trying to do with that function directly. >>> >>> Not sure why you want the axis to be white - that draws an axis so it >>> will obscure anything drawn before it with white paint. >>> >>> The only solution at the moment will be to modify the vegan:::text.cca() >>> function to change the two calls to axis() at the end of the function >>> definition. I suspect you could just copy the body of vegan:::text.cca >>> and put it into your own function, but I haven't tried it. If that fails >>> due to namespace issues, then use assignInNamespace() to assign your >>> function to the text.cca function in vegans namespace. >>> >>> See the relevant help pages on how to do this. I'm about to leave the >>> office so I can't help further now, but if you have trouble email back >>> to the list and I'll see about cooking up and example... >>> >>> All the best >>> >>> Gavin >>> This seems like a simple task but I tried different approaches and >>> coudn't figure out how to change my axis. This is my code : par(mfrow=c(2,2)) par(mar=c(0.2,0.2,0.2,0.2)) ### first RDA biplot with(arctic, levels(site)) shapevec<- c(19,19,19,19,19,19,19,19,19,19,19,19,6,6,6,6,6,6,6,6,6,6,6,6) plot(spiders.rda.a, type = "n", scaling = 2, las=1, tcl=0.2, col.axis="white") with(spiders.env.a, points(spiders.rda.a, display = "sites", scaling = 2, pch = shapevec, cex=1.3)) text(spiders.rda.a, display="bp", cex=1.1, col.axis="white", ann=FALSE) it is when I run this line that my y ax
Re: [R-sig-eco] Should one remove highly correlated variables before doing PCA??
Two additional issues might be considered: 1. Correlated variables are still correlated after PCA or after tossing one of the variables so teasing apart separate effects of the two variables is not resolved (nor can it necessarily be resolved with the particular dataset at hand). 2. The purpose for using PCA should be clear and determined to meet your objectives. Just because you can do a PCA doesn't mean you should. For example, if PCA is performed to obtain "uncorrelated" variables for a regression, then consider that the component explaining the most variation will not necessarily be a wonderful predictor. The component explaining the least amount of variation might be the best predictor. Performing PCA for a regression has always puzzled me because why would one think that doing something in complete isolation of the dependent variable would make for better predictors. (Orthogonal and more numerically stable estimators of the coefficients, yes, but not necessarily coefficients of interest.) Jim -Original Message- From: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Chris Howden Sent: Tuesday, March 05, 2013 10:45 PM To: 张勇; r-sig-ecology@r-project.org Subject: Re: [R-sig-eco] Should one remove highly correlated variables before doing PCA?? Hi Yong, PCA is a way to deal with highly correlated variables, so there is no need to remove them. If N variables are highly correlated than they will all load out on the SAME Principal Component (Eigenvector), not different ones. This is how you identify them as being highly correlated. If you were to do further analysis U can then either: 1) Use the PCA, and interpret it according to what variables load out on it 2) Choose one of the highly correlated variables as identified as those that all load onto the same variable and analyse only it. Most people if using PCA would use option 1) A bit more detail. Many methods have a hard time dealing with multicollinearity, which is when there are a number of variables that are highly correlated (I suggest U Google it). Before analysis this is usually dealt with in one of 2 ways: 1) Use PCA to get a set of orthogonal i.e. not correlated, variables and analyse them 2) Use correlation co-efficients to determine which variables are highly correlated and use only 1 in the analysis. A cut off for highly correlated is often 0.8. Variance Inflation Factors are also used. Personally I don't like them since they don't tell me what variables are correlated with. They are also clumsy to use. U can't simply remove all variables with high VIF or you will likely remove some useful variables e.g. if 4 variables all have a high VIF U don't know if it's because all 4 are correlated or if there are 2 sets of highly correlated variables. So which do U remove??? If U must use them it's IMPERATIVE that U only remove 1 at a time and then rerun to get new VIF's, remove 1, get new VIF's, remove 1, etc this prevents U removing too many variables. Chris Howden B.Sc. (Hons) GStat. Founding Partner Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax) +612 4782 9023 ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. -Original Message- From: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of ?? Sent: Wednesday, 6 March 2013 4:33 PM To: r-sig-ecology@r-project.org Subject: [R-sig-eco] Should one remove highly correlated variables before doing PCA?? Hi list, Maybe this is not a "R" question, however, it has bothered me for a long time. Some people think if a set of correlated variables might "load" onto several principal components (eigenvectors),so including many variables from such a set will differentially weight several eigenvectors--and thereby change the directions of all eigenvectors, too. So, according to these considerations, we
[R-sig-eco] Course: Beginner's Guide to MCMC, GLM and GAM with R
There are a few places left on the following course: Beginner's Guide to MCMC, GLM and GAM with R When: 10 - 13 June 2013 Where: SAMS, Oban, Scotland Further information: http://www.highstat.com/statscourse.htm Flyer: http://www.highstat.com/Courses/Flyer2013June_SAMS.pdf Kind regards, Alain Zuur ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Should one remove highly correlated variables before doing PCA??
With reference to jims point 2. One can use Partial Least Squares, which finds orthogonal PC's that best explain a set of responses. Chris Howden Founding Partner Tricky Solutions Tricky Solutions 4 Tricky Problems Evidence Based Strategic Development, IP Commercialisation and Innovation, Data Analysis, Modelling and Training (mobile) 0410 689 945 (fax / office) ch...@trickysolutions.com.au Disclaimer: The information in this email and any attachments to it are confidential and may contain legally privileged information. If you are not the named or intended recipient, please delete this communication and contact us immediately. Please note you are not authorised to copy, use or disclose this communication or any attachments without our consent. Although this email has been checked by anti-virus software, there is a risk that email messages may be corrupted or infected by viruses or other interferences. No responsibility is accepted for such interference. Unless expressly stated, the views of the writer are not those of the company. Tricky Solutions always does our best to provide accurate forecasts and analyses based on the data supplied, however it is possible that some important predictors were not included in the data sent to us. Information provided by us should not be solely relied upon when making decisions and clients should use their own judgement. On 07/03/2013, at 1:11, "Baldwin, Jim -FS" wrote: > Two additional issues might be considered: > > 1. Correlated variables are still correlated after PCA or after tossing one > of the variables so teasing apart separate effects of the two variables is > not resolved (nor can it necessarily be resolved with the particular dataset > at hand). > > 2. The purpose for using PCA should be clear and determined to meet your > objectives. Just because you can do a PCA doesn't mean you should. For > example, if PCA is performed to obtain "uncorrelated" variables for a > regression, then consider that the component explaining the most variation > will not necessarily be a wonderful predictor. The component explaining the > least amount of variation might be the best predictor. Performing PCA for a > regression has always puzzled me because why would one think that doing > something in complete isolation of the dependent variable would make for > better predictors. (Orthogonal and more numerically stable estimators of the > coefficients, yes, but not necessarily coefficients of interest.) > > Jim > > -Original Message- > From: r-sig-ecology-boun...@r-project.org > [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Chris Howden > Sent: Tuesday, March 05, 2013 10:45 PM > To: 张勇; r-sig-ecology@r-project.org > Subject: Re: [R-sig-eco] Should one remove highly correlated variables before > doing PCA?? > > Hi Yong, > > PCA is a way to deal with highly correlated variables, so there is no need to > remove them. > > If N variables are highly correlated than they will all load out on the SAME > Principal Component (Eigenvector), not different ones. This is how you > identify them as being highly correlated. If you were to do further analysis > U can then either: > > 1) Use the PCA, and interpret it according to what variables load out on it > 2) Choose one of the highly correlated variables as identified as those that > all load onto the same variable and analyse only it. > > Most people if using PCA would use option 1) > > A bit more detail. > > Many methods have a hard time dealing with multicollinearity, which is when > there are a number of variables that are highly correlated (I suggest U > Google it). Before analysis this is usually dealt with in one of 2 ways: > 1) Use PCA to get a set of orthogonal i.e. not correlated, variables and > analyse them > 2) Use correlation co-efficients to determine which variables are highly > correlated and use only 1 in the analysis. A cut off for highly correlated is > often 0.8. > > Variance Inflation Factors are also used. Personally I don't like them since > they don't tell me what variables are correlated with. They are also clumsy > to use. U can't simply remove all variables with high VIF or you will likely > remove some useful variables e.g. if 4 variables all have a high VIF U don't > know if it's because all 4 are correlated or if there are 2 sets of highly > correlated variables. So which do U remove??? If U must use them it's > IMPERATIVE that U only remove 1 at a time and then rerun to get new VIF's, > remove 1, get new VIF's, remove 1, etc this prevents U removing too many > variables. > > > Chris Howden B.Sc. (Hons) GStat. > Founding Partner > Evidence Based Strategic Development, IP Commercialisation and Innovation, > Data Analysis, Modelling and Training > (mobile) 0410 689 945 > (fax) +612 4782 9023 > ch...@trickysolutions.com.au > > > > > Disclaimer: The information in this email and any attachments to it are > confidential and may contain legally pri
Re: [R-sig-eco] Projecting model to landscape
Li, You can use the "predict" function in the 'raster' package. There are also examples with randomForest and other techniques in this vignette that comes with the 'dismo' package: http://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf Robert > Li Wen-2 wrote > > Dear list member > > > > > > > > I have fitted a Random Forest model for species distribution, and want to > > use it to project the model to a defined landscape (i.e. forecasting for > > all > > grids in an area) . The landscape has all the environment covariates > > (rasters) and cover a large region (over 1000*1000 grids). I wonder if > > there > > is package in R to do this. Thank you for your help in advance. > > > > > > > > Cheers > > > > Li > > > [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Re: [R-sig-eco] Projecting model to landscape
Hi, Robert and all others Thanks for the suggestions and directions pointed out from the community. It seems there are two packages for the work: 1. Using predict function from package "raster". The function can project a fitted model of any class that has a 'predict' method (or supplying a similar method as fun argument. E.g. glm, gam, or randomForest) to a raster object. Examples can be found in package "dismo"; and 2. Using package "Biomod", which can build, evaluate and forecast up to 8 models (e.g. Maxent, MARS, RF, etc.). However, Biomod can only handle presence/absence data (i.e. probability or abundant data need to transform to 0/1). Thanks again for the help. Li -Original Message- From: r-sig-ecology-boun...@r-project.org [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Robert J. Hijmans Sent: Thursday, 7 March 2013 10:35 AM To: r-sig-ecology@r-project.org Subject: Re: [R-sig-eco] Projecting model to landscape Li, You can use the "predict" function in the 'raster' package. There are also examples with randomForest and other techniques in this vignette that comes with the 'dismo' package: http://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf Robert > Li Wen-2 wrote > > Dear list member > > > > > > > > I have fitted a Random Forest model for species distribution, and want to > > use it to project the model to a defined landscape (i.e. forecasting for > > all > > grids in an area) . The landscape has all the environment covariates > > (rasters) and cover a large region (over 1000*1000 grids). I wonder if > > there > > is package in R to do this. Thank you for your help in advance. > > > > > > > > Cheers > > > > Li > > > [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology -- This email is intended for the addressee(s) named and may contain confidential and/or privileged information. If you are not the intended recipient, please notify the sender and then delete it immediately. Any views expressed in this email are those of the individual sender except where the sender expressly and with authority states them to be the views of the Office of Environment and Heritage, NSW Department of Premier and Cabinet. PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS EMAIL [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] quantifying directed dependence of environmental factors
Hello, I'm posting to this list because I believe it's the best place to go. My question is R related only inasmuch as all the work I've done so far has been with R and I expect any answers I get from here will lead me to more R work. I'm consulting with an ecologist and an engineer on a project related to a reservoir nearby. They've collected data on diatoms in the reservoir via core samples; they have sections of data over the past 100yrs. They are looking at the community structure plus other environmental factors over the same time period. We've done a ton of work already and there's no point trying to hash all of that out here. Short story: we did an NMDS, it fits OK (stress 0.17), there are obvious clusters in the ordination which correspond to a-priori clusters from ecological considerations (and which match an independent cluster analysis), we're really quite pleased overall. We checked for relationships with =envfit=, most environmental variables are *highly* significant, yet there are a couple which aren't significant at all. Here comes my question: The ecologist pointed out to me that our environmental variables don't have equal status (ecologically speaking); some variables lead to others. For instance, there are so-called ultimate factors (population, percentage farmland) which contribute to intermediate factors (suspended solids, total phosphorous) which in turn contribute to direct factors (AREA, pH,...) which then in turn contribute to diatom structure. We have measured data on all the above and several more. The model we are fitting with =envfit= is symmetric in those n environmental variables, but the ecology of the situation isn't symmetric, it's a directed top-down kind of relationship. He asked me, "How can we quantify that? How can we demonstrate that? Can we quantify/demonstrate that?" I don't know. There are ecologists on this list: what am I looking for, here? What methods do ecologists use to answer this (or related) question(s)? Feel free to direct me to papers, literature, textbooks, whatever. I'm trying to help answer this question and (this not being my subject specialty) I'm at a bit of a loss. If there are relevant R packages/vignettes/manuals you can point me to, that'd be cool too. Thanks for reading all the way down to here. Jay P.S. If it hadn't been for the archives of this list containing lengthy and poignant answers to *several* questions I've had already then I couldn't even have made it this far. Thank you! -- G. Jay Kerns, Ph.D. Youngstown State University http://people.ysu.edu/~gkerns/ ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
[R-sig-eco] nMDS plot with points of different size
Dear list members, I want to plot an nMDS diagram with points' area proportional to the abundance of particular species. I could imagine just plotting with type = "n" and then using points() with different cex, but may be some special functions/packages exist for that which you can point me to? Thank you, Stas Junior Res Asst Hydrobiology Lab Institute of Limnology Russian Academy of Sciences Sevastyanova 9 196105 Russia, St Petersburg http://www.limno.org.ru Phone: +7 (812) 387-80-60 Fax: +7 (812) 388-73-27 [[alternative HTML version deleted]] ___ R-sig-ecology mailing list R-sig-ecology@r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology