Re: [R-sig-eco] question for the R community : Plot RDA biplot without axis ?

2013-03-06 Thread Sarah Loboda
Hi :D
I finally had the chance to try your solutions. It was fast to modify
the code of the function and it worked!
I really appreciate your comments and your help. You are fast and clear !
I modified my code and I will never put white axes ever again ;)
Thank you very much.

2013/2/26 Jari Oksanen :
> Sarah,
>
> I added argument 'axis.bp' to text.cca and points.cca functions. To upgraded 
> functions can be found in http://vegan.r-forge.r-project.org/ (rev2452), and 
> will probably be included in the next minor release of vegan (2.0-7) 
> scheduled for March, 2013.
>
> You can also get the single files from R-Forge, or you can install devel 
> version of vegan with
>
> install.packages("vegan", repos="http://r-forge.r-project.org";)
>
> It will take a day at minimum to get the version packaged in R-Forge.
>
> Cheers, Jari Oksanen
>
> On 26 Feb 2013, at 3:49, Sarah Loboda wrote:
>
>> Hi,
>> Here's the reproducible example that I made with dune data. When you do the
>> 4 graphs, you can see that because of the text () function, there is an
>> axis on the right and values appear in the plots on the right side. I
>> understand that it is because of my text () function, but is there a way to
>> delete that axis in the text funtion? if not, is there another way to plot
>> my data on 4 panels without axis?
>>
>> I don't know what you mean by "body of vegan text.cca". You mean in the
>> vegan tutorial ?
>> I used col.axis because ann=FALSE as an argument in plot function does not
>> work and col.axis seems fine...
>>
>> Thank you very much for your time. I really appreciate your help :D
>>
>> library(vegan)
>> library(MASS)
>>
>> ### data
>> data(dune)
>> data(dune.env)
>>
>> ### Constrained ordination
>> dune.hel<-decostand(dune, "hellinger")
>> dune.cca <- cca(dune ~ A1 + Manure, data=dune.env)
>>
>> ### Plot with 4 panels
>> par(mfrow=c(2,2))
>> par(mar=c(0.3,0.3,0.3,0.3))
>>
>>
>> ### plot 1
>> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
>> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
>> col=2))
>> ### When I add the next line, it adds env. variables as arrows but also
>> adds an axis on the right
>> text(dune.cca, display="bp", col=1, cex=1.1)
>>
>> ###plot 2
>> plot(dune.cca, type = "n", scaling = 2, col.axis="white", col="grey")
>> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
>> col=2))
>> text(dune.cca, display="bp", col=1, cex=1.1)
>>
>> ###plot 3
>> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
>> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
>> col=2))
>> text(dune.cca, display="bp", col=1, cex=1.1)
>>
>> ###plot 4
>> plot(dune.cca, type = "n", scaling = 2, col.axis="white")
>> with(dune.env, points(dune.cca, display = "sites", scaling = 2, cex=1.3,
>> col=2))
>> text(dune.cca, display="bp", col=1, cex=1.1)
>>
>> 2013/2/25 Gavin Simpson 
>>
>>> On Mon, 2013-02-25 at 13:18 -0500, Sarah Loboda wrote:
 Hi,
 I have trouble to obtain the ordination graph I want. I want to have 4
>>> RDA
 biplot on the same page and I don't want to have (or I want to modify)
>>> the
 axis numbers. I want the marks on the axis without numbers to maximize
>>> the
 space for each RDA plot.
>>>
>>> A problem is the call to text() ( which calls text.cca() ). It doesn't
>>> pass on arguments to the underlying axis() calls and hence you can't do
>>> what you are trying to do with that function directly.
>>>
>>> Not sure why you want the axis to be white - that draws an axis so it
>>> will obscure anything drawn before it with white paint.
>>>
>>> The only solution at the moment will be to modify the vegan:::text.cca()
>>> function to change the two calls to axis() at the end of the function
>>> definition. I suspect you could just copy the body of vegan:::text.cca
>>> and put it into your own function, but I haven't tried it. If that fails
>>> due to namespace issues, then use assignInNamespace() to assign your
>>> function to the text.cca function in vegans namespace.
>>>
>>> See the relevant help pages on how to do this. I'm about to leave the
>>> office so I can't help further now, but if you have trouble email back
>>> to the list and I'll see about cooking up and example...
>>>
>>> All the best
>>>
>>> Gavin
>>>
 This seems like a simple task but I tried different approaches and
>>> coudn't
 figure out how to change my axis. This is my code :

 par(mfrow=c(2,2))
 par(mar=c(0.2,0.2,0.2,0.2))

 ### first RDA biplot
 with(arctic, levels(site))
 shapevec<- c(19,19,19,19,19,19,19,19,19,19,19,19,6,6,6,6,6,6,6,6,6,6,6,6)
 plot(spiders.rda.a, type = "n", scaling = 2, las=1, tcl=0.2,
 col.axis="white")
 with(spiders.env.a, points(spiders.rda.a, display = "sites",
scaling = 2, pch = shapevec, cex=1.3))
 text(spiders.rda.a, display="bp", cex=1.1, col.axis="white", ann=FALSE)
  it is when I run this line that my y ax

Re: [R-sig-eco] Should one remove highly correlated variables before doing PCA??

2013-03-06 Thread Baldwin, Jim -FS
Two additional issues might be considered:

1.  Correlated variables are still correlated after PCA or after tossing one of 
the variables so teasing apart separate effects of the two variables is not 
resolved (nor can it necessarily be resolved with the particular dataset at 
hand).

2.  The purpose for using PCA should be clear and determined to meet your 
objectives.  Just because you can do a PCA doesn't mean you should.  For 
example, if PCA is performed to obtain "uncorrelated" variables for a 
regression, then consider that the component explaining the most variation will 
not necessarily be a wonderful predictor.  The component explaining the least 
amount of variation might be the best predictor.  Performing PCA for a 
regression has always puzzled me because why would one think that doing 
something in complete isolation of the dependent variable would make for better 
predictors.  (Orthogonal and more numerically stable estimators of the 
coefficients, yes, but not necessarily coefficients of interest.)

Jim

-Original Message-
From: r-sig-ecology-boun...@r-project.org 
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Chris Howden
Sent: Tuesday, March 05, 2013 10:45 PM
To: 张勇; r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] Should one remove highly correlated variables before 
doing PCA??

Hi Yong,

PCA is a way to deal with highly correlated variables, so there is no need to 
remove them.

If N variables are highly correlated than they will all load out on the SAME 
Principal Component (Eigenvector), not different ones. This is how you identify 
them as being highly correlated. If you were to do further analysis U can then 
either:

1) Use the PCA, and interpret it according to what variables load out on it
2) Choose one of the highly correlated variables as identified as those that 
all load onto the same variable and analyse only it.

Most people if using PCA would use option 1)

A bit more detail.

Many methods have a hard time dealing with multicollinearity, which is when 
there are a number of variables that are highly correlated (I suggest U Google 
it). Before analysis this is usually dealt with in one of 2 ways:
1) Use PCA to get a set of orthogonal i.e. not correlated, variables and 
analyse them
2) Use correlation co-efficients to determine which variables are highly 
correlated and use only 1 in the analysis. A cut off for highly correlated is 
often 0.8.

Variance Inflation Factors are also used. Personally I don't like them since 
they don't tell me what variables are correlated with. They are also clumsy to 
use. U can't simply remove all variables with high VIF or you will likely 
remove some useful variables e.g. if 4 variables all have a high VIF U don't 
know if it's because all 4 are correlated or if there are 2 sets of highly 
correlated variables. So which do U remove???  If U must use them it's 
IMPERATIVE that U only remove 1 at a time and then rerun to get new VIF's, 
remove 1, get new VIF's, remove 1, etc this prevents U removing too many 
variables.


Chris Howden B.Sc. (Hons) GStat.
Founding Partner
Evidence Based Strategic Development, IP Commercialisation and Innovation, Data 
Analysis, Modelling and Training
(mobile) 0410 689 945
(fax) +612 4782 9023
ch...@trickysolutions.com.au




Disclaimer: The information in this email and any attachments to it are 
confidential and may contain legally privileged information. If you are not the 
named or intended recipient, please delete this communication and contact us 
immediately. Please note you are not authorised to copy, use or disclose this 
communication or any attachments without our consent. Although this email has 
been checked by anti-virus software, there is a risk that email messages may be 
corrupted or infected by viruses or other interferences. No responsibility is 
accepted for such interference. Unless expressly stated, the views of the 
writer are not those of the company.
Tricky Solutions always does our best to provide accurate forecasts and 
analyses based on the data supplied, however it is possible that some important 
predictors were not included in the data sent to us. Information provided by us 
should not be solely relied upon when making decisions and clients should use 
their own judgement.

-Original Message-
From: r-sig-ecology-boun...@r-project.org
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of ??
Sent: Wednesday, 6 March 2013 4:33 PM
To: r-sig-ecology@r-project.org
Subject: [R-sig-eco] Should one remove highly correlated variables before doing 
PCA??

Hi list,

Maybe this is not a "R" question, however, it has bothered me for a long time.

Some people think if a set of correlated variables might "load" onto several 
principal components (eigenvectors),so including many variables from such a set 
will differentially weight several eigenvectors--and thereby change the 
directions of all eigenvectors, too.  So, according to these considerations, we 

[R-sig-eco] Course: Beginner's Guide to MCMC, GLM and GAM with R

2013-03-06 Thread Highland Statistics Ltd


There are a few places left on the following course:   Beginner's Guide 
to MCMC, GLM and GAM with R



When:  10 - 13 June 2013
Where: SAMS, Oban, Scotland


Further information: http://www.highstat.com/statscourse.htm
Flyer: http://www.highstat.com/Courses/Flyer2013June_SAMS.pdf

Kind regards,

Alain Zuur

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Should one remove highly correlated variables before doing PCA??

2013-03-06 Thread Chris Howden
With reference to jims point 2. One can use Partial Least Squares,
which finds orthogonal PC's that best explain a set of responses.

Chris Howden
Founding Partner
Tricky Solutions
Tricky Solutions 4 Tricky Problems
Evidence Based Strategic Development, IP Commercialisation and
Innovation, Data Analysis, Modelling and Training

(mobile) 0410 689 945
(fax / office)
ch...@trickysolutions.com.au

Disclaimer: The information in this email and any attachments to it are
confidential and may contain legally privileged information. If you are not
the named or intended recipient, please delete this communication and
contact us immediately. Please note you are not authorised to copy,
use or disclose this communication or any attachments without our
consent. Although this email has been checked by anti-virus software,
there is a risk that email messages may be corrupted or infected by
viruses or other
interferences. No responsibility is accepted for such interference. Unless
expressly stated, the views of the writer are not those of the
company. Tricky Solutions always does our best to provide accurate
forecasts and analyses based on the data supplied, however it is
possible that some important predictors were not included in the data
sent to us. Information provided by us should not be solely relied
upon when making decisions and clients should use their own judgement.

On 07/03/2013, at 1:11, "Baldwin, Jim -FS"  wrote:

> Two additional issues might be considered:
>
> 1.  Correlated variables are still correlated after PCA or after tossing one 
> of the variables so teasing apart separate effects of the two variables is 
> not resolved (nor can it necessarily be resolved with the particular dataset 
> at hand).
>
> 2.  The purpose for using PCA should be clear and determined to meet your 
> objectives.  Just because you can do a PCA doesn't mean you should.  For 
> example, if PCA is performed to obtain "uncorrelated" variables for a 
> regression, then consider that the component explaining the most variation 
> will not necessarily be a wonderful predictor.  The component explaining the 
> least amount of variation might be the best predictor.  Performing PCA for a 
> regression has always puzzled me because why would one think that doing 
> something in complete isolation of the dependent variable would make for 
> better predictors.  (Orthogonal and more numerically stable estimators of the 
> coefficients, yes, but not necessarily coefficients of interest.)
>
> Jim
>
> -Original Message-
> From: r-sig-ecology-boun...@r-project.org 
> [mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Chris Howden
> Sent: Tuesday, March 05, 2013 10:45 PM
> To: 张勇; r-sig-ecology@r-project.org
> Subject: Re: [R-sig-eco] Should one remove highly correlated variables before 
> doing PCA??
>
> Hi Yong,
>
> PCA is a way to deal with highly correlated variables, so there is no need to 
> remove them.
>
> If N variables are highly correlated than they will all load out on the SAME 
> Principal Component (Eigenvector), not different ones. This is how you 
> identify them as being highly correlated. If you were to do further analysis 
> U can then either:
>
> 1) Use the PCA, and interpret it according to what variables load out on it
> 2) Choose one of the highly correlated variables as identified as those that 
> all load onto the same variable and analyse only it.
>
> Most people if using PCA would use option 1)
>
> A bit more detail.
>
> Many methods have a hard time dealing with multicollinearity, which is when 
> there are a number of variables that are highly correlated (I suggest U 
> Google it). Before analysis this is usually dealt with in one of 2 ways:
> 1) Use PCA to get a set of orthogonal i.e. not correlated, variables and 
> analyse them
> 2) Use correlation co-efficients to determine which variables are highly 
> correlated and use only 1 in the analysis. A cut off for highly correlated is 
> often 0.8.
>
> Variance Inflation Factors are also used. Personally I don't like them since 
> they don't tell me what variables are correlated with. They are also clumsy 
> to use. U can't simply remove all variables with high VIF or you will likely 
> remove some useful variables e.g. if 4 variables all have a high VIF U don't 
> know if it's because all 4 are correlated or if there are 2 sets of highly 
> correlated variables. So which do U remove???  If U must use them it's 
> IMPERATIVE that U only remove 1 at a time and then rerun to get new VIF's, 
> remove 1, get new VIF's, remove 1, etc this prevents U removing too many 
> variables.
>
>
> Chris Howden B.Sc. (Hons) GStat.
> Founding Partner
> Evidence Based Strategic Development, IP Commercialisation and Innovation, 
> Data Analysis, Modelling and Training
> (mobile) 0410 689 945
> (fax) +612 4782 9023
> ch...@trickysolutions.com.au
>
>
>
>
> Disclaimer: The information in this email and any attachments to it are 
> confidential and may contain legally pri

Re: [R-sig-eco] Projecting model to landscape

2013-03-06 Thread Robert J. Hijmans
Li,

You can use the "predict" function in the 'raster' package. There are also
examples with randomForest and other techniques in this vignette that comes
with the 'dismo' package:
http://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf

Robert



> Li Wen-2 wrote
> > Dear list member
> >
> >
> >
> > I have fitted a Random Forest model for species distribution, and want to
> > use it to project the model to a defined landscape (i.e. forecasting for
> > all
> > grids in an area) . The landscape has all the environment covariates
> > (rasters) and cover a large region (over 1000*1000 grids). I wonder if
> > there
> > is package in R to do this. Thank you for your help in advance.
> >
> >
> >
> > Cheers
> >
> > Li
> >
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Projecting model to landscape

2013-03-06 Thread Li Wen
Hi, Robert and all others



Thanks for the suggestions and directions pointed out from the community. It 
seems there are two packages for the work:

1. Using predict function from package "raster". The function can project a 
fitted model of any class that has a 'predict' method (or supplying a similar 
method as fun argument. E.g. glm, gam, or randomForest) to a raster object. 
Examples can be found in package "dismo"; and

2. Using package "Biomod", which can build, evaluate and forecast up to 8 
models (e.g. Maxent, MARS, RF, etc.). However, Biomod can only handle 
presence/absence data (i.e. probability or abundant data need to transform to 
0/1).



Thanks again for the help.



Li



-Original Message-
From: r-sig-ecology-boun...@r-project.org 
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Robert J. Hijmans
Sent: Thursday, 7 March 2013 10:35 AM
To: r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] Projecting model to landscape



Li,



You can use the "predict" function in the 'raster' package. There are also

examples with randomForest and other techniques in this vignette that comes

with the 'dismo' package:

http://cran.r-project.org/web/packages/dismo/vignettes/sdm.pdf



Robert







> Li Wen-2 wrote

> > Dear list member

> >

> >

> >

> > I have fitted a Random Forest model for species distribution, and want to

> > use it to project the model to a defined landscape (i.e. forecasting for

> > all

> > grids in an area) . The landscape has all the environment covariates

> > (rasters) and cover a large region (over 1000*1000 grids). I wonder if

> > there

> > is package in R to do this. Thank you for your help in advance.

> >

> >

> >

> > Cheers

> >

> > Li

> >

>



  [[alternative HTML version deleted]]



___

R-sig-ecology mailing list

R-sig-ecology@r-project.org

https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
--
This email is intended for the addressee(s) named and may contain confidential 
and/or privileged information. 
If you are not the intended recipient, please notify the sender and then delete 
it immediately.
Any views expressed in this email are those of the individual sender except 
where the sender expressly and with authority states them to be the views of 
the Office of Environment and Heritage, NSW Department of Premier and Cabinet.

PLEASE CONSIDER THE ENVIRONMENT BEFORE PRINTING THIS EMAIL

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] quantifying directed dependence of environmental factors

2013-03-06 Thread Jay Kerns
Hello,

I'm posting to this list because I believe it's the best place to
go.  My question is R related only inasmuch as all the work I've
done so far has been with R and I expect any answers I get from
here will lead me to more R work.

I'm consulting with an ecologist and an engineer on a project
related to a reservoir nearby.  They've collected data on diatoms
in the reservoir via core samples; they have sections of data over
the past 100yrs.  They are looking at the community structure
plus other environmental factors over the same time period.

We've done a ton of work already and there's no point trying to
hash all of that out here.  Short story: we did an NMDS, it fits
OK (stress 0.17), there are obvious clusters in the ordination
which correspond to a-priori clusters from ecological
considerations (and which match an independent cluster analysis),
we're really quite pleased overall.  We checked for relationships
with =envfit=, most environmental variables are *highly*
significant, yet there are a couple which aren't significant at
all.  Here comes my question:

The ecologist pointed out to me that our environmental variables
don't have equal status (ecologically speaking); some variables
lead to others.  For instance, there are so-called ultimate
factors (population, percentage farmland) which contribute to
intermediate factors (suspended solids, total phosphorous) which
in turn contribute to direct factors (AREA, pH,...) which then in
turn contribute to diatom structure.

We have measured data on all the above and several more.  The
model we are fitting with =envfit= is symmetric in those n
environmental variables, but the ecology of the situation isn't
symmetric, it's a directed top-down kind of relationship.  He
asked me, "How can we quantify that?  How can we demonstrate
that?  Can we quantify/demonstrate that?"  I don't know.

There are ecologists on this list: what am I looking for, here?
What methods do ecologists use to answer this (or related)
question(s)?  Feel free to direct me to papers, literature,
textbooks, whatever.  I'm trying to help answer this question
and (this not being my subject specialty) I'm at a bit of a loss.

If there are relevant R packages/vignettes/manuals you can point
me to, that'd be cool too.

Thanks for reading all the way down to here.

Jay

P.S. If it hadn't been for the archives of this list containing
lengthy and poignant answers to *several* questions I've had
already then I couldn't even have made it this far.  Thank you!



-- 
G. Jay Kerns, Ph.D.
Youngstown State University
http://people.ysu.edu/~gkerns/

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] nMDS plot with points of different size

2013-03-06 Thread Stas Malavin
Dear list members,

I want to plot an nMDS diagram with points' area proportional to the
abundance of particular species. I could imagine just plotting with type =
"n" and then using points() with different cex, but may be some special
functions/packages exist for that which you can point me to?

Thank you,
Stas



Junior Res Asst
Hydrobiology Lab
Institute of Limnology
Russian Academy of Sciences

Sevastyanova 9
196105 Russia, St Petersburg
http://www.limno.org.ru
Phone: +7 (812) 387-80-60
Fax: +7 (812) 388-73-27

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology