[R-sig-eco] Mantel test with multiple imputation, P values

2013-08-18 Thread Jacob Cram
Dear List,
  I have an environmental data set with several missing values that I
am trying to relate to a community structure data set using a mantel
test.   One solution to the missing data problem seems to be multiple
imputation; I am using the Amelia package. This generates several (five in
this example) imputed data sets.  I can run mantel on each of these and
come up with five similar but not identical solutions.  I figure I can
average the mantel rho values.  However, I am not sure what to do about the
P values. From looking around online, it looks like I shouldn't take the
average of p values.  I found this reference <
http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=article&id=164:combining-p-values-from-multiple-imputations&catid=57:multiple-imputation&Itemid=98>
that seems to have promising suggestions, but I can't seem to figure out
how I'd implement any of these in R.  I was hoping somebody might have a
suggestion for how I could combine my p values.  One option, I think would
be to take the highest (worst) p value (in the example below, this would be
p = 0.012).  However for large numbers of imputations, I am believe that
this method might be to conservative.  Another option might be to take the
p value corresponding to the median rho score (in the example below this
would be p =0.008).   Thoughts?
-Jacob


##Example Code Below
require(Amelia)
require(vegan)
require(ecodist)

##Species data matrix with environmental data that are missing some values.
data(varespec)
data(varechem)
varechem.missings <- varechem[,c("N", "P", "K")]
varechem.missings[c(1,5, 7, 15, 20),1] <- NA
varechem.missings[c(1,2, 9, 21), 2] <- NA

#I multiply impute the missing values with the Amelia package
imps <- 5
#imps <- 25
varechem.amelia <- amelia(varechem.missings, m = imps)


#for each imputation of the environmental data I run a mantel test and save
#the results to mresults
mresults <- NULL

for(i in 1:imps){
varespec.dist <- vegdist(varespec)
varechem.am.dist <- dist(varechem.amelia$imputations[[i]])
mresults <- rbind(mresults,
(ecodist::mantel(varespec.dist~varechem.am.dist)))
}

mresults

##mantelr pval1 pval2 pval3 llim.2.5% ulim.97.5%
## [1,] 0.2137656 0.008 0.993 0.008 0.1015176  0.3389979
## [2,] 0.2162388 0.011 0.990 0.011 0.1207528  0.3346554
## [3,] 0.2149556 0.012 0.989 0.016 0.1319943  0.3279028
## [4,] 0.2101820 0.009 0.992 0.012 0.1217293  0.3288272
## [5,] 0.2135279 0.006 0.995 0.006 0.1130386  0.3359864

#based on these results what would be a reasonable p value to report for
the environmental parameters relating to the community structure?

##end example

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel test with multiple imputation, P values

2013-08-18 Thread Krzysztof Sakrejda
Hi Jacob, comments below.

On Aug 18, 2013 2:31 PM, "Jacob Cram"  wrote:
>
> Dear List,
>   I have an environmental data set with several missing values that I
> am trying to relate to a community structure data set using a mantel
> test.   One solution to the missing data problem seems to be multiple
> imputation; I am using the Amelia package. This generates several (five in
> this example) imputed data sets.  I can run mantel on each of these and
> come up with five similar but not identical solutions.  I figure I can
> average the mantel rho values.  However, I am not sure what to do about
the
> P values. From looking around online, it looks like I shouldn't take the
> average of p values.  I found this reference <
>
http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=article&id=164:combining-p-values-from-multiple-imputations&catid=57:multiple-imputation&Itemid=98
>
> that seems to have promising suggestions, but I can't seem to figure out
> how I'd implement any of these in R.

So following that link and reading the Li et al. reference it looks as
though the procedure is well described at the top of page 71. You get your
parameter estimate from the usual procedure. The test statistic, written as
"D", is the distance between the null value and the estimated value with
some scaling specified in eq. 1.17. They use the F distribution and k and m
(the number of imputations) degrees of freedom. I don't think you need to
reinvent some inferior ecologists-only procedure for this.

Krzysztof

I was hoping somebody might have a
> suggestion for how I could combine my p values.  One option, I think would
> be to take the highest (worst) p value (in the example below, this would
be
> p = 0.012).  However for large numbers of imputations, I am believe that
> this method might be to conservative.  Another option might be to take the
> p value corresponding to the median rho score (in the example below this
> would be p =0.008).   Thoughts?
> -Jacob
>
>
> ##Example Code Below
> require(Amelia)
> require(vegan)
> require(ecodist)
>
> ##Species data matrix with environmental data that are missing some
values.
> data(varespec)
> data(varechem)
> varechem.missings <- varechem[,c("N", "P", "K")]
> varechem.missings[c(1,5, 7, 15, 20),1] <- NA
> varechem.missings[c(1,2, 9, 21), 2] <- NA
>
> #I multiply impute the missing values with the Amelia package
> imps <- 5
> #imps <- 25
> varechem.amelia <- amelia(varechem.missings, m = imps)
>
>
> #for each imputation of the environmental data I run a mantel test and
save
> #the results to mresults
> mresults <- NULL
>
> for(i in 1:imps){
> varespec.dist <- vegdist(varespec)
> varechem.am.dist <- dist(varechem.amelia$imputations[[i]])
> mresults <- rbind(mresults,
> (ecodist::mantel(varespec.dist~varechem.am.dist)))
> }
>
> mresults
>
> ##mantelr pval1 pval2 pval3 llim.2.5% ulim.97.5%
> ## [1,] 0.2137656 0.008 0.993 0.008 0.1015176  0.3389979
> ## [2,] 0.2162388 0.011 0.990 0.011 0.1207528  0.3346554
> ## [3,] 0.2149556 0.012 0.989 0.016 0.1319943  0.3279028
> ## [4,] 0.2101820 0.009 0.992 0.012 0.1217293  0.3288272
> ## [5,] 0.2135279 0.006 0.995 0.006 0.1130386  0.3359864
>
> #based on these results what would be a reasonable p value to report for
> the environmental parameters relating to the community structure?
>
> ##end example
>
> [[alternative HTML version deleted]]
>
> ___
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel test with multiple imputation, P values

2013-08-18 Thread Jacob Cram
Krzystof,
   Thank you for your reply.  It is promising that you think that Li's
method should work for imputed mantel results.  That said, I am a bit over
my head with the math in the Li et al reference.  Could you (or anyone on
this list) provide an R-code example of how D would be calculated from the
output of several mantel tests?

Also, and forgive my ignorance if this statement is coming from the wrong
direction, It is my understanding that mantel P values are generally
calculated by actually permuting the rows of the similarity matrix and
re-running the mantel test a given number of times (usually 1000).
Accordingly, as far as I can tell there is no explicitly generated F value
corresponding to a mantel p value.  It seems like Li's method assumes I am
generating P from an F table.  Would it be appropriate to back calculate F
from P, k and m?

Thanks again,
Jacob



On Sun, Aug 18, 2013 at 2:19 PM, Krzysztof Sakrejda <
krzysztof.sakre...@gmail.com> wrote:

> Hi Jacob, comments below.
>
> On Aug 18, 2013 2:31 PM, "Jacob Cram"  wrote:
> >
> > Dear List,
> >   I have an environmental data set with several missing values that I
> > am trying to relate to a community structure data set using a mantel
> > test.   One solution to the missing data problem seems to be multiple
> > imputation; I am using the Amelia package. This generates several (five
> in
> > this example) imputed data sets.  I can run mantel on each of these and
> > come up with five similar but not identical solutions.  I figure I can
> > average the mantel rho values.  However, I am not sure what to do about
> the
> > P values. From looking around online, it looks like I shouldn't take the
> > average of p values.  I found this reference <
> >
> http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=article&id=164:combining-p-values-from-multiple-imputations&catid=57:multiple-imputation&Itemid=98
> >
> > that seems to have promising suggestions, but I can't seem to figure out
> > how I'd implement any of these in R.
>
> So following that link and reading the Li et al. reference it looks as
> though the procedure is well described at the top of page 71. You get your
> parameter estimate from the usual procedure. The test statistic, written as
> "D", is the distance between the null value and the estimated value with
> some scaling specified in eq. 1.17. They use the F distribution and k and m
> (the number of imputations) degrees of freedom. I don't think you need to
> reinvent some inferior ecologists-only procedure for this.
>
> Krzysztof
>
> I was hoping somebody might have a
> > suggestion for how I could combine my p values.  One option, I think
> would
> > be to take the highest (worst) p value (in the example below, this would
> be
> > p = 0.012).  However for large numbers of imputations, I am believe that
> > this method might be to conservative.  Another option might be to take
> the
> > p value corresponding to the median rho score (in the example below this
> > would be p =0.008).   Thoughts?
> > -Jacob
> >
> >
> > ##Example Code Below
> > require(Amelia)
> > require(vegan)
> > require(ecodist)
> >
> > ##Species data matrix with environmental data that are missing some
> values.
> > data(varespec)
> > data(varechem)
> > varechem.missings <- varechem[,c("N", "P", "K")]
> > varechem.missings[c(1,5, 7, 15, 20),1] <- NA
> > varechem.missings[c(1,2, 9, 21), 2] <- NA
> >
> > #I multiply impute the missing values with the Amelia package
> > imps <- 5
> > #imps <- 25
> > varechem.amelia <- amelia(varechem.missings, m = imps)
> >
> >
> > #for each imputation of the environmental data I run a mantel test and
> save
> > #the results to mresults
> > mresults <- NULL
> >
> > for(i in 1:imps){
> > varespec.dist <- vegdist(varespec)
> > varechem.am.dist <- dist(varechem.amelia$imputations[[i]])
> > mresults <- rbind(mresults,
> > (ecodist::mantel(varespec.dist~varechem.am.dist)))
> > }
> >
> > mresults
> >
> > ##mantelr pval1 pval2 pval3 llim.2.5% ulim.97.5%
> > ## [1,] 0.2137656 0.008 0.993 0.008 0.1015176  0.3389979
> > ## [2,] 0.2162388 0.011 0.990 0.011 0.1207528  0.3346554
> > ## [3,] 0.2149556 0.012 0.989 0.016 0.1319943  0.3279028
> > ## [4,] 0.2101820 0.009 0.992 0.012 0.1217293  0.3288272
> > ## [5,] 0.2135279 0.006 0.995 0.006 0.1130386  0.3359864
> >
> > #based on these results what would be a reasonable p value to report for
> > the environmental parameters relating to the community structure?
> >
> > ##end example
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > R-sig-ecology mailing list
> > R-sig-ecology@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


[R-sig-eco] point color in NMDS (vegan)

2013-08-18 Thread Elaine Kuo
Dear List,



This is Elaine.

I am using metaMDS in package vegan to plot NMDS.

However, I want to draw the points using different colors according to
island sites.

For example:

island site 1: island B, C, D => blue

island site 2: island A, E, F => green

island site 3: island G, H, J => red



Please kindly advise how to modify the following code to classify the sites
by colors.

Thank you



Elaine



library(MASS)

  library(vegan)

  island.NMDS  <-  metaMDS(island,k=2, distfun  =  betadiver,  distance  =
"sim",trymax=100,zerodist="add")

  plot(island.NMDS,  type  =  "n")



  # species as symbols

  points(island.NMDS, display = 'species', pch = '+', cex = 0.6)



  # sites as text

  text(island.NMDS, display = 'sites')

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


Re: [R-sig-eco] Mantel test with multiple imputation, P values

2013-08-18 Thread Krzysztof Sakrejda
I think D is the easy part, it should be the mean Mantel test statistic
from all the imputed data sets.  The rest I'm not sure yet, but if you get
a good answer from someone else, I'd appreciate it if you shared it.
Krzysztof
On Aug 18, 2013 4:44 PM, "Jacob Cram"  wrote:

> Krzystof,
>Thank you for your reply.  It is promising that you think that Li's
> method should work for imputed mantel results.  That said, I am a bit over
> my head with the math in the Li et al reference.  Could you (or anyone on
> this list) provide an R-code example of how D would be calculated from the
> output of several mantel tests?
>
> Also, and forgive my ignorance if this statement is coming from the wrong
> direction, It is my understanding that mantel P values are generally
> calculated by actually permuting the rows of the similarity matrix and
> re-running the mantel test a given number of times (usually 1000).
> Accordingly, as far as I can tell there is no explicitly generated F value
> corresponding to a mantel p value.  It seems like Li's method assumes I am
> generating P from an F table.  Would it be appropriate to back calculate F
> from P, k and m?
>
> Thanks again,
> Jacob
>
>
>
> On Sun, Aug 18, 2013 at 2:19 PM, Krzysztof Sakrejda <
> krzysztof.sakre...@gmail.com> wrote:
>
>> Hi Jacob, comments below.
>>
>> On Aug 18, 2013 2:31 PM, "Jacob Cram"  wrote:
>> >
>> > Dear List,
>> >   I have an environmental data set with several missing values that
>> I
>> > am trying to relate to a community structure data set using a mantel
>> > test.   One solution to the missing data problem seems to be multiple
>> > imputation; I am using the Amelia package. This generates several (five
>> in
>> > this example) imputed data sets.  I can run mantel on each of these and
>> > come up with five similar but not identical solutions.  I figure I can
>> > average the mantel rho values.  However, I am not sure what to do about
>> the
>> > P values. From looking around online, it looks like I shouldn't take the
>> > average of p values.  I found this reference <
>> >
>> http://missingdata.lshtm.ac.uk/index.php?option=com_content&view=article&id=164:combining-p-values-from-multiple-imputations&catid=57:multiple-imputation&Itemid=98
>> >
>> > that seems to have promising suggestions, but I can't seem to figure out
>> > how I'd implement any of these in R.
>>
>> So following that link and reading the Li et al. reference it looks as
>> though the procedure is well described at the top of page 71. You get your
>> parameter estimate from the usual procedure. The test statistic, written as
>> "D", is the distance between the null value and the estimated value with
>> some scaling specified in eq. 1.17. They use the F distribution and k and m
>> (the number of imputations) degrees of freedom. I don't think you need to
>> reinvent some inferior ecologists-only procedure for this.
>>
>> Krzysztof
>>
>> I was hoping somebody might have a
>> > suggestion for how I could combine my p values.  One option, I think
>> would
>> > be to take the highest (worst) p value (in the example below, this
>> would be
>> > p = 0.012).  However for large numbers of imputations, I am believe that
>> > this method might be to conservative.  Another option might be to take
>> the
>> > p value corresponding to the median rho score (in the example below this
>> > would be p =0.008).   Thoughts?
>> > -Jacob
>> >
>> >
>> > ##Example Code Below
>> > require(Amelia)
>> > require(vegan)
>> > require(ecodist)
>> >
>> > ##Species data matrix with environmental data that are missing some
>> values.
>> > data(varespec)
>> > data(varechem)
>> > varechem.missings <- varechem[,c("N", "P", "K")]
>> > varechem.missings[c(1,5, 7, 15, 20),1] <- NA
>> > varechem.missings[c(1,2, 9, 21), 2] <- NA
>> >
>> > #I multiply impute the missing values with the Amelia package
>> > imps <- 5
>> > #imps <- 25
>> > varechem.amelia <- amelia(varechem.missings, m = imps)
>> >
>> >
>> > #for each imputation of the environmental data I run a mantel test and
>> save
>> > #the results to mresults
>> > mresults <- NULL
>> >
>> > for(i in 1:imps){
>> > varespec.dist <- vegdist(varespec)
>> > varechem.am.dist <- dist(varechem.amelia$imputations[[i]])
>> > mresults <- rbind(mresults,
>> > (ecodist::mantel(varespec.dist~varechem.am.dist)))
>> > }
>> >
>> > mresults
>> >
>> > ##mantelr pval1 pval2 pval3 llim.2.5% ulim.97.5%
>> > ## [1,] 0.2137656 0.008 0.993 0.008 0.1015176  0.3389979
>> > ## [2,] 0.2162388 0.011 0.990 0.011 0.1207528  0.3346554
>> > ## [3,] 0.2149556 0.012 0.989 0.016 0.1319943  0.3279028
>> > ## [4,] 0.2101820 0.009 0.992 0.012 0.1217293  0.3288272
>> > ## [5,] 0.2135279 0.006 0.995 0.006 0.1130386  0.3359864
>> >
>> > #based on these results what would be a reasonable p value to report for
>> > the environmental parameters relating to the community structure?
>> >
>> > ##end example
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > ___

Re: [R-sig-eco] point color in NMDS (vegan)

2013-08-18 Thread Baldwin, Jim -FS
I assume that the dataset "island" is yours so I've made that equivalent to an 
example dataset in vegan named "dune".  Below is a rather brute force way to 
assign colors.

library(MASS)
library(vegan)

data(dune)
island = dune
island.NMDS  <-  metaMDS(island,k=2, distfun  =  betadiver,
  distance  = "sim", trymax=100, zerodist="add")

plot(island.NMDS, type = "n")

# species as symbols
col = rep("black",dim(island)[2])
c = colnames(island)
col[c=="Belper" | c=="Elepal" | c=="Viclat" | c=="Brarut"] = "blue"
col[c=="Airpra" | c=="Rumace" | c=="Potpal" | c=="Tripra"] = "red"
col[c=="Sagpro" | c=="Agrsto" | c=="Hyprad" | c=="Poapra"] = "green"
points(island.NMDS, display = 'species', pch = '+', cex = 0.6, col=col)

# sites as text
col = rep("black",dim(island)[1])
r = rownames(island)
col[r=="2" | r=="17" | r=="18"] = "blue"
col[r=="3" | r=="10" | r=="20"] = "red"
text(island.NMDS, display = 'sites',col=col)

Jim Baldwin
USDA Forest Service


-Original Message-
From: r-sig-ecology-boun...@r-project.org 
[mailto:r-sig-ecology-boun...@r-project.org] On Behalf Of Elaine Kuo
Sent: Sunday, August 18, 2013 4:02 PM
To: 
Subject: [R-sig-eco] point color in NMDS (vegan)

Dear List,



This is Elaine.

I am using metaMDS in package vegan to plot NMDS.

However, I want to draw the points using different colors according to island 
sites.

For example:

island site 1: island B, C, D => blue

island site 2: island A, E, F => green

island site 3: island G, H, J => red



Please kindly advise how to modify the following code to classify the sites by 
colors.

Thank you



Elaine



library(MASS)

  library(vegan)

  island.NMDS  <-  metaMDS(island,k=2, distfun  =  betadiver,  distance  =
"sim",trymax=100,zerodist="add")

  plot(island.NMDS,  type  =  "n")



  # species as symbols

  points(island.NMDS, display = 'species', pch = '+', cex = 0.6)



  # sites as text

  text(island.NMDS, display = 'sites')

[[alternative HTML version deleted]]

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology





This electronic message contains information generated by the USDA solely for 
the intended recipients. Any unauthorized interception of this message or the 
use or disclosure of the information it contains may violate the law and 
subject the violator to civil or criminal penalties. If you believe you have 
received this message in error, please notify the sender and delete the email 
immediately.

___
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology