- Reduce the model to a reasonable size with far less variables than
observations.
- Code factors as factors rather than numerics
- don't use variables with perfect correlation to other nor any duplicates
Best,
Uwe Ligges
On 17.05.2011 15:46, Songer, Katherine B - DNR wrote:
Uwe,
Thank you very much for looking at this. I'm attaching the data, in case you
have any wisdom on why variables 10, 38, and 42 would appear constant.
Meanwhile, I'll remove colinear variables and read up a little more...
Thanks,
Katie
-----Original Message-----
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de]
Sent: Tuesday, May 17, 2011 04:25 AM
To: Songer, Katherine B - DNR
Cc: r-help@r-project.org
Subject: Re: [R] Linear Discriminant Analysis error: "Variables appear constant"
On 16.05.2011 22:07, Songer, Katherine B - DNR wrote:
Hi R experts,
I'm attempting to run Linear Discriminant Analysis using the lda function in
the MASS package. I've got around 50 predictor variables and one response
variable. My response variable has 5 numeric categories that represent
different clusters of fish abundance data (clusters were developed using
Bray-Curtis and NMDS), and my predictor variables are environmental variables
that might influence the fish data. These data all came from 68 sampling
locations.
I'm getting an error message:
DALogFish<-lda(Cluster~DrainArea+Flow+StrmWidth+Gradient+NatComm+Fish
IBIUsed
+QHEI+QHEIsub+QHEImwh_h+QHEIcov+QHEIchan+QHEIrip+QHEIpool+QHEIrif+QHEI
+QHEI+QHEIsub+grads+
QHEIgradv+QHEImwh+QHEIcovtype+QHEIwwh+QHab+QHabBuff+QHabEros+QhabPool+
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+N
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+H
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+3
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+M
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+i
QHabWDRatio+QHabRif+QHabFines+QHabCov+QHabRating+QHabSize+TP+TKN+NH3+n
+NO3NO2N+BOD+TSS+TSSMax+TDS+SSC+SSCMax+Chloride+Sulfate+Ecoli+ChlA+DOper+
DOperMin+DOperMin1_5+DOmgL+DOmgLMean+DOmgLMax+Cond+pH+pHMax+Trans+Temp
DOperMin++
TempMin+Temp4+Crop100+Crop500+CropSub+Dev100+Dev500+DevSub+For100+For500+
ForSub+Pas100+Pas500+PasSub+Wat100+Wat500+WatSub+Wet100+Wet500+WetSub+
Undev100+Undev500+UndevTotal+Undev100NoPas+Undev500NoPas+UndevTotNoPas
Undev100+Undev500+UndevTotal+Undev100NoPas+Undev500NoPas+,
data=AllData1, na.action="na.omit", CV=TRUE)
Error in lda.default(x, grouping, ...) :
variables 10 38 42 appear to be constant within groups
When I look at the variables listed, they don't appear "constant within the
groups" to me.
We do not know, since we do not have the data.
I'm new to LDA and am wondering what this error means... Are my data
somehow not in the right format? Should I remove colinear variables?
(All variables have been normalized.)
Yes, colinear variables should be removed. Note als, that you have roughly as
many (or even more) variables in the model than observations.
This won't work either. I think you should read some textbook on the mechanisms
behind an LDA.
Uwe Ligges
Thanks very much!
Katie
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.