When I import a simple dataset, run LDA, and then try to use the model to
forecast out of sample data, I get a forecast for the training set not the
out of sample set.  Others have posted this question, but I do not see the
answers to their posts.

Here is some sample data:

Date    Names   v1      v2      v3      c1
1/31/2009       Name1   0.714472361     0.902552278     0.783353694     a
1/31/2009       Name2   0.512158919     0.770451596     0.111853346     a
1/31/2009       Name3   0.470693282     0.129200065     0.800973877     a
1/31/2009       Name4   0.24236898      0.472219638     0.486599763     b
1/31/2009       Name5   0.785619735     0.628511593     0.106868172     b
1/31/2009       Name6   0.718718387     0.697257275     0.690326648     b
1/31/2009       Name7   0.327331186     0.01715109      0.861421706     c
1/31/2009       Name8   0.632011743     0.599040196     0.320741634     c
1/31/2009       Name9   0.302804404     0.475166304     0.907143632     c
1/31/2009       Name10  0.545284813     0.967196462     0.945163717     a
1/31/2009       Name11  0.563720418     0.024862018     0.970685281     a
1/31/2009       Name12  0.357614427     0.417490445     0.415162276     a
1/31/2009       Name13  0.154971203     0.425227967     0.856866993     b
1/31/2009       Name14  0.935080173     0.488659307     0.194967973     a
1/31/2009       Name15  0.363069339     0.334206603     0.639795596     b
1/31/2009       Name16  0.862889297     0.821752532     0.549552875     a

Attached is the code:

myDat <-read.csv(file="f:\\Systematiq\\data\\TestData.csv",
header=TRUE,sep=",")
myData <- data.frame(myDat)

length(myDat[,1])

train <- myDat[1:10,]
outOfSample <- myDat[11:16,]
outOfSample <- (cbind(outOfSample$v1,outOfSample$v2,outOfSample$v3))
outOfSample <-data.frame(outOfSample)

length(train[,1])
length(outOfSample[,1])

fit <- lda(train$c1~train$v1+train$v2+train$v3)

forecast <- predict(fit,outOfSample)$class

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6

Output:

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6
[1] 10






-- 
View this message in context: 
http://www.nabble.com/LDA-Precdict---Seems-to-be-predicting-on-the-Training-Data-tp25976178p25976178.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to