This is not an explanation but it gives you a solution,

Instead of using lda with a formula do it by giving the variables and the classification factor as arguments, base on your example and data:



outOfSample <- myDat[11:16,]

train <- myDat[1:10,]

outOfSample <- outOfSample[,3:5]

train2 <- train[,3:5]

fit <- lda(train2,train$c1)

forecast <- predict(fit,outOfSample)$class

length(forecast)

[1] 6



Seems that the problem arise when predict.lda works on lda fit applied to a formula class object.



Hope this help,

Gabriela.



______________________________
Lic. María Gabriela Cendoya
Magíster en Biometría
Profesor Adjunto
Cátedra de Estadística y Diseño
Facultad de Ciencias Agrarias
Universidad Nacional de Mar del Plata
______________________________

----- Original Message ----- From: "BostonR" <dp...@capitaliq.com>
To: <r-help@r-project.org>
Sent: Tuesday, October 20, 2009 11:31 AM
Subject: [R] LDA Precdict - Seems to be predicting on the Training Data



When I import a simple dataset, run LDA, and then try to use the model to
forecast out of sample data, I get a forecast for the training set not the
out of sample set.  Others have posted this question, but I do not see the
answers to their posts.

Here is some sample data:

Date Names v1 v2 v3 c1
1/31/2009 Name1 0.714472361 0.902552278 0.783353694 a
1/31/2009 Name2 0.512158919 0.770451596 0.111853346 a
1/31/2009 Name3 0.470693282 0.129200065 0.800973877 a
1/31/2009 Name4 0.24236898 0.472219638 0.486599763 b
1/31/2009 Name5 0.785619735 0.628511593 0.106868172 b
1/31/2009 Name6 0.718718387 0.697257275 0.690326648 b
1/31/2009 Name7 0.327331186 0.01715109 0.861421706 c
1/31/2009 Name8 0.632011743 0.599040196 0.320741634 c
1/31/2009 Name9 0.302804404 0.475166304 0.907143632 c
1/31/2009 Name10 0.545284813 0.967196462 0.945163717 a
1/31/2009 Name11 0.563720418 0.024862018 0.970685281 a
1/31/2009 Name12 0.357614427 0.417490445 0.415162276 a
1/31/2009 Name13 0.154971203 0.425227967 0.856866993 b
1/31/2009 Name14 0.935080173 0.488659307 0.194967973 a
1/31/2009 Name15 0.363069339 0.334206603 0.639795596 b
1/31/2009 Name16 0.862889297 0.821752532 0.549552875 a

Attached is the code:

myDat <-read.csv(file="f:\\Systematiq\\data\\TestData.csv",
header=TRUE,sep=",")
myData <- data.frame(myDat)

length(myDat[,1])

train <- myDat[1:10,]
outOfSample <- myDat[11:16,]
outOfSample <- (cbind(outOfSample$v1,outOfSample$v2,outOfSample$v3))
outOfSample <-data.frame(outOfSample)

length(train[,1])
length(outOfSample[,1])

fit <- lda(train$c1~train$v1+train$v2+train$v3)

forecast <- predict(fit,outOfSample)$class

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6

Output:

length(forecast)##### I am expecting this to be same as
lengthoutOfSample[,1]), which is 6
[1] 10






--
View this message in context: http://www.nabble.com/LDA-Precdict---Seems-to-be-predicting-on-the-Training-Data-tp25976178p25976178.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


___________________________________________________________________________

Aviso:
=====

El contenido del presente e-mail y sus posibles adjuntos pertenecen al INTA y pueden contener información confidencial. Si usted no es el destinatario original de este mensaje y por este medio pudo acceder a dicha información, por favor solicitamos contactar al remitente y eliminar el mensaje de inmediato. Se encuentra prohibida la divulgación, copia, distribución o cualquier otro uso de la información contenida en el presente e-mail por parte de personas distintas al destinatario.

This e-mail contents and its possible attachments belong to INTA and may 
contain confidential information. If this message was not originally addressed 
to you, but you have accessed to such information by this means, please contact 
the sender and eliminate this message immediately. Circulation, copy, 
distribution, or any other use of the information contained in this e-mail is 
not allowed on part of those different from the addressee.


Antes de imprimir este mensaje, asegúrese de que sea necesario. Proteger el 
medio ambiente está también en su mano.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to