[R] logic question

matsumotoN Tue, 07 Jul 2009 05:30:08 -0700

Setting:
200 input variables, 1 binary target variable.
Run a principle component analysis on the data 
then
use the output of the principle component analysis (the generated factors)
as input into a neural network -but first having partitioned the pca data
into training and testing sets so that a neural network model can be trained
on the first partition and tested on the second.


I was told that it was not logically sound to include the target variable as
an input into the principle component algorithm.

Normally that sounds correct. You never want to include the target variable
as an input variable in your model.
However, I argued that it is ok here because I am only using the target
variable to build the principle components the model. So each record now has
a value for each of the principle components. Then take the training
partition only to build the neural network. Then test the neural network on
the testing partition. 

Is this wrong?
-- 
View this message in context: 
http://www.nabble.com/logic-question-tp24369772p24369772.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] logic question

Reply via email to