Hi,

More questions in my ongoing quest to convert from RapidMiner to R.

One thing has become VERY CLEAR:  None of the issues I'm asking about 
here are addressed in RapidMiner.  How it handles misisng values, 
scaling, etc. is hidden within the "black box".  Using R is forcing me 
to take a much deeper look at my data and how my experiments are 
constructed.  (That's a very "Good Thing")

So, on to the question...

I'm scaling data based on groups.  I have it working well in a nice 
loop.  (This WORKS, but if someone has a faster/cleaner way, I'd be 
curious.)

#group-wide normailzation
groups <- unique(rawdata$group)
group_names = grep('norm_',names(rawdata))
for(group in groups){
     for(name in group_names){
         rawdata[rawdata$code==group, name] <- 
c(scale(rawdata[rawdata$code==group, name]))
     }
}


My problem is that if the particular list of data I'm scoring is all 0, 
then scale returns NaN for all of them, subsequently breaking my SVM 
training.
 >foo <- c(0,0,0,0,0)
 >scale(foo)
      [,1]
[1,]  NaN
[2,]  NaN
[3,]  NaN
[4,]  NaN
attr(,"scaled:center")
[1] 0
attr(,"scaled:scale")
[1] 0


I would have expected scale to just return back 0 for all the values.
Is there some trick to fixing this?

Thanks!

-Noah

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to