Hi, More questions in my ongoing quest to convert from RapidMiner to R.
One thing has become VERY CLEAR: None of the issues I'm asking about here are addressed in RapidMiner. How it handles misisng values, scaling, etc. is hidden within the "black box". Using R is forcing me to take a much deeper look at my data and how my experiments are constructed. (That's a very "Good Thing") So, on to the question... I'm scaling data based on groups. I have it working well in a nice loop. (This WORKS, but if someone has a faster/cleaner way, I'd be curious.) #group-wide normailzation groups <- unique(rawdata$group) group_names = grep('norm_',names(rawdata)) for(group in groups){ for(name in group_names){ rawdata[rawdata$code==group, name] <- c(scale(rawdata[rawdata$code==group, name])) } } My problem is that if the particular list of data I'm scoring is all 0, then scale returns NaN for all of them, subsequently breaking my SVM training. >foo <- c(0,0,0,0,0) >scale(foo) [,1] [1,] NaN [2,] NaN [3,] NaN [4,] NaN attr(,"scaled:center") [1] 0 attr(,"scaled:scale") [1] 0 I would have expected scale to just return back 0 for all the values. Is there some trick to fixing this? Thanks! -Noah [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.