Hi, please read the posting guide. You are not likely to get an extensive answer to your question from this list. Your question is a "please solve/explain my statistical problem for me" question. There are two things problematic with that. First, "statistical", and second "please solve for me."
First, the R-help list is mostly concerned with problems in implementing analyses in R, not with the (choice of the) statistical approach per se (there are few exceptions). Second, "please solve for me" questions are generally frowned upon, unless you evidence a specific point at which you are stuck and have to make a choice. That is, the list members want to see that you have done your "homework" to the extent one can expect you to. To ask the list to provide an introduction to data reduction methods without having any background knowledge is, frankly, a waste of your and the list members' time. There are books on the topic, which you can buy or lend, and certainly many online sources to give you a basic background. Or you can start here: http://en.wikipedia.org/wiki/Dimension_reduction. If you want your statistical questions answered and problems solved without reading yourself into the matter, your question is more suitable for a local statistician at your institution or a paid service rather than this list. Best, Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Original Message----- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of rubystallion Sent: Wednesday, January 13, 2010 11:57 AM To: r-help@r-project.org Subject: [R] Method for reduction of independent variables Hello I am currently investing software code metrics for a variety of software projects of a company to determine the worst parts of software products according to specified quality characteristics. As the gathering of metrics correlates with effort, I would like to find a subset of the metrics preserving significant predictive power for the "problem value" while using the least amount of code metrics. I have the results of 25 metrics for 6 software projects for a combined 9355 "individuals", i.e. software parts with metrics. However, as many metrics only measure metric values above a predefined limit, 58% of the responses for independent variables are 0. Which method can I use to determine a reduced set of independent variables with significant predictive power? As I do not have a statistics background, I would also appreciate a simple explanation of the chosen method and sensible choices for parameters, so that I will be able to infer the reduced set of software metrics to keep. Thank you in advance! Johannes -- View this message in context: http://n4.nabble.com/Method-for-reduction-of-independent-variables-tp1013171 p1013171.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.