This is really a statistics problem, so I wonder which R packages can be 
employed best to solve and visualize it.

I run a lot of simulations to approach the truth.  The truth is a result of 
very complex computations, and is a real number.  The closer it is to 0, the 
truthier it is.

Each simulations has a set of features, some of which are not available for all 
simulations.  Some of the features are numeric (week), some boolean (utility), 
while others are factors.

Each simulation has the final value, the dm column in the data frame.  The 
names of the simulations are rownames of the data frame, and feature names are 
the column names.  Here's the dataframe:

http://dl.dropbox.com/u/9300701/Data/sf.dm.pos.r

You read it in R with

sf <- read.table(sf.dm.pos.r)

Seeking the truth questions:

-- What kinds of GLM and other models can we run to determine which features 
are most contributing to the truth, i.e. making dm closer to 0?

-- What kind of clustering can emphasize the most contributing features?

-- What kind of visualizations can be used to make it clear which features 
affect the truth the most, and in which combinations?  What kind of color 
visualizations are there to make the truth even clearer?

Cheers,
Alexy

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to