Michal Figurski wrote:
Hello all,

I am trying to optimize my logistic regression model by using bootstrap. I was previously using SAS for this kind of tasks, but I am now switching to R.

My data frame consists of 5 columns and has 109 rows. Each row is a single record composed of the following values: Subject_name, numeric1, numeric2, numeric3 and outcome (yes or no). All three numerics are used to predict outcome using LR.

In SAS I have written a macro, that was splitting the dataset, running LR on one half of data and making predictions on second half. Then it was collecting the equation coefficients from each iteration of bootstrap. Later I was just taking medians of these coefficients from all iterations, and used them as an optimal model - it really worked well!

Why not use maximum likelihood estimation, i.e., the coefficients from the original fit. How does the bootstrap improve on that?


Now I want to do the same in R. I tried to use the 'validate' or 'calibrate' functions from package "Design", and I also experimented with function 'sm.binomial.bootstrap' from package "sm". I tried also the function 'boot' from package "boot", though without success - in my case it randomly selected _columns_ from my data frame, while I wanted it to select _rows_.

validate and calibrate in Design do resampling on the rows

Resampling is mainly used to get a nearly unbiased estimate of the model performance, i.e., to correct for overfitting.

Frank Harrell


Though the main point here is the optimized LR equation. I would appreciate any help on how to extract the LR equation coefficients from any of these bootstrap functions, in the same form as given by 'glm' or 'lrm'.

Many thanks in advance!



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to