Michal Figurski wrote:
Hello all,
I am trying to optimize my logistic regression model by using bootstrap.
I was previously using SAS for this kind of tasks, but I am now
switching to R.
My data frame consists of 5 columns and has 109 rows. Each row is a
single record composed of the following values: Subject_name, numeric1,
numeric2, numeric3 and outcome (yes or no). All three numerics are used
to predict outcome using LR.
In SAS I have written a macro, that was splitting the dataset, running
LR on one half of data and making predictions on second half. Then it
was collecting the equation coefficients from each iteration of
bootstrap. Later I was just taking medians of these coefficients from
all iterations, and used them as an optimal model - it really worked well!
Why not use maximum likelihood estimation, i.e., the coefficients from
the original fit. How does the bootstrap improve on that?
Now I want to do the same in R. I tried to use the 'validate' or
'calibrate' functions from package "Design", and I also experimented
with function 'sm.binomial.bootstrap' from package "sm". I tried also
the function 'boot' from package "boot", though without success - in my
case it randomly selected _columns_ from my data frame, while I wanted
it to select _rows_.
validate and calibrate in Design do resampling on the rows
Resampling is mainly used to get a nearly unbiased estimate of the model
performance, i.e., to correct for overfitting.
Frank Harrell
Though the main point here is the optimized LR equation. I would
appreciate any help on how to extract the LR equation coefficients from
any of these bootstrap functions, in the same form as given by 'glm' or
'lrm'.
Many thanks in advance!
--
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.