> install.packages('fortunes') > library(fortunes) > fortune(28)
> -----Original Message----- > From: Marc Schwartz [mailto:[EMAIL PROTECTED] > Sent: Tuesday, July 22, 2008 1:29 PM > To: Michal Figurski > Cc: Doran, Harold; r-help@r-project.org; Frank E Harrell Jr; > Bert Gunter > Subject: Re: [R] Coefficients of Logistic Regression from > bootstrap - how to get them? > > Michal, > > With all due respect, you have openly acknowledged that you > don't know enough about the subject at hand. > > If that is the case, on what basis are you in a position to > challenge the collective wisdom of those professionals who > have voluntarily offered *expert* level statistical advice to you? > > You have erected a wall around your thinking. > > You may choose to use R or any other software application to > "Git-R-Done". But that does not make it correct. > > There are other methods to consider that could be used during > the model building process itself, rather than on a post-hoc > basis and I would specifically refer you to Frank's book, > Regression Modeling Strategies: > > http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/RmS > > Marc Schwartz > > on 07/22/2008 09:43 AM Michal Figurski wrote: > > Hmm... > > > > It sounds like ideology to me. I was asking for technical > help. I know > > what I want to do, just don't know how to do it in R. I'll > go back to > > SAS then. Thank you. > > > > -- > > Michal J. Figurski > > > > Doran, Harold wrote: > >> I think the answer has been given to you. If you want to > continue to > >> ignore that advice and use bootstrap for point estimates > rather than > >> the properties of those estimates (which is what bootstrap is for) > >> then you are on your own. > >>> -----Original Message----- > >>> From: [EMAIL PROTECTED] > >>> [mailto:[EMAIL PROTECTED] On Behalf Of Michal Figurski > >>> Sent: Tuesday, July 22, 2008 9:52 AM > >>> To: r-help@r-project.org > >>> Subject: Re: [R] Coefficients of Logistic Regression from > bootstrap > >>> - how to get them? > >>> > >>> Dear all, > >>> > >>> I don't want to argue with anybody about words or about what > >>> bootstrap is suitable for - I know too little for that. > >>> > >>> All I need is help to get the *equation coefficients* > optimized by > >>> bootstrap - either by one of the functions or by simple median. > >>> > >>> Please help, > >>> > >>> -- > >>> Michal J. Figurski > >>> HUP, Pathology & Laboratory Medicine Xenobiotics Toxicokinetics > >>> Research Laboratory 3400 Spruce St. 7 Maloney > Philadelphia, PA 19104 > >>> tel. (215) 662-3413 > >>> > >>> Frank E Harrell Jr wrote: > >>>> Michal Figurski wrote: > >>>>> Frank, > >>>>> > >>>>> "How does bootstrap improve on that?" > >>>>> > >>>>> I don't know, but I have an idea. Since the data in my set > >>> are just a > >>>>> small sample of a big population, then if I use my whole > >>> dataset to > >>>>> obtain max likelihood estimates, these estimates may be > >>> best for this > >>>>> dataset, but far from ideal for the whole population. > >>>> The bootstrap, being a resampling procedure from your > >>> sample, has the > >>>> same issues about the population as MLEs. > >>>> > >>>>> I used bootstrap to virtually increase the size of my > dataset, it > >>>>> should result in estimates more close to that from the > >>> population - > >>>>> isn't it the purpose of bootstrap? > >>>> No > >>>> > >>>>> When I use such median coefficients on another dataset (another > >>>>> sample from population), the predictions are better, than > >>> using max > >>>>> likelihood estimates. I have already tested that and it worked! > >>>> Then your testing procedure is probably not valid. > >>>> > >>>>> I am not a statistician and I don't feel what > >>> "overfitting" is, but > >>>>> it may be just another word for the same idea. > >>>>> > >>>>> Nevertheless, I would still like to know how can I get the > >>>>> coeffcients for the model that gives the "nearly unbiased > >>> estimates". > >>>>> I greatly appreciate your help. > >>>> More info in my book Regression Modeling Strategies. > >>>> > >>>> Frank > >>>> > >>>>> -- > >>>>> Michal J. Figurski > >>>>> HUP, Pathology & Laboratory Medicine Xenobiotics Toxicokinetics > >>>>> Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA > >>>>> 19104 tel. (215) 662-3413 > >>>>> > >>>>> Frank E Harrell Jr wrote: > >>>>>> Michal Figurski wrote: > >>>>>>> Hello all, > >>>>>>> > >>>>>>> I am trying to optimize my logistic regression model by using > >>>>>>> bootstrap. I was previously using SAS for this kind of > >>> tasks, but I > >>>>>>> am now switching to R. > >>>>>>> > >>>>>>> My data frame consists of 5 columns and has 109 rows. > >>> Each row is a > >>>>>>> single record composed of the following values: Subject_name, > >>>>>>> numeric1, numeric2, numeric3 and outcome (yes or no). > All three > >>>>>>> numerics are used to predict outcome using LR. > >>>>>>> > >>>>>>> In SAS I have written a macro, that was splitting the > dataset, > >>>>>>> running LR on one half of data and making predictions > on second > >>>>>>> half. Then it was collecting the equation > coefficients from each > >>>>>>> iteration of bootstrap. Later I was just taking > medians of these > >>>>>>> coefficients from all iterations, and used them as an > >>> optimal model > >>>>>>> - it really worked well! > >>>>>> Why not use maximum likelihood estimation, i.e., the > coefficients > >>>>>> from the original fit. How does the bootstrap improve on that? > >>>>>> > >>>>>>> Now I want to do the same in R. I tried to use the > 'validate' or > >>>>>>> 'calibrate' functions from package "Design", and I also > >>>>>>> experimented with function 'sm.binomial.bootstrap' > from package > >>>>>>> "sm". I tried also the function 'boot' from package > >>> "boot", though > >>>>>>> without success > >>>>>>> - in my case it randomly selected _columns_ from my > data frame, > >>>>>>> while I wanted it to select _rows_. > >>>>>> validate and calibrate in Design do resampling on the rows > >>>>>> > >>>>>> Resampling is mainly used to get a nearly unbiased > >>> estimate of the > >>>>>> model performance, i.e., to correct for overfitting. > >>>>>> > >>>>>> Frank Harrell > >>>>>> > >>>>>>> Though the main point here is the optimized LR > equation. I would > >>>>>>> appreciate any help on how to extract the LR equation > >>> coefficients > >>>>>>> from any of these bootstrap functions, in the same form > >>> as given by > >>>>>>> 'glm' or 'lrm'. > >>>>>>> > >>>>>>> Many thanks in advance! > >>>>>>> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.