[R] How long to wait for process?

john polo Wed, 26 Jul 2017 07:22:05 -0700

UseRs,

I have a dataframe with 2547 rows and several hundred columns in R3.1.3. I am trying to run a small logistic regression with a subset ofthe data.

know_fin ~comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county


    > str(knowf3)
    'data.frame':   2033 obs. of  18 variables:

$ userid : Factor w/ 2542 levels "FNCNM1639","FNCNM1642",..:1857 157 965 1967 164 315 849 1017 699 189 ...

    $ round_id   : Factor w/ 1 level "Round 11": 1 1 1 1 1 1 1 1 1 1 ...
    $ age       : int  67 66 44 27 32 67 36 76 70 66 ...

$ county: Factor w/ 80 levels "Adair","Alfalfa",..: 75 75 75 75 7575 64 64 64 64 ...

    $ gender    : Factor w/ 2 levels "0","1": 1 2 1 1 2 1 2 1 2 2 ...

$ education : Factor w/ 8 levels "1","2","3","4",..: 6 7 6 8 2 4 24 2 6 ...$ employment: Factor w/ 9 levels "1","2","3","4",..: 8 4 4 4 3 8 58 4 4 ...$ income : num 550000 80000 90000 19000 42000 30000 18000 50000800000 10000 ...

    $ home: num  0 0 0 0 0 0 0 0 0 0 ...

$ ideol : Factor w/ 7 levels "1","2","3","4",..: 2 7 4 3 2 4 23 2 6 ...

    $ home_lot  : Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 3 3 1 2 ...
    $ hispanic  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

$ comp_grp2 : Factor w/ 16 levels "Cr_Gr","Cr_Ot",..: 13 13 13 1313 13 10 10 10 10 ...

    $ know_fin : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...

With the regular glm() function, I get a warning about "perfect orquasi-perfect separation"[1]. I looked for a method to deal with thisand a penalized GLM is an accepted method[2]. This is implemented inlogistf(). I used the default settings for the function.

Just before I run the model, memory.size() for my session is ~4500 (MB).memory.limit() is ~25500. When I start the model, R immediately becomesnon-responsive. This is in a Windows environment and in Task Manager,the instance of R is, and has been, using ~13% of CPU aand ~4997 MB ofRAM. It's been ~24 hours now in that state and I don't have any idea ofhow long this should take. If I run the same model in the same settingwith the base glm(), the model runs in about 60 seconds. Is there a wayto know if the process is going to produce something useful after allthis time or if it's hanging on some kind of problem?

[1]:https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression#68917[2]:https://academic.oup.com/biomet/article-abstract/80/1/27/228364/Bias-reduction-of-maximum-likelihood-estimates



--
Men occasionally stumble
over the truth, but most of them
pick themselves up and hurry off
as if nothing had happened.
-- Winston Churchill

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How long to wait for process?

Reply via email to