Re: [R] Quantile regression with complex survey data

Thomas Lumley Thu, 21 Aug 2008 13:52:06 -0700

You can get point estimates by supplying the sampling weights as weightsto the quantile regression functions in Roger Koenker's quantreg package.This is useful for smoothing (with the rqss() function; it is not clearhow useful it is for straight line regression.

You should get valid interval estimates from BRR or bootstrap replicateweights if you have sufficient sample size[*]. If I recall correctly,NHANES has two PSUs per stratum, so BRR replicates are possible. Useas.svrepdesign() to create the BRR replicates and then withReplicates() torun the regression and get the standard errors.

You will not get correct interval estimates with jackknife replicates orby any Taylor-series based approach.

As a additional note, Yiling's two copies of the message to the listwithin half an hour (following one to me less than 24 hours earlier)suggest an unrealistic expectation of response times.


        -thomas

[*] this isn't explicitly in the survey literature, but quantileregression is a Hadamard-differentiable functional of the empiricalprocess, which should give it consistency, asymptotic Normality, andbootstrappability under various standard sets of asymptotics.



On Wed, 20 Aug 2008, Stas Kolenikov wrote:

On Wed, Aug 20, 2008 at 8:12 AM, Cheng, Yiling (CDC/CCHP/NCCDPHP)
<[EMAIL PROTECTED]> wrote:

I am working on the NHANES survey data, and want to apply quantile
regression on these complex survey data. Does anyone know how to do
this?


There are no references in technical literature (thinking, Annals,
JASA, JRSS B, Survey Methodology). Absolutely none. Zero. You might be
able to apply the procedure mechanically and then adjust the standard
errors, but God only knows what the population equivalent is of
whatever that model estimates. If there is a population analogue at
all.

In general, a quantile regression is a heavily model based concept:
for each value of the explanatory variables, there is a well defined
distribution of the response, and quantile regression puts additional
structure on it -- linearity of quantiles wrt to some explanatory
variables. That does not mesh well with the design paradigm according
to which the survey estimation is usually conducted. With the latter,
the finite population and characteristics of every unit are assumed
fixed, and randomness comes only from the sampling procedure. Within
that paradigm, you can define the marginal distribution of the
response (or any other) variable, but the conditional distributions
may simply be unavailable because there are no units in the population
satisfying the conditions.

--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: I use this email account for mailing lists only.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quantile regression with complex survey data

Reply via email to