I am using the R statistical package in a pretty straightforward manner, just
interested in getting summary statistics and reliable estimates of means,
variances, confidence intervals, p-values using standard tests, etc. I've been
using the generalized estimating equation package because I often deal with
repeatedly sampled pairs of data from the same source, and my reading (and my
statistician) tells me that GEE is an appropriate method for that type of data
set.
Im using the geeglm program from the geepack download. My data is correlated
data from two estimates of vascular permeability in a rat model of cerebral
tumor. One estimate, the MRI estimate, can be conducted non-invasively and
repeatedly. The other estimate, the QAR estimate, can be done just once since
it involves sacrificing the animal. The MRI estimate needs to be characterized
as to its correlation with the QAR estimate, and that is what the paper I have
in review does.
The question is whether, and how strongly, the two estimates are correlated.
The estimates are presented in the form of image maps of vascular permeability
call them estimate QAR (the gold standard) and estimate MRI (the proposed
surrogate measure). I align the two image maps of the parametric estimates as
best I can, draw a region-of interest around the cerebral tumor, and get
anywhere from 200 to 400 pairs of estimates per region of interest in one
animal. I have 15 studies. The data looks like this:
ID QAR MRI
EJ4 0.001341 0.001941
EJ4 0.00102 0.001605
EJ4 0.000673 0.001767
EJ4 0.0005 0.000143
EJ4 0.000393 -0.000962
EJ4 0.002681 0.00117
EJ4 0.002529 0.001405
EJ4 0.002097 0.001295
EJ4 0.001448 0.001463
EJ4 0.001101 0.000877
EJ4 0.000834 0.000719
EJ4 0.000567 0.001902
.
.
.
EJ5 0.001084 0.00153
EJ5 0.001437 0.001332
EJ5 0.003087 0.001737
EJ5 0.003479 0.001634
EJ5 0.003541 0.002122
EJ5 0.003355 0.002099
EJ5 0.003149 0.00211
EJ5 0.002881 0.001591
EJ5 0.001469 0.001559
EJ5 0.001191 0.002259
EJ5 0.000518 0.001589
EJ5 0.000689 0.001348
.
.
.
etcetera.
I used this command to generate an estimate of the correlation coefficient:
ll_gee_2 <- geeglm(MRI~QAR, id = ID, family = gaussian, corstr ="exch")
This yielded a very significant but moderate correlation between the two
methods, in general agreement with the results of a linear regression done on
the means of the values of the two estimates in their ROIs.
Now, the reviewer had this query:
Statistical evaluation: How many parameters were used for the GEE evaluation?
Was only the bias changed for each tumor. The correlation of a single tumor
should be presented. How much did the bias change?
In second review, after we explained that we didnt think there were biases in
the analysis, this came back:
The exact fitting approach using the GEE model is difficult to understand. The
authors should clearly state, that they use a different bias for each
individual measurement using the GEE model.
I cant figure out what the reviewer is talking about. According to my reading
of the Liang and Zeger original paper on the Generalized Estimating Equations,
there arent any biases (as I understand bias) generated in this method. Im
guessing that the reviewer is having trouble with language, and is really
referring to the weights assigned to the variances of each of the 15 clusters
of data. And, after this lengthy preamble heres my question:
How do I find the weights assigned to each of the 15 clusters in the GEE
analysis? If this is in summary(), which field would it be? Indeed, how do I
tell what fields are contained in summary()? Clearly, there are fields that
aren't reported in the simple invocation of summary(ll_gee_2). The listserver
comments reveal that summary() produces a list containing a data frame, but I'm
still not clear on how to identify which element in the data frame would be the
weights I'm after.
I did use str(ll_gee_2), and got a listing that (apparently) showed the weights
to be all 1's, as far as I can determine.
Sorry about the length of this query, and about my ignorance of R functions.
Also, I'm aware that I'm actually asking the experts to address two questions:
1) How to print out fields produced by gee that aren't routinely reported in
summary(), and 2) How to deal with a reviewer who is asking a badly-worded
question that he apparently thinks is important. I hope problem 2 evokes a
sympathetic response from those who have had similar problems.
Ive googled R summary() fields, and variations of that query, but have yet
to find a way to list out the fields in the summary() function.
- Jim Ewing
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.