Thanks and sorry,

I didn't get a lot of response at the Bioconductor support site and thus tried it here. However, good to know where would be the best place...

Best,
Aileen


Zitat von Sean Davis <seand...@gmail.com>:

Hi, Aileen.

This list isn't really the best place to ask questions like this and is
really reserved for discussion around package development.  Could you
please post to:

https://support.bioconductor.org/

That way, you benefit from more eyes and everyone benefits from potential
answers.

Thanks,
Sean


On Tue, Apr 21, 2015 at 12:31 PM, Aileen Bahl <aileen.b...@helsinki.fi>
wrote:

Dear all,

I have some problems in understanding how exactly to include confounders
in my downstream analysis. I will provide a short description of my
analysis and problem and I would be very happy if some of you could help me
understanding how exactly to go ahead with that:

I normalized 450k data and then used lmFit() to find differentially
methylated CpGs. My design matrix looks like this:
model.matrix(~Pair+FatPercentage+EstradiolLevel). So, basically I want to
identify CpG sites that are associated with changes in estradiol levels. As
I want to perform within-pair analysis of monozygotic twins I added pair
information looking like c(1,1,2,3,2,3...). I also added the fat percentage
as a confounder as we saw significant correlations with the first principal
component of the data. Does this look right to you?

Now, after having identified significantly differentially methylated CpGs,
we want to use the GSA package and look at correlations between methylation
and expression data. For GSA the pairs can be specified directly in the
function call. Does that also work with continuous traits or only if you
have to groups? Additionally, I am not really sure how to include
confounders then. Do I have to use adjusted or unadjusted data? If I use
adjusted data, would I use the same design matrix as above and not include
pair information in the function call? Would that be still a within-pair
comparison then? And for the adjustment itself, would it be something like
adj.m <- normalizedM-fit$coef[,-1]%*%t(myDesign[,-1]) or do I also have to
include the columns for pair and fat percentage in this adjustment somehow?
If I don't have to use unadjusted data, how would I include information on
fat percentage and the estradiol levels then?

Similarly, for the correlations between methylation and expression... Do I
just use the adjusted data sets and then compute correlations over all
individuals? Is that then still considering the within-pair changes? Or
would I use delta betas for correlation analysis? In the latter case, would
I use adjusted data? Would that then be like adjusting for pair twice if I
use the design matrix from above? Or would I have to change the matrix and
if yes, how?

One last thing - say I wanted to perform differential analysis between two
groups (not within-pair) but still have some twin pairs included in the
analysis, would I then used duplicateCorrelation() instead of including the
pair information directly in the design matrix? Or if that's not the right
way to go, what should I do in that case?

Sorry for that many questions! However, I would really appreciate any kind
of help or ideas, to be able to understand how to go on...


Thanks a lot in advance and best regards,

Aileen

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to