[Bioc-devel] Use of confounders in downstream analysis

Aileen Bahl Tue, 21 Apr 2015 09:32:31 -0700

Dear all,

I have some problems in understanding how exactly to includeconfounders in my downstream analysis. I will provide a shortdescription of my analysis and problem and I would be very happy ifsome of you could help me understanding how exactly to go ahead withthat:

I normalized 450k data and then used lmFit() to find differentiallymethylated CpGs. My design matrix looks like this:model.matrix(~Pair+FatPercentage+EstradiolLevel). So, basically I wantto identify CpG sites that are associated with changes in estradiollevels. As I want to perform within-pair analysis of monozygotic twinsI added pair information looking like c(1,1,2,3,2,3...). I also addedthe fat percentage as a confounder as we saw significant correlationswith the first principal component of the data. Does this look rightto you?

Now, after having identified significantly differentially methylatedCpGs, we want to use the GSA package and look at correlations betweenmethylation and expression data. For GSA the pairs can be specifieddirectly in the function call. Does that also work with continuoustraits or only if you have to groups? Additionally, I am not reallysure how to include confounders then. Do I have to use adjusted orunadjusted data? If I use adjusted data, would I use the same designmatrix as above and not include pair information in the function call?Would that be still a within-pair comparison then? And for theadjustment itself, would it be something like adj.m <-normalizedM-fit$coef[,-1]%*%t(myDesign[,-1]) or do I also have toinclude the columns for pair and fat percentage in this adjustmentsomehow? If I don't have to use unadjusted data, how would I includeinformation on fat percentage and the estradiol levels then?

Similarly, for the correlations between methylation and expression...Do I just use the adjusted data sets and then compute correlationsover all individuals? Is that then still considering the within-pairchanges? Or would I use delta betas for correlation analysis? In thelatter case, would I use adjusted data? Would that then be likeadjusting for pair twice if I use the design matrix from above? Orwould I have to change the matrix and if yes, how?

One last thing - say I wanted to perform differential analysis betweentwo groups (not within-pair) but still have some twin pairs includedin the analysis, would I then used duplicateCorrelation() instead ofincluding the pair information directly in the design matrix? Or ifthat's not the right way to go, what should I do in that case?

Sorry for that many questions! However, I would really appreciate anykind of help or ideas, to be able to understand how to go on...



Thanks a lot in advance and best regards,

Aileen

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

[Bioc-devel] Use of confounders in downstream analysis

Reply via email to