Hi, Aileen. This list isn't really the best place to ask questions like this and is really reserved for discussion around package development. Could you please post to:
https://support.bioconductor.org/ That way, you benefit from more eyes and everyone benefits from potential answers. Thanks, Sean On Tue, Apr 21, 2015 at 12:31 PM, Aileen Bahl <aileen.b...@helsinki.fi> wrote: > Dear all, > > I have some problems in understanding how exactly to include confounders > in my downstream analysis. I will provide a short description of my > analysis and problem and I would be very happy if some of you could help me > understanding how exactly to go ahead with that: > > I normalized 450k data and then used lmFit() to find differentially > methylated CpGs. My design matrix looks like this: > model.matrix(~Pair+FatPercentage+EstradiolLevel). So, basically I want to > identify CpG sites that are associated with changes in estradiol levels. As > I want to perform within-pair analysis of monozygotic twins I added pair > information looking like c(1,1,2,3,2,3...). I also added the fat percentage > as a confounder as we saw significant correlations with the first principal > component of the data. Does this look right to you? > > Now, after having identified significantly differentially methylated CpGs, > we want to use the GSA package and look at correlations between methylation > and expression data. For GSA the pairs can be specified directly in the > function call. Does that also work with continuous traits or only if you > have to groups? Additionally, I am not really sure how to include > confounders then. Do I have to use adjusted or unadjusted data? If I use > adjusted data, would I use the same design matrix as above and not include > pair information in the function call? Would that be still a within-pair > comparison then? And for the adjustment itself, would it be something like > adj.m <- normalizedM-fit$coef[,-1]%*%t(myDesign[,-1]) or do I also have to > include the columns for pair and fat percentage in this adjustment somehow? > If I don't have to use unadjusted data, how would I include information on > fat percentage and the estradiol levels then? > > Similarly, for the correlations between methylation and expression... Do I > just use the adjusted data sets and then compute correlations over all > individuals? Is that then still considering the within-pair changes? Or > would I use delta betas for correlation analysis? In the latter case, would > I use adjusted data? Would that then be like adjusting for pair twice if I > use the design matrix from above? Or would I have to change the matrix and > if yes, how? > > One last thing - say I wanted to perform differential analysis between two > groups (not within-pair) but still have some twin pairs included in the > analysis, would I then used duplicateCorrelation() instead of including the > pair information directly in the design matrix? Or if that's not the right > way to go, what should I do in that case? > > Sorry for that many questions! However, I would really appreciate any kind > of help or ideas, to be able to understand how to go on... > > > Thanks a lot in advance and best regards, > > Aileen > > _______________________________________________ > Bioc-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/bioc-devel > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel