Re: [R] R Data

Fowler, Mark Thu, 14 Feb 2019 05:44:59 -0800

I am not sure I would use the word ‘accounted’, more like discounted (tossed 
out).

From: Spencer Brackett <spbracket...@saintjosephhs.com>
Sent: February 14, 2019 9:21 AM
To: Fowler, Mark <mark.fow...@dfo-mpo.gc.ca>
Cc: R-help <r-help@r-project.org>; Sarah Goslee <sarah.gos...@gmail.com>; 
Caitlin Gibbons <bioprogram...@gmail.com>; Jeff Newmiller 
<jdnew...@dcn.davis.ca.us>
Subject: Re: R Data

Mr. Fowler,

Thank you! This information is most helpful. So from my understanding, I can 
use the regression coefficients shown (via the coding I originally sent, to 
generate a continuous distribution with what is essentially a line of best fit? 
The data added here had some 30,000 variables (it is genomic data from TCGA), 
does this mean that any none NA data is being accounted for in said 
distribution?

Best,

Spencer Brackett

On Thursday, February 14, 2019, Fowler, Mark 
<mark.fow...@dfo-mpo.gc.ca<mailto:mark.fow...@dfo-mpo.gc.ca>> wrote:
Hi Spencer,

The an1 syntax is adding regression coefficients (or NAs where a regression 
could not be done) to the downloaded and processed data, which ends up a 
matrix. The cbind function adds the regression coefficients to the last column 
of the matrix (i.e. bind the columns of the inputs in the order given). Simple 
example below. Not actually any need for the separate cbind commands, could 
have just used an1=cbind(an,p,t). The cbind function expects all the columns to 
be of the same length, hence the use of the tryCatch function to capture NA's 
for failed regression attempts, ensuring that p and t correspond row by row 
with the matrix.

 x=seq(1,5)
 y=seq(6,10)
 z=seq(1,5)
xyz=cbind(x,y,z)
xyz
   x  y z
[1,] 1  6 1
[2,] 2  7 2
[3,] 3  8 3
[4,] 4  9 4
[5,] 5 10 5
dangs=rep(NA,5)
xyzd=cbind(xyz,dangs)
xyzd
     x  y z dangs
[1,] 1  6 1    NA
[2,] 2  7 2    NA
[3,] 3  8 3    NA
[4,] 4  9 4    NA
[5,] 5 10 5    NA

-----Original Message-----
From: R-help 
<r-help-boun...@r-project.org<mailto:r-help-boun...@r-project.org>> On Behalf 
Of Spencer Brackett
Sent: February 14, 2019 12:32 AM
To: R-help <r-help@r-project.org<mailto:r-help@r-project.org>>; Sarah Goslee 
<sarah.gos...@gmail.com<mailto:sarah.gos...@gmail.com>>; Caitlin Gibbons 
<bioprogram...@gmail.com<mailto:bioprogram...@gmail.com>>; Jeff Newmiller 
<jdnew...@dcn.davis.ca.us<mailto:jdnew...@dcn.davis.ca.us>>
Subject: [R] R Data

Hello everyone,

The following is a portion of coding that a colleague sent. Given my lack of 
experience in R, I am not quite sure what the significance of the following 
arguments. Could anyone help me translate? For context, I am aware of the 
downloading portion of the script... library(data.table) etc., but am not 
familiar with the portion pertaining to an1 .

library(data.table)
anno = as.data.frame(fread(file =
"/rsrch1/bcb/kchen_group/v_mohanty/data/TCGA/450K/mapper.txt", sep ="\t", 
header = T)) meth = read.table(file = 
"/rsrch1/bcb/kchen_group/v_mohanty/data/TCGA/27K/GBM.txt", sep  ="\t", header = 
T, row.names = 1) meth = as.matrix(meth) """ the loop just formats the 
methylation column names to match format"""
colnames(meth) = sapply(colnames(meth), function(i){
  c1 = strsplit(i,split = '.', fixed = T)[[1]]
  c1[4] = paste(strsplit(c1[4],split = "",fixed = T)[[1]][1:2],collapse =
"")
  paste(c1,collapse = ".")
})
exp = read.table(file =
"/rsrch1/bcb/kchen_group/v_mohanty/data/TCGA/RNAseq/GBM.txt", sep = "\t", 
header = T, row.names = 1) exp = as.matrix(exp) c = 
intersect(colnames(exp),colnames(meth))
exp = exp[,c]
meth = meth[,c]
m = apply(meth, 1, function(i){
  log2(i/(1-i))
})
m = t(as.matrix(m))
an = anno[anno$probe %in% rownames(m),]
an = an[an$gene %in% rownames(exp),]
an = an[an$location %in% c("TSS200","TSS1500"),]

p = apply(an,1,function(i){
  tryCatch(summary(lm(exp[as.character(i[2]),] ~ 
m[as.character(i[1]),]))$coefficient[2,4], error= function(e)NA)
})
t = apply(an,1,function(i){
  tryCatch(summary(lm(exp[as.character(i[2]),] ~ 
m[as.character(i[1]),]))$coefficient[2,3], error= function(e)NA)
})
an1 =cbind(an,p)
an1 = cbind(an1,t)
an1$q = p.adjust(as.numeric(an1$p))
summary(lm(exp["MAOB",] ~ m["cg00121904",]$coefficient[2,c(3:4)]
###############################################

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Data

Reply via email to