You are right.  I also just thought about that, no intercept is not applicable 
to my case.

Ding

From: Bert Gunter <bgunter.4...@gmail.com>
Sent: Saturday, August 10, 2024 1:06 PM
To: Yuan Chun Ding <ycd...@coh.org>
Cc: Ben Bolker <bbol...@gmail.com>; r-help@r-project.org
Subject: Re: [R] a fast way to do my job

Ah, messages crossed. A no-intercept model **assumes** the straight line fit 
must pass through the origin. Unless there is a strong justification for such 
an assumption, you should include an intercept. -- Bert On Sat, Aug 10, 2024 at 
1: 02 PM


Ah, messages crossed.

A no-intercept model **assumes** the straight line fit must pass

through the origin. Unless there is a strong justification for such an

assumption, you should include an intercept.



-- Bert



On Sat, Aug 10, 2024 at 1:02 PM Bert Gunter 
<bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com>> wrote:

>

> Is it because I failed to to add a column of ones for an intercept to

> the x matrix? TRhat would be my bad.

>

> -- Bert

>

>

> On Sat, Aug 10, 2024 at 12:59 PM Bert Gunter 
> <bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com>> wrote:

> >

> > Probably because you inadvertently ran different models. Without your code, 
> > I haven't a clue.

> >

> >

> > On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding 
> > <ycd...@coh.org<mailto:ycd...@coh.org>> wrote:

> >>

> >> HI Bert and Ben,

> >>

> >>

> >>

> >> Yes, running lm.fit using the matrix format is much faster. I read a 
> >> couple of online comments why it is faster.

> >>

> >>

> >>

> >> However, the residual values for three tested variables or genes from lm 
> >> function and lm.fit function are different, with Pearson correlation of 
> >> 0.55, 0.89, and 0.99.

> >>

> >>

> >>

> >> I have not found the reason.

> >>

> >>

> >>

> >> Thanks,

> >>

> >>

> >> Ding

> >>

> >>

> >>

> >> From: Bert Gunter <bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com>>

> >> Sent: Friday, August 9, 2024 7:11 PM

> >> To: Ben Bolker <bbol...@gmail.com<mailto:bbol...@gmail.com>>

> >> Cc: Yuan Chun Ding <ycd...@coh.org<mailto:ycd...@coh.org>>; 
> >> r-help@r-project.org<mailto:r-help@r-project.org>

> >> Subject: Re: [R] a fast way to do my job

> >>

> >>

> >>

> >> Better idea, Ben! It would work as you might expect it to to produce the 
> >> same results as the above: ##first make sure your regressor is a matrix: 
> >> pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into 
> >> a matrix dat <-

> >>

> >> Better idea, Ben!

> >>

> >>

> >>

> >> It would work as you might expect it to to produce the same results as

> >>

> >> the above:

> >>

> >>

> >>

> >> ##first make sure your regressor is a matrix:

> >>

> >> pur2 <- matrix(purity2, ncol =1)

> >>

> >> ## convert the data frame variables into a matrix

> >>

> >> dat <- as.matrix(gem751be.rpkm[ , 74:35164])

> >>

> >> ##then

> >>

> >> result <- residuals(lm.fit( x= pur2, y = dat))

> >>

> >>

> >>

> >> Cheers,

> >>

> >> Bert

> >>

> >>

> >>

> >> On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker 
> >> <bbol...@gmail.com<mailto:bbol...@gmail.com>> wrote:

> >>

> >> >

> >>

> >> > You can also fit a linear model with a matrix-valued response

> >>

> >> > variable, which should be even faster (not sure off the top of my head

> >>

> >> > how to get the residuals and reshape them to the dimensions you want)

> >>

> >> >

> >>

> >> > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter 
> >> > <bgunter.4...@gmail.com<mailto:bgunter.4...@gmail.com>> wrote:

> >>

> >> > >

> >>

> >> > > See ?lm.fit.

> >>

> >> > > I must be missing something, because:

> >>

> >> > >

> >>

> >> > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2,

> >>

> >> > > gem751be.rpkm[, i] )))

> >>

> >> > >

> >>

> >> > > would give you a 751 x 35091 matrix of the residuals from each of the

> >>

> >> > > regressions.

> >>

> >> > > I assume it will be considerably faster than all the overhead you are

> >>

> >> > > carrying in your current code, but of course you'll have to try it and

> >>

> >> > > see. ... Assuming that I have interpreted your request correctly.

> >>

> >> > > Ignore if not.

> >>

> >> > >

> >>

> >> > > Cheers,

> >>

> >> > > Bert

> >>

> >> > >

> >>

> >> > > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help

> >>

> >> > > <r-help@r-project.org<mailto:r-help@r-project.org>> wrote:

> >>

> >> > > >

> >>

> >> > > > Dear R users,

> >>

> >> > > >

> >>

> >> > > > I am running the following code below,  the gem751be.rpkm is a 
> >> > > > dataframe with dim of 751 samples by 35164 variables,  73 phenotypic 
> >> > > > variables in the furst to 73rd column and 35091 genomic variables or 
> >> > > > genes in the 74th to 35164th columns.  What I need to do is to 
> >> > > > calculate the residuals for each gene using the simple linear 
> >> > > > regression model of genelist[i] ~ purity2;

> >>

> >> > > >

> >>

> >> > > > The following code is running,  it takes long time, but I have an 
> >> > > > expensive ThinkStation window computer.

> >>

> >> > > > Can you provide a fast way to do it?

> >>

> >> > > >

> >>

> >> > > > Thank you,

> >>

> >> > > >

> >>

> >> > > > Ding

> >>

> >> > > >

> >>

> >> > > > ---------------------------------------------------------------------------------

> >>

> >> > > >

> >>

> >> > > >

> >>

> >> > > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)),

> >>

> >> > > > +                           by.x="id2",by.y=0)

> >>

> >> > > > >   row.names(gem751be.rpkm)<-gem751be.rpkm$id3

> >>

> >> > > > >   
> >> > > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_")

> >>

> >> > > > >   genelist <- gem751be.rpkm %>% dplyr::select(74:35164)

> >>

> >> > > > >   residuals <- NULL

> >>

> >> > > > >   for (i in 1:length(genelist)) {

> >>

> >> > > > +     #i=1

> >>

> >> > > > +     formula <- reformulate("purity2", response=names(genelist)[i])

> >>

> >> > > > +     model <- lm(formula, data = gem751be.rpkm)

> >>

> >> > > > +     resi <- as.data.frame(residuals(model))

> >>

> >> > > > +     colnames(resi)[1]<-names(genelist)[i]

> >>

> >> > > > +     resi <-as.data.frame(t(resi))

> >>

> >> > > > +     residuals <- rbind(residuals, resi)

> >>

> >> > > > +   }

> >>

> >> > > >

> >>

> >> > > >

> >>

> >> > > >

> >>

> >> > > > ----------------------------------------------------------------------

> >>

> >> > > > ------------------------------------------------------------

> >>

> >> > > > -SECURITY/CONFIDENTIALITY WARNING-

> >>

> >> > > >

> >>

> >> > > > This message and any attachments are intended solely for the 
> >> > > > individual or entity to which they are addressed. This communication 
> >> > > > may contain information that is privileged, confidential, or exempt 
> >> > > > from disclosure under applicable law (e.g., personal health 
> >> > > > information, research data, financial information). Because this 
> >> > > > e-mail has been sent without encryption, individuals other than the 
> >> > > > intended recipient may be able to view the information, forward it 
> >> > > > to others or tamper with the information without the knowledge or 
> >> > > > consent of the sender. If you are not the intended recipient, or the 
> >> > > > employee or person responsible for delivering the message to the 
> >> > > > intended recipient, any dissemination, distribution or copying of 
> >> > > > the communication is strictly prohibited. If you received the 
> >> > > > communication in error, please notify the sender immediately by 
> >> > > > replying to this message and deleting the message and any 
> >> > > > accompanying files from your system. If, due to the security risks, 
> >> > > > you do not wish to rec

> >>

> >> > > >  eive further communications via e-mail, please reply to this 
> >> > > > message and inform the sender that you do not wish to receive 
> >> > > > further e-mail from the sender. (LCP301)

> >>

> >> > > > ------------------------------------------------------------

> >>

> >> > > >

> >>

> >> > > >         [[alternative HTML version deleted]]

> >>

> >> > > >

> >>

> >> > > > ______________________________________________

> >>

> >> > > > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> >> > > > UNSUBSCRIBE and more, see

> >>

> >> > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$%3e>

><https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$%3e>>>

> >> > > > PLEASE do read the posting guide 
> >> > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$<https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$%3e>

><https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$%3e>>>

> >> > > > and provide commented, minimal, self-contained, reproducible code.

> >>

> >> > >

> >>

> >> > > ______________________________________________

> >>

> >> > > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> >> > > UNSUBSCRIBE and more, see

> >>

> >> > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$%3e>

><https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$%3e>>>

> >> > > PLEASE do read the posting guide 
> >> > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$<https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$%3e>

><https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$%3e>>>

> >> > > and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to