HI Rui,

Thank you for the  help!

You did not remove a row if zero values exist in both column pair, right?

Ding

From: Rui Barradas <ruipbarra...@sapo.pt>
Sent: Thursday, July 25, 2024 11:15 AM
To: Yuan Chun Ding <ycd...@coh.org>; r-help@r-project.org
Subject: Re: [R] please help generate a square correlation matrix

Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R users, > > 
I generated a square correlation matrix for the dat dataframe below; > 
dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > g2=c(0,1,0,1,0,1,1,0,0), > 
g3=c(1,1,0,0,0,1,0,0,0),


Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu:

> Hi R users,

>

> I generated a square correlation matrix for the dat dataframe below;

> dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0),

>                  g2=c(0,1,0,1,0,1,1,0,0),

>                  g3=c(1,1,0,0,0,1,0,0,0),

>                  g4=c(0,1,0,1,1,1,1,1,0))

> library("Hmisc")

> dat.rcorr = rcorr(as.matrix(dat))

> dat.r <-round(dat.rcorr$r,2)

>

> however, I want to modify this correlation calculation;

> my dat has more than 1000 rows and 22 columns;

> in each column, less than 10% values are 1, most of them are 0;

> so I want to remove a  row with value of zero in both columns when calculate 
> correlation between two columns.

> I just want to check whether those values of 1 are correlated between two 
> columns.

> Please look at my code in the following;

>

> cor.4gene <-matrix(0,nrow=4*4, ncol=4)

> for (i in 1:4){

>    #i=1

>    for (j in 1:4) {

>      #j=1

>      d <-dat[,c(i,j)]%>%

>        filter(eval(as.symbol(colnames(dat)[i]))!=0 |

>                 eval(as.symbol(colnames(dat)[j]))!=0)

>      c <-cor.test(d[,1],d[,2])

>      cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j],

>                          c$estimate,c$p.value)

>    }

> }

> cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0)

> colnames(cor.4gene)<-c("gene1","gene2","cor","P")

>

> Can you tell me what mistakes I made?

> first, why cor is NA when calculation of correlation for g1 and g1, I though 
> it should be 1.

>

> cor.4gene$cor[is.na(cor.4gene$cor)]<-1

> cor.4gene$cor[is.na(cor.4gene$P)]<-0

> cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, values_from = cor)

>

> Then this line of code above did not generate a square matrix as what the 
> HMisc library did.

> How to fix my code?

>

> Thank you,

>

> Ding

>

>

> ----------------------------------------------------------------------

> ------------------------------------------------------------

> -SECURITY/CONFIDENTIALITY WARNING-

>

> This message and any attachments are intended solely for the individual or 
> entity to which they are addressed. This communication may contain 
> information that is privileged, confidential, or exempt from disclosure under 
> applicable law (e.g., personal health information, research data, financial 
> information). Because this e-mail has been sent without encryption, 
> individuals other than the intended recipient may be able to view the 
> information, forward it to others or tamper with the information without the 
> knowledge or consent of the sender. If you are not the intended recipient, or 
> the employee or person responsible for delivering the message to the intended 
> recipient, any dissemination, distribution or copying of the communication is 
> strictly prohibited. If you received the communication in error, please 
> notify the sender immediately by replying to this message and deleting the 
> message and any accompanying files from your system. If, due to the security 
> risks, you do not wish to rec

>   eive further communications via e-mail, please reply to this message and 
> inform the sender that you do not wish to receive further e-mail from the 
> sender. (LCP301)

> ------------------------------------------------------------

>

>             [[alternative HTML version deleted]]

>

> ______________________________________________

> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> UNSUBSCRIBE and more, see

> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$>

> PLEASE do read the posting guide 
> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$<https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$>

> and provide commented, minimal, self-contained, reproducible code.

Hello,



You are complicating the code, there's no need for as.symbol/eval, the

column numbers do exactly the same.



# create the two results matrices beforehand

r <- P <- matrix(NA, nrow = 4L, ncol = 4L, dimnames = list(names(dat),

names(dat)))



for(i in 1:4) {

   x <- dat[[i]]

   for(j in (1:4)) {

     if(i == j) {

       # there's nothing to test, assign correlation 1

       r[i, j] <- 1

     } else {

       tmp <- cor.test(x, dat[[j]])

       r[i, j] <- tmp$estimate

       P[i, j] <- tmp$p.value

     }

   }

}



# these two results are equal up to floating-point precision

dat.rcorr$r

#>           g1        g2        g3        g4

#> g1 1.0000000 0.1000000 0.3162278 0.1581139

#> g2 0.1000000 1.0000000 0.3162278 0.6324555

#> g3 0.3162278 0.3162278 1.0000000 0.0000000

#> g4 0.1581139 0.6324555 0.0000000 1.0000000

r

#>           g1        g2           g3           g4

#> g1 1.0000000 0.1000000 3.162278e-01 1.581139e-01

#> g2 0.1000000 1.0000000 3.162278e-01 6.324555e-01

#> g3 0.3162278 0.3162278 1.000000e+00 1.355253e-20

#> g4 0.1581139 0.6324555 1.355253e-20 1.000000e+00



# these two results are equal up to floating-point precision

dat.rcorr$P

#>           g1         g2        g3         g4

#> g1        NA 0.79797170 0.4070838 0.68452834

#> g2 0.7979717         NA 0.4070838 0.06758329

#> g3 0.4070838 0.40708382        NA 1.00000000

#> g4 0.6845283 0.06758329 1.0000000         NA

P

#>           g1         g2        g3         g4

#> g1        NA 0.79797170 0.4070838 0.68452834

#> g2 0.7979717         NA 0.4070838 0.06758329

#> g3 0.4070838 0.40708382        NA 1.00000000

#> g4 0.6845283 0.06758329 1.0000000         NA





You can put these two results in a list, like Hmisc::rcorr does.



lst_rcorr <- list(r = r, P = P)





Hope this helps,



Rui Barradas









--

Este e-mail foi analisado pelo software antivírus AVG para verificar a presença 
de vírus.

https://urldefense.com/v3/__http://www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$<https://urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to