[R] How to represent the effect of one covariate on regression results?

2020-09-14 Thread Ana Marija
Hello, I was running association analysis using --glm genotypic from: https://www.cog-genomics.org/plink/2.0/assoc with these covariates: sex,age,PC1,PC2,PC3,PC4,PC5,PC6,PC7,PC8,PC9,PC10,TD,array,HBA1C. The result looks like this: #CHROMPOSIDREFALTA1TESTOBS_CTB

[R] how to replace values in a named vector

2020-09-14 Thread Ana Marija
Hello, I have a vector like this: > head(geneSymbol) Ku8QhfS0n_hIOABXuE Bx496XsFXiAlj.Eaeo W38p0ogk.wIBVRXllY QIBkqIS9LR5DfTlTS8 BZKiEvS0eQ305U0v34 6TheVd.HiE1UF3lX6g "MACC1""GGACT" "A4GALT" "NPSR1-AS1""NPSR1-AS1" "AAAS" it has around 15000 en

Re: [R] how to replace values in a named vector

2020-09-14 Thread Ana Marija
sorry not replace with NA but with empty string for a name, for example for example this: > geneSymbol["Ku8QhfS0n_hIOABXuE"] Ku8QhfS0n_hIOABXuE "MACC1" would go when I subject it to > geneSymbol["Ku8QhfS0n_hIOABXuE"] Ku8QhfS0n_hIOABXuE On Mon,

Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Ana Marija
. > > > https://groups.google.com/g/plink2-users?pli=1 > > > -- > > David. > > On 9/14/20 6:29 AM, Ana Marija wrote: > > Hello, > > > > I was running association analysis using --glm genotypic from: > > https://www.cog-genomics.org/plink/2.0/assoc w

Re: [R] How to represent the effect of one covariate on regression results?

2020-09-15 Thread Ana Marija
% had Type 1. (my TD covariate is reference for the type of diabetes) In the attach is the description of the data. Cheers, Ana On Tue, Sep 15, 2020 at 7:59 PM David Winsemius wrote: > > > On 9/15/20 8:57 AM, Ana Marija wrote: > > Hi Abby and David, > > > > Thanks

[R] how to overlay two histograms

2020-09-17 Thread Ana Marija
Hello, I am trying to overlay two histograms with this: p <- ggplot(d, aes(CHR, counts, fill = name)) + geom_bar(position = "dodge") p but I am getting this error: Error: stat_count() can only have an x or y aesthetic. Run `rlang::last_error()` to see where the error occurred. my data is this:

Re: [R] how to overlay two histograms

2020-09-17 Thread Ana Marija
SE) > barpos<-barplot(counts~name+CHR,data=d,beside=TRUE,names.arg=rep("",22)) > legend(40,22,c("new","old"),fill=c("gray20","gray80")) > library(plotrix) > staxlab(1,at=colMeans(barpos),labels=1:22) > > Jim > > On Fri,

[R] help with nesting if else statements

2020-09-23 Thread Ana Marija
Hello, I have a data frame as shown bellow. I want to create a new column PHENO which will be defined as follows: if CURRELIG==1 -> PHENO==1 in the above subset those that have: PLASER==2 -> PHENO==2 and those where RTNPTHY==1 -> PHENO==1 I tried doing this: a$PHENO=ifelse(a$CURRELIG==1 | a$RTNPT

Re: [R] help with nesting if else statements

2020-09-23 Thread Ana Marija
, 2020 at 11:43 AM Ana Marija wrote: > > Hello, > > I have a data frame as shown bellow. > I want to create a new column PHENO which will be defined as follows: > if CURRELIG==1 -> PHENO==1 > in the above subset those that have: > PLASER==2 -> PHENO==2 > and >

Re: [R] help with nesting if else statements

2020-09-23 Thread Ana Marija
t RHS position 1 taken as TRUE when assigning to type 'logical' (column 6 named 'PHENO') Please advise, Ana On Wed, Sep 23, 2020 at 2:48 PM Jeremie Juste wrote: > > > Hello Ana Marija, > > I cannot reproduce your error, > > with a$PHENO=ifelse(a$PLASER==2 |a$

[R] how to turn column into column names and fill it with values

2020-09-29 Thread Ana Marija
Hello, I have a data frame like this: > head(mc) FID IID PLATE 1 fam0110 G110 4RWG569 2 fam0113 G113 cherry 3 fam0114 G114 cherry 4 fam0117 G117 4RWG569 5 fam0118 G118 5XAV049 6 fam0119 G119 cherry ... > dim(mc) [1] 16254 > length(unique(mc$PLATE)) [1] 34 I am trying to make a ne

Re: [R] how to turn column into column names and fill it with values

2020-09-29 Thread Ana Marija
0117 G117 4RWG569 2 1 1 > 5 fam0118 G118 5XAV049 1 1 2 > 6 fam0119 G119 cherry 1 2 1 > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -

Re: [R] how to turn column into column names and fill it with values

2020-09-29 Thread Ana Marija
4:ncol(mt)) mt[,i] <- 1 + (names(mt)[i]== mt$PLATE) Thanks! On Tue, Sep 29, 2020 at 12:08 PM Ana Marija wrote: > > HI Bert, > > thank you for getting back to me. > I tried this: > > > dat <- cbind(mc, matrix(0,ncol = 34)) > > head(dat) > FID IID PLATE 1

[R] 2 D density plot interpretation and manipulating the data

2020-10-08 Thread Ana Marija
Hello, I have a data frame like this: > head(SNP) mean var sd FQC.10090295 0.0327 0.002678 0.0517 FQC.10119363 0.0220 0.000978 0.0313 FQC.10132112 0.0275 0.002088 0.0457 FQC.10201128 0.0169 0.000289 0.0170 FQC.10208432 0.0443 0.004081 0.0639 FQC.10218466 0.0116 0.000131 0.

Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-08 Thread Ana Marija
NP[SNP$density>400,] and plot it again: p <- ggplot(a, mapping = aes(x = mean, y = var)) p <- p + geom_density_2d() + geom_point() + my.theme + ggtitle("SNPS_red") On Thu, Oct 8, 2020 at 3:52 PM Ana Marija wrote: > > Hello, > > I have a data frame like this: > >

Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Ana Marija
Hi Abby, thank you for getting back to me and for this useful information. I'm trying to detect the outliers in my distribution based of mean and variance. Can I see that from the plot I provided? Would outliers be outside of ellipses? If so how do I extract those from my data frame, based on whi

Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Ana Marija
an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Oct 9, 2020 at 8:25 AM Ana Marija wrote: >> >> Hi Abby, >> >> thank you for gettin

Re: [R] 2 D density plot interpretation and manipulating the data

2020-10-09 Thread Ana Marija
a bad idea. > And if it's a good idea, then how much to trim. > > > On Sat, Oct 10, 2020 at 5:47 AM Ana Marija > wrote: > > > > Hi Bert, > > > > Another confrontational response from you... > > > > You might have noticed that I use the wor

[R] how do I remove entries in data frame from a vector

2020-10-21 Thread Ana Marija
Hello, I have a data frame with one column: > remove V1 1 ABAFT_g_4RWG569_BI_SNP_A10_35096 2 ABAFT_g_4RWG569_BI_SNP_B12_35130 3 ABAFT_g_4RWG569_BI_SNP_E09_35088 4 ABAFT_g_4RWG569_BI_SNP_E12_35136 5 ABAFT_g_4RWG569_BI_SNP_F11_35122 6 ABAFT_g_4RWG569_BI_SNP_F12_351

Re: [R] how do I remove entries in data frame from a vector

2020-10-21 Thread Ana Marija
lename %in% as.character(remove$V1)] > > > > > > Hope this helps, > > > > Rui Barradas > > > > Às 22:15 de 21/10/20, Ana Marija escreveu: > >> Hello, > >> > >> I have a data frame with one column: > >&g

Re: [R] how do I remove entries in data frame from a vector

2020-10-21 Thread Ana Marija
Makes sense, thank you! On Wed, 21 Oct 2020 at 17:46, Rolf Turner wrote: > > On Wed, 21 Oct 2020 16:15:22 -0500 > Ana Marija wrote: > > > Hello, > > > > I have a data frame with one column: > > > > > remove > > > >

[R] how to order variables on correlation plot

2020-11-06 Thread Ana Marija
Hello I have data like this: > head(my_data) subjects DIABDUR HBA1C ESRD SEX AGE PHENO C1 C2 1 fam0110_G110 38 9.41 2 51 2 -0.01144980 0.002661140 2 fam0113_G113 30 12.51 2 40 2 -0.00502052 -0.000929061 3 fam0114_G114 23 8.42

Re: [R] how to order variables on correlation plot

2020-11-06 Thread Ana Marija
sorry forgot to attach the plot. On Fri, Nov 6, 2020 at 8:07 AM Ana Marija wrote: > > Hello > > I have data like this: > > > head(my_data) > subjects DIABDUR HBA1C ESRD SEX AGE PHENO C1 C2 > 1 fam0110_G110 38 9.41 2 51 2 -

Re: [R] how to order variables on correlation plot

2020-11-06 Thread Ana Marija
lp page, ?corrplot . > > -- > > David. > > On 11/6/20 6:08 AM, Ana Marija wrote: > > sorry forgot to attach the plot. > > On Fri, Nov 6, 2020 at 8:07 AM Ana Marija wrote: > > Hello > > I have data like this: > > head(my_data) > > subjects DIABDUR HB

[R] making code (loop) more efficient

2020-12-15 Thread Ana Marija
Hello, I made a terribly inefficient code which runs forever but it does run. library(dplyr) library(splitstackshape) datalist = list() files <- list.files("/WEIGHTS1/Retina", pattern=".RDat", ignore.case=T) for(i in files) { a<-get(load(i)) names <- rownames(a) data <- as.data.frame(cbind(name

Re: [R] making code (loop) more efficient

2020-12-15 Thread Ana Marija
much experience with data tables I may be > wrong, but I suspect that the column name "blup" may not be visible or > even present in "data". I don't see it in "dd" above this code > fragment. > > Jim > > On Wed, Dec 16, 2020 at 11:12 AM Ana Mar

Re: [R] making code (loop) more efficient

2020-12-15 Thread Ana Marija
e and see if it contains a > column named "blup" or just the values that were extracted from > a$blup. Also, I assume that weight=blup looks for an object named > "blup", which may not be there. > > Jim > > On Wed, Dec 16, 2020 at 1:20 PM Ana Marija >

Re: [R] making code (loop) more efficient

2020-12-16 Thread Ana Marija
assuming that the filename > # is stored in files[i] > files<-"retina.ENSG0135776.wgt.RDat" > i<-1 > WGT<-rep(files[i],length(rsid)) > data<-data.frame(rsid=rsid,weight=a$top1, > ref_allele=ref_allele,eff_allele,WGT=WGT) > data > > Note that the

Re: [R] making code (loop) more efficient

2020-12-16 Thread Ana Marija
rsplit(names, ":")[-2]] out <- data[, .(rsid, ref_allele, eff_allele)][, WGT := files[i]][] } return(out) rm(data) gc() } parallel::stopCluster(cl) big_data <- rbindlist(lst_out, fill = TRUE) On Wed, Dec 16, 2020 at 9:31 AM Ana Marija wrote: >

[R] how to quantify outliers on multi-dimensional scaling plot?

2021-03-31 Thread Ana Marija
Hello, I am in process of writing a grant where I am explaining my planned methylation analysis using R software "minfi". In the text of the grant I am mentioning looking for samples containing outliers in the multi-dimensional scaling (MDS) plot https://rdrr.io/bioc/minfi/man/mdsPlot.html . My qu

[R] Error in n * rvec : non-numeric argument to binary operator

2021-04-19 Thread Ana Marija
Hello, I have this code, and when I run it: > kbpowerf() Error in n * rvec : non-numeric argument to binary operator this is the code: function (){ #USER SPECIFICATION PORTION alpha=0.05 #DESIGNATED ALPHA g=3 #NUMBER OF GROUPS nvec=c(25,10,15) #GROUP SIZES beta1vec=c(789.93,122.87,1871

[R] unable to remove NAs from a data frame

2021-09-16 Thread Ana Marija
Hi All, I have lines in file that look like this: > df[14509227,] SNP A1 A2 freq b se p N 1: NA NA NA NA NA data looks like this: > head(df) SNP A1 A2 freq b se p N 1: rs74337086 G A 0.0024460 0.1627 0.1231 0.1865 218792 2: rs76388980 G

Re: [R] unable to remove NAs from a data frame

2021-09-16 Thread Ana Marija
) > #[1] 145092258 > > df[14509227,] # beyond nrow(df) by 2 > > > Hope this helps, > > Rui Barradas > > > Às 15:12 de 16/09/21, Ana Marija escreveu: > > Hi All, > > > > I have lines in file that look like this: > > > >> df[1450922

[R] calculate power-linear mixed effect model

2021-09-17 Thread Ana Marija
Hi All, I plan to identify metabolite levels that differ between individuals with various retinopathy outcomes (DR or noDR). I plan to model metabolite levels using linear mixed models ref as implemented in lmm2met software. The model covariates will include: age, sex, SV1, SV, and disease_conditi

Re: [R] calculate power-linear mixed effect model

2021-09-17 Thread Ana Marija
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > On Fri, Sep 17, 2021 at 12:22 PM Ana Marija > wrote: > > > > Hi All, > > > > I plan to identify metabolite levels that differ between individuals > > with various retinopa

[R] how to do inverse log of every value in every column in data frame

2021-10-14 Thread Ana Marija
Hi All, I have a data frame like this: > head(b) LRET02LRET04LRET06LRET08LRET10LRET12LRET14 1 0 0.6931472 . 1.0986123 1.0986123 1.0986123 0.6931472 2 2.1972246 2.4849066 2.4849066 . 2.5649494 2.6390573 2.6390573 3 1.6094379 1.7917595 1.6094379

Re: [R] how to do inverse log of every value in every column in data frame

2021-10-14 Thread Ana Marija
with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Thu, Oct 14, 2021 at 10:10 AM Ana Marija > wrote: > >> Hi All, >> >> I hav

[R] Error in readChar(con, 5L, useBytes = TRUE) : cannot open the connection

2019-07-09 Thread Ana Marija
Hello, I am trying to run this program FUSION.assoc_test.R from : http://gusevlab.org/projects/fusion/#typical-analysis-and-output [t.cri.asokovic@cri16in002 fusion_twas-master]$ Rscript FUSION.assoc_test.R \ > --sumstats /gpfs/data/stranger-lab/anamaria/meta_gwas/META_CHR22_1.txt \ > --weights .

[R] Call `rlang::last_error()` to see a backtrace

2019-07-10 Thread Ana Marija
Hello, I am trying to use this program: https://github.com/kenhanscombe/ukbtools > my_ukb_data[1:3,1:3] eid sex_f31_0_0 year_of_birth_f34_0_0 1 117 Female 1938 2 125 Female 1951 3 138Male 1961 > ukb_icd_diagno

Re: [R] Call `rlang::last_error()` to see a backtrace

2019-07-10 Thread Ana Marija
so, always reply to list. > > On Wed, Jul 10, 2019, 6:08 PM Ana Marija > wrote: > > > Hi Patrick, > > > > thanks for getting back to me, I tried that: > > > > > ukb_icd_diagnosis(my_ukb_data, eid = "117", icd.version = 10) > > Error in u

Re: [R] Call `rlang::last_error()` to see a backtrace

2019-07-10 Thread Ana Marija
; > > Sent from my iPhone > > > > > On Jul 10, 2019, at 1:09 PM, Patrick (Malone Quantitative) < > mal...@malonequantitative.com> wrote: > > > > > > First response: The ID column in your data is labeled "eid" but your > > > function call refers

[R] How to create a new column based on the values from multiple columns which are matching a particular string?

2019-07-29 Thread Ana Marija
I have data frame which looks like this: df=data.frame( eye_problemsdisorders_f6148_0_1=c(A,C,D,NA,D,A,C,NA,B,A), eye_problemsdisorders_f6148_0_2=c(B,C,NA,A,C,B,NA,NA,A,D), eye_problemsdisorders_f6148_0_3=c(C,A,D,D,B,A,NA,NA,A,B), eye_problemsdisorders_f6148_0_4=c(D,D,NA,B,A,C,NA,C,A,B),

Re: [R] How to create a new column based on the values from multiple columns which are matching a particular string?

2019-07-29 Thread Ana Marija
which would be named "case" and values inside would be: 1,1,0,1,1,1,0,0,1,1 so "case" column is where value "A" can be found in any column. On Mon, Jul 29, 2019 at 12:53 PM Eric Berger wrote: > You may have a typo/misstatement in your question. > You define a

Re: [R] How to create a new column based on the values from multiple columns which are matching a particular string?

2019-07-29 Thread Ana Marija
Thank you so much! Just to confirm here MARGIN=1 indicates that "A" should appear at least once per row? On Mon, Jul 29, 2019 at 1:53 PM Eric Berger wrote: > df$case <- apply(df,MARGIN = 1,function(v) { as.integer("A" %in% v) }) > > > On Mon, Jul 29,

[R] filter and add a column

2019-08-06 Thread Ana Marija
Hello, I am filtering my data frame "tot" via: controls=tot %>% filter_all(any_vars(. %in% c("E109", "E119","E149"))) %>% filter_all(any_vars(. %in% c("Caucasian"))) %>% filter_all(any_vars(. %in% c("No kinship found","Ten or more third-degree relatives identified"))) > dim(controls) [1] 15381

Re: [R] filter and add a column

2019-08-06 Thread Ana Marija
some kind of ID variable? If so, it should be > straightforward with the appropriate joining function. > > What have you tried? > > Also, please post in plain text. > > > On Tue, Aug 6, 2019 at 2:16 PM Ana Marija > wrote: > > > > Hello, > > > > I am

Re: [R] filter and add a column

2019-08-06 Thread Ana Marija
guide. > > > On Tue, Aug 6, 2019 at 2:38 PM Ana Marija > wrote: > > > > Hi Patrick, > > > > yes both controls and tot have "eid" column, please see attached > > > > Can you please tell em what means to post in "plain text" ? &g

Re: [R] filter and add a column

2019-08-06 Thread Ana Marija
tor of 1s to controls before joining the datasets. > > On Tue, Aug 6, 2019 at 3:01 PM Ana Marija > wrote: > > > > I really don't know how I would implement this > > > > On Tue, Aug 6, 2019 at 1:42 PM Patrick (Malone Quantitative) < > mal...@malonequan

[R] Help with if else statement

2019-08-07 Thread Ana Marija
Hello, I have a data frame which looks like this: > head(pt) eidQ phenoQ phenoH 1 117 -9 -9 2 125 -9 -9 3 138 -9 1 4 142 -9 -9 5 156 -9 -9 6 174 -9 -9 7 138 -9 1 8 1000127 2 1 9 1000690 2

Re: [R] Help with if else statement

2019-08-07 Thread Ana Marija
does this look ok: pt$pheno=ifelse(pt$phenoQ==-9 & pt$phenoH==-9,-9,ifelse(pt$phenoH==2 | pt$phenoQ==2,2,1)) On Wed, Aug 7, 2019 at 1:40 PM Ana Marija wrote: > > Hello, > > I have a data frame which looks like this: > > > head(pt) > eidQ phenoQ phenoH > 1 10

[R] how to reverse colors on boxplot

2019-08-08 Thread Ana Marija
Hello, I made plot in attach using: boxplot(flcn_M~subject,data=dx,col = c("royalblue1","palevioletred1"),xlab="subjects",ylab="Expression estimate in delta (log2)",boxwex = 0.2,frame.plot = FALSE) stripchart(flcn_M~subject, vertical = TRUE, data = dx,method = "jitter", add = TRUE,pch = 20, col=r

[R] how to calculate True Positive Rate in R?

2019-09-24 Thread Ana Marija
Hello, I tried using qvalue function: library(qvalue) qval_obj=qvalue(pvalR) pi1=1-qval_obj$pi0 but after running: qval_obj=qvalue(pvalR) Error in smooth.spline(lambda, pi0, df = smooth.df) : missing or infinite values in inputs are not allowed or qval_obj=qvalu

[R] how to add p values to bar plot?

2019-09-27 Thread Ana Marija
Hi, I created a bar plot with this code: library(ggplot2) df <- data.frame("prop" = c(7.75,70.42), "Name" = c("All Genes","RG Genes")) p<-ggplot(data=df, aes(x=Name, y=prop,fill=Name)) + geom_bar(stat="identity")+ labs(x="", y = "Proportion of cis EQTLs")+ scale_fill_brewer(palette="Greens") +

Re: [R] how to add p values to bar plot?

2019-09-27 Thread Ana Marija
70.42-7.75 or fold change 70.42/7.75. If they are absolute value you can > also scale them in log scale and do the same. Hope this helps. Good luck. > > Vivek > > On Fri, Sep 27, 2019 at 9:32 PM Ana Marija > wrote: > >> Hi Vivek, >> >> Thanks for getting back

[R] can not extract rows which match a string

2019-10-03 Thread Ana Marija
Hello, I have a dataframe (t1) with many columns, but the one I care about it this: > unique(t1$sex_chromosome_aneuploidy_f22019_0_0) [1] NA"Yes" it has these two values. I would like to remove from my dataframe t1 all rows which have "Yes" in t1$sex_chromosome_aneuploidy_f22019_0_0 I tried

[R] how to select all columns that contain in any of their rows a partial match for a string?

2019-10-05 Thread Ana Marija
Hello, I have a data frame tot which has many columns and many rows. I am trying to find all columns that have say a value in any of their rows that STARTS WITH: "E94" for example there are columns like this: > unique(tot$diagnoses_icd9_f41271_0_44) [1] NA "E9420" I tried: s=select(tot,st

Re: [R] how to select all columns that contain in any of their rows a partial match for a string?

2019-10-05 Thread Ana Marija
tot$newcol <- -9 > tot$newcol[e10] <- 1 > tot$newcol[e11] <- 2 > > > On both cases the 2 lines sapply/rowSums can be made one with > > rowSums(sapply(...)) > 0 > > > Hope this helps, > > Rui Barradas > > Às 20:52 de 05/10/19, Ana Marija escreveu: >

[R] how to compare two distributions and calculate p value?

2019-10-22 Thread Ana Marija
Hello, I would like to calculate a p value from two distributions, one looks like this: > head(b) gene_id number_of_eqtles_per_gene 1: ENSG0237683.5 5 2: ENSG0225972.1 267 3: ENSG0225630.197 4: ENSG0

Re: [R] how to compare two distributions and calculate p value?

2019-10-22 Thread Ana Marija
er > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Oct 22, 2019 at 6:18 PM Ana Marija > wrote: >> >

[R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
Hello, I have two data frames like this: > head(l4) X1X2 X3 X4 X5 variant_id pval_nominal gene_id.LCL 1 chr1 13550 G A b38 1:13550:G:A 0.375614 ENSG0227232 2 chr1 14671 G C b38 1:14671:G:C 0.474708 ENSG0227232 3 chr1 14677 G A b38 1:14677:G:A 0.699887 ENSG0

Re: [R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
I also tried left_join but I got: Error: std::bad_alloc > df3 <- left_join(l4, asign, by = c("chr","pos")) Error: std::bad_alloc > dim(l4) [1] 166941635 8 > dim(asign) [1] 107371528 5 On Wed, Oct 23, 2019 at 5:32 PM Ana Marija wrote: > >

Re: [R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
ently the best way to do it if you > have a database manager is to read the two datasets into tables and do > the join via SQL or whatever language is available. > > Jim > > On Thu, Oct 24, 2019 at 10:17 AM Ana Marija > wrote: > > > > no can you please send m

Re: [R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
X2 X3 X4 X5 > [6] variant_id pval_nominal gene_id.LCL gene chr_pos > [11] p.val.Retina > <0 rows> (or 0-length row.names) > > It works okay, but there are no matches in the join. So I can't even > guess what the proble

Re: [R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
no can you please send me an example how the command would look like in my case? On Wed, Oct 23, 2019 at 6:16 PM Jim Lemon wrote: > > Yes. Have you tried the bigmemory package? > > Jim > > On Thu, Oct 24, 2019 at 10:08 AM Ana Marija > wrote: > > > > Hi Jim, &g

Re: [R] negative vector length when merging data frames

2019-10-23 Thread Ana Marija
I am using R-3.6.1 and these libraries: library(data.table) library(dplyr) On Wed, Oct 23, 2019 at 6:54 PM Duncan Murdoch wrote: > > On 23/10/2019 7:04 p.m., Ana Marija wrote: > > I also tried left_join but I got: Error: std::bad_alloc > > > >> df3 <- left_joi

[R] how to calculate multiple meta p values

2019-10-25 Thread Ana Marija
Hello, I would like to use this package metap to calculate multiple o values I have my data frame with 3 p values > head(tt) RSG E B 1: rs2089177 0.9986 0.7153 0.604716 2: rs4360974 0.9738 0.7838 0.430228 3: rs6502526 0.9744 0.7839 0.4291

Re: [R] how to calculate multiple meta p values

2019-10-25 Thread Ana Marija
this is the function I was referring to: https://www.rdocumentation.org/packages/metap/versions/1.1/topics/sumz On Fri, Oct 25, 2019 at 6:31 PM Ana Marija wrote: > > Hello, > > I would like to use this package metap > to calculate multiple o values > > I have my data

Re: [R] how to calculate multiple meta p values

2019-10-28 Thread Ana Marija
gt; There must be several ways of doing this but see below for an idea with > comments in-line. > > On 26/10/2019 00:31, Ana Marija wrote: > > Hello, > > > > I would like to use this package metap > > to calculate multiple o values > > > > I have

[R] Error when using qvalue function

2019-10-28 Thread Ana Marija
Hello, I am trying to calculate True Positive Rate, TPR with this procedure: pvals=q$METAge qval_obj=qvalue(pvals) #is false discovery rate pi1=1-qval_obj$pi0 #TPR pi1 #TPR But I am getting this error: Error in smooth.spline(lambda, pi0, df = smooth.df) : missing or

Re: [R] Error when using qvalue function

2019-10-28 Thread Ana Marija
1.053550e-01,4.812686e-01,1.404957e-01,9.835912e-02,4.373995e-01, > 8.803856e-02) > qval_obj=qvalue(pvals) > qval_obj$pi0 > [1] 0.1981095 > pi1=1-qval_obj$pi0 > pi1 > [1] 0.8018905 > > Jiim > > On Tue, Oct 29, 2019 at 8:45 AM Ana Marija > wrote: &g

Re: [R] Error when using qvalue function

2019-10-28 Thread Ana Marija
: missing or infinite values in inputs are not allowed On Mon, Oct 28, 2019 at 6:02 PM Ana Marija wrote: > > can you please send me command you used to install it? > > On Mon, Oct 28, 2019 at 5:12 PM Jim Lemon wrote: > > > > Hi Ana, > > Seems to work without er

Re: [R] Error when using qvalue function

2019-10-28 Thread Ana Marija
not to install packages in my > script but just to use `library` and manually get each of the packages that > `library` complains about onto my machine. Once done, the script runs just > fine after that. > > On October 28, 2019 4:08:03 PM PDT, Ana Marija > wrote: >

Re: [R] negative vector length when merging data frames

2019-10-28 Thread Ana Marija
- (l4join$X1 %in% ajoin$chr) & (l4join$X2 %in% ajoin$pos) > i2 <- (ajoin$chr %in% l4join$X1) & (ajoin$pos %in% l4join$X2) > > rm(l4join, ajoin) # don't need this any more, remove them > > # now the real fread's > l4 <- data.table::fread(l4_file) > asign <

Re: [R] Error when using qvalue function

2019-10-29 Thread Ana Marija
6.024415e-01, >> 2.459322e-02,2.873351e-01,8.477168e-01,1.351068e-02, >> 1.053550e-01,4.812686e-01,1.404957e-01,9.835912e-02,4.373995e-01, >> 8.803856e-02) >> qval_obj=qvalue(pvals) >> qval_obj$pi0 >> [1] 0.1981095 >> pi1=1-qval_obj$pi0 >> pi1 &

Re: [R] how to calculate multiple meta p values

2019-10-30 Thread Ana Marija
ry about that, I > should have thought of it before. > > When I next update metap I will try to get it to degrade more gracefully > when it finds an error. > > Michael > > On 28/10/2019 19:06, Ana Marija wrote: > > Hi Michael, > > > > I tried what y

Re: [R] how to calculate multiple meta p values

2019-10-30 Thread Ana Marija
wr "numeric" > sum(is.na(d$LCL)) [1] 0 > sum(is.na(d$Retina)) [1] 0 > sum(is.na(d$wl)) [1] 0 > sum(is.na(d$wr)) [1] 0 > dim(d) [1] 1668837 7 On Wed, Oct 30, 2019 at 4:52 PM Ana Marija wrote: > > Hi Michael, > > this still doesn't work, by dat

Re: [R] how to calculate multiple meta p values

2019-10-31 Thread Ana Marija
Can you please get back to me about this, I need this meta p values for manuscript I have to submit next week On Wed, Oct 30, 2019 at 5:35 PM Ana Marija wrote: > > I also tried to do it this way: > > d$META <- sapply(seq_len(nrow(d)), function(rn) { > unlist(sumz(as.matr

[R] How to calculate p value and correlation coefficient for Spearman’s correlation of differential expression data with 40000 permutations?

2019-10-31 Thread Ana Marija
Hello, I have 3 groups,let's call them g1, g2, g3. Each of them is a result of analysis in between groups of conditions, and g1 looks like this geneSymbol logFC t P.Value adj.P.Val Beta EXykpF1BRREdXnv9Xk MKI67 -0.3115880 -5.521186 5.77213

[R] How to merge 3 data frames by rownames?

2019-11-05 Thread Ana Marija
Hi, I have 3 data frames like this: > head(s11) B_NoD Ebfrl.7uOZfnjp_E7k 7.583709 ueQUrXd5FH554RlhZc 5.177791 0Uu3XrB6Bd14qoNeuc 4.680306 0t7nhVLii6tSAxtLhc 4.565023 fSUyR.vR7Xu0iR4nUU 2.885992 0Tm7hdRJxd9zoevPlA 2.866847 > head(s22) B_DwoC Ebfrl.7uOZfnj

[R] how to get higher precision p value output

2019-11-05 Thread Ana Marija
Hi, I am running this function: library(psych) corr.test.col.1to3 <- corr.test(allF[1:3], method = "spearman", use = "complete.obs") names(corr.test.col.1to3) corr.test.col.1to3$p and my result looks like this: > corr.test.col.1to3$p B_NoDB_DwoC B_DwC B_NoD 0.000 0.000

[R] How to interpret Mendelian randomization results?

2019-11-06 Thread Ana Marija
Hello, I did Mendelian randomization using this software: https://cran.r-project.org/web/packages/MendelianRandomization/vignettes/Vignette_MR.pdf library(MendelianRandomization) f=read.table("246LDout272Biobank_Retina.txt", header=T) > head(f) rs exposure.beta exposure.se outcome.beta

[R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
Hello, I have a data frame like this: > head(dt,20) chrpos gene_id pval_nominal pval_ret wl wr 1: chr1 54490 ENSG02272320.6084950 0.7837780 31.62278 21.2838 2: chr1 58814 ENSG02272320.2952110 0.8975820 31.62278 21.2838 3: chr1 60351 ENSG02272

Re: [R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
- > Dr. Gerrit Eichner Mathematical Institute, Room 212 > gerrit.eich...@math.uni-giessen.de Justus-Liebig-University Giessen > Tel: +49-(0)641-99-32104 Arndtstr. 2, 35392 Giessen, Germany > http://www.uni-giessen.de/eichner >

Re: [R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
--------- > > Am 08.11.2019 um 16:02 schrieb Ana Marija: > > I tried it but I got this error: > >> udt <- unique(dt[c("chr", "pos", "gene_id")]) > > Error in `[.data.table`(dt, c("chr", "pos", "gene_id")) : &g

Re: [R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
: chr1 64931 ENSG0227232 0.276679 0.907037 31.62278 21.2838 0.5974800 On Fri, Nov 8, 2019 at 9:30 AM Ana Marija wrote: > > Thank you so much! Converting it to data frame resolved the issue! > > On Fri, Nov 8, 2019 at 9:19 AM Gerrit Eichner > wrote: > > > > It seem

Re: [R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
them. About duplicated() function I know as well as about unique On Fri, 8 Nov 2019 at 10:08, Boris Steipe wrote: > Are you trying to eliminate duplicated rows from your dataframe? Because > that would be better achieved with duplicated(). > > > B. > > > > > > On

Re: [R] how to find number of unique rows for combination of r columns

2019-11-08 Thread Ana Marija
the - sign >> >> If you prefer, the "Tidyverse" world has what are purported to be more >> user-friendly versions of such data handling functionality that you can use >> instead. >> >> >> Bert >> >> On Fri, Nov 8, 2019 at 7:38 AM Ana M

[R] QQ plot

2019-11-11 Thread Ana Marija
Hi, I was using this library, qqman https://cran.r-project.org/web/packages/qqman/vignettes/qqman.html to create QQ plot, attached. How would I change this default abline to start from the beginning of my QQ line? This is my code: qq(dd$P, main = "Q-Q plot of GWAS p-values") Thanks Ana

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
details about my data if it is helpful: > median(dd$P,na.rm = FALSE) [1] 0.000444 > mean(dd$P,na.rm = FALSE) [1] 0.000461 > min(dd$P,na.rm = FALSE) [1] 9.89e-08 > max(dd$P,na.rm = FALSE) [1] 0.001 On Tue, Nov 12, 2019 at 2:07 PM Ana Marija wrote: > > Hi, > > what I know

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
I agree with Abby. That would defeat the purpose of a QQ plot. > > > > On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle wrote: > > > > > Hi > > > > > > I'm not familiar with the qqman package, or GWAS studies. > > > However, my guess would be th

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
> wrote: > >> > >> I agree with Abby. That would defeat the purpose of a QQ plot. > >> > >> On Mon, Nov 11, 2019, 9:54 PM Abby Spurdle wrote: > >> > >>> Hi > >>> > >>> I'm not familiar with the qqman package

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
Just do I need to change the axis when I multiply with 1000 and what should I put on my axis? On Tue, Nov 12, 2019 at 3:07 PM Ana Marija wrote: > > Hi Duncan, > > yes I choose for QQ plot only P<1e-3 and multiplying everything with > 1000 works great! > This should no

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
the smallest p value in my dataset goes to 9.89e-08. How do I make that known on the new QQ plot with multiplied with 1000 values On Tue, Nov 12, 2019 at 3:37 PM Ana Marija wrote: > > Just do I need to change the axis when I multiply with 1000 and what > should I put on my axis? > &

Re: [R] QQ plot

2019-11-12 Thread Ana Marija
PM Ana Marija wrote: > > the smallest p value in my dataset goes to 9.89e-08. How do I make > that known on the new QQ plot with multiplied with 1000 values > > On Tue, Nov 12, 2019 at 3:37 PM Ana Marija > wrote: > > > > Just do I need to change the axis when I mult

Re: [R] QQ plot

2019-11-13 Thread Ana Marija
ight grey so the overlap is more > visible. > > Michael > > On 12/11/2019 22:04, Ana Marija wrote: > > why I selected only those with P<0.003 to put on QQ plot is because > > the original data set contains 5556249 points and when I extract only > > P<0.001 I am getting

[R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija
Hello, I have a data frame like this (a matrix): head(calc.rho) rs9900318 rs8069906 rs9908521 rs9908336 rs9908870 rs9895995 rs56192520 0.903 0.268 0.327 0.327 0.327 0.582 rs3764410 0.928 0.276 0.336 0.336 0.336 0.598 rs145984817 0.

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija
with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Thu, Nov 14, 2019 at 10:50 AM Ana Marija > wrote: >> >> Hello, >> >>

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija
what would be the approach to remove variable that has at least 2 correlation coefficients >0.8? this is the whole output of the head() > head(calc.rho) rs56192520 rs3764410 rs145984817 rs1807401 rs1807402 rs35350506 rs56192520 1.000 0.976 0.927 0.927 0.927

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-14 Thread Ana Marija
s of the variables with at most one absolute value greater than > 0.8 ignoring the diagonal values because I don't care about those". If > so: > > colnames(calc.jim)[colSums(abs(calc.jim)>0.8)<3] > > Any more tricks? > > Jim > > On Fri, Nov 15, 2019 at 8:1

Re: [R] Remove highly correlated variables from a data frame or matrix

2019-11-15 Thread Ana Marija
contain bugs, but something along > these lines should get you where you want to be. > > Oh, and depending on how strict you want to be with the remaining > correlations, you could use complete linkage clustering (will retain > more variables, some correlations will be above 0.8)

  1   2   3   >