Here is one way to do it: > y <- textConnection("UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_SL + 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195 + 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824 + 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735 + 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA + 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133") > x <- read.table(y, header=TRUE) > closeAllConnections() > # split and then aggregate so we can carry through some data > z <- split(x, x$UniGene) > z.l <- lapply(z, function(.data){ + .agg <- colMeans(.data[, c(1,4:8)], na.rm=TRUE) + data.frame(.data[1, 2], .data[1, 3], lapply(.agg, unlist)) + }) > do.call(rbind, z.l) .data.1..2. .data.1..3. UNIQID X1_SL X2_SL X17_SL X18_SL X38_SL Hs.10095 Hs.10095 MLLT1 1175390 -0.00595 0.62315 0.853150 1.11215 -0.19500 Hs.10101 Hs.10101 C1orf166 1175392 -0.49450 -0.04025 0.129900 -0.00575 -0.18240 Hs.101014 Hs.101014 CEP57 1190438 0.22230 0.04415 -0.458975 -0.43815 -0.14735 Hs.1011 Hs.1011 PROZ 1173756 -0.72110 -0.68895 0.465100 0.30815 0.11330 > >
On Wed, Jul 23, 2008 at 5:08 PM, Kaposi-Novak, Pal <[EMAIL PROTECTED]> wrote: > > ________________________________________ > From: Kaposi-Novak, Pal > Sent: Wednesday, July 23, 2008 5:07 PM > To: jim holtman > Subject: RE: [R] average replicate probe values > > Dear Dr Holtman, > > Thank you very much for your response. > > What I want is avarege data points in a data.frame from probes which > represent the same gene (ie have the same UniGene ID). > > For example in the table below probe sets in rows 3 and 4 both represent the > CEP57 gene. > > UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL > 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195 > 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824 > 1187428 Hs.101014 CEP57 0.60085 0.2564 -0.42885 -0.57635 -0.14735 > 1193447 Hs.101014 CEP57 -0.15625 -0.1681 -0.4891 -0.29995 NA > 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133 > > I would like to make R find the matching UniGene IDs and average expression > values for each sample. > The result would look like the table below: > > UNIQID UniGene Gene 1_SL 2_SL 17_SL 18_SL 38_ SL > 1175390 Hs.10095 MLLT1 -0.00595 0.62315 0.85315 1.11215 -0.195 > 1175392 Hs.10101 C1orf166 -0.4945 -0.04025 0.1299 -0.00575 -0.1824 > 1199466 Hs.101014 CEP57 0.2223 0.04415 -0.458975 -0.43815 -0.14735 > 1173756 Hs.1011 PROZ -0.7211 -0.68895 0.4651 0.30815 0.1133 > > I am sorry for the naivness of my question, but I am not a trained > biostatistician just need to analyze data. > > Sincerely, > > Pal Kaposi-Novak MD PhD > PIRT Fellow > University of Pittsburgh > Department of Pathology > BST S408, 200 Lothrop Str > Pittsburgh, PA , 15261 > Tel: (412) 383-7748 > [EMAIL PROTECTED] > ________________________________________ > From: jim holtman [EMAIL PROTECTED] > Sent: Wednesday, July 23, 2008 7:15 AM > To: Kaposi-Novak, Pal > Cc: r-help@r-project.org > Subject: Re: [R] average replicate probe values > > It would be helpful if you included a sample of the data so that we > could understand what you would like to do with it (before/after > pictures). > > ?aggregate > > On Tue, Jul 22, 2008 at 9:57 PM, Kaposi-Novak, Pal > <[EMAIL PROTECTED]> wrote: >> Hi, >> >> Could somebody tell me how I can average expression values of replicate >> probe sets in an data frame? >> >> Thanks >> >> Pal Kaposi-Novak MD PhD >> PIRT Fellow >> University of Pittsburgh >> Department of Pathology >> BST S408, 200 Lothrop Str >> Pittsburgh, PA , 15261 >> Tel: (412) 383-7748 >> [EMAIL PROTECTED] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.