That sample data set is really hard to read. Could you resent it after having used dput on it?
A file output with dput is easily read into R and makes seeing what you need much easier. BTW what are the = doing? Thanks --- On Mon, 1/10/11, Guy Jett <gj...@itsi.com> wrote: > From: Guy Jett <gj...@itsi.com> > Subject: [R] Help with Data Transformation > To: "r-help@r-project.org" <r-help@r-project.org> > Received: Monday, January 10, 2011, 3:59 PM > Greetings, > I am new to R and am having trouble with parsing a file > with the following characteristics: > > * Individual results > for a single sample are written to multiple lines. > > * First 16 columns > are constant from sample to sample. > > * Remaining 10 need > to be matched up (cross-tabbed?) > > o (the exact contents for the remaining 10 > vary from sample to sample, as indicated in the extract > below) > > * Ultimate goal is to > run various comparisons between the variable columns, > compare samples from separate populations, and graph samples > from the separate populations. > > * (An extract is > provided below) > > The data is initially extracted from an SQL database into > Excel, then saved as a tab-delimited text file for use in > R. > I have been successful in using subset() to extract > specific sample types, but have not yet been able to > transform the data so that all the data needed is on a > single line. I have looked at several R manuals, read > through 'R in a Nutshell', prowled the help resources (R > Site Search and the Google link), tried stack(), subset(), > reshape(), and several other functions, to no avail. > > Thank you very much for your help. This seems like a > wonderful community, > Guy Jett, R.G. > Project Geologist > gj...@itsi.com<mailto:gj...@itsi.com> > > Example Data Input (subset): > > fldsampid > CLP_ID sacode > matrix etc... > prccode > Lab > EXMCODE > Analysis > PARLABEL > PARVQ Result > 2268 LHR020GW-01E2 > > N > WG > > INO > BRLS NONE E300 > CL > = > 23590.9 > 2269 LHR020GW-01E2 > > N > WG > > INO > BRLS NONE E300 > PO4 ND > 50 > 2270 LHR020GW-01E2 > > N > WG > > INO > BRLS NONE E300 > SO4 = > 22460 > 2272 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1631 HG > = > 0.00171 > 2273 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 AG > = 2.57 > 2274 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 AL > = > 122 > 2275 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 AS > = > 317 > 2276 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 B > = > 9970 > 2289 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 V > = > 131 > 2290 LHR020GW-01E2 > > N > WG > > MET > BRLS FLDFLT > E1638 Zn > = > 1.76 > 2291 LHR020GW-01E2 > > N > WG > > MET > BRLS METHOD > E1638 > PB ND > 0.008 > 2292 LHR020GW-01E2 > > N > WG > > MI > BRLS NONE A2320 > ALK = > 807000 > 2293 LHR020GW-01E2 > > N > WG > > MI > BRLS NONE A2320 > ALKB = > 807000 > 2294 LHR020GW-01E2 > > N > WG > > MI > BRLS NONE A2320 > ALKC ND > 2500 > 2295 LHR020GW-01E2 > > N > WG > > ORG > BRLS NONE > A5310B DOC = > 49330 > 2296 LHR020GW-01E2 > > N > WG > > SN > BRLS NONE E300 > NO3 = > 792 > 2326 LHR020SD-00E2 > > N > SE > > MET > BRLS METHOD > E1630 > MEHG = > 4.28 > 2327 LHR020SD-00E2 > > N > SE > > MI > BRLS METHOD > E160.3 SOLID > = > 48.45 > 2328 LHR020SD-00E2 > > N > SE > > ORG > BRLS NONE > SW9060 > TOC = > 4.823 > 2329 LHR020SD-00E2 MY77J8 > N > SE > > MET > A4SW METHOD > > C245.5 HG > = 5100 > 2330 LHR020SD-00E2 MY77J8 > N > SE > > MET > A4SW METHOD > > E200.8 AG > ND 1050 > 2331 LHR020SD-00E2 MY77J8 > N > SE > > MET > A4SW METHOD > > E200.8 AS > = > 5500 > 2332 LHR020SD-00E2 MY77J8 > N > SE > > MET > A4SW METHOD > > E200.8 B > = > 11400 > 2346 LHR020SD-00E2 MY77J8 > N > SE > > MET > A4SW SW3050B > SW6010B > V > = > 56900 > 2349 LHR020SD-00E2 MY77J8 > N > SE > > MI > A4SW METHOD > A2540G > SOLID = > 47.7 > > Desired output: > > fldsampid > CLP_ID sacode > matrix etc... CL > PO4 > SO4 AG > AL > AS > B > V > Zn > etc... ALK > ALKB ALKC > SOLID DOC > TOC NO3 > > LHR020GW-01E2 > > N > WG > > <value> > <value> > <value> > > <value> > <value> > <value> > > <value> > <value> > <value> > > <value> > <value> > <value> > > <value> > <value> > <value> > > <value> > <value> > > LHR020SD-00E2 MY77J8 N > SE > > NA > NA NA > <value> > <value> > > <value> > <value> > <value> > NA > <value> > NA > NA NA > <value> > NA > NA > NA > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org > mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, > reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.