On Fri, 11 Apr 2008, Zev Ross wrote: > Hi All, > > Can anyone direct me to a read function in R that will allow me to only > read in rows of a text file that begin with a particular value such as > the data below. I would read the entire file in and then limit, but the > files were constructed such that the first two letters determine how > many variables are in the row (different letters mean different numbers > of columns and different column names/types). > > I can do this in SAS, but I'd prefer to use R. The approximate SAS code > is below with the key piece of code being "if rectype='RD'" then do. > > Thoughts?
If your data are in 'tmp.dat': > txt <- readLines( "tmp.dat" ) > con <- textConnection( grep( "^RD", txt, value=TRUE ) ) > dat <- read.csv( con, sep='|', header=FALSE) > close(con) > summary( dat[ , 1:3 ] ) V1 V2 V3 RD:6 I:6 Min. :1 1st Qu.:1 Median :1 Mean :1 3rd Qu.:1 Max. :1 Alternatively, if you have 'grep' in your system and in the path: > con2 <- pipe( 'grep "^RD" tmp.dat' ) > dat2 <- read.csv( con2, sep='|', header=FALSE) > See ?connection ?textConnection ?grep HTH, Chuck > > Zev > > > RD|I|01|073|0023|68103|5|7|017|810|20070103|00:00|0.6||3||||||||||||| > RD|I|01|073|0023|68103|5|7|017|810|20070106|00:00|9.5||3||||||||||||| > RD|I|01|073|0023|68103|5|7|017|810|20070109|00:00|2.5||3||||||||||||| > RD|I|01|073|0023|68103|5|7|017|810|20070112|00:00|13.7||3||||||||||||| > RD|I|01|073|0023|68103|5|7|017|810|20070115|00:00|7.3||3||||||||||||| > RA|I|01|073|0023|A334|5|7|017|810|20070118|00:00|3.7||3||||||||||||| > RD|I|01|073|0023|68103|5|7|017|810|20070121|00:00|6.9||3||||||||||||| > RC|I|01|073|0023|Quer|5|7|017|810|20070124|00:00|1.8||3||||||||||||| > > > infile 'C:\junk\RD_501_88101_2006-0.txt' > dlm='|' firstobs=3 missover; > rectype $2. @; > if rectype = 'RD' then do; > > -- > Zev Ross > ZevRoss Spatial Analysis > 303 Fairmount Ave > Ithaca, NY 14850 > 607-277-0004 (phone) > 866-877-3690 (fax, toll-free) > [EMAIL PROTECTED] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > Charles C. Berry (858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.