Here is a way to do it. I assume that you data has each record on a line; it came through the email as multiple lines.
> x <- readLines("/tempxx.txt") > # remove '#Fields:" so it can be used as a header > x <- sub("^#Fields: ", "", x) > # remove comment lines > x <- x[-grep("^#", x)] > # remove quotes > x <- gsub('"', '', x) > # now read in the data > input <- read.table(textConnection(x), header=TRUE) > > str(input) 'data.frame': 2 obs. of 16 variables: $ date : Factor w/ 1 level "2007-12-03": 1 1 $ time : Factor w/ 1 level "13:50:17": 1 1 $ c.ip : Factor w/ 1 level "200.40.203.197": 1 1 $ cs.username : Factor w/ 1 level "-": 1 1 $ s.ip : Factor w/ 1 level "200.40.51.20": 1 1 $ s.port : int 80 80 $ cs.method : Factor w/ 1 level "GET": 1 1 $ cs.uri.stem : Factor w/ 2 levels "/localidades/img/cargando.gif",..: 2 1 $ cs.uri.query : Factor w/ 1 level "-": 1 1 $ sc.status : int 200 200 $ sc.bytes : int 328 1150 $ cs.bytes : int 447 451 $ time.taken : int 0 0 $ cs.User.Agent.: Factor w/ 1 level "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322)": 1 1 $ cs.Cookie. : Factor w/ 1 level "ASPSESSIONIDSQCBSQAB=JOLECDCCBFCKPOFLGDLHMENA": 1 1 $ cs.Referer. : Factor w/ 1 level "http://www.teatro.com/localidades/localidades.asp": 1 1 > On Tue, Sep 22, 2009 at 9:51 PM, Sebastian Kruk <residuo.so...@gmail.com> wrote: > If I have a web log file as follows: > > #Software: Microsoft Internet Information Services 5.0 > #Version: 1.0 > #Date: 2007-12-03 13:50:17 > #Fields: date time c-ip cs-username s-ip s-port cs-method cs-uri-stem > cs-uri-query sc-status sc-bytes cs-bytes time-taken cs(User-Agent) > cs(Cookie) cs(Referer) > "2007-12-03 13:50:17 200.40.203.197 - 200.40.51.20 80 GET > /localidades/img/nada.gif - 200 328 447 0 > Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) > ASPSESSIONIDSQCBSQAB=JOLECDCCBFCKPOFLGDLHMENA > http://www.teatro.com/localidades/localidades.asp" > "2007-12-03 13:50:17 200.40.203.197 - 200.40.51.20 80 GET > /localidades/img/cargando.gif - 200 1150 451 0 > Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) > ASPSESSIONIDSQCBSQAB=JOLECDCCBFCKPOFLGDLHMENA > http://www.teatro.com/localidades/localidades.asp" > "2007-12-03 13:50:18 200.40.203.197 - 200.40.51.20 80 GET > /localidades/img/cerrar.png - 200 450 449 0 > Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1;+SV1;+.NET+CLR+1.1.4322) > > how can I turn it into a dataframe with 3 rows, and 16 columns named > date time c-ip cs-username s-ip s-port cs-method cs-uri-stem > cs-uri-query sc-status sc-bytes cs-bytes time-taken cs(User-Agent) > cs(Cookie) cs(Referer) skiping lines begining with #? > > Thanks, > > Sebastián. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.