Greetings, all.
I've got a datafile I've been working with that has an ideosyncratic, heterogeneous format. It's grossly like: [...] DISKREAD,metadata about disks MEM,metadata about memory ZZZZ,observation-identifier,time,date DISKREAD,observation-identifier,data about disks MEM,observation-identifier,data about memory [ and repeat for each observation ] What I've done in the past was take the monolithic file, and preprocess it into files, one per observation type. The observation types are structurally self-similar, so once I have them split up, normal read.csv methods work just fine. Then I read the ZZZZ file to get timestamps, and whichever observation files I care about on this run. But ideally, I'd like to do this entire operation with R features, and without multiple passes through the file. The line lengths vary wildly, so a read.table doesn't help. I was visualizing the following: + create a FIFO for each desired observation class, including the ZZZZ metadata + In one pass through the source file, populate the FIFOs with their data + read.csv the output sides of the FIFOs. But I have problems right out of the gate: when I set a data.frame element to the output of fifo(), what actually gets inserted seems to be an integer; I am guessing it's being turned into a factor. example: ---- desired_slices=c("ZZZZ","DISKWRITE") temps = data.frame(slice=desired_slices,row.names=1,handle=I("")) temps["ZZZZ",] = fifo("./ZZZZ",open="w+") showConnections() ( you can see that the connection is open) temps ( you can see that the contents of the data.frame cell is the filehandle number) ----- Am I just barking up the wrong tree? - Allen S. Rout ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.