Why don't you make one pass through your data and encode you characters as integers (it would appear that you only have 16 combinations). You might also want to consider using the 'raw' object since these only take up one byte of storage -- will reduce your storage requirements by 4. Then store each row in a 'filehash' object so you can quickly retrieve a row at a time and then index directly to the byte(s) that have the information that you want.
On Mon, Sep 22, 2008 at 7:00 AM, José E. Lozano <[EMAIL PROTECTED]> wrote: >> So is each line just ACCGTATAT etc etc? > > Exacty, A_G, A_A, G_G and the such. > >> If you have fixed width fields in a file, so that every line is the >> same length, then you can use random access methods to get to a >> particular value - just multiply the line length by the row number you > > Nice hint! I didn't think on this. But I fear that if I have missing values > on the file I wont be able to read the right information... > >> When doing this, it's a good idea to test your dataset first to make >> sure the lines and fields are right. > > Yes, I am trying to figure out if all the lines have the exact same lenght > to use a random access method to read it. > > Thanks, > Jose Lozano > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.