There are several things you can tell read.table to make it faster.

First, as mentioned, setting colClasses helps. I think telling read.table how
many rows and columns there are also helps.

When this was not sufficient,  I've had to do the data processing
using Python, Perl, or awk.

If that had not been convenient I would have tried the sqldf solution that was
mentioned.

That covers all the options I'm familiar with. I'm also curious about other ways
to selectively read in rows in R. Let me know what ends up working.



On Sun, May 31, 2009 at 2:17 PM,  <g...@ucalgary.ca> wrote:
> Since there are many rows, using read.table we spent too much on reading
> in rows that we do not want. We are wondering if there is a way to read
> only rows that we are interested in. Thanks,
>
> -james
>> I think you can use readLines(n=1) in loop to skip unwanted rows.
>>
>> On Mon, Jun 1, 2009 at 12:56 AM,  <g...@ucalgary.ca> wrote:
>>> Thanks, Juliet.
>>> It works for filtering columns.
>>> I am also wondering if there is a way to filter rows.
>>> Thanks again.
>>> -james
>>>
>>>> One can use colClasses to set which columns get read in. For the
>>>> columns you don't
>>>> want you can set those to NULL. For example,
>>>>
>>>> cc <- c("NULL",rep("numeric",9))
>>>>
>>>> myData <-
>>>> read.table("myFile.txt",header=TRUE,colClasses=cc,nrow=numRows).
>>>>
>>>>
>>>> On Wed, May 27, 2009 at 12:27 PM,  <g...@ucalgary.ca> wrote:
>>>>> We are reading big tables, such as,
>>>>>
>>>>> Chemicals <-
>>>>> read.table('ftp://ftp.bls.gov/pub/time.series/wp/wp.data.7.Chemicals',header
>>>>> = TRUE, sep = '\t', as.is =T)
>>>>>
>>>>> I was wondering if it is possible to set a filter during loading so
>>>>> that
>>>>> we just load what we want not the whole table each time. Thanks,
>>>>>
>>>>> -james
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>
>
>

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to