There are also #select: and #select:thenDo: convenience methods.

NeoCSV is properly streaming, it should not introduce memory consumption 
problems itself. But note that you cannot load more than about 1Gb of permanent 
data in the current VM.

One known performance limitation is in handling extremely long lines/records.

If you have a question or problem, just ask.

Sven

> On 04 Apr 2015, at 19:54, Norbert Hartl <norb...@hartl.name> wrote:
> 
>> 
>> Am 04.04.2015 um 19:23 schrieb Serge Stinckwich <serge.stinckw...@gmail.com>:
>> 
>> Dear all,
>> We are currently setup a small ROASSAL team to participate to
>> #Datathon Data for Development:
>> http://simplon.co/datathon-data-for-development-rdv-les-7-et-8-avril-a-montreuil/
>> 
>> We are looking to ways to be able to load big CSV table in a Pharo image.
>> Apparently the size of some CSV files provided will be huge (around 5
>> Go for one month of data). The format of the data are describe here:
>> http://arxiv.org/abs/1407.4885
>> 
>> Is this possible with NeoCSV, to read only a fraction of the lines
>> regarding some conditions ?
>> 
> The NeoCSVReader supports the necessary stream protocol. If you setup the csv 
> reader you can call #next on it and filter by condition. There is also #atEnd 
> so a simple loop should. But I never used to csv reader so Sven might have 
> much better options.
> 
> Norbert
> 
>> If some people want to help online, we can organize a chat to organize us.
>> Regards,
>> -- 
>> Serge Stinckwich
>> UCBN & UMI UMMISCO 209 (IRD/UPMC)
>> Every DSL ends up being Smalltalk
>> http://www.doesnotunderstand.org/


Reply via email to