Re: [Pharo-users] GSOC 2015 Call for Ideas

Andrea Ferretti Wed, 18 Feb 2015 01:27:07 -0800

Thank you Sven. I think this should be emphasized and prominent on the
home page*. Still, libraries such as pandas are even more lenient,
doing things such as:


- autodetecting which fields are numeric in CSV files
- allowing to fill missing data based on statistics (for instance, you
can say: where the field `age` is missing, use the average age)

Probably there is room for something built on top of Neo


* by the way, I suggest that the documentation on Neo could benefit
from a reorganization. Right now, the first topic  on the NeoJSON
paper introduces JSON itself. I would argue that everyone that tries
to use the library knows what JSON is already. Still, there is no
example of how to read JSON from a file in the whole document.

2015-02-18 10:12 GMT+01:00 Sven Van Caekenberghe <s...@stfx.eu>:
>
>> On 18 Feb 2015, at 09:52, Andrea Ferretti <ferrettiand...@gmail.com> wrote:
>>
>> Also, these tasks
>> often involve consuming data from various sources, such as CSV and
>> Json files. NeoCSV and NeoJSON are still a little too rigid for the
>> task - libraries like pandas allow to just feed a csv file and try to
>> make head or tails of the content without having to define too much of
>> a schema beforehand
>
> Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition 
> of any schema's or (2) with the definition of schema's and mappings. The 
> quick and dirty explore style is most certainly possible.
>
> 'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) 
> upToEnd ].
>
>   => an array of arrays
>
> 'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) 
> next ].
>
>   => objects structured using dictionaries and arrays
>
> Sven
>
>

Re: [Pharo-users] GSOC 2015 Call for Ideas

Reply via email to