Well, you are certainly free to contribute. Heuristic interpretation of data could be useful, but looks like an addition on top, the core library should be fast and efficient.
> On 18 Feb 2015, at 10:35, Andrea Ferretti <ferrettiand...@gmail.com> wrote: > > For an example of what I am talking about, see > > http://pandas.pydata.org/pandas-docs/version/0.15.2/io.html#csv-text-files > > I agree that this is definitely too much options, but it gets the job > done for quick and dirty exploration. > > The fact is that working with a dump of table on your db, whose > content you know, requires different tools than exploring the latest > opendata that your local municipality has put online, using yet > another messy format. > > Enterprise programmers deal more often with the former, data > scientists with the latter, and I think there is room for both kind of > tools > > 2015-02-18 10:26 GMT+01:00 Andrea Ferretti <ferrettiand...@gmail.com>: >> Thank you Sven. I think this should be emphasized and prominent on the >> home page*. Still, libraries such as pandas are even more lenient, >> doing things such as: >> >> - autodetecting which fields are numeric in CSV files >> - allowing to fill missing data based on statistics (for instance, you >> can say: where the field `age` is missing, use the average age) >> >> Probably there is room for something built on top of Neo >> >> >> * by the way, I suggest that the documentation on Neo could benefit >> from a reorganization. Right now, the first topic on the NeoJSON >> paper introduces JSON itself. I would argue that everyone that tries >> to use the library knows what JSON is already. Still, there is no >> example of how to read JSON from a file in the whole document. >> >> 2015-02-18 10:12 GMT+01:00 Sven Van Caekenberghe <s...@stfx.eu>: >>> >>>> On 18 Feb 2015, at 09:52, Andrea Ferretti <ferrettiand...@gmail.com> wrote: >>>> >>>> Also, these tasks >>>> often involve consuming data from various sources, such as CSV and >>>> Json files. NeoCSV and NeoJSON are still a little too rigid for the >>>> task - libraries like pandas allow to just feed a csv file and try to >>>> make head or tails of the content without having to define too much of >>>> a schema beforehand >>> >>> Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition >>> of any schema's or (2) with the definition of schema's and mappings. The >>> quick and dirty explore style is most certainly possible. >>> >>> 'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) >>> upToEnd ]. >>> >>> => an array of arrays >>> >>> 'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) >>> next ]. >>> >>> => objects structured using dictionaries and arrays >>> >>> Sven >>> >>> >