Rather than jump to adding new quad functions, I'm wondering what the timing of reading that CSV file is when you optimize the APL code like the few suggestions made by Juergen.
Specifically, we all know APL is a dog when it comes to looping and doing one thing at a time. Reading the whole thing in as a matrix and processing it as a unit is more APL-ish and would probably have beaten the bad version of the Lisp code. (Of course reading the whole thing in and processing it as a unit could end up taking 1GB of RAM with the intermediary stuff.) On the other hand, reading CSV and fixed length record files is pretty common and useful. Thanks. Blake On Tue, Jan 17, 2017 at 5:01 PM, Juergen Sauermann < juergen.sauerm...@t-online.de> wrote: > Hi Elias, > > I believe in principle what we want is something like this: > > *Z←FOO¨Z←⎕FIO**[N] 'filename'* > > where *⎕FIO[N**]* reads *'filename'* line by line putting each line *j* > into the nested item *Z[j]* > and *FOO* is a decoding function that translates a line into whatever > *Z[j]* shall become in the end. > > The current performance problem is then solved by the ¨ operator which > allocates a big enough *Z* beforehand > and fills it with the result of FOO for each line. > > I can try to make *⎕FIO* an operator so that you can use > > *Z←FOO ⎕FIO**[N] 'filename'* > > for the above and I hope that will be syntactically possible. But it looks > almost like *+/[N]B *with *FOO* > instead of *+* and *⎕FIO* instead of */ *which I believe should work > somehow. Can become a little tricky though, > because there are the same ambiguities for *⎕FIO* then those for* /* > (function versus operator). > > /// Jürgen > > > > On 01/17/2017 09:37 PM, Elias Mårtenson wrote: > > On 18 January 2017 at 04:10, Juergen Sauermann < > juergen.sauerm...@t-online.de> wrote: > > >> What I do not like about *⎕CSV* (actually I am only guessing here >> because I dont know what it reallly does, >> but I assume it is specifically for comma separated lists) is that it is >> supposedly only works for comma >> separated lists. If we have something more general which solves the >> performance problem of >> *Z*⍪ without only working for specific formats like CSV then I would >> prefer that. >> > > You make a good point, and in my envisioned function (being an external > function, or a built-in one (called ⎕CSV or otherwise)) would accept a > left-hand argument, being a format definition telling the function how to > parse the CSV data. > > You are absolutely correct in that there are many ways to express CSV > data, and looking at the flags available in R > <https://cran.r-project.org/doc/manuals/R-data.html#Spreadsheet_002dlike-data> > gives some insight into this. My intention is to build something that can > at least handle the most important of these variations. What the left-hand > format definition will look like, I have not yet decided, except for one > thing: I want to be able to specify a function that will be called that can > be responsible for parsing a line. This way it'll be possible to handle any > format that is not natively supported. > > Regards, > Elias > > >