You've all made good points, and I changed the code slightly to provide the initial array side in order to avoid the recreation of the array on each iteration. This brought down the loading time to a much more bearable *14 seconds*. I rewrote the Lisp code to be compatible with the APL code and the time was *1.46 seconds*. This suggests that GNU APL is consistently about 10 times slower than non-optimised Lisp code. To me, this is not unexpected given the fact that GNU APL isn't designed to be high-performance.
However, while 14 seconds for 30k is manageable, I have had the need to work with arrays of over a million rows. Extrapolating this suggests that it would take almost 8 minutes to load such a file. Thus, unless GNU APL can magically improve overall performance by at least 10 times, I still think we need a native CSV loading function. Regards, Elias For reference, here is the APL code: ∇Z ← type convert_entry value →('n'≡type)/numeric →('s'≡type)/string ⎕ES 'Illegal conversion type' numeric: Z←⍎value →end string: Z←value end: ∇ ∇Z ← pattern read_csv_n[n] filename ;fd;line;separator;i separator ← ' ' Z ← n (↑⍴pattern) ⍴ 0 fd ← 'r' FIO∆fopen filename i ← ⎕IO next: line ← FIO∆fgets fd ⍝ Read one line from the file →(⍬≡line)/end →(10≠line[⍴line])/skip_nl ⍝ If the line ends in a newline line ← line[⍳¯1+⍴line] ⍝ Remove the newline skip_nl: line ← ⎕UCS line Z[i;] ← pattern convert_entry¨ (line≠separator) ⊂ line i ← i+1 →next end: FIO∆fclose fd ∇ And here is the Lisp code (the test case was running on SBCL), requires the QL packages SPLIT-SEQUENCE and PARSE-NUMBER: (defparameter *result* (time (with-open-file (s "apjs492452t1_mrt.txt") (let ((res (make-array '(34030 11)))) (dotimes (i (array-dimension res 0)) (let* ((line (read-line s)) (parts (split-sequence:split-sequence #\Space line :remove-empty-subseqs t))) (loop for ii from 0 below 10 for p in parts do (setf (aref res i ii) (parse-number:parse-number p))) (setf (aref res i 10) (nth 10 parts)))) res)))) On 18 January 2017 at 09:57, Blake McBride <blake1...@gmail.com> wrote: > On Tue, Jan 17, 2017 at 7:39 PM, Xiao-Yong Jin <jinxiaoy...@gmail.com> > wrote: > >> I always feel GNU APL kind of slow compared to Dyalog, but I never really >> compared two in large dataset. >> I'm mostly using J now for large dataset. >> If Elias has the optimized code for GNU APL and a reproducible way to >> measure timing, I'd like to compare it with Dyalog and J. > > > I think that's actually a good idea. It would be a good comparison. It > would really make it clear if there is a blaring problem. But first the > APL code should be optimized a bit (but nothing crazy like reading it all > into memory right now.) > > --blake > > > >