I installed the sxml package out of curiosity, and while it is faster, it 
is not 4 times as fast, as your tests indicate. I used the following test 
program with a 14Mb XML file (a bike ride in TCX format):

    (define file-name 
"../MyPackages/more-df-tests/tcx-data/2015-09-27-0755_Road_Cycling_WF.tcx")
    ;; Make sure the file is in the cache
    (call-with-input-file file-name
      (lambda (in) (let loop ([c (read-char in)]) (unless (eof-object? c) 
(loop (read-char in))))))
    (collect-garbage 'major)
    (time (void (call-with-input-file file-name (lambda (in) 
(ssax:xml->sxml in null)))))
    (collect-garbage 'major)
    (time (void (call-with-input-file file-name read-xml)))

On my laptop the times are:

     ssax:xml->sxml : cpu time: 4031 real time: 4128 gc time: 157
     read-xml: cpu time: 9578 real time: 10031 gc time: 3270

The big difference I found so far is that `read-xml` will store the 
location (line number, column and file offset) for each element, and 
enabled `port-count-lines!` by default.  If I use:

    (parameterize ([xml-count-bytes #t])
      (time (void (call-with-input-file file-name read-xml))))

The results are much closer together, although `read-xml` is still slower 
and spends more time in the garbage collector:

     ssax:xml->sxml :  cpu time: 4187 real time: 4233 gc time: 202
     read-xml: cpu time: 5797 real time: 5824 gc time: 1251

Perhaps a note could be added to the documentation indicating that users 
can speed up `read-xml` significantly if they set `xml-count-bytes` to #t.

Alex.

On Saturday, June 27, 2020 at 11:05:42 AM UTC+8 'John Clements' via 
users-redirect wrote:

> I’m parsing a large-ish apple plist file, (18 megabytes), and I find that 
> the built-in xml parsing (read-xml) takes about five times as long as the 
> sxml version (11 seconds vs 2.4 seconds on my machine), and that the plist 
> parser is way longer, at 18 seconds.
>
> Would anyone object if I added a margin note to this effect to the xml 
> docs?
>
> John
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/f99175fa-de2a-4434-9984-78446b3cf828n%40googlegroups.com.

Reply via email to