I installed the sxml package out of curiosity, and while it is faster, it 
is not 4 times as fast, as your tests indicate. I used the following test 
program with a 14Mb XML file (a bike ride in TCX format):

    (define file-name 
"../MyPackages/more-df-tests/tcx-data/2015-09-27-0755_Road_Cycling_WF.tcx")
    ;; Make sure the file is in the cache
    (call-with-input-file file-name
      (lambda (in) (let loop ([c (read-char in)]) (unless (eof-object? c) 
(loop (read-char in))))))
    (collect-garbage 'major)
    (time (void (call-with-input-file file-name (lambda (in) 
(ssax:xml->sxml in null)))))
    (collect-garbage 'major)
    (time (void (call-with-input-file file-name read-xml)))

On my laptop the times are:

     ssax:xml->sxml : cpu time: 4031 real time: 4128 gc time: 157
     read-xml: cpu time: 9578 real time: 10031 gc time: 3270

The big difference I found so far is that `read-xml` will store the 
location (line number, column and file offset) for each element, and 
enabled `port-count-lines!` by default.  If I use:

    (parameterize ([xml-count-bytes #t])
      (time (void (call-with-input-file file-name read-xml))))

The results are much closer together, although `read-xml` is still slower 
and spends more time in the garbage collector:

     ssax:xml->sxml :  cpu time: 4187 real time: 4233 gc time: 202
     read-xml: cpu time: 5797 real time: 5824 gc time: 1251

Perhaps a note could be added to the documentation indicating that users 
can speed up `read-xml` significantly if they set `xml-count-bytes` to #t.

Alex.

On Saturday, June 27, 2020 at 11:05:42 AM UTC+8 'John Clements' via 
users-redirect wrote:

> I’m parsing a large-ish apple plist file, (18 megabytes), and I find that 
> the built-in xml parsing (read-xml) takes about five times as long as the 
> sxml version (11 seconds vs 2.4 seconds on my machine), and that the plist 
> parser is way longer, at 18 seconds.
>
> Would anyone object if I added a margin note to this effect to the xml 
> docs?
>
> John
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/f99175fa-de2a-4434-9984-78446b3cf828n%40googlegroups.com.

Reply via email to