I installed the sxml package out of curiosity, and while it is faster, it
is not 4 times as fast, as your tests indicate. I used the following test
program with a 14Mb XML file (a bike ride in TCX format):
(define file-name
"../MyPackages/more-df-tests/tcx-data/2015-09-27-0755_Road_Cycling_WF.tcx")
;; Make sure the file is in the cache
(call-with-input-file file-name
(lambda (in) (let loop ([c (read-char in)]) (unless (eof-object? c)
(loop (read-char in))))))
(collect-garbage 'major)
(time (void (call-with-input-file file-name (lambda (in)
(ssax:xml->sxml in null)))))
(collect-garbage 'major)
(time (void (call-with-input-file file-name read-xml)))
On my laptop the times are:
ssax:xml->sxml : cpu time: 4031 real time: 4128 gc time: 157
read-xml: cpu time: 9578 real time: 10031 gc time: 3270
The big difference I found so far is that `read-xml` will store the
location (line number, column and file offset) for each element, and
enabled `port-count-lines!` by default. If I use:
(parameterize ([xml-count-bytes #t])
(time (void (call-with-input-file file-name read-xml))))
The results are much closer together, although `read-xml` is still slower
and spends more time in the garbage collector:
ssax:xml->sxml : cpu time: 4187 real time: 4233 gc time: 202
read-xml: cpu time: 5797 real time: 5824 gc time: 1251
Perhaps a note could be added to the documentation indicating that users
can speed up `read-xml` significantly if they set `xml-count-bytes` to #t.
Alex.
On Saturday, June 27, 2020 at 11:05:42 AM UTC+8 'John Clements' via
users-redirect wrote:
> I’m parsing a large-ish apple plist file, (18 megabytes), and I find that
> the built-in xml parsing (read-xml) takes about five times as long as the
> sxml version (11 seconds vs 2.4 seconds on my machine), and that the plist
> parser is way longer, at 18 seconds.
>
> Would anyone object if I added a margin note to this effect to the xml
> docs?
>
> John
>
>
>
>
--
You received this message because you are subscribed to the Google Groups
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/racket-users/f99175fa-de2a-4434-9984-78446b3cf828n%40googlegroups.com.