This looks like a bug in the regexp matcher. (The examples run without
error on Racket CS, which has a new implementation of the matcher.)
I'll work on a repair. Meanwhile,as you've noted that the bug seems to
be related to `\W`, using something else in place of `\W` is probably the
easiest workar
Laurent,
Thanks for the idea. Unfortunately, that doesn't seem to make a difference.
Here's a search involving only 8000 characters.
> (regexp-match* #px"\\W\\wat" (substring book 24000 32000))
. . ../../Applications/Racket v7.2/collects/racket/private/kw.rkt:1325:47:
substring: ending index
Not sure what the actual problem is, but may be this excerpt from the docs
is relevant:
The internal size of a regexp value is limited to 32 kilobytes; this
limit roughly corresponds to a source string with 32,000 literal characters
or 5,000 operators.
Source:
http://docs.racket-lang.org/refere
Dear Racket Users,
Some of my students are getting strange results from regexp-match* and I'm
hoping that someone on the list might be able to explain what's happening.
They've selected the book at
http://www.gutenberg.org/cache/epub/37499/pg37499.txt, which is encoded in
UTF-8. The student
4 matches
Mail list logo