I learned some things by studying the Racket version of this program,
thanks.
I tried to rewrite it without using the Racket built-ins. In the end I
wanted a version where edits1 and edits2 wouldn't allocate the lists of
intermediate results, but rather process the items as they were
generat
I just submitted new PR: https://github.com/jmoy/norvig-spell/pull/4
I moved `(in-value (substring rht 1))` so it's calculated only once.
This change reduces the time in my machine from 14,4s to 13,3s.
The new implementation has too many `in-value`. I was wondering if it
would be a good idea to h
I have now tested using v 6.1.1 and the time does drop to 17s from the
earlier 24s though this still remains higher than Python's 13s.
I have updated the GitHub README with the new numbers.
Jyotirmoy
On Mon, Jan 5, 2015 at 6:59 AM, Jyotirmoy Bhattacharya <
jyotir...@jyotirmoy.net> wrote:
> > On
At Sun, 4 Jan 2015 20:18:11 +0100, Jens Axel Søgaard wrote:
> One possibility: Python hash tables are fast(er).
That reminds me: I sped up `equal?` hash tables on strings in v6.1.1.
If Jyotirmoy is using version v6.1 instead of v6.1.1, that could
explain the time differences that he sees versus G
> Both of these caused a serious slowdown. -- Matthias
Too bad.
One possibility: Python hash tables are fast(er).
There is a nice chapter on their implementation in Beautiful Code,
but I couldn't find a pdf to link to.
http://www.amazon.com/Beautiful-Code-Leading-Programmers-Practice/dp/0596510
On Jan 4, 2015, at 7:52 AM, Gustavo Massaccesi wrote:
> The original code is something like:
>
> (define ALPHABET (in-list (string->list "abc...xyz")))
> (for/list ([c ALPHABET]))
>(string-append "?" (string c) "?"))
>
> The first problem is that outside a clause of a `for` the `in-list`
I've just submitted a pull request with two changes
https://github.com/jmoy/norvig-spell/pull/3 . In my machine, these
changes reduce the time of the tests from 17.7 seconds, to 16.2
seconds, and finally to 14.5 seconds.
The original code is something like:
(define ALPHABET (in-list (string->li
On Jan 2, 2015, at 11:38 AM, Jens Axel Søgaard wrote:
> 2015-01-02 17:10 GMT+01:00 Matthias Felleisen :
>>
>> Jens -- thanks for re-directing us to your article. This saved me quite
>> some time. I noticed the lacking abstractions (slicing, comprehension)
>> and considered sketching a prototype
Hi, Jyotirmoy.
>> When I make that change, my run time decreases from ~16s to ~10s, and
>> produces the same output (which differs from output.txt in the same
>> way I mentioned above).
>
>> In relative terms this would probably get it close to the Python version?
>
> I merged a pull request fro
Sorry guys, I tried most of these (including profiler) and I should have
reported so as you could save your time.
I did figure out of course that edits2 is the heavy in the program,
the profiler points there. This isn't actionable advice, however.
[With git we should be able to report all the
> Through the sophisticated profiling technique of commenting-out `or`
> branches in `correct`
Oh, I should mention that I did try Optimization Coach, which is a
wonderful tool! It's given me many great suggestions.
In this case it could only suggest some inlines that didn't help. But
for anyone
On Fri, Jan 2, 2015 at 2:12 PM, Greg Hendershott
wrote:
>> Would it help to use an eq? hash and turn all of the strings into
>> symbols (hopefully up front)? That might make the hashing part faster
>> in the best-known function.
>
> Good idea. But unless I made a mistake, no, that's twice as slow,
> Would it help to use an eq? hash and turn all of the strings into
> symbols (hopefully up front)? That might make the hashing part faster
> in the best-known function.
Good idea. But unless I made a mistake, no, that's twice as slow, not
faster. All the string->symbol conversions seem to outweig
Would it help to use an eq? hash and turn all of the strings into
symbols (hopefully up front)? That might make the hashing part faster
in the best-known function. (Also, you can use two accumulators in the
for*/fold instead of one to avoid creating 'cons' cells but that seems
unlikely to matter mu
Although it can't hurt, I don't think `train` comes close to
dominating the time?
Through the sophisticated profiling technique of commenting-out `or`
branches in `correct`, I got the sense that the third one is the
issue:
(define (correct m s)
(let ([best-known (λ (xs) (best-known m xs))])
On Jan 2, 2015, at 11:38 AM, Jens Axel Søgaard wrote:
> Another trick: the regular expression matchers in Racket works both
> on strings and ports, so train can be written
>
>(define (train fname)
> (freqs (words (open-input-file fname
>
> I am not sure whether it will give a s
2015-01-02 17:10 GMT+01:00 Matthias Felleisen :
>
> Jens -- thanks for re-directing us to your article. This saved me quite
> some time. I noticed the lacking abstractions (slicing, comprehension)
> and considered sketching a prototype to use with this program. I decided
> to wait till today becaus
Jens -- thanks for re-directing us to your article. This saved me quite
some time. I noticed the lacking abstractions (slicing, comprehension)
and considered sketching a prototype to use with this program. I decided
to wait till today because it didn't address what bothered me most:
speed.
Ra
Preface:
1. Your status quo version is about 16s on my laptop.
2. The output differs from your repo's output.txt on one item: It
corrects "accesing" to "accusing" rather than "acceding".
Next:
(append-map edits1 (edits1 s))
is very large -- 40,000+ items for "cat". But it looks like Norvig
pru
I have been trying to translate Norvig's famous spelling correction program
into Racket. Norvig's Python version is:
http://norvig.com/spell-correct.html
and my Racket version is
https://github.com/jmoy/norvig-spell/blob/master/racket/norvig.rkt
On my system the Racket version takes about 31s w
20 matches
Mail list logo