Here's a somewhat-optimized version of the code:

#lang racket/base
(require racket/string racket/vector racket/port)

(define h (make-hash))

(time
 (for* ([l (in-lines)]
        [w (in-list (string-split l))]
        [w* (in-value (string-downcase w))])
   (hash-update! h w* add1 0)))

(define v
  (time
   (for/vector #:length (hash-count h)
       ([(k v) (in-hash h)])
     (cons k v))))
(time (vector-sort! v > #:key cdr))
(define p (current-output-port) #;(open-output-nowhere))
(time
 (for ([pair (in-vector v)])
   (write-string (car pair) p)
   (write-string (number->string (cdr pair)) p)
   (newline p)))

It's much more imperative, but also pretty nice and compact. The
`printf` optimization is significant for that portion of the program,
but that isn't much of the running time. The overall running time for
10 copies of the KJV is about 9 seconds on my laptop.

I think the remaining difference between Racket and other languages is
likely the `string-split` and `string-downcase` functions, plus the
relatively-inefficient string representation that Racket uses.

Sam


On Thu, Mar 18, 2021 at 10:28 AM Pawel Mosakowski <pa...@mosakowski.net> wrote:
>
> Hi David,
>
> Yes, the 21 seconds includes the interpreter startup time. I have done a 
> simple test to see how long it takes:
>
> $ time racket -e '(displayln "Hello, world")'
> Hello, world
>
> real    0m0.479s
> user    0m0.449s
> sys    0m0.030s
>
> I have also put my code inside a main function and profiled it:
>
> Profiling results
> -----------------
>   Total cpu time observed: 20910ms (out of 20970ms)
>   Number of samples taken: 382 (once every 55ms)
>   (Hiding functions with self<1.0% and local<2.0%: 1 of 12 hidden)
>
> ==============================================================
>                                   Caller
>  Idx    Total         Self      Name+src                Local%
>         ms(pct)       ms(pct)     Callee
> ==============================================================
>  [1] 20910(100.0%)     0(0.0%)  [running body] 
> ...word-occurences-profile.rkt":##f
>                                   profile-thunk [2]     100.0%
> --------------------------------------------------------------
>                                   [running body] [1]    100.0%
>  [2] 20910(100.0%)     0(0.0%)  profile-thunk 
> ...ket/pkgs/profile-lib/main.rkt:9:0
>                                   run [3]               100.0%
> --------------------------------------------------------------
>                                   profile-thunk [2]     100.0%
>  [3] 20910(100.0%)     0(0.0%)  run 
> ...share/racket/pkgs/profile-lib/main.rkt:39:2
>                                   main [4]              100.0%
> --------------------------------------------------------------
>                                   run [3]               100.0%
>  [4] 20910(100.0%)    50(0.2%)  main 
> ...cket/count-word-occurences-profile.rkt:5:0
>                                   read-from-stdin-it [5] 98.5%
>                                   ??? [6]                 0.2%
> --------------------------------------------------------------
>                                   main [4]              100.0%
>  [5] 20606(98.5%)  11796(56.4%) read-from-stdin-it 
> ...-occurences-profile.rkt:19:6
>                                   internal-split [7]     42.8%
> --------------------------------------------------------------
>                                   main [4]              100.0%
>  [6]    51(0.2%)       0(0.0%)  ??? 
> ...cket/collects/racket/private/sort.rkt:369:3
>                                   generic-sort/key [8]  100.0%
> --------------------------------------------------------------
>                                   read-from-stdin-it [5]100.0%
>  [7]  8810(42.1%)   3528(16.9%) internal-split 
> ...collects/racket/string.rkt:117:0
>                                   regexp-split [9]       59.9%
> --------------------------------------------------------------
>                                   ??? [6]               100.0%
>  [8]    51(0.2%)       0(0.0%)  generic-sort/key 
> .../racket/private/sort.rkt:156:2
>                                   copying-mergesort [10]100.0%
> --------------------------------------------------------------
>                                   internal-split [7]    100.0%
>  [9]  5282(25.3%)   2810(13.4%) regexp-split 
> ...ts/racket/private/string.rkt:338:2
>                                   loop [11]              46.8%
> --------------------------------------------------------------
>                                   generic-sort/key [8]   10.0%
>                                   copying-mergesort [10] 90.0%
> [10]    51(0.2%)      51(0.2%)  copying-mergesort 
> ...racket/private/sort.rkt:129:8
>                                   copying-mergesort [10] 90.0%
> --------------------------------------------------------------
>                                   regexp-split [9]      100.0%
> [11]  2471(11.8%)   2471(11.8%) loop 
> ...t/collects/racket/private/string.rkt:169:7
> --------------------------------------------------------------
>
> Kind regards,
> Pawel
>
>
> On Thursday, March 18, 2021 at 2:09:35 PM UTC david....@gmail.com wrote:
>>
>> Hi Pawel,
>>
>> I'll take a look at the code later, but did that 21 seconds include startup 
>> time for the interpreter?
>>
>> On Thu, Mar 18, 2021, 9:24 AM Pawel Mosakowski <pa...@mosakowski.net> wrote:
>>>
>>> Hello,
>>>
>>> I am a Racket beginner and I have come across this article:
>>>
>>> https://benhoyt.com/writings/count-words/
>>>
>>> This is my attempt at solving the challenge:
>>>
>>> https://pastebin.com/kL16w5Hc
>>>
>>> However when I have benchmarked it, it takes ~21 seconds to run compared to 
>>> the Python and Ruby versions which take around 3-4 seconds.
>>>
>>> I understand that both Ruby and Python probably have the string operations 
>>> and hash tables implemented in optimized C but is there anything I can do 
>>> to improve performance of my program?
>>>
>>> Many thanks for all help and suggestions.
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups 
>>> "Racket Users" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to racket-users...@googlegroups.com.
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/racket-users/118c1340-66d1-421d-92a4-6b66c56cb88fn%40googlegroups.com.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to racket-users+unsubscr...@googlegroups.com.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/racket-users/09c58a34-bd2d-49e7-bfbd-d3253c1e6dd1n%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to racket-users+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CAK%3DHD%2BaA8%3D_mMp8s_DL5D3vNJEmmW9v7FP9sRXqdGbM9i0mASw%40mail.gmail.com.

Reply via email to