Re: guile performance - Ackermann function: way slower than emacs, slower still if compiled

Marijn Schouten (hkBst) Tue, 04 Aug 2009 07:02:04 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Ken Raeburn wrote:
> I implemented Ackermann's function A(m,n), a recursive function with
> scary performance characteristics, based on the definition given at
> wikipedia involving lots of +1 and -1 operations...  A(0,n) takes
> constant time to compute assuming bounded integer sizes, A(1,n) takes
> linear time, A(2,n) is quadratic, A(3,n) is exponential... A(4,n) takes
> a really long time if n=1, and for anything higher you can just give up
> now.  (A(4,2) is in the neighborhood of 2**65536, and A(4,3) is about 2
> raised to that power.  And it gets there by adding or subtracting 1 in
> each invocation.)  Optimizations are possible, like A(1,n)=n+2,
> A(2,n)=2n+3, A(3,n)=2**(n+3)-3, but I didn't encode those optimizations.
> 
> I wrote both elisp and scheme versions; I changed the recursion to an
> explicitly-managed stack of values, to avoid some of the
> dynamic-vs-lexical argument binding issues, turn the native stack usage
> into large numbers of cell allocations and discards, and avoid exceeding
> the maximum Lisp evaluation depth Emacs permits.  Not surprisingly, it's
> not too fast, and Emacs spends a lot of time in garbage collection
> passes even with the byte-compiled version.  My time-test function here
> counts seconds of real time (on an 8-core Mac without a lot of other
> stuff going on, so CPU usage is probably very close):
> 
> ELISP> (time-test '(ackermann 3 1))
> 1.6e-05
> ELISP> (time-test '(ackermann 3 6))
> 0.036018
> ELISP> (time-test '(ackermann 3 7))
> 0.16574899999999998
> ELISP> (time-test '(ackermann 3 8))
> 0.6457170000000001
> ELISP> (time-test '(ackermann 3 9))
> 2.482042
> ELISP> (time-test '(ackermann 4 0))
> 1.4e-05
> ELISP> (time-test '(ackermann 4 1))
> 642.746514
> ELISP>
> 
> With the installed guile-1.8.7 binary and the interpreted Scheme version
> of the code, performance was worse:
> 
> % time guile -l ack.scm -c '(ackermann 3 1)'
> 0.009u 0.003s 0:00.01 0.0%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 6)'
> 0.409u 0.078s 0:00.52 90.3%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 9)'
> 25.300u 4.540s 0:29.85 99.9%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 4 1)'
> 6016.150u 1122.109s 1:58:59.32 99.9%    0+0k 8+4io 149pf+0w
> %
> 
> Yes, that's just shy of two hours for the last one, where Emacs took
> less than 11 minutes.
> 
> (Did I make some silly mistake in translating, that causes the Scheme
> version to be far less efficient?)
> 
> It's not a terribly fair comparison, for various reasons, but moving on...
> 
> I built and installed 1.9.1, so I could test with more recent code:
> 
> % time guile -l ack.scm -c '(ackermann 3 1)'
> 0.026u 0.006s 0:00.03 66.6%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 6)'
> 0.761u 0.141s 0:00.90 100.0%    0+0k 0+0io 0pf+0w
> % time guile -l ack.scm -c '(ackermann 3 9)'
> 47.242u 8.644s 0:55.89 99.9%    0+0k 0+0io 0pf+0w
> %
> 
> Since this behaved worse than 1.8.6, I skipped the 4,1 test.  I tested
> the compiled version:
> 
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 1))'
> 0.012u 0.004s 0:00.01 100.0%    0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 6))'
> 0.772u 0.333s 0:01.10 100.0%    0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 9))'
> 48.728u 22.016s 1:10.75 99.9%    0+0k 0+0io 0pf+0w
> %
> 
> Not much better than loading the .scm file, and only better for small
> values of n... in fact, with m=3 and n>=6 the performance seems worse in
> the precompiled case:
> 
> % time guile -l ack.scm -c '(ackermann 3 10)'
> 181.997u 34.928s 3:36.94 99.9%    0+0k 0+0io 0pf+0w
> % time guile -c '(begin (load-compiled "ack.go") (ackermann 3 10))'
> 196.678u 85.618s 4:42.34 99.9%    0+0k 0+0io 0pf+0w
> %
> 
> Poking at it a couple of times with Apple's process sampling tool, it
> does seem to be spending a bit of time in garbage collection -- but
> scm_leave_guile (called from scm_async_click via scm_pthread_mutex_lock)
> and getrusage (called from scm_c_get_internal_run_time via mytime) also
> seem to show up on the stack a significant fraction of the time, and
> each of those appears to be a comparable if not bigger time sink.
> 
> According to the execution timing with the compiled code, GC actually
> doesn't seem to be a huge fraction of the run time, maybe 8-9%:
> 
> scheme@(guile-user)> ,t (ackermann 3 8)
> 2045
> clock utime stime cutime cstime gctime
> 18.43 12.96  5.47   0.00   0.00   1.66
> scheme@(guile-user)> ,t (ackermann 3 9)
> 4093
> clock utime stime cutime cstime gctime
> 73.47 51.61 21.85   0.00   0.00   5.95
> scheme@(guile-user)>
> 
> The latter is the case that took under three seconds in Emacs, remember.
> 
> Garbage collection is much more of a factor in the non-compiled version:
> 
> scheme@(guile-user)> (load "ack.scm")
> scheme@(guile-user)> ,t (ackermann 3 8)
> 2045
> clock utime stime cutime cstime gctime
> 21.97 19.74  2.21   0.00   0.00   7.76
> scheme@(guile-user)> ,t (ackermann 3 9)
> 4093
> clock utime stime cutime cstime gctime
> 86.26 77.37  8.88   0.00   0.00  30.20
> scheme@(guile-user)>
> 
> It also looks like execution time measurements may add more to the
> non-compiled execution time, as with the measurements the compiled code
> was still faster at 3,9.
> 
> In retrospect, this is much heavier on the interpretation or execution
> of some of the simple operations than I intended, and Emacs benefits a
> lot from opcodes like "add1", and I'm more interested specifically in
> exercising allocation and GC at the moment, so it's not that useful a
> test for me after all.  But I thought I'd send this along anyways in
> case someone was interested.  In particular the fact that the
> performance of the precompiled version gets worse as n grows might be
> something to investigate, unless the cause is already known....


A more idiomatic way to implement it that also works in other Schemes is:


; $ larceny -- acker.scm -e "(time (pretty-print (A 3 9)))" -e "(quit)"
; $ mzscheme -f acker.scm -e "(time (display (A 3 9))(newline))"
; $ time guile -l acker.scm -c "(display (A 3 9))(newline)"
; $ gsi acker.scm -e "(time (pp (A 3 9)))"

(define (A m n)
  (let loop ((m+ (list m)) (n n))
    (if (null? m+)
        n
        (let loop1 ((m (car m+)) (m* (cdr m+)) (n n))
          (if (= m 0)
              (loop m* (+ n 1))
              (if (= n 0)
                  (loop1 (- m 1) m* 1)
                  (loop1 m (cons (- m 1) m*) (- n 1)) ))))))


On my machine larceny computes (A 3 9) in 83 ms and (A 4 1) in less than 22
seconds. In guile-1.8.6 this version is also twice as fast as your version for
(A 3 9); they run 9s and 18s respectively. I haven't tried the new guile
compiler yet.

Marijn

- --
If you cannot read my mind, then listen to what I say.

Marijn Schouten (hkBst), Gentoo Lisp project, Gentoo ML
<http://www.gentoo.org/proj/en/lisp/>, #gentoo-{lisp,ml} on FreeNode
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkp4ONoACgkQp/VmCx0OL2wCmwCgh4a3N6mifoRKuQXG/eaylTqb
yzQAn0Kb6YRQy6CXyAEcs8t+yBZsHWLa
=6wJR
-----END PGP SIGNATURE-----

Re: guile performance - Ackermann function: way slower than emacs, slower still if compiled

Reply via email to