Hi! Andy Wingo <wi...@igalia.com> skribis:
> Nice investigation! Perhaps slot-allocation should track live variables > using something that's not bigints, but who knows. Yeah I wondered; it’s not clear whether bitvectors would be more efficient, for instance, although we could make it perhaps locally imperative. > On Wed 05 Feb 2020 17:29, Ludovic Courtès <l...@gnu.org> writes: > >> /* The next three functions (custom_libgmp_*) are passed to >> mp_set_memory_functions (in GMP) so that memory used by the digits >> themselves is known to the garbage collector. This is needed so >> @@ -237,19 +227,20 @@ finalize_bignum (void *ptr, void *data) >> static void * >> custom_gmp_malloc (size_t alloc_size) >> { >> - return scm_malloc (alloc_size); >> + return scm_gc_malloc (alloc_size, "GMP"); >> } >> >> static void * >> custom_gmp_realloc (void *old_ptr, size_t old_size, size_t new_size) >> { >> - return scm_realloc (old_ptr, new_size); >> + return scm_gc_realloc (old_ptr, old_size, new_size, "GMP"); >> } >> >> static void >> custom_gmp_free (void *ptr, size_t size) >> { >> - free (ptr); >> + /* Do nothing: all memory allocated by GMP is under GC control and >> + will be freed when needed. */ >> } > > I think this makes sense to me as a short-term fix. The down-side is > that limbs can alias Scheme objects. Yes. To my surprise, on a pure bignum microbenchmark, this is counterproductive: --8<---------------cut here---------------start------------->8--- $ guile ~/src/guile-debugging/bignum-finalizers.scm # 3.0.0 clock utime stime cutime cstime gctime 2.42 6.20 0.17 0.00 0.00 5.62 heap size: 2.0 MiB $ /data/src/guile-3.0/meta/guile ~/src/guile-debugging/bignum-finalizers.scm clock utime stime cutime cstime gctime 3.97 10.91 0.15 0.00 0.00 10.60 heap size: 3.0 MiB $ cat ~/src/guile-debugging/bignum-finalizers.scm (use-modules (ice-9 time)) (time (let loop ((n (expt 2 18)) (i 1)) (unless (zero? n) ;; (display ".") (loop (- n 1) (logior 0 (ash i 1)))))) (format #t "heap size: ~a MiB~%" (round (/ (assoc-ref (gc-stats) 'heap-size) (expt 2. 20)))) --8<---------------cut here---------------end--------------->8--- (Here we’re creating ~24 bignums, no more.) I wonder if there’s another part of the story that I’m missing here. Perf report for 3.0.0: --8<---------------cut here---------------start------------->8--- 46.93% guile libgc.so.1.3.6 [.] GC_mark_from 17.61% guile libgc.so.1.3.6 [.] GC_header_cache_miss 9.96% guile libgc.so.1.3.6 [.] GC_add_to_black_list_normal 5.20% guile libgmp.so.10.3.2 [.] __gmpn_lshift_coreisbr 4.13% guile libgc.so.1.3.6 [.] GC_find_header 2.28% guile libgc.so.1.3.6 [.] GC_finalize 2.09% guile libgc.so.1.3.6 [.] GC_base --8<---------------cut here---------------end--------------->8--- With the patch: --8<---------------cut here---------------start------------->8--- 48.40% guile libgc.so.1.3.6 [.] GC_mark_from 17.74% guile libgc.so.1.3.6 [.] GC_header_cache_miss 11.90% guile libgc.so.1.3.6 [.] GC_add_to_black_list_normal 4.45% guile libgc.so.1.3.6 [.] GC_find_header 2.31% guile libgmp.so.10.3.2 [.] __gmpn_lshift_coreisbr 2.30% guile libgc.so.1.3.6 [.] GC_base 1.73% guile libgc.so.1.3.6 [.] GC_finalize --8<---------------cut here---------------end--------------->8--- IOW, the relative part of computations drops from 5% to 2%. Thoughts? > In the long-term I think we should be representing bignums as > pointerless objects whose first word is the tag and a word count, > followed by inline "limbs" (in the sense of > https://gmplib.org/manual/Nomenclature-and-Types.html#Nomenclature-and-Types). > Generally we can use the low-level API to work on these > (https://gmplib.org/manual/Low_002dlevel-Functions.html#Low_002dlevel-Functions), > and if we need to use mpz_t, we can easily create an mpz_t that points > to these values. Yes, that sounds like the right approach longer-term. Note that ‘mpz_t’ is exposed through “numbers.h”, which I guess means we cannot change that in 3.0.x. Thanks, Ludo’.