On Mon, 2013-04-01 at 04:36 -0400, Mark H Weaver wrote: > Nala Ginrut <nalagin...@gmail.com> writes: > > > I've tried to implement a function to mimic string multiply like Python: > > "asdf" * 10 > > > > --------------code---------------- > > (define (str* str n) > > (format #f "~{~a~}" (make-list n str))) > > > > or > > > > (define (str* str n) > > (string-join (make-list n str) "")) > > --------------end----------------- > > > > > > Both are very slow when N is large (> 1000000). > > Indeed, the implementation of 'string-join' was very bad: about O(n^2) > in the length of the list (assuming that the strings are roughly the > same length). Thanks for bringing this to my attention. The problem > was that it called 'string-append' repeatedly, adding one component at a > time to the result string. Since each call to 'string-append' copied > the source strings into a fresh new string, this resulted in a lot of > unnecessary copying and allocation. > > I just pushed a much faster O(n) implementation to stable-2.0, which > instead constructs a list of strings, and then calls 'string-append' > only once. > > http://git.savannah.gnu.org/gitweb/?p=guile.git;a=commit;h=786ab4258fbf605f46287da5e7550d3ab4b68589 > > On my system, this makes (string-join (make-list 100000 "test") "-") > over 3000 times faster (about 28.5 milliseconds vs about 98 seconds). > I expect that the same test with 1,000,000 elements would be about > 30,000 times faster (roughly 2.7 hours vs 0.3 seconds), but I didn't > have the patience to wait 2.7 hours to verify this :) >
Thanks Mark! string-join is a common thing for text processing(include web develop). However, our powerful 'format' is not so efficient. I do think it's necessary to we spend some time on it. > Before: > > scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") > "-")) > ;; 0.998800s real time, 0.996677s run time. 0.984885s spent in GC. > scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") > "-")) > ;; 98.006569s real time, 97.817077s run time. 97.795970s spent in GC. > > After: > > scheme@(guile-user)> ,time (define s (string-join (make-list 10000 "test") > "-")) > ;; 0.006362s real time, 0.006351s run time. 0.000000s spent in GC. > scheme@(guile-user)> ,time (define s (string-join (make-list 100000 "test") > "-")) > ;; 0.028513s real time, 0.028457s run time. 0.022235s spent in GC. > scheme@(guile-user)> ,time (define s (string-join (make-list 1000000 "test") > "-")) > ;; 0.303098s real time, 0.302543s run time. 0.289639s spent in GC. > scheme@(guile-user)> ,time (define s (string-join (make-list 10000000 "test") > "-")) > ;; 3.288105s real time, 3.281922s run time. 3.174460s spent in GC. > > Format is still slow for large numbers of elements, but I'm not > sufficiently motivated to dive into that swamp right now. > > Thanks, > Mark