Thank you Richard for the detailed response.

Noury
On Mar 18 2024, at 8:06 am, Richard O'Keefe <rao...@gmail.com> wrote:
> Let me start by giving some figures from my Smalltalk, on an Intel
> core I5-6200U @ 2.3 Ghz CPU laptop with 8GB of memory running Ubuntu
> 22.04 and gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0. Smalltalk is
> compiled to C then finished with the system C compiler. Static
> whole-program compilation is allowed by the ANSI standard and the
> system was originally written to serve as a baseline for Bryce's JIT.
> nsec technique
> 249 replaceFrom:to:with:startingAt:*5
> 128 withAll:*5
> 486 ,,,,
> 492 (,,),(,)
> 521 streamContents:
> 367 StringWriteStream
> 860 StringBuffer>>addAllLast:
> 385 StringBuffer>>nextPutAll:
>
> replaceFrom:to:with:startingAt:*5 makes a string the right size then
> fills it in using #replaceAll:from:to:startingAt:.
> withAll:*5 is String withAll: a withAll: b withAll: c withAll: d
> withAll: e (supported up to 6 withAlls.)
> This is interesting because the result can be a [ReadOnly](ByteArray
> -- UTF8 -- or ShortArray -- UTF16 or String -- UTF32) and each of the
> up to 6 operands can independently be these things. It wasn't
> intended as a fast alternative to #, .
> .... is a,b,c,d,e
> (,,),(,) is (a,b,c),(d,e).
> streamContents: is what you had
> StringWriteStream is basically the same as streamContents: but using a
> WriteStream specialised to Strings with some extra primitive support.
> There are also StringReadStream and StringReadWriteStream.
> StringBuffer is my version of Java's StringBuilder; it's a cross
> between a String, an OrderedCollection, and a WriteStream. It can
> change size like an OrderedCollection; it has most of the "writing"
> methods (but not the "position" ones) of a WriteStream, and at all
> times you can use it as a String without having to copy the contents.
> You would expect #addAllLast: and #nextPutAll: to have the same
> result, and they do, but they were written a different times and
> #nextPutAll: was optimised for the case where the operand is a string
> while #addAllLast: wasn't.
>
> What does all that mean in practice?
> It means that a benchmark like this is VERY SENSITIVE to the details
> of how the library is written.
> Even just bracketing the commas differently gives you a different time.
>
> It means that techniques which are more efficient for LARGE volumes of
> data may have startup costs
> that make them less efficient for SMALL volumes of data, and that this
> is a very small benchmark.
> The cost of a,b,c,d,e is proportional to |a|*5 + |b|*4 + |c|*3 + |d|*2
> + |e|, while the other techniques
> are proportional to |a| + |b| + |c| + |d| + |e|, BUT have overheads of
> their own.
>
> Well, that was astc. What about Pharo?
> 1,950,528 per second' ,,,,
> 6,509,256 per second' withAll:*5
>
> Here it is. I've added withAll:*2 to withAll:*6 to ArrayedCollection class.
> withAll: c1 withAll: c2 withAll: c3 withAll: c4 withAll: c5
> |e1 e2 e3 e4 e5|
> e1 := c1 size.
> e2 := c2 size + e1.
> e3 := c3 size + e2.
> e4 := c4 size + e3.
> e5 := c5 size + e4.
> ^(self new: e5)
> replaceFrom: 1 to: e1 with: c1 startingAt: 1;
> replaceFrom: e1+1 to: e2 with: c2 startingAt: 1;
> replaceFrom: e2+1 to: e3 with: c3 startingAt: 1;
> replaceFrom: e3+1 to: e4 with: c4 startingAt: 1;
> replaceFrom: e4+1 to: e5 with: c5 startingAt: 1;
> yourself
>
> What's the lesson here? Just because A is faster than B doesn't mean
> there isn't a fairly obvious C, D, ..., that will beat A.
>
> Now what is the real argument in favour of StringBuilder in Java and
> streamContents: in Smalltalk?
>
> s := ''.
> 1 to: n do: [:i | s := s , 'X'].
>
> makes a string of n Xs but takes O(n**2) time and turns over O(n**2) memory.
> s := String streamContents: [:o | 1 to: n do: [:i | o nextPut: $X]
> makes a string of n Xs while taking O(n) time and turning over O(n) memory.
> n does not have to be very big before this gets to be a HUGE difference.
>
> For what it's worth, the Java compiler turns a+b+c+d+e into code that creates
> a StringBuilder, stuffs a ... e into it, and then pulls a string out.
> There is no point
> in benchmarking a fixed number of concatenations against a StringBuilder in
> Java because they're the same thing. Smalltalk compilers don't do that.
>
> In Java and in Smalltalk you should seldom concatenation strings, but should
> send the fragments directly to their final destination. I've never
> quite made up
> my mind whether being toString()-centric was Java's biggest blunder or just 
> the
> second biggest, but it was a pretty darned big one for sure. Smalltalk go this
> right: #printOn: is the basic notion and #printString the derived and
> best avoided
> one.
>
> On Sat, 16 Mar 2024 at 08:12, Noury Bouraqadi <bouraq...@gmail.com> wrote:
> >
> > I thought streamContents: was faster than using a comma binary message...
> >
> > I was wrong. Pharo is not Java :-)
> >
> > Noury
> >
> > "Run in P11"
> >
> > a := 'aaaaa'.
> >
> > b := 'bbbbb'.
> >
> > c := 'ccccc'.
> >
> > d := 'ddddd'.
> >
> > e := 'eeeeee'.
> >
> > [ a , b , c , d , e ] bench.
> >
> > "'3958888.090 per second'"
> >
> > "'3808242.503 per second'"
> >
> >
> > [
> >
> > String streamContents: [ :str |
> >
> > str
> >
> > << a;
> >
> > << b;
> >
> > << c;
> >
> > << d;
> >
> > << e ] ] bench
> >
> > "'3083603.838 per second'"
> >
> > "'2927641.144 per second'" a := 'aaaaa'.
> >
> > b := 'bbbbb'.
> >
> > c := 'ccccc'.
> >
> > d := 'ddddd'.
> >
> > e := 'eeeeee'.
> >
> > [ a , b , c , d , e ] bench.
> >
> > "'3958888.090 per second'"
> >
> > "'3808242.503 per second'"
> >
> > [
> >
> > String streamContents: [ :str |
> >
> > str
> >
> > << a;
> >
> > << b;
> >
> > << c;
> >
> > << d;
> >
> > << e ] ] bench
> >
> > "'3083603.838 per second'"
> >
> > "'2927641.144 per second'"
> >
> > a := 'aaaaa'.
> > b := 'bbbbb'.
> > c := 'ccccc'.
> > d := 'ddddd'.
> > e := 'eeeeee'.
> > [ a , b , c , d , e ] bench.
> > "'3958888.090 per second'"
> > "'3808242.503 per second'"
> > [
> > String streamContents: [ :str |
> > str
> > << a;
> > << b;
> > << c;
> > << d;
> > << e ] ] bench
> > "'3083603.838 per second'"
> > "'2927641.144 per second'"
>

Reply via email to