Melvin Smith wrote: > BUT... for the string_replace internal API it might still be worthwhile > for performance to add additional semantics so we don't have to > next stuff for performance. > > I'm open to adding a dest_len arg or something, but consider this a > request for comments on any additional string semantics that we might > be missing that I'm unaware of.
My concern here would be where to stop. If string_replace allows a substring for its replacement string (incidentally, that would be 2 extra arguments, for offset and length), then perhaps string_concat should allow both its arguments to be substrings, and so on. If we look at the code in pasm (excuse my rearrangement of the original post): > my $orig = "123123""; > my $rep = "456456"; > > If I want orig to be 123456 I have to do > > substr($orig, 3, 3, substr($rep, 0, 3) ); set S0, "123123" set S1, "456456" substr S31, S1, 0, 3 substr S0, 3, 3, S31 As compared to the proposed: set S0, "123123" set S1, "456456" substr S0, 3, 3, S1, 0, 3 There are three things that cause the existing code to be slower: 1) Creating a new string header for the substring 2) Copying the data 3) The interim header stays in use until S31 is reused and a DOD run occurs Point 1 can be helped by reducing the cost of creating headers; I have done some work in this area which I hope to post soon. Point 2 is probably the least significant cost; but it can be avoided by COW. Point 3 contributes greatly to point 1, as the header creation is fairly cheap if there are free headers available. The best option I have thought of for helping here, is to allow voluntary explicit disposal of unwanted resources, i.e. the addition of 'free S31' into the above code, since the compiler knows it is not needed after execution of that statement. This would return the buffer header to the free pool, and clear the register. The actual string data would only be reclaimed by GC. This is not a trivial task, as there is currently no method of determining what resource type an S register points to; however a hacked-together test I did a few weeks ago indicated about a 5% improvement in 'life' with just this change. Basically, I am suggesting that we put more effort into tuning the basic engine, rather than add additional opcodes/functions to improve performance of specific high-level functions. If, when we get to the stage of running real code, we find a specific operation is running too slowly, then we can look at specialised handling for it. Incidentally, try the following little program sometime: substr S0, "constant", 0, 8, "variable" print "This is a " print "constant" print " constant.\n" end This can be disallowed at assembly time (see patch below); but, as a general rule, where should we be preserving our constants? -- Peter Gibbs EmKel Systems Index: core.ops =================================================================== RCS file: /home/perlcvs/parrot/core.ops,v retrieving revision 1.137 diff -u -r1.137 core.ops --- core.ops 15 May 2002 05:01:15 -0000 1.137 +++ core.ops 15 May 2002 15:45:48 -0000 @@ -1777,7 +1777,7 @@ goto NEXT(); } -inline op substr(out STR, in STR, in INT, in INT, in STR) { +inline op substr(out STR, invar STR, in INT, in INT, in STR) { $1 = string_replace(interpreter, $2, $3, $4, $5, &$1); goto NEXT(); }