Melvin Smith wrote:

> BUT... for the string_replace internal API it might still be worthwhile
> for performance to add additional semantics so we don't have to
> next stuff for performance.
>
> I'm open to adding a dest_len arg or something, but consider this a
> request for comments on any additional string semantics that we might
> be missing that I'm unaware of.

My concern here would be where to stop. If string_replace allows a substring
for its replacement string (incidentally, that would be 2 extra arguments,
for offset and length), then perhaps string_concat should allow both its
arguments to be substrings, and so on.

If we look at the code in pasm (excuse my rearrangement of the original
post):

> my $orig =  "123123"";
> my $rep = "456456";
>
> If I want orig to be 123456 I have to do
>
> substr($orig, 3, 3, substr($rep, 0, 3) );

 set S0, "123123"
 set S1, "456456"
 substr S31, S1, 0, 3
 substr S0, 3, 3, S31

As compared to the proposed:

 set S0, "123123"
 set S1, "456456"
 substr S0, 3, 3, S1, 0, 3

There are three things that cause the existing code to be slower:
1) Creating a new string header for the substring
2) Copying the data
3) The interim header stays in use until S31 is reused and a DOD run occurs

Point 1 can be helped by reducing the cost of creating headers; I have done
some work in this area which I hope to post soon.
Point 2 is probably the least significant cost; but it can be avoided by
COW.
Point 3 contributes greatly to point 1, as the header creation is fairly
cheap if there are free headers available. The best option I have thought of
for helping here, is to allow voluntary explicit disposal of unwanted
resources, i.e. the addition of 'free S31' into the above code, since the
compiler knows it is not needed after execution of that statement. This
would return the buffer header to the free pool, and clear the register. The
actual string data would only be reclaimed by GC. This is not a trivial
task, as there is currently no method of determining what resource type an S
register points to; however a hacked-together test I did a few weeks ago
indicated about a 5% improvement in 'life' with just this change.

Basically, I am suggesting that we put more effort into tuning the basic
engine, rather than add additional opcodes/functions to improve performance
of specific high-level functions. If, when we get to the stage of running
real code, we find a specific operation is running too slowly, then we can
look at specialised handling for it.

Incidentally, try the following little program sometime:

  substr S0, "constant", 0, 8, "variable"
  print "This is a "
  print "constant"
  print " constant.\n"
  end

This can be disallowed at assembly time (see patch below); but, as a general
rule, where should we be preserving our constants?

--
Peter Gibbs
EmKel Systems

Index: core.ops
===================================================================
RCS file: /home/perlcvs/parrot/core.ops,v
retrieving revision 1.137
diff -u -r1.137 core.ops
--- core.ops  15 May 2002 05:01:15 -0000  1.137
+++ core.ops  15 May 2002 15:45:48 -0000
@@ -1777,7 +1777,7 @@
   goto NEXT();
 }

-inline op substr(out STR, in STR, in INT, in INT, in STR) {
+inline op substr(out STR, invar STR, in INT, in INT, in STR) {
   $1 = string_replace(interpreter, $2, $3, $4, $5, &$1);
   goto NEXT();
 }


Reply via email to