Fair enough. I will keep playing with this and report back on any results I
get. Thanks for the test cases. :-)

Regards,
Elias


On 28 April 2014 22:24, Juergen Sauermann <juergen.sauerm...@t-online.de>wrote:

>  Hi,
>
> generally speaking unnecessary clone() of values should of course be
> avoided.
>
> In GNU APL 1.0 and 1.1 there was a flag-based system of value ownership
> where the last owner
> would delete the value when giving up its interest in the value. This
> system began like the
> tmp flag, but then caused stale values on one hand and segfaults for
> values released too early on the
> other hand all over the place.
>
> I then changed to Value_P which is pretty much a shared pointer with
> reference counting. Initially
> I tried the standard shared_ptr<> class of C++ but that caused some
> compiler/portability issues.
> That change has brought a lot of stability into GNU APL,
>
> As you have noticed yourself, going through the code and changing is all
> over the place is bad and
> error-prone, and it would kind of undo the change from the flag-system to
> Value_P (which was also
> a huge effort as you may figure from SVN).
>
> So the only solution remaining is rewriting the Value (or. more likely,
> the Value_P) class.
> Now, the Value class knows how many owners it has (the share_count below
> already exists, it is called owner_count).
> But it does not know who they are. If someone does X[Y]←Z then symbol X
> would need to clone its current value
> when Value::index(Z) os performed, but Value::index(Z) does not know
> anything about Symbol X and other owners
> of the value,
>
> I will send you the 170+ GNU APL testcases so that you can quickly check
> if some change you make is breaking
> something else.
>
> Having several Values pointing to the same ravel is also a bit obscure
> because the Cells of the ravel need to be
> released (sub values and complex numbers need to be deleted) which is
> almost impossible if you have nested
> ravels or ravels of of different sizes.
>
> I also doubt that you can gain 3-4 orders of magnitude because a value is
> normally only cloned very few times.
> That does not mean that you can't prove otherwise, I can speed up +/⍳N  by
> 6 orders of magnitude but that does
> not prove that this is a valuable optimization,
>
> /// Jürgen
>
>
>
> On 04/28/2014 02:57 PM, Elias Mårtenson wrote:
>
> Hello Jürgen,
>
>  I don't know if you have given this issue any thought, but it has
> certainly occupied my mind for the last few days.
>
>  It's clear that heavy array processing does far too much cloning than
> should be necessary. Especially in cases where you have lots of operations
> on smaller arrays (as opposed to few operations on large arrays).
>
>  This is because the code always performs a clone prior to a destructive
> operation because at that point it can never be sure that the array will
> not be used again. The (very few) exceptions to this is handled by the
> "temp" system, which is really only used in the ravel function.
>
>  The way I see it, there are two approaches to solving this: The first
> one being to go through the code with a fine-toothed comb and implement the
> temp system everywhere. This is lots of work and is error-prone.
>
>  The other solution would be to re-engineer the Value class so that it
> can share underlying data with other Value instances. The Value class
> would then get a counter called share_count or something, indicating how
> many other references there are to the same data. When clone() is called,
> it will simply create a new Value instance, share the underlying data and
> increase share_count. Any destructive operations would only copy the
> content if it's not already shared.
>
>  *Now, Value_P already implements some of these semantics. Could it be
> reused for this?*
>
>  While appealing, this copy-on-write solution might not be perfect
> though. The assumption is that the caller would have decremented the share
> count (effectively "releasing" the Value) before the called function tries
> to modify it. Now, this "release" is similar to the the "temp" marking.
> Could there be a better way?
>
>  I've been experimenting with this on and off, and my interim results (as
> I've mentioned before) show that the potential performance boosts are
> massive. We're talking about 3-4 orders of magnitude. Definitely worth
> quite a lot of effort, IMHO.
>
>  I'm likely going to continue pursuing this, because I personally feel
> frustrated when I do something and it's not instantaneous, knowing that I'm
> waiting for the interpreter to perform unnecessary operations. :-)
>
>  What are your thoughts on this? What would be the best approach?
>
>  Regards,
> Elias
>
>
>

Reply via email to