Fair enough. I will keep playing with this and report back on any results I get. Thanks for the test cases. :-)
Regards, Elias On 28 April 2014 22:24, Juergen Sauermann <juergen.sauerm...@t-online.de>wrote: > Hi, > > generally speaking unnecessary clone() of values should of course be > avoided. > > In GNU APL 1.0 and 1.1 there was a flag-based system of value ownership > where the last owner > would delete the value when giving up its interest in the value. This > system began like the > tmp flag, but then caused stale values on one hand and segfaults for > values released too early on the > other hand all over the place. > > I then changed to Value_P which is pretty much a shared pointer with > reference counting. Initially > I tried the standard shared_ptr<> class of C++ but that caused some > compiler/portability issues. > That change has brought a lot of stability into GNU APL, > > As you have noticed yourself, going through the code and changing is all > over the place is bad and > error-prone, and it would kind of undo the change from the flag-system to > Value_P (which was also > a huge effort as you may figure from SVN). > > So the only solution remaining is rewriting the Value (or. more likely, > the Value_P) class. > Now, the Value class knows how many owners it has (the share_count below > already exists, it is called owner_count). > But it does not know who they are. If someone does X[Y]←Z then symbol X > would need to clone its current value > when Value::index(Z) os performed, but Value::index(Z) does not know > anything about Symbol X and other owners > of the value, > > I will send you the 170+ GNU APL testcases so that you can quickly check > if some change you make is breaking > something else. > > Having several Values pointing to the same ravel is also a bit obscure > because the Cells of the ravel need to be > released (sub values and complex numbers need to be deleted) which is > almost impossible if you have nested > ravels or ravels of of different sizes. > > I also doubt that you can gain 3-4 orders of magnitude because a value is > normally only cloned very few times. > That does not mean that you can't prove otherwise, I can speed up +/⍳N by > 6 orders of magnitude but that does > not prove that this is a valuable optimization, > > /// Jürgen > > > > On 04/28/2014 02:57 PM, Elias Mårtenson wrote: > > Hello Jürgen, > > I don't know if you have given this issue any thought, but it has > certainly occupied my mind for the last few days. > > It's clear that heavy array processing does far too much cloning than > should be necessary. Especially in cases where you have lots of operations > on smaller arrays (as opposed to few operations on large arrays). > > This is because the code always performs a clone prior to a destructive > operation because at that point it can never be sure that the array will > not be used again. The (very few) exceptions to this is handled by the > "temp" system, which is really only used in the ravel function. > > The way I see it, there are two approaches to solving this: The first > one being to go through the code with a fine-toothed comb and implement the > temp system everywhere. This is lots of work and is error-prone. > > The other solution would be to re-engineer the Value class so that it > can share underlying data with other Value instances. The Value class > would then get a counter called share_count or something, indicating how > many other references there are to the same data. When clone() is called, > it will simply create a new Value instance, share the underlying data and > increase share_count. Any destructive operations would only copy the > content if it's not already shared. > > *Now, Value_P already implements some of these semantics. Could it be > reused for this?* > > While appealing, this copy-on-write solution might not be perfect > though. The assumption is that the caller would have decremented the share > count (effectively "releasing" the Value) before the called function tries > to modify it. Now, this "release" is similar to the the "temp" marking. > Could there be a better way? > > I've been experimenting with this on and off, and my interim results (as > I've mentioned before) show that the potential performance boosts are > massive. We're talking about 3-4 orders of magnitude. Definitely worth > quite a lot of effort, IMHO. > > I'm likely going to continue pursuing this, because I personally feel > frustrated when I do something and it's not instantaneous, knowing that I'm > waiting for the interpreter to perform unnecessary operations. :-) > > What are your thoughts on this? What would be the best approach? > > Regards, > Elias > > >