On Wed, 21 Oct 2009, Pritpal Bedi wrote:

Hi,

> > Looks that it's a time to extended our GC and add support for references
> > between GC blocks. I'll try to make it soon before we release 2.0.
> Thank you that you thought this way.
> A must needed feature of VM.
> I always wondered why I am unable to free as much
> resources as I have allocated, mainly due to the this fact
> you explained. And it is really hard to detect and clear such 
> type of occurances.

It may happen only when hb_itemNew() is used so danger places in
the code can be easy found. The problem is only with checking .prg
code it it exploits them or not.

Few months ago I discussed this problem with Mindaugas on this forum
and possible solutions. In general it's necessary to add to GC a way
to detect logical references between different blocks allocated by
hb_gcAlloc(). Our GC knows nothing about internal structure of user
GC blocks so it cannot make it without some help from programmer side.
Now GC well know internal format of GC Harbour items like arrays, hashes
or codeblocks because we added code which scan their all internal
references in GC mark phase.
The problem appears when programmer wants to store harbour item in
custom GC block which is later returned as collectible pointer item
to PRG code.
Such complex item has to be copied into item allocated by hb_itemNew().
It protects the item against direct releasing by HVM and also against
releasing by GC because GC keeps a list of all items allocated by
hb_itemNew() and marks all complex items inside them as used.
Using hb_itemNew() programmer is sure that item is always valid
and does not have to worry that it will be freed suddenly so the
pointer stored in C internals will point to some unused memory block.

So far all is good but in some cases it may disable automatic
destructors of GC blocks. It will happen if user create cyclic
reference and then attach such "ring" of items to item allocated
by hb_itemNew(). Because GC knows all items allocated by hb_itemNew()
then it always mark them as used and all other items which can be
accessed from it. It's still nothing wrong if programmer later release
the item allocated by hb_itemNew() using hb_itemRelease(). But if he
does not want to release it explicitly but by destructor of other
GC block where he stored pointer to the item and then this external
GC block can be accessed from one of item in the cyclic reference path
then he has problem. GC always mark such blocks as used so they are never
released.
To resolve the problem it's necessary to unlock item returned by
hb_itemNew() and inform the GC how it should access it in GC mark
pass.
The first part is trivial. It's enough to call hb_gcUnlock( pItem )
where pItem is pointer to item returned by hb_itemNew().
But we do not have anything to well resolve the second part.
Using hb_gcRegisterSweep() is not a reasonable solution.
After unlocking the item without anything to mark them as used during
GC mark pass we lost it immediately when GC complete all steps.
So we have to modify existing GC code.
We can resolve this problem in three ways.
1. we can add additional "mark" function which will be registered
   with GC block together with cleanup function by hb_gcAlloc().
   During "mark" pass GC executed such function for all accessed
   blocks and this function will call hb_gcItemRef() for all pointer
   to unlocked items stored in given block. We can implement it
   without additional memory overhead and even increasing a little
   bit speed of existing GC core code but it will be necessary
   to modify hb_gcAlloc() function from:
      void * hb_gcAlloc( ULONG ulSize, HB_GARBAGE_FUNC_PTR pFunc );
   to:
      typedef 
      {
         HB_GARBAGE_FUNC_PTR  pClean;
         HB_GARBAGE_FUNC_PTR  pMark;
      }
      HB_GC_FUNCS;
      void * hb_gcAllocExt( ULONG ulSize, const HB_GC_TABLE * pFuncs );
   and then update all existing code which uses hb_gcAlloc(), hb_parptrGC(),
   hb_itemGetPtrGC() and introduce static table used as parameter in these
   functions instead of pointer to cleanup function. It will be job also
   for 3-rd party code developers. Such modifications breaks binary
   compatibility so maybe we should also modify hb_gcAlloc() function name
   to detect possible problems at link time.
2. we can extend internal GC structure and introduce new field with
   list of bound GC blocks. Programmer can attach given GC block with some
   to other GC block using:
      void hb_gcBind( void * pParentBlock, void * pBlock );
   Then when GC in mark step finding pParentBlock marks also all blocks
   attached to it. Such solution increase a little bit total memory used
   by Harbour because we will need additional space for pointer (on some
   platforms rounded to memory alignment factor - usually 8 bytes) in
   each GC block (each codeblock, array, hash array, GC collectible pointer).
3. We can add new function which will be used instead of hb_itemNew(), i.e.:
      PHB_ITEM hb_itemRefNew( void * pParentBlock, PHB_ITEM pSource );
   Such items will be stored in separate list with references to parent
   blocks. This list is scanned by GC at the end of GC mark step and GC
   marks as used (and scan recursively) only items for which parent blocks
   were marked before. This solution needs some extra memory only for items
   allocated by hb_itemRefNew() so it's statically unimportant. But it
   may create some performance problems if application uses a lot of
   items allocated by hb_itemRefNew(). It's necessary to make linear
   scan to remove hb_itemRefNew() item from the list when pParentBlock()
   is destroyed (now we have similar situation with hb_gcRegisterSweep()
   blocks).

Now we have to chose one of above method.
I would like to hear your opinions about preferred method.

best regards,
Przemek
_______________________________________________
Harbour mailing list
Harbour@harbour-project.org
http://lists.harbour-project.org/mailman/listinfo/harbour

Reply via email to