On Wed, 21 Oct 2009, Pritpal Bedi wrote: Hi,
> > Looks that it's a time to extended our GC and add support for references > > between GC blocks. I'll try to make it soon before we release 2.0. > Thank you that you thought this way. > A must needed feature of VM. > I always wondered why I am unable to free as much > resources as I have allocated, mainly due to the this fact > you explained. And it is really hard to detect and clear such > type of occurances. It may happen only when hb_itemNew() is used so danger places in the code can be easy found. The problem is only with checking .prg code it it exploits them or not. Few months ago I discussed this problem with Mindaugas on this forum and possible solutions. In general it's necessary to add to GC a way to detect logical references between different blocks allocated by hb_gcAlloc(). Our GC knows nothing about internal structure of user GC blocks so it cannot make it without some help from programmer side. Now GC well know internal format of GC Harbour items like arrays, hashes or codeblocks because we added code which scan their all internal references in GC mark phase. The problem appears when programmer wants to store harbour item in custom GC block which is later returned as collectible pointer item to PRG code. Such complex item has to be copied into item allocated by hb_itemNew(). It protects the item against direct releasing by HVM and also against releasing by GC because GC keeps a list of all items allocated by hb_itemNew() and marks all complex items inside them as used. Using hb_itemNew() programmer is sure that item is always valid and does not have to worry that it will be freed suddenly so the pointer stored in C internals will point to some unused memory block. So far all is good but in some cases it may disable automatic destructors of GC blocks. It will happen if user create cyclic reference and then attach such "ring" of items to item allocated by hb_itemNew(). Because GC knows all items allocated by hb_itemNew() then it always mark them as used and all other items which can be accessed from it. It's still nothing wrong if programmer later release the item allocated by hb_itemNew() using hb_itemRelease(). But if he does not want to release it explicitly but by destructor of other GC block where he stored pointer to the item and then this external GC block can be accessed from one of item in the cyclic reference path then he has problem. GC always mark such blocks as used so they are never released. To resolve the problem it's necessary to unlock item returned by hb_itemNew() and inform the GC how it should access it in GC mark pass. The first part is trivial. It's enough to call hb_gcUnlock( pItem ) where pItem is pointer to item returned by hb_itemNew(). But we do not have anything to well resolve the second part. Using hb_gcRegisterSweep() is not a reasonable solution. After unlocking the item without anything to mark them as used during GC mark pass we lost it immediately when GC complete all steps. So we have to modify existing GC code. We can resolve this problem in three ways. 1. we can add additional "mark" function which will be registered with GC block together with cleanup function by hb_gcAlloc(). During "mark" pass GC executed such function for all accessed blocks and this function will call hb_gcItemRef() for all pointer to unlocked items stored in given block. We can implement it without additional memory overhead and even increasing a little bit speed of existing GC core code but it will be necessary to modify hb_gcAlloc() function from: void * hb_gcAlloc( ULONG ulSize, HB_GARBAGE_FUNC_PTR pFunc ); to: typedef { HB_GARBAGE_FUNC_PTR pClean; HB_GARBAGE_FUNC_PTR pMark; } HB_GC_FUNCS; void * hb_gcAllocExt( ULONG ulSize, const HB_GC_TABLE * pFuncs ); and then update all existing code which uses hb_gcAlloc(), hb_parptrGC(), hb_itemGetPtrGC() and introduce static table used as parameter in these functions instead of pointer to cleanup function. It will be job also for 3-rd party code developers. Such modifications breaks binary compatibility so maybe we should also modify hb_gcAlloc() function name to detect possible problems at link time. 2. we can extend internal GC structure and introduce new field with list of bound GC blocks. Programmer can attach given GC block with some to other GC block using: void hb_gcBind( void * pParentBlock, void * pBlock ); Then when GC in mark step finding pParentBlock marks also all blocks attached to it. Such solution increase a little bit total memory used by Harbour because we will need additional space for pointer (on some platforms rounded to memory alignment factor - usually 8 bytes) in each GC block (each codeblock, array, hash array, GC collectible pointer). 3. We can add new function which will be used instead of hb_itemNew(), i.e.: PHB_ITEM hb_itemRefNew( void * pParentBlock, PHB_ITEM pSource ); Such items will be stored in separate list with references to parent blocks. This list is scanned by GC at the end of GC mark step and GC marks as used (and scan recursively) only items for which parent blocks were marked before. This solution needs some extra memory only for items allocated by hb_itemRefNew() so it's statically unimportant. But it may create some performance problems if application uses a lot of items allocated by hb_itemRefNew(). It's necessary to make linear scan to remove hb_itemRefNew() item from the list when pParentBlock() is destroyed (now we have similar situation with hb_gcRegisterSweep() blocks). Now we have to chose one of above method. I would like to hear your opinions about preferred method. best regards, Przemek _______________________________________________ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour