Author: paultcochrane Date: Thu Oct 18 11:58:53 2007 New Revision: 22234 Modified: trunk/docs/pdds/draft/pdd09_gc.pod trunk/docs/pdds/draft/pdd16_native_call.pod
Log: [docs/pdds] Basic reformat of some paragraphs. This has the positive side effect of lines not exceeding the 100 character limit. Modified: trunk/docs/pdds/draft/pdd09_gc.pod ============================================================================== --- trunk/docs/pdds/draft/pdd09_gc.pod (original) +++ trunk/docs/pdds/draft/pdd09_gc.pod Thu Oct 18 11:58:53 2007 @@ -11,9 +11,9 @@ =head1 DESCRIPTION -Doing DOD takes a bit of work--we need to make sure that everything is findable -from the root set, and that we don't go messing up data shared between -interpreters. +Doing DOD takes a bit of work--we need to make sure that everything is +findable from the root set, and that we don't go messing up data shared +between interpreters. =head1 GC TERMS @@ -26,28 +26,31 @@ =item Reference counting All objects have a count, how often they are refered to by other objects. If -that count reaches zero, the object's space can be reclaimed. This scheme can't -cope with reference loops, i.e, a loop of dead objects, all referencing one -another but not reachable from elsewhere, never gets collected. +that count reaches zero, the object's space can be reclaimed. This scheme +can't cope with reference loops, i.e, a loop of dead objects, all +referencing one another but not reachable from elsewhere, never gets +collected. =item Mark and Sweep -Starting from the root set (Parrot registers, stacks, internal structures) all -reachable objects (and objects reachable from these) are marked being alive. +Starting from the root set (Parrot registers, stacks, internal structures) +all reachable objects (and objects reachable from these) are marked being +alive. -Objects not reached are considered dead and get collected by a sweep -through the objects arenas. +Objects not reached are considered dead and get collected by a sweep through +the objects arenas. =item Copying collection -Live objects are copied into a new memory region. The old memory space can then -be reclaimed. +Live objects are copied into a new memory region. The old memory space can +then be reclaimed. =back =head2 GC Variants -There are several variants possible with the preceding schemes. These variants achieve different goals: +There are several variants possible with the preceding schemes. These +variants achieve different goals: =over 4 @@ -88,9 +91,9 @@ All managed objects (PMCs, Strings, Buffers) inside Parrot are subject to garbage collection. As these objects aren't allowed to move after creation, -garbage collection is done by a non-copying scheme. Further: as we have to cope -with pointers to objects on the C stack and in CPU registers, the garbage -collection scheme is a conservative one. +garbage collection is done by a non-copying scheme. Further: as we have to +cope with pointers to objects on the C stack and in CPU registers, the +garbage collection scheme is a conservative one. DOD/GC is normally triggered by allocation of new objects, which happens usually from some stack nesting below the run-loop. There is a small chance @@ -104,9 +107,10 @@ Variable-sized memory like string memory gets collected, when the associated header isn't found to be alive during DOD. While a copying collection could -basically[1] be done at any time, it's inefficient to copy buffers of objects -that are non yet detected being dead. This implies that before a collection in -the memory pools is run, a DOD run for fixed-sized headers is triggered. +basically[1] be done at any time, it's inefficient to copy buffers of +objects that are non yet detected being dead. This implies that before a +collection in the memory pools is run, a DOD run for fixed-sized headers is +triggered. [1] Dead objects stay dead, there is no possibility of a reusal of dead objects. @@ -119,15 +123,16 @@ new object headers in the fastest possible way. How that is achieved can be considered as an implementation detail. -While GC subsystems are independent they may share some code to reduce Parrot -memory footprint. E.g. stop-the-world mark and sweep and incremental mark and -sweep use the same arena structures and share arena creation and DOD routines. +While GC subsystems are independent they may share some code to reduce +Parrot memory footprint. E.g. stop-the-world mark and sweep and incremental +mark and sweep use the same arena structures and share arena creation and +DOD routines. =head2 Initialization -Currently only one GC system is active (selected at configure or compile time). -But future versions might support switching GC systems during runtime to -accommodate for different work loads. +Currently only one GC system is active (selected at configure or compile +time). But future versions might support switching GC systems during +runtime to accommodate for different work loads. =over 4 @@ -135,9 +140,10 @@ Initialize GC system named C<XXX>. -Called from F<src/memory.c:mem_setup_allocator()> after creating C<arena_base>. -The initialization code is responsible for the creation of the header pools and -has to fill the following function pointer slots in C<arena_base>: +Called from F<src/memory.c:mem_setup_allocator()> after creating +C<arena_base>. The initialization code is responsible for the creation of +the header pools and has to fill the following function pointer slots in +C<arena_base>: =back @@ -157,18 +163,18 @@ Run a normal GC cycle. This is normally called by resource shortage in the buffer memory pools before a collection is run. The bit named -C<DOD_trace_stack_FLAG> indicates that the C-stack (and other system areas like -the processor registers) have to be traced too. +C<DOD_trace_stack_FLAG> indicates that the C-stack (and other system areas +like the processor registers) have to be traced too. -The implementation might or might not actually run a full GC cycle. If e.g an -incremental GC system just has finished the mark phase, it would do nothing. -OTOH if no objects are currently marked live, the implementation should run the -mark phase, so that copying of dead objects is avoided. +The implementation might or might not actually run a full GC cycle. If e.g +an incremental GC system just has finished the mark phase, it would do +nothing. OTOH if no objects are currently marked live, the implementation +should run the mark phase, so that copying of dead objects is avoided. =item DOD_lazy_FLAG -Do a timely destruction run. The goal is to either detect all objects that need -timely destruction or to do a full collection. In the former case the +Do a timely destruction run. The goal is to either detect all objects that +need timely destruction or to do a full collection. In the former case the collection can be interrupted or postponed. This is called from the Parrot run-loop. No system areas have to be traced. @@ -198,8 +204,8 @@ =item C<void (*init_pool) (Interp *, struct Small_Object_Pool *)> -A function to initialize the given pool. This function should set the following -object allocation functions for the given pool. +A function to initialize the given pool. This function should set the +following object allocation functions for the given pool. =back @@ -212,9 +218,9 @@ =item C<PObj * (*get_free_object) (Interp *, struct Small_Object_Pool*)> -It should return one free object from the given pool. Object flags are returned -clear, except flags that are used by the garbage collector itself. If the pool -is a buffer header pool, all other object memory is zeroed. +It should return one free object from the given pool. Object flags are +returned clear, except flags that are used by the garbage collector itself. +If the pool is a buffer header pool, all other object memory is zeroed. =back @@ -239,18 +245,18 @@ =head2 The Arena_base structure -The C<arena_base> holds the mentioned function pointers, pointers to the header -pools, some statistic counters, and a pointer C<void *gc_private> reserved for -the GC subsystem. +The C<arena_base> holds the mentioned function pointers, pointers to the +header pools, some statistic counters, and a pointer C<void *gc_private> +reserved for the GC subsystem. -The GC subsystem is responsible for updating the appropriate statistic fields -of the structure. +The GC subsystem is responsible for updating the appropriate statistic +fields of the structure. =head2 Blocking GC Being able to block GC and DOD is important--you'd hate to have the newly -allocated Buffers or PMCs you've got yanked out from underneath you. That'd be -no fun. Use the following routines to control GC: +allocated Buffers or PMCs you've got yanked out from underneath you. That'd +be no fun. Use the following routines to control GC: =over 4 @@ -272,17 +278,17 @@ =back -Note that the blocking is recursive--if you call Parrot_block_DOD() three times -in succession, you need to call Parrot_unblock_DOD() three times to re-enable -DOD. +Note that the blocking is recursive--if you call Parrot_block_DOD() three +times in succession, you need to call Parrot_unblock_DOD() three times to +re-enable DOD. =head2 Important flags -For PMCs and Buffers to be collected properly, you B<must> get the flags set on -them properly. Otherwise Bad Things Will Happen. +For PMCs and Buffers to be collected properly, you B<must> get the flags set +on them properly. Otherwise Bad Things Will Happen. -Note: don't manipulate these flags directly. Always use the macros defined in -F<include/parrot/pobj.h>. +Note: don't manipulate these flags directly. Always use the macros defined +in F<include/parrot/pobj.h>. =over 4 @@ -293,15 +299,15 @@ =item PObj_custom_mark_FLAG -The C<mark> vtable slot will be called during DOD. The mark function must call -C<pobject_lives> for all non-NULL objects that PMC refers to. +The C<mark> vtable slot will be called during DOD. The mark function must +call C<pobject_lives> for all non-NULL objects that PMC refers to. Please note that C<pobject_lives> may be a macro. =item PObj_data_is_PMC_array_FLAG -Set if the data pointer points to an array of objects. The length of the array -is C<PMC_int_val(SELF)>. +Set if the data pointer points to an array of objects. The length of the +array is C<PMC_int_val(SELF)>. =item PObj_external_FLAG @@ -315,8 +321,8 @@ =item PObj_COW_FLAG -The buffer's memory is copy on write. Any changes to the buffer must first have -the buffer's memory copied. The COW flag should then be removed. +The buffer's memory is copy on write. Any changes to the buffer must first +have the buffer's memory copied. The COW flag should then be removed. =back Modified: trunk/docs/pdds/draft/pdd16_native_call.pod ============================================================================== --- trunk/docs/pdds/draft/pdd16_native_call.pod (original) +++ trunk/docs/pdds/draft/pdd16_native_call.pod Thu Oct 18 11:58:53 2007 @@ -12,26 +12,27 @@ =head1 DESCRIPTION -The NCI is designed to allow Parrot to interface to I<most> of the functions in -a C library without having to write any C code for that interface. It isn't -designed to be a universal C-less interface--there will always be libraries -that have some bizarre parameter list that requires that some C be written. It -should, however, handle all the simple cases. - -Using the NCI, parrot automatically wraps the C functions and presents them as -prototyped subroutines that follow normal parrot calling conventions, and can -be called like any other parrot subroutine. +The NCI is designed to allow Parrot to interface to I<most> of the functions +in a C library without having to write any C code for that interface. It +isn't designed to be a universal C-less interface--there will always be +libraries that have some bizarre parameter list that requires that some C be +written. It should, however, handle all the simple cases. + +Using the NCI, parrot automatically wraps the C functions and presents them +as prototyped subroutines that follow normal parrot calling conventions, and +can be called like any other parrot subroutine. The NCI uses the platform native dynamic by-name function loading mechanism -(dlopen/dlsym on unix and LoadLibrary/GetProcAddress on Win32, for example) to -get the function pointer, then dynamically generates the wrapper based on the -signature of that function. - -As there is no good platform-independent way to determine function signatures -(C header files are not always available (certainly not for libraries not -designed for access from C) and not always reasonably parseable anyway, and -there is no generic way to query a function for its signature) the signature -must be passed in when the linkage between the C function and parrot is made. +(dlopen/dlsym on unix and LoadLibrary/GetProcAddress on Win32, for example) +to get the function pointer, then dynamically generates the wrapper based on +the signature of that function. + +As there is no good platform-independent way to determine function +signatures (C header files are not always available (certainly not for +libraries not designed for access from C) and not always reasonably +parseable anyway, and there is no generic way to query a function for its +signature) the signature must be passed in when the linkage between the C +function and parrot is made. =head2 Function signatures @@ -40,12 +41,13 @@ represents a single parameter passed into the NCI. Note that the letters are case-sensitive, and must be within the base 7-bit ASCII character set. -At some point punctuation may be used as modifiers on the function parameters, -in which case each parameter may be represented by multiple letters. +At some point punctuation may be used as modifiers on the function +parameters, in which case each parameter may be represented by multiple +letters. In I<no> case should the signature letters be separated by whitespace. This -restriction may be lifted in the future, but for now remains as an avenue for -adding additional functionality. +restriction may be lifted in the future, but for now remains as an avenue +for adding additional functionality. =over 4 @@ -89,10 +91,10 @@ =item p -PMC thingie. A generic pointer, taken from or stuck into a PMC's data pointer. -If this is a return type, parrot will create a new UnManagedStruct PMC type, -which is just a generic "pointer to some damn thing or other" PMC type which -Parrot does I<no> management of. +PMC thingie. A generic pointer, taken from or stuck into a PMC's data +pointer. If this is a return type, parrot will create a new UnManagedStruct +PMC type, which is just a generic "pointer to some damn thing or other" PMC +type which Parrot does I<no> management of. =item 2 @@ -129,8 +131,8 @@ invocation, there are two parts: the native function to be invoked, and the PIR code to do the invocation. -First the native function, to be written in C. -On Windows, it is necessary to do a DLL export specification of the NCI function: +First the native function, to be written in C. On Windows, it is necessary +to do a DLL export specification of the NCI function: /* foo.c */ @@ -185,28 +187,28 @@ void (function *)(void *user_data, void *library_data); -The information C<library_data> is normally coming from C code and can be any C -type that Parrot supports as NCI value. +The information C<library_data> is normally coming from C code and can be +any C type that Parrot supports as NCI value. -The position of the C<user_data> is specified with the C<U> function signature, -when creating the callback PMC: +The position of the C<user_data> is specified with the C<U> function +signature, when creating the callback PMC: cb_PMC = new_callback cb_Sub, user_data, "tU" Given a Parrot function C<cb_Sub>, and a C<user_data> PMC, this creates a -callback PMC C<cb_PMC>, which expects the user data as the second argument. The -information returned by the callback (C<library_data>) is a C string. +callback PMC C<cb_PMC>, which expects the user data as the second argument. +The information returned by the callback (C<library_data>) is a C string. -Since parrot needs more than just a pointer to a generic function to figure out -what to do, it stuffs all the extra information into the C<user_data> pointer, -which contains a custom PMC holding all the information that Parrot needs. This -also implies that the C function that installs the callback, must not make any -assumptions on the C<user_data> argument. This argument must be handled -transparently by the C code. - -The callback function takes care of wrapping the external data pointer into an -UnManagedStruct PMC, the same as if it were a p return type of a normal NCI -function. +Since parrot needs more than just a pointer to a generic function to figure +out what to do, it stuffs all the extra information into the C<user_data> +pointer, which contains a custom PMC holding all the information that Parrot +needs. This also implies that the C function that installs the callback, +must not make any assumptions on the C<user_data> argument. This argument +must be handled transparently by the C code. + +The callback function takes care of wrapping the external data pointer into +an UnManagedStruct PMC, the same as if it were a p return type of a normal +NCI function. The signature of the I<parrot> subroutine which is called by the callback should be: @@ -280,9 +282,10 @@ print "Foo callback\n" .end -The C code contains the function to be invoked through NCI. In the function C<sayhello> -a function call is done to a Parrot subroutine. The C<sayhello> function gets a reference -to this callback function, so its signature needs to be known. +The C code contains the function to be invoked through NCI. In the function +C<sayhello> a function call is done to a Parrot subroutine. The C<sayhello> +function gets a reference to this callback function, so its signature needs +to be known. #include <stdio.h> #include <parrot/parrot.h> @@ -303,9 +306,8 @@ cb(result, userdata); } -The file containing this C code should be compiled as a shared library (specifying the C<include> directory so -C<<parrot/parrot.h>> can be found.) - +The file containing this C code should be compiled as a shared library +(specifying the C<include> directory so C<<parrot/parrot.h>> can be found.) =head1 REFERENCES