I'm happy to see new documentation, including the .dev files, appearing in parrot. However, I do have a small concern that we not set ourselves in a position of maintaining multiple copies of the same information.
To be specific, I looked at byteorder.dev and noted a listing of all the functions. That's fine, but if the list of functions changes in the .c file, someone has to remember to go back and update the list in byteorder.dev as well. (I don't mean to pick on byteorder at all -- in fact quite the contrary -- because it's small, well-commented, and easy-to-follow, it's an easy practice ground for documentation!) I tend to think that a better plan is to not bother listing the functions (unless it makes the implementation discussion easier to understand) and to rely on being able to pull them out of the relevant source file with some sort of tool. Further, keeping them right next to the appropriate source in the .c file makes it more likely that they will be maintained in sync. To be specific, I propose three patches to illustrate how I think these different files ought to work together. 1. docs/pdds/pdd07_codingstd.pod: I clarify where the list of functions goes. 2. byteorder.dev: I took the existing one, put it in POD format, added a few more implementation notes, and removed all the specific functions. 3. byteorder.c: I put in POD documentation for all the functions. The exact format is still to be determined, but I used perl5's utf8.c as an example/model. I chose the arbitrary apiname of 'byteorder'. That's almost certainly wrong but is good enough to get started. I'd welcome discussion on whether this looks like a reasonable way to think about these .dev files, before we get too far along. Andy Dougherty [EMAIL PROTECTED] --- parrot-cvs/docs/pdds/pdd07_codingstd.pod Wed Jul 17 11:35:19 2002 +++ parrot-andy/docs/pdds/pdd07_codingstd.pod Wed Jul 17 12:14:55 2002 @@ -538,12 +538,17 @@ this is in contrast to PDDs, which describe design decisions). This is the place for mini-essays on how to avoid overflows in unsigned arithmetic, or on the pros and cons of differing hash algorithms, and -why the current one was chosen, and how it works. +why the current one was chosen, and how it works. In principle, someone coming to a particular source file for the first time should be able to read the F<.dev> file and gain an immediate overview of what the source file is for, the algorithms it implements, etc. + +The F<.dev> file is not usually the place for a complete listing of all +functions in the source file. That information is (presumably) already +in the file itself, and duplicating it would only lead to more +maintenance work. Currently no particular format or structure is imposed on the developer file, but it should have as a minimum the following sections: --- parrot-cvs/byteorder.dev Tue Jul 16 23:31:03 2002 +++ parrot-andy/byteorder.dev Wed Jul 17 13:18:16 2002 @@ -1,57 +1,47 @@ -Overview -The byteorder code will check the endianness of an INTVAL or -an opcode_t value and swap from little to big, or big to little -when appropriate. Functions also exist to convert a 4, 8, 12, -or 16 byte character buffer to big or little endian. -The functions will be placed in the PackFile -vtable and will be called when necessary. It is hoped that -the Parrot interpreter will not call these functions when +=head1 Name + + byteorder.c + +=head1 Overview + +The byteorder code will check the endianness of an INTVAL or an +opcode_t value and swap from little to big, or big to little when +appropriate. Functions also exist to convert a 4, 8, 12, or 16 byte +character buffer to big or little endian. The functions will be placed +in the PackFile vtable and will be called when necessary. It is hoped +that the Parrot interpreter will not call these functions when converting from and to the same byteorder. -Data Structures and Algorithms -The algorithm to change from one endian to another is -identical and simplistic to understand. Basically, -the size of an INTVAL or opcode_t is used to -determine at compile time how many bits should -be shifted around. Then, the correct bits are shifted -the correct amounts (please look at source code for -exact amounts). The buffer change functions are implemented -by a straight forward algorithm that assigns swaps all -of the bytes. - -Important Functions -fetch_iv_le - This function will convert an INTVAL into -little endian format. It is a noop if the native -format is already little endian. -fetch_iv_be - This function will convert an INTVAL into -big endian format. It is a noop if the native -format is already big endian. -fetch_op_be - This function will convert an opcode_t into -big endian format. It is a noop if the native -format is already big endian. -fetch_op_le - This function will convert an opcode_t into -little endian fommat. It is a noop if the native -format is already little endian. -fetch_buf_le_(4,8,12,16) - This set of functions -will convert an unsigned character buffer into -little endian format. Only a memcpy is performed -if the native format is already little endian. -fetch_buf_be_(4,8,12,16) - This set of functions -will convert an unsigned character buffer into -big endian format. Only a memcpy is performed -if the native format is already big endian. +=head1 Data Structures and Algorithms + +The algorithms to change from one endian to another are identical and +easy to understand. Basically, the size of an INTVAL or opcode_t +is used to determine at compile time how many bits should be shifted +around. Then, the correct bits are shifted the correct amounts (please +look at source code for exact amounts). The buffer change functions +are implemented by a straightforward algorithm that assigns swaps all +of the bytes. All loops are unrolled for simplicity. + +On some systems, the htonl() family of functions may exist and be +implemented in assembly. (This is the case with glibc2-based Linux +systems, if Parrot is compiled with -O or better). It may be +appropriate, at some point, to have Configure probe for such functions +and use them where appropriate. + +=head1 Unimplemented Functions -Unimplemented Functions endianize_fetch_int - fetch an INTVAL directly from a bytestream endianize_put_int - put an INTVAL directly on a bytestream -History +=head1 History + Initial version by Melvin on 2002/05/01 -Notes +=head1 Notes + This assumes big or little endianness...other, more esoteric forms (such as middle endian) are not supported. Also, an assumption of 4 or 8 byte INTVAL's and opcode_t's is made. -References +=head1 References --- parrot-cvs/byteorder.c Wed Jul 17 11:35:16 2002 +++ parrot-andy/byteorder.c Wed Jul 17 13:12:53 2002 @@ -14,6 +14,7 @@ * Initial version by Melvin on 2002/05/1 * Notes: * References: + * See byteorder.dev. */ #include "parrot/parrot.h" @@ -27,9 +28,14 @@ */ /* fetch_iv_le - * This function converts a 4 or 8 byte INTVAL into little - * endian format. If the native format is already little - * endian, then no conversion is done. + +=for api byteorder INTVAL|fetch_iv_le|INTVAL w + + This function converts a 4 or 8 byte INTVAL into little endian + format. If the native format is already little endian, then no + conversion is done. + +=cut */ INTVAL fetch_iv_le(INTVAL w) @@ -54,9 +60,14 @@ } /* fetch_iv_be - * This function converts a 4 or 8 byte INTVAL into big - * endian format. If the native format is already big - * endian, then no conversion is done. + +=for api byteorder INTVAL|fetch_iv_be|INTVAL w + + This function converts a 4 or 8 byte INTVAL into big endian + format. If the native format is already big endian, then no + conversion is done. + +=cut */ INTVAL fetch_iv_be(INTVAL w) @@ -82,7 +93,13 @@ /* - * Same as above for opcode_t +=for api byteorder opcode_t|fetch_op_be|opcode_t w + + This function converts a 4 or 8 byte opcode_t into big endian + format. If the native format is already big endian, then no + conversion is done. + +=cut */ opcode_t fetch_op_be(opcode_t w) @@ -107,6 +124,15 @@ #endif } +/* +=for api byteorder opcode_t|fetch_op_le|opcode_t w + + This function converts a 4 or 8 byte opcode_t into little endian + format. If the native format is already little endian, then no + conversion is done. + +=cut +*/ opcode_t fetch_op_le(opcode_t w) { @@ -131,9 +157,57 @@ } /* - * Unrolled routines for swapping various sizes from 32-128 bits - * These should only be used if alignment is unknown or we are - * pulling something out of a padded buffer. +=for api byteorder void|fetch_buf_be_4|unsigned char *rb|unsigned char *b + + Fetches 4 bytes from b and copies them to rb in big-endian order. + + These functions provide a set of routines for copying (and swapping + bytes, if appropriate) objects 4, 8, 12, or 16 bytes long. Source + bytes are in b; the destination is rb. These functions do not do + in-place swaps. The source and destination must be different. + + If no byte-swapping needs to be done, a simple memcpy() is + performed. + + These functions should only be used if alignment is unknown or we + are pulling something out of a padded buffer. + +=for api byteorder void|fetch_buf_le_4|unsigned char *rb|unsigned char *b + + Fetches 4 bytes from b and copies them to rb in little-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_be_8|unsigned char *rb|unsigned char *b + + Fetches 8 bytes from b and copies them to rb in big-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_le_8|unsigned char *rb|unsigned char *b + + Fetches 8 bytes from b and copies them to rb in little-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_be_12|unsigned char *rb|unsigned char *b + + Fetches 12 bytes from b and copies them to rb in big-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_le_12|unsigned char *rb|unsigned char *b + + Fetches 12 bytes from b and copies them to rb in little-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_be_16|unsigned char *rb|unsigned char *b + + Fetches 16 bytes from b and copies them to rb in big-endian order. + See fetch_buf_be_4. + +=for api byteorder void|fetch_buf_le_16|unsigned char *rb|unsigned char *b + + Fetches 16 bytes from b and copies them to rb in little-endian order. + See fetch_buf_be_4. + +=cut */ void fetch_buf_be_4(unsigned char *rb, unsigned char *b)