I found several things that either seemed wrong or confusing here. Forgive me if they are explained elsewhere or I missed something obvious ....
On Tue, 2008-04-22 at 10:56 -0700, [EMAIL PROTECTED] wrote: > -Conversion will be done with a function called C<string_grapheme_copy>: > +Conversion will be done with a function called > C<Parrot_string_grapheme_copy>: > > INTVAL string_grapheme_copy(STRING *src, STRING *dst) The prototype doesn't have the new name. > +=head3 Parrot_string_set (was string_set) > + > +Set one string to a copy of the value of another string. I assume this *doesn't* set the COW flag, but it might be clearer to state so. > +=head3 Parrot_string_new_COW (was Parrot_make_COW_reference) > + > +Create a new copy-on-write string. Creating a new string header, clone the > +struct members of the original string, and point to the same string buffer as > +the original string. > [...] > +=head3 Parrot_string_copy (was string_copy) > + > +Make a COW copy a string (a new string header pointing to the same string > +buffer). How are these two different? > +=head3 Parrot_string_from_cstring (was string_from_cstring) > + > +Create a Parrot string from a C string (a C<char *>). Takes two arguments, a > C > +string, and an integer length of the string (number of characters). If the > +integer length isn't passed, the function will calculate the length. > + > +{{NOTE: the integer length isn't really necessary, and is under consideration > +for deprecation.}} I've seen this in a few APIs, and found it to be a useful optimization for numerous special cases. I would suggest not deprecating it yet. > +=head3 Parrot_constant_string_new (was const_string) > + > +Creates and returns a new Parrot constant string. Takes one C string (a > C<char > +*>) as an argument, the value of the constant string. The length of the C > +string is calculated internally. Why not include the optional integer length here as well? It seems to me even more likely than with Parrot_string_from_cstring that the string's size will already be known before this call. > +=head3 Parrot_string_grapheme_copy (new) > + > +Accepts two string arguments: a destination and a source. Iterates through > the > +source string one grapheme at a time and appends it to the destination > string. The name and description don't match -- perhaps Parrot_string_grapheme_append is a better name, with semantics analagous to Parrot_string_append? > +=head3 Parrot_string_chopn (was string_chopn) > +=head3 Parrot_string_chopn_inplace (was string_chopn_inplace). > +=head3 Parrot_string_grapheme_chopn Why no Parrot_string_grapheme_chopn_inplace? > +=head3 string_ord > + > +Replaced by C<Parrot_string_index>. > + > +=head3 string_chr > + > +This is handled just fine by C<Parrot_string_new>, we don't need a special > +version for a single character. I don't understand either of these, assuming 'ord' and 'chr' are used in the standard way. Can you explain this a bit more fully? > +=item new_from_string > + > +Create a new String PMC from a Parrot string argument. > + > +=item clone > + > +Clone a String PMC. > + > +=item get_string > + > +Return the string value of the String PMC. Is the buffer reused for any of these? > +=item get_integer > +=item get_number > +=item get_bignum What bases and formats are understood by each of these? > +=item get_bool > + > +Return the boolean value of the string. What are the rules for this? Since this is a base Parrot PMC, and not a HLL PMC, I'm assuming Perl's rules are not a priori correct. (This may be explained elsewhere, but it should probably be noted here.) > +=item set_integer_native > + > +Set the string to an integer value, transforming the integer to its string > +equivalent. Using what base? > +=item set_bool > + > +Set the string to a boolean (integer) value, transforming the boolean to its > +string equivalent. What do true and false become as strings? '0' and '1', given the integer reference in the description? > +=item set_number_native > + > +Set the string to a floating-point value, transforming the number to its > string > +equivalent. In what format, and with how many decimal places? > +=item set_string_native > + > +Set the String PMC's stored string value to be the string argument. If the > +passed in string is a constant, store a copy. If it's a constant, why bother to make a copy? Why not just mark the PMC read-only or COW? And if not a constant, does this reuse the string buffer, and if so, does it set the string's COW bit or the PMC or both? > +=item assign_string_native > + > +Set the String PMC's stored string value to a copy of the string argument. > + > +=item set_string_same > + > +Set the String PMC's stored string value to be the same as another String > PMC's > +stored string value. {{NOTE: uses direct access into the storage of the two > +PMCs, very ugly.}} Is set_string_native just a cross between these two? Or equivalent to one of them? > +=item set_pmc > + > +Set the String PMC's stored string value to be the same as another PMC's > string > +value, as returned by that PMC's C<get_string> vtable function. Again, is this a copy or a COW? > +=item cmp > + > +Compares two PMCs and returns 1 if SELF is shorter, 0 if they are equal > length > +strings, and -1 if the passed in string argument is shorter. > + > +=item cmp_string > + > +Compares two PMCs and returns 1 if SELF is shorter, 0 if they are equal > length > +strings, and -1 if the passed in string argument is shorter. The names don't match the descriptions, assuming common usage of 'cmp'. Also, they appear to be identical in function. Is this intended? > +=item substr > +=item substr_str > +=item exists_keyed > +=item get_string_keyed > +=item set_string_keyed These are all grapheme specific. Are there character variants? > +=item get_integer_keyed > + > +Returns the integer value of the Nth C<char> in the string. {{DEPRECATE}} > + > +=item set_integer_keyed > + > +Replace the C<char> at the Nth character position in the string with the > +C<char> that corresponds to the passed integer value key. {{DEPRECATE}} I have no problem with deprecating these, as long as there is some other way to do chr and ord from PIR. > +=item to_int > + > +Return the integer equivalent of a string. Using what base? > +=item lower > + > +Change the string to all lowercase. What about the other casing functions? > +=item is_integer > + > +Checks if the string is just an integer. {{NOTE: Currently only works for > ASCII > +strings, fix or deprecate.}} Assuming which base? And does it expect base notations ('Ox' and friends) or not? -'f