On Mon, Apr 9, 2012 at 5:14 AM, Kaz Kylheku <k...@kylheku.com> wrote: > Not only can compilers compress storage by recognizing that string literals > are > the suffixes of other string literals, but a lot of string manipulation code > is > simplified, because you can treat a pointer to interior of any string as a > string.
I'm not sure about the value of tail recursion in C, but this is definitely a majorly useful feature, as is the related technique of parsing by dropping null bytes into the string (see for instance the strtok function, which need not do any memory movement; I wrote a CSV parser that works the same way). Often I use both techniques simultaneously, for instance in parsing this sort of string: "A:100 B:200 C:300" First, tokenize on the spaces by looking for a space, retaining a pointer, and putting in a NUL: char *next=strchr(str,' '); if (!next) break; *next++=0; Then read a character, and increment the pointer through that string as you parse. Try doing THAT in a high level language without any memory copying. And "without any memory copying" may not be important with this trivial example, but suppose you've just read in a huge CSV file to parse - maybe 16MB in the normal case, with no actual limit other than virtual memory. (And yes, I read the whole thing in at once, because it comes from a Postgres database and reading it in pieces would put more load on the central database server.) Don't get me wrong, I wouldn't want to do _everything_ in C; but I also wouldn't want to do everything in length-preceded strings. The nearest equivalent that would be able to use the shared buffer is a length-external string like BASIC uses (or used, back when I used to write BASIC code and 8086 assembly to interface with it) - a string "object" consists of a length and a pointer. But that has issues with freeing up memory, if you're using parts of a string. ChrisA -- http://mail.python.org/mailman/listinfo/python-list