On Apr 18, 2004, at 8:06 AM, Leopold Toetsch wrote:

Leopold Toetsch <[EMAIL PROTECTED]> wrote:

[ initial proposal ]

I've now checked in a working version.
* c2str.pl generates a  .str header from a .c file
* c2str.pl --all generates $(INC)/string_private_cstring.h
* this us used in string_init() to finally generate entries
  in the interpreter's constant string table

* to add new files makefiles/root.in or such has to be edited
  see objects.str as a template

Using multiple files is only slightly tested though.

To handle multiple files, we'll probably need to generate a .c to hold the C strings (instead of the .h), and have an extern declaration in the .h (since it will be included in multiple files). That's assuming they'll all be aggregated into a single file (which makes sense).

Here is a related patch, to cause us to cache the hash values of all strings (on demand). The important part is that the cached value is cleared out in unmake_COW, which is called any time we might mutate the string (and thus, invalidate the cached value). This will have the side-effect of allowing c2str.pl to be slightly simpler, since it won't need to pre-calculate the hash value (since const strings are the same as any others, and their hash value will be calculated and cached if it is ever needed).

This change speeds up the attached benchmark by a factor of 1.86 in the optimize case (via --optimize, so -Os), or 3.73 in the unoptimized case (on Mac OS X):

# without the patch, optimized build
% ./parrot hash-timing.pasm
679 hash entries
128 characters per test key
1000000 lookups each
rep 1:  1.093889 sec
rep 2:  1.109484 sec
rep 4:  1.095041 sec

# with the patch, optimized build
% ./parrot hash-timing.pasm
679 hash entries
128 characters per test key
1000000 lookups each
rep 1:  0.608547 sec
rep 2:  0.586352 sec
rep 4:  0.575159 sec

JEff


Attachment: hash-caching.patch
Description: Binary data

Attachment: hash-timing.pasm
Description: application/text


Reply via email to