The first draft of the dynamic chartype loading code has been committed. In the process, several changes have been made to the CHARTYPE struct. The main purpose is to provide data fields in the structure, so generic functions can be used for transcoding and digit handling.
Several ops have also been added to simplify testing: set_encoding S0, I0 - set encoding to specified index value set_chartype S0, I0 - set chartype to specified index value - these two can be deleted once these fields can be set via IO etc transcode S0, S1, I0, I1 - transcode to specified encoding/chartype Dynamic loading of a chartype is automatic if a request is made for a non-existent chartype, e.g. find_chartype I0, "8859-1" - this will load parrot/runtime/chartypes/8859-1.TXT set_chartype S0, I0 The search path and extension are hard-coded for now. The mapping file format is that used by the Unicode consortium, and 8859-1.TXT was downloaded directly from their web site; there are lots more mapping files there that we can use (Dan - can you confirm that the license is okay?) There are a lot of limitations in the mechanism so far, including: only singlebyte encoding digit mapping is assumed to be standard ascii '0' to '9' mapping from unicode uses full table scan However, it should allow us to get a start on testing support for multiple character sets in the rest of Parrot, and I wanted to get something in for comments while I continue with further development. All feedback welcome -- Peter Gibbs EmKel Systems