> Ludo sez: > The (undocumented) ‘scm_to_stringn ()’ returns the number of characters, > AFAICS.
I'll try to get a doc patch in this weekend. The second parameter to scm_to_stringn gets filled with the result buffer length from either mem_iconveh or u32_conv_to_encoding, so it is definitely the length in bytes of the locale-encoded string. Maybe I can also make a unit test for this function. > > Also, in the big scheme of things, I wonder if the name "string port" > > is misleading now. Strings can contain the whole codepoint range. > > But string ports can't store the whole range depending on their encoding. > > (That's what the "UTF-8" hack was about.) > > Yes, it’s tricky. The problem is that currently we can send both > textual and binary data to a given port (unlike the R6RS port API, which > judiciously distinguishes textual and binary ports.) Because of that, I > think string ports can’t just use a fixed encoding. > > What do you think? I'm fine with having the string ports operate this way. I think the parallelism to other ports is a good thing. I know that I'm 'splitting hairs', but there are a couple of places in the docs that refer to string ports being "ports *on* a scheme string", when in truth they are ports initialized by strings or that output to strings. That is a trivial point of nomenclature, though. Thanks, Mike