l...@gnu.org (Ludovic Courtès) writes:
> Andy Wingo <wi...@pobox.com> skribis:
>
>>   The (newline) function can write CRLF
>>   The ~% format directive should DTRT
>>   read-line should DTRT
>
> IMO the correct abstraction here is transcoders à la R6RS.

Agreed.

> The problem is that scm_t_port doesn’t have any slot to specify the
> EOL style, but it would need one.

I think it's important that we find a way to add new information to
scm_t_port in 2.0.  We also need this to properly fix the BOM issue.

Here's a proposal: let's slightly redefine the meaning of 'input_cd' and
'output_cd'.  Users are already unable to use these, because in the
common case (UTF-8) they are both -1.

Instead of having 'input_cd' and 'output_cd' point directly to the
platform's iconv_t structures, let's have them point to our own internal
structure(s) that hold the needed transcoder state.  This could include
things like the state for internally-implement encoding(s) (e.g. UTF-8
BOM handling), EOL style, and iconv_t pointer(s) if appropriate.

What do you think?

    Mark

Reply via email to