Thank you Mark The problem is rather obscure and may have been fixed in 2.2.
I've taken the reins of handling the guile code in GnuCash. For various reasons I can't fathom, the Windows build includes Guile 2.0.14 rather than Guile-2.2. I've checked NEWS and there was change in SRFI-6 string-ports to make them Unicode-capable in 2.0.6. Bearing in mind majority of strings code in GnuCash handle Unicode just fine. However, there are some currencies e.g.TYR https://en.wikipedia.org/wiki/Turkish_lira need extended Unicode and are misprinted as ? in the reports. I've dwelved down and figure there are only 2 offending functions. (format #f "~a bla" str) and (with-output-to-string) as described above. After much experimentation I can fix by changing (format) to (string-append), and changing (with-ouput-to-string) to (open-string-port) and importing srfi-6 as described in original port, and these fix the TYR symbol display. Hence my suspicion that string-ports on Windows munging Unicode. To try elucidate this I've also tried removing (setlocale LC_ALL "") and dumping (locale-encoding) which is "CP1252". There are also other bits where UTF8 is being interpreted as CP1252 but these are outside the scope of this post. So, I'm rather late in this game (started diving into scheme 18 months ago) and have probably missed many controversial changes in the past years, but the issue above seems weird to me, why the Windows port is munging Unicode :) On Tue, 16 Apr 2019 at 17:29, Mark H Weaver <m...@netris.org> wrote: > Hi Christopher, > > Christopher Lam <christopher....@gmail.com> writes: > > > I'm struggling with string-ports on Windows. > > > > Last para of > > https://www.gnu.org/software/guile/manual/html_node/String-Ports.html > > "With string ports, the port-encoding is treated differently than other > > types of ports. When string ports are created, they do not inherit a > > character encoding from the current locale. They are given a default > locale > > that allows them to handle all valid string characters." > > > > This causes a string-sanitize function to not run correctly in Windows. > > (locale-encoding) says "CP1252" no matter what LANG or setlocale I try. > > > > The use case is to sanitize string for html, but on Windows it munges > > extended-unicode. > > Can you explain more fully what the problem is? I know a fair amount > about Unicode, but my knowledge of Windows is extremely weak. > > What exactly is "extended-unicode" in this context? References welcome. > > Thanks, > Mark >