Hans-Peter Diettrich wrote:
Mark Morgan Lloyd schrieb:
I've got a couple of terminal emulators using WideChar and WideString
for internal manipulation, what /should/ I be using? and where does it
leave things like Sorokin's regex unit, which similarly use WideChar
and WideString?
Depends on which libraries you use. AFAIK SBCS RegEx works for both Ansi
and UTF-8 strings, so that an UTF-16 library is optional. For the
terminal emulators I'd think that it's sufficient to introduce an
internal string type that allows to switch between UTF-8 and UTF-16, so
that the (different?) behaviour can be tested. When there exist
differences, this indicates that the WideString emulators *only* handle
Unicode BMP characters, not surrogate pairs, and you have to decide
whether this restriction is okay for you.
I think I need to clarify. The terminal emulators are not for a standard
coding such as UTF-8, but accept a non-standard byte sequence over e.g.
a telnet or serial connection and convert that to a particular set of
characters to emulate e.g. an IBM Selectric APL golfball.
Sorokin's regex unit is a separate issue, and applies to FPC's regexpr
package which uses WideChar: I don't know whether this would be
problematic on Windows.
--
Mark Morgan Lloyd
markMLl .AT. telemetry.co .DOT. uk
[Opinions above are the author's, not those of his employers or colleagues]
_______________________________________________
fpc-devel maillist - [email protected]
http://lists.freepascal.org/mailman/listinfo/fpc-devel