On Sun, Jan 19, 2020 at 12:20:18AM -0500, Paul Procacci wrote:
> On Sun, Jan 19, 2020 at 12:12 AM yary <not....@gmail.com> wrote:
> 
> > In UTF-16 every character is 16 bits, so all 8 bits of zeros tells you is
> > that it's possibly a big-endian ascii character or a little-endian
> > non-ascii character at a position divisible by 256. All zeros U+0000 is
> > unicode NULL, which the windows UTF-16 C convention uses to terminate the
> > string.
> 
> Perfect.  Obviously didn't know that.  My assumption that only the first
> byte gets checked was obviously wrong.

It is correct if you're talking about UTF-8, not UTF-16 :)

G'luck,
Peter

-- 
Peter Pentchev  roam@{ringlet.net,debian.org,FreeBSD.org} p...@storpool.com
PGP key:        http://people.FreeBSD.org/~roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13

Attachment: signature.asc
Description: PGP signature

Reply via email to