On Sun, Jan 19, 2020 at 12:20:18AM -0500, Paul Procacci wrote: > On Sun, Jan 19, 2020 at 12:12 AM yary <not....@gmail.com> wrote: > > > In UTF-16 every character is 16 bits, so all 8 bits of zeros tells you is > > that it's possibly a big-endian ascii character or a little-endian > > non-ascii character at a position divisible by 256. All zeros U+0000 is > > unicode NULL, which the windows UTF-16 C convention uses to terminate the > > string. > > Perfect. Obviously didn't know that. My assumption that only the first > byte gets checked was obviously wrong.
It is correct if you're talking about UTF-8, not UTF-16 :) G'luck, Peter -- Peter Pentchev roam@{ringlet.net,debian.org,FreeBSD.org} p...@storpool.com PGP key: http://people.FreeBSD.org/~roam/roam.key.asc Key fingerprint 2EE7 A7A5 17FC 124C F115 C354 651E EFB0 2527 DF13
signature.asc
Description: PGP signature