-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wednesday, March 19 at 02:14 PM, quoth Chris G: >Well I'm still not sure things are right, even after getting my >editor to do (approximately) the right thing. > >Here are some incorrect pound signs:-
Those are all encoded as three bytes: 0xEF 0xBF 0xBD >Here are some correct (as in correctly encoded as utf-8 by my editor) >pound signs:- Those are also all the same three bytes: 0xEF 0xBF 0xBD That *looks* like valid utf-8. For a quick tutorial in three-byte utf-8, the way three-byte letters are encoded (in binary) is like this: 1110xxxx 10yyyyyy 10zzzzzz The three bytes 0xEF 0xBF and 0xBD are, in binary, this: 11101111 10111111 10111101 Thus, the decoded portions are: 1111 111111 111101 Put them back together as a single binary number: 1111111111111101 That's 65533 in decimal (0xfffd in hex). In utf-8, that's referred to as U+FFFD, which (according to the Unicode specification) is: REPLACEMENT CHARACTER - used to replace an incoming character whose value is unknown or unrepresentable in Unicode - compare the use of U+001A as a control character to indicate the substitute function In other words, if that's what your editor is generating, then it obviously doesn't know how to handle a pound symbol, even though it DOES seem to understand UTF-8 (kinda). For what it's worth, the CORRECT utf-8 encoding of the pound symbol (U+00A3) is only two bytes. Here's how we get it. Two-byte unicode characters are encoded like this (in binary): 110yyyyy 10zzzzzz U+00A3 translates to the hex number 0xA3, which in binary is this: 10100011 If we split that up, that becomes: 10 100011 Thus, in UTF-8 it's encoded as: 11000010 10100011 Thus, the correct UTF-8 encoding for a pound symbol is 0xC2 0xA3. Here's an example: £ ~Kyle - -- I contend that we are both atheists. I just believe in one fewer god than you do. When you understand why you dismiss all the other possible gods, you will understand why I dismiss yours. -- Sir Stephen Henry Roberts -----BEGIN PGP SIGNATURE----- Comment: Thank you for using encryption! iEYEARECAAYFAkfhMFsACgkQBkIOoMqOI144+gCg5bLJ2t7fK7+Ih1A6qBFgeuka jO0AoKDy+JgwsknmCiSDkOwG4OTE2p0Z =euIx -----END PGP SIGNATURE-----