On Tue, Mar 15, 2016 at 11:18:47AM +0100, Ionel Mugurel Ciobîcă wrote: > On 14-03-2016, at 17h 30'55", Jon LaBadie wrote about "decoding UTF-8" > > I frequently find headers (mostly Subject, but also From/To) > > that I assume are some representation form for a UTF-8 encoded > > string as they start with "=?UTF-8?" and end with "=?= ". > > For example: > > > > To: =?UTF-8?B?Z3VuZGk=?= <user@domain> > > > > Is my assumption correct? What is the representation called? > > Is there a tool to regain the original string? > > I believe my video system can display the larger > > character set. > > > > If after =?UTF-8 there is ?Q then the non-ascii characters (and =) are > represented by their hexadecimal representation, for example ç is > =C3=A7. > > If after =?UTF-8 there is ?B then all characters are encoded using an > algorithm that takes 6bits at the time. You can encode and decode this > with base64: > > # echo "something" | base64 > # c29tZXRoaW5nCg== > > #echo "Z3VuZGk=" | base64 -d > gundi > > > Ionel >
Thank you Ionel. I looked at over 400 messages in my spam quarantine directory where I see a lot of these encodings. The vast majority had ?B? after the UTF-8. I tried a few of them and they did decode with base64. Instead of ?B?, seven had ?Q? and two had ?q?. These did not decode with base64. Gee, maybe they are ROT13 :-) jl -- Jon H. LaBadie j...@jgcomp.com 11226 South Shore Rd. (703) 787-0688 (H) Reston, VA 20190 (703) 935-6720 (C)