On Jul 9, 2009, at 3:02 AM, Ethan Grammatikidis wrote:
; hget http://google.gr/
<!doctype html><html><head><meta http-equiv="content-type"
content="text/html; charset=ISO-8859-7">
i'm pretty sure that ISO-8859-7 != utf-8.
I guess that's server-side mucking about based on user-agent not
reporting utf-8 capability or something stupid. Firefox page info
feature reports the page as utf-8, and on inspection of the source:
<!doctype html><html><head><meta http-equiv="content-type"
content="text/html; charset=UTF-8">
I wonder if there's some 'prefered encoding' message the UA can send
to the server.
Accept-Charset is the http header that you want, but to do it `right'
you probably want to muck about with http's q-value weighting system.
The shorter form is that you'll have to tell the server you're ok with
UTF, or it'll fall back to it's best-guess techniques, with the
default fallback of iso-8859.
*Chad