>
> Hi Attila,
>
> Well, sound simple but I think it's a bit more...
>
> How can the server guess what will be the encoding in the
> first page it
> sends ( that will be used back to send the form, etc ) ? Even
> if you know
> the language ( say it's japanese ), there are many charsets
> that could be
> used.

Even better idea that came to mind since the previous letter is to always
assume the session charset is same as the charset of last response Writer. I
think this would suffice for most real world scenarios. It would break only
when a) sessions use multiple charsets b) sessions get multiple requests
concurrently. Both cases are rare in my opinion.

>
> If we are going to rely on the fact that the browser will use
> the original
> encoding I would rather send UTF in the first page.
>
> One case I don't know how to deal with - say we have a form to type a
> name. And we have people from Japan, Europe, etc - who might have
> non-ascii letters in their names. There is no way to know
> what charset to
> use to decode - or to send the original page. We can't set
> 8859-2 on the
> first page - or jis or anything else ( because we don't know
> what the user
> is going to type ).
>

In my eyes, UTF-8 is definitely the way to go if you think globally.

Cheers,
  Attila.

Reply via email to