On Thu, 28 Jun 2012, frantisek holop wrote:

>hmm, on Thu, Jun 28, 2012 at 09:47:00AM -0400, Dave Anderson said that
>> Using META is _ugly_, especially for specifying a charset (since the
>> page will be read up through the META element using the charset
>> specified in the real header or assumed by the browser -- and that
>> charset could be incompatible with the actual encoding.)  Why not just
>> use the AddDefaultCharset directive to ensure that a charset is
>> specified in the real header for all pages?  Or is this known to break
>> some browsers that are still in use?
>
>because AddDefaultCharset is a braindead concept.

No, just one that needs to be applied only when appropriate.  The truly
braindead idea is that of partially parsing a file in order to find out
what charset you should have been using in doing that parsing.  This
only "mostly works" because, for the typical page content from the
beginning through any META elements, the encoding specified by most
charset values happens to match the encoding specified by 8859-1.

>as the apache config file comment says (on debian):
>
># In general, it is only a good idea if you know that all your files
># have this encoding. It will override any encoding given in the files
># in meta http-equiv or xml encoding tags.

Precisely.  In the case under discussion (where, IIRC, the files in
question were all 8859-1 but some of them did not get a charset
specified in the real headers) it does exactly what is needed.  In more
complicated situations more configuration is needed and, if this is done
properly, setting a default charset may not be appropriate.

>setting AddDefaultCharset is a sure way to break any
>content on your site that happens to be written
>in the non-default-charset, as the server setting
>overrides the explicit meta-tag.

Not true at all.  If you're using different charset values for different
files, you need to set up a pattern in your file naming which encodes
which charset value is appropriate for each type of file and tell the
webserver about it; it then emits the appropriate header for each file.
For dynamic content it's even simpler -- the program producing the
content should also provide the corresponding header information.

>the webserver has no business telling the client
>what charset the content will be in.  it cannot know.
>especially for dynamic content.  the webserver simply
>shuffles bytes.  sometimes it can give a hint with mime-types,
>sometimes not.

Nonsense!

        Dave

-- 
Dave Anderson
<d...@daveanderson.com>

Reply via email to