On 2006-02-04 06:30:27 +0100, Mike Hommey wrote: > Actually, it's the contrary. You must have a font that provides latin > character set and chinese characters.
I think I have one (a bitmap font, for xterm). This would explain the problem. > The way it works is pretty simple, based on the character set from the > page (GB2312), it chooses the chinese simplified font set in firefox > preference, which, by default, uses the generic serif, sans-seric and > monotype fonts. When displaying, it says to fontconfig that the text is > to display with the serif, sans-serif or monotone font, and the chinese > language. IMHO, it would be better to look at the actual characters instead of the encoding. The encoding doesn't have much meaning (one can even use an US-ASCII encoding for any language, thanks to entities), and AFAIK, the specifications from the W3C never say that the encoding has any meaning concerning the language. And in particular, the encoding is something global to the page, whereas the language (or the script) is local. Moreover, the browser should have been able to use different fonts on the page. For instance, what if a Chinese author write an English text, then his name with Chinese characters? Choosing GB2312 would be a natural choice. > Then, fontconfig, knowing it's chinese, tries to find the fittest font > for the generic name, which means it will look first at chinese fonts, > and will pick the first one it finds that contain the latin character > set, which they are, for most of them, providing. Again, I think that it is a bad choice. One should consider that fonts can be displayed under different conditions. For instance, a bitmap font is OK for an xterm, but not for Firefox. > Anyways, it's not firefox's fault if the author of the web page is > dumb and said the page is in GB2312 charset while it is basic ASCII > and if CJK fonts usually include ugly latin characters. I completely disagree: see above. And anyway, GB2312 is *correct* even when there isn't a good reason to use it. On the contrary, Firefox is dumb since it looks at the encoding instead of the actual characters (after resolving entities). The only thing one could say concerning the author is that he should have provided the language (with the lang or xml:lang attribute). But one should note that it is not always easily possible, e.g. when the document comes from an external source where the language isn't provided and it has been processed automatically. Moreover, after some thoughts, choosing the font based on the declared language would also be incorrect. Indeed, one may write a Chinese or Japanese person name with western characters, e.g. <p xml:lang="en">some English text <span xml:lang="ja">Ishii</span></p> All this text should be displayed with the best font providing ASCII. > You have several options to work around this : > - say firefox the page is in a latin language (view -> encoding menu), > which will be fine, since the charset is ASCII, This is not acceptable as a general workaround. > - find the chinese font you have on your system that provides those > characters you don't want to see and uninstall it. Instead of uninstalling it (I'd like to keep it for text terminals), there should be a way to disable this font for Firefox. > - change the chinese font settings in the preferences to use a given > font instead of the generic ones. Yes, choosing Bitstream Vera seems to be OK, and it seems to provide Japanese characters (I haven't tried Chinese). But isn't this something that should be done at the fontconfig level (so that any application would benefit)? Also, I've noticed that when I look at results returned by Google (e.g. for "sharp"), I get nice fonts (both western and Japanese), but when I look at the page that uses ISO-2022-JP, I get an ugly (bitmap?) monospace font. There's something wrong in the font selection! Ditto for Chinese characters (e.g. http://en.wikipedia.org/wiki/China looks OK for both western and Chinese characters). In both cases, the encoding is Unicode. I wonder why Firefox doesn't use a Unicode font for everything! BTW, is there any way to know what request Firefox did to fontconfig for some part of text? -- Vincent Lefèvre <[EMAIL PROTECTED]> - Web: <http://www.vinc17.org/> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/> Work: CR INRIA - computer arithmetic / SPACES project at LORIA