On Wed, Feb 18 2009, Jens Seidel wrote: > On Wed, Feb 18, 2009 at 02:43:11AM +0800, Anthony Wong wrote: >> I have been thinking that using Big5 as the primary encoding for both TC >> (Traditional Chinese) and SC (Simplified Chinese) versions of Debian website >> are detrimental to user contributions. To summarize the current situation of >> the Chinese versions of Debian website, translations must be done in Big5 >> WML files, TC version is basically converted simply from WML to HTML, but to >> generate the SC versions, Big5 files must be converted to GB2312 first. It >> is done so due to the one-to-many SC-TC mappings problem. To deal with the >> differences of terms for the same meaning in TC and SC, like 文件 and 檔案, we >> use a simple mapping table written in Perl and for some terms that are >> rarely used, inline WML substitution syntax is used, like >> [CN:文件:][HKTW:檔案:]. > >> I suggest 1. to convert all existing Chinese WML files for the Debian >> website from Big5 to UTF-8, and 2. to use MediaWiki's Chinese conversion >> table to do both TC-SC and SC-TC conversions. > > Will conditionals as [CN:...] still be used/required? >
I think they are still required in case of mis-auto-conversion. > What dialect (TC resp. SC) do you suggest for committed files? Both is > probably not possible or requires that the build system is extented to > either support a file name suffix to recognize the dialect or a WML > tag such as #use debian::chinese::Traditional_Chinese. > If we migrate to a UTF-8 based build system, then both dialects can be used in one file. - Kanru
pgpeXm5ZcQ8Su.pgp
Description: PGP signature