小林さん

> First, the name of the mailing list suggest that a Chinese version of JLREQ 
> and (the still-in-development) KLREQ is in the works. If so, that's great 
> news.

Yes, Chinese text layout task force is going to publish several "REQs" of Asian 
language, Mongolian, Tibetan, and of course Traditional/Simplified 
Chinese/Hanzi. [1]

I'm working on the first working draft of TC/SC Hanzi. We also want to list 
punctuations' code points as reference to solve the mess. Middle Dot is one of 
them.

> About the middle dot in Traditional Chinese, based on the exchange between 
> Addison and me yesterday, both U+30FB and U+FF0E must be removed from the 
> equation, because the former has strong ties to Japanese-only usage (and 
> because Chinese fonts may not include a glyph for this character) and the 
> latter is a full-stop (aka period) that happens to be centered within the 
> em-box for Traditional Chinese use.


Agreed.

> That leaves U+00B7 and U+2027, but U+2022 should also be considered.

I think U+2022 is a bullet and usually larger than the middle dot. And it may 
used for emphasis dot, filled one.

If we get the glyph smaller, authors may need extra work to fix.

> The U+2022 versus U+2027 mapping difference is likely yet another platform 
> (Windows versus OS X) difference in the treatment of some punctuation and 
> other symbols. I suggest that someone use the standard Traditional Chinese 
> IME on both platforms to input the character for 0xA145, then inspect which 
> Unicode character was emitted into the document.。

I will try to sort them out. Yosemite's default IME lists : 4 U+30FB 5 U+00B7 6 
U+FF0E [2]. Quite terrible, 

> Source Han Sans (and Noto Sans CJK), the Pan-CJK typeface family that we 
> released earlier this year, do not conform to the above recommendation, but I 
> can implement these recommendations for Traditional Chinese fonts and font 
> instances in the Version 1.002 update that I am planning for the early part 
> of 2015. This would affect only U+00B7 (it is currently proportional) and 
> U+2022 (ditto). U+2027 is a full-width middle dot.

Thank you. I may list them on the ZHREQ for middle dot, I'm thinking about:

[U+00B7] en, zh-Hant, it will be half-width. 
[U+2027] en, zh-Hant, probably fallback to Chinese font to be full-width.

[U+00B7] zh-Hant <-> zh-Hans, same.
[U+2027] zh-Hant <-> zh-Hans, need mapping.

There's still a lot of issue need to be solved, such as ellipsis in TC. Hope 
the ZHREQ will help. Later better than never.

Regards.


[1] http://www.w3.org/International/groups/chinese-layout/charter
[2] https://www.dropbox.com/s/1muxxy6rucb5zyz/yosemiteimedot.png

Bobby


> Ken Lunde <[email protected]> 於 2014年12月11日 下午11:09 寫道:
> 
> Bobby,
> 
> Allow me to insert a few comments about this particular issue.
> 
> 
> About the middle dot in Traditional Chinese, based on the exchange between 
> Addison and me yesterday, both U+30FB and U+FF0E must be removed from the 
> equation, because the former has strong ties to Japanese-only usage (and 
> because Chinese fonts may not include a glyph for this character) and the 
> latter is a full-stop (aka period) that happens to be centered within the 
> em-box for Traditional Chinese use.
> 
> 
> If you examine Big Five and the near identical CNS 11643 Planes 1 and 2, 
> 0xA150 is grouped together with the so-called "small punctuation" whose 
> practical use still escapes me and most others. These "small" characters are 
> in Unicode starting from U+FE50:
> 
>  http://www.unicode.org/charts/PDF/UFE50.pdf
> 
> In fact, the gap at U+FE53 would correspond to 0xA150 because the characters 
> are in Big Five order (and are in Unicode only because they were in Big 
> Five), so when Unicode Version 1.1 was compiled, U+FE53 was likely occupied 
> in a draft version, then removed, possibly to unify with U+00B7.
> 
> Because the usage of these "small" characters is unclear, I would put less 
> emphasis on it and the use of U+00B7. Instead, the more common middle dot 
> would be 0xA145, which corresponds to U+2027 (according to your notes below), 
> but I think that U+2022 is the better mapping.
> 
> The U+2022 versus U+2027 mapping difference is likely yet another platform 
> (Windows versus OS X) difference in the treatment of some punctuation and 
> other symbols. I suggest that someone use the standard Traditional Chinese 
> IME on both platforms to input the character for 0xA145, then inspect which 
> Unicode character was emitted into the document.
> 
> My recommendation would thus be for Traditional Chinese fonts to include 
> full-width versions of U+00B7 (mainly for compatibility reasons), U+2022, and 
> U+2027. The latter two are to compensate for platform mapping differences, 
> and I would consider them to be much more important than U+00B7 in a 
> Traditional Chinese context.
> 
> Source Han Sans (and Noto Sans CJK), the Pan-CJK typeface family that we 
> released earlier this year, do not conform to the above recommendation, but I 
> can implement these recommendations for Traditional Chinese fonts and font 
> instances in the Version 1.002 update that I am planning for the early part 
> of 2015. This would affect only U+00B7 (it is currently proportional) and 
> U+2022 (ditto). U+2027 is a full-width middle dot.
> 
> Regards...
> 
> -- Ken
> 
>> On Dec 10, 2014, at 6:23 AM, Bobby Tung <[email protected]> wrote:
>> 
>> Hello,
>> 
>> There's a problem I found about the middle dot usage in Traditional Chinese.
>> 
>> --Usage
>> 
>> Middle dot for Traditional Chinese has 3 usages list below: 
>> 
>> 1, separates translated latin name in Hanzi, e.g. 理查・石田
>> 
>> 2, as decimal point in Hanzi e.g. 三・一四
>> 
>> 3, separates book, chapter, title e.g.  詩經・魏風・碩鼠
>> 
>> In Traditional Chinese, the Middle dot should be full-width and a filled 
>> round dot in the middle.
>> 
>> --Codepoint
>> 
>> There's some codepoints general used for the middle dot in Traditional 
>> Chinese.
>> 
>> ·    U+00B7  MIDDLE DOT
>> ‧    U+2027  HYPHENATION POINT
>> ・    U+30FB  KATAKANA MIDDLE DOT
>> .    U+FF0E  FULLWIDTH FULL STOP
>> 
>> And in Simplified Chinese usage, the middle dot is U+00B7.
>> 
>> U+00B7 from A150 and U+2027 from A145 on BIG 5 code table[1]. 
>> 
>> But I think U+00B7's definition more suitable for the middle dot than U+2027 
>> / U+FF0E. 
>> 
>> --Solutions
>> 
>> Considering about interoperability and codepoint definition, I have 2 
>> proposals.
>> 
>> 1. use U+00B7 as general middle dot, if authors want to let it full-width, 
>> use U+30FB. But most Chinese fonts do not have the glyph, certainly fallback 
>> to Japanese font. [2]
>> 
>> 2. use U+00B7 as general middle dot, and in Traditional Chinese subset, let 
>> glyph be full-width. 
>> 
>> 
>> =====
>> 
>> 
>> 各位,我發現繁體字的中點在使用上相當混亂,想藉寫中文排版需求時把標準訂下來,提出兩個方案。
>> 
>> 先提出繁體字「連接號」(舊稱音節號)使用的狀況:
>> 
>> 1, 用來分隔漢譯姓與名,例如:理查・石田
>> 
>> 2, 作為漢字數字的小數點,例如:三・一四
>> 
>> 3, 用來分隔書、章、作品名,例如:詩經・魏風・碩鼠
>> 
>> 而在繁體字的用法上,連接號應該為全形/全角,為置中的實心點。
>> 
>> 再來從實際的文件上,會發現有最常使用的四個Codepoints:
>> 
>> ·    U+00B7  MIDDLE DOT
>> ‧    U+2027  HYPHENATION POINT
>> ・    U+30FB  KATAKANA MIDDLE DOT
>> .    U+FF0E  FULLWIDTH FULL STOP
>> 
>> 簡體字則是統一使用U+00B7,而U+00B7來自BIG 
>> 5的A150,但我認為U+00B7的定義比較符合使用狀況,所以不考慮使用U+2027與U+FF0E。
>> 
>> 所以提出的方案如下:
>> 
>> 1, 
>> 使用U+00B7作為標準中點,若作者想要全形,則使用U+30FB,但因為這個Codepoint許多中文字型沒有造,所以幾乎一定會Fallback到日文字型。
>> 
>> 2, 使用U+00B7作為標準中點,但在繁體字字型中,將其造為全形。
>> 
>> 
>> [1]: http://www.khngai.com/chinese/charmap/tblbig.php?page=0
>> [2]: http://www.unicode.org/reports/tr11/
>> 
>> 
>> 
>> WANDERER Digital Publishing Inc.
>> Bobby Tung @bobtung
>> Mobile:+886-975068558
>> [email protected]
>> http://wanderer.tw
>> 
> 


回复