I know the unicode range for CJK is messy 
(https://github.com/go-cc/cc-table/tree/master/text/info/UnicodeCJK),
so I'm glad to found the `*unicode.Han*` range definition in Go. 
My questions are, 

- Is there also a range definition for CJK as a whole as well? 
- Has unicode.Han taken into account of so many separated range in Chinese 
unicode? --
  (https://stackoverflow.com/a/1366113/2125837)

I wrote a small program to test unicode.Han, 
https://github.com/suntong/lang/blob/master/lang/Go/src/text/unicode/unicodeHan.go
and here is its output:

\u2E7F: false   \u2E80: true    \u2E81: true
\u2EFE: false   \u2EFF: false   \u2F00: true

\u2FFF: false   \u3000: false   \u3001: false
\u303E: false   \u303F: false   \u3040: false

\u31FF: false   \u3200: false   \u3201: false
\u32FE: false   \u32FF: false   \u3300: false

\u32FF: false   \u3300: false   \u3301: false
\u33FE: false   \u33FF: false   \u3400: true

\u33FF: false   \u3400: true    \u3401: true
\u4DBE: false   \u4DBF: false   \u4DC0: false

\u4DFF: false   \u4E00: true    \u4E01: true
\u9FFE: false   \u9FFF: false   \uA000: false

\uF8FF: false   \uF900: true    \uF901: true
\uFAFE: false   \uFAFF: false   \uFB00: false

\uFE2F: false   \uFE30: false   \uFE31: false
\uFE4E: false   \uFE4F: false   \uFE50: false

\u1FFFF: false  \u20000: true   \u20001: true
\u2A6DE: false  \u2A6DF: false  \u2A6E0: false

\u2F7FF: false  \u2F800: true   \u2F801: true
\u2FA1E: false  \u2FA1F: false  \u2FA20: false


This output does not go well with the list in 
http://jrgraphix.net/research/unicode.php
specifically, 

       2E80 — 2EFF     [31]CJK Radicals Supplement
       3000 — 303F     [37]CJK Symbols and Punctuation
       3200 — 32FF     [53]Enclosed CJK Letters and Months
       3300 — 33FF     [55]CJK Compatibility
       3400 — 4DBF     [57]CJK Unified Ideographs Extension A
       4E00 — 9FFF     [61]CJK Unified Ideographs
       F900 — FAFF     [77]CJK Compatibility Ideographs
       FE30 — FE4F     [87]CJK Compatibility Forms
       20000 — 2A6DF   [125]CJK Unified Ideographs Extension B
       2F800 — 2FA1F   [127]CJK Compatibility Ideographs Supplement

-- 
https://github.com/go-cc/cc-table/tree/master/text/info/UnicodeCJK#unicode-character-ranges-for-cjk

So you can see that I'm really confused. e.g., which one is the real 
"bible".

Can someone give a clear explanation? 

Thanks a lot!

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to