Am 23.06.2018 um 20:46 schrieb Brian Inglis:
On 2018-06-22 17:06, Takashi Yano wrote:
On Sat, 23 Jun 2018 05:39:27 +0900
Takashi Yano wrote:
I looked into this problem, and found this is caused by incorrect
return value of iswprint().
I have found the cause. That is, file categories.t is not correct.

For example,  http://ftp.unicode.org/Public/UNIDATA/UnicodeData.txt says:

3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;
4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;
...
4E00;<CJK Ideograph, First>;Lo;0;L;;;;;N;;;;;
9FEF;<CJK Ideograph, Last>;Lo;0;L;;;;;N;;;;;

However, categories.t is:
     {CAT_Lo, 0x3400, 0},
     {CAT_Lo, 0x4DB5, 0},
...
     {CAT_Lo, 0x4E00, 0},
     {CAT_Lo, 0x9FEA, 0},

Therefore, the script mkcategories which generates categories.t should be fixed.
Obviously. I will check why the script was failing here and thanks for
the patch already.

Why are the categories entries not generated from UnicodeData.txt by an (awk) 
script?
They are generated, with a shell script. They were hard-coded before my
patch.
Why not awk? Because I am not familiar with awk.
These entries change with every Unicode release, and a new one came out a few
weeks ago, updated here yesterday.

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
https://www.avast.com/antivirus


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple

Reply via email to