Hello, This is an update of the patches I posted on bug-libunistring two weeks ago: https://lists.gnu.org/archive/html/bug-libunistring/2014-10/msg00001.html
Changes from v1: - The line-breaking implementation has been updated for LBP_HL and LBP_RI - The changes to gen-uninames.lisp is included - Some minor cleanup of gen-uni-tables.c Like the previous post, I didn't include generated files, which you can generate them with: http://du-a.org/~ueno/gen-uni-tables.sh ChangeLog updates are also omitted in order to avoid merge conflicts while testing. The snapshot libunistring distribution is also available: ftp://alpha.gnu.org/gnu/libunistring/libunistring-0.9.5-alpha3.tar.xz I'd really appreciate if someone could go through the patches and find any mistakes before committing. Thanks in advance. Daiki Ueno (10): gen-uni-tables: Minor style fixes gen-uni-tables: Check out-of-range values added to 3-level tables unictype/joininggroup-of: Switch to 3-level table uniwbrk: Ignore Extended/Format at the beginning of the line uniwbrk/u32-wordbreaks-tests: Test using WordBreakTest.txt from UCD uniname: Make codepoint transformation more flexible Update to Unicode 6.1.0 Update to Unicode 6.2.0 Update to Unicode 6.3.0 Update to Unicode 7.0.0 lib/gen-uni-tables.c | 436 ++++++++++++++++++++++++------ lib/unictype.in.h | 37 ++- lib/unictype/bidi_byname.gperf | 12 + lib/unictype/joininggroup_byname.gperf | 59 ++++ lib/unictype/joininggroup_name.h | 29 ++ lib/unictype/joininggroup_of.c | 29 +- lib/unigbrk.in.h | 3 +- lib/unigbrk/uc-is-grapheme-break.c | 9 +- lib/unilbrk/lbrktables.c | 56 ++-- lib/unilbrk/lbrktables.h | 23 +- lib/uniname/gen-uninames.lisp | 86 +++--- lib/uniname/uniname.c | 203 ++++++++------ lib/uniwbrk.in.h | 6 +- lib/uniwbrk/u-wordbreaks.h | 83 ++++-- lib/uniwbrk/wbrktable.c | 52 ++-- lib/uniwbrk/wbrktable.h | 2 +- modules/uniwbrk/u32-wordbreaks-tests | 9 +- tests/unigbrk/test-uc-gbrk-prop.c | 1 + tests/unigbrk/test-uc-is-grapheme-break.c | 1 + tests/uniwbrk/test-uc-wordbreaks.c | 181 +++++++++++++ tests/uniwbrk/test-uc-wordbreaks.sh | 3 + 21 files changed, 1009 insertions(+), 311 deletions(-) create mode 100644 tests/uniwbrk/test-uc-wordbreaks.c create mode 100755 tests/uniwbrk/test-uc-wordbreaks.sh -- 1.9.3