On Sun, 2012-07-08 at 08:08 -0700, sungkhum wrote: > I have two questions: is there a way to have the LibreOffice spelling > checker (Hunspell) also recognize word-breaks using the ICU break iterator > for Khmer so that Cambodians no longer have to add zero-width spaces > manually (as it seems to work for Thai now?)? Currently, lines without > zero-width spaces are seen as one long word to the spelling checker in > LibreOffice 3.6. But since the line-breaking is working, it would seem > breaking words for the spelling checker should also be able to work. Should > I submit a bug? How should I proceed?
Sounds like a bug really. I mean, hunspell itself generally doesn't do the parsing of text into words, the app gives each word to hunspell. And we're *supposed* to be using the icu breakiterator to split words. I suspect its a similar bug as this original one. So... sure, file a bug, assign it to me (caol...@redhat.com) and paste a short two word example text into the bug and indicate where the word break should be and I'll add a regression test for it and see if its a trivial fix for Khmer too now that we're using the latest-and-greatest icu. > Also, since many other programs do not incorporate ICU's code, is there a > way to make the line breaks "real" when a document is saved in another > format (such as a .doc?). And by "real" I mean that a zero-width space is > actually added to the text where a line-break should be. That should at least be theoretically possible, albeit a bit tricky seeing as the layout code is the bit that knows the width of the page and does the line breaking, while the export filters don't get to know that information. There was something similar done in the past IIRC to pass around soft-page-break information so that export filters could know where the layout last put the page breaks. I forget the details of that though. C. _______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice