Hi Santiago, Samuel, > The upload of gettext 0.21 for Debian unstable has made package "dasher", > maintained by Samuel Thibault (in Cc), not to build anymore, as reported here > by Lucas Nussbaum: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978315 > > We are not sure where is exactly the problem (either "dasher" or "gettext"). > > In short: xgettext seems to parse and complain about UTF conformance > of strings even if they are not marked for translation. > > Here is a minimal test case provided by Samuel: > > ----- Begin forwarded message ----- > > € cat test.c > > #include <wchar.h> > > void f(const wchar_t *str) { } > > void g(void) { > f(L"\xABCDFF"); > } > > > € xgettext test.c > xgettext: x-c.c:1666: phase5_get: Assertion `UNICODE_VALUE (c) >= 0 && > UNICODE_VALUE (c) < 0x110000' failed. > > Samuel > > ----- End forwarded message -----
This behaviour was introduced in gettext 0.20, with the ability to grok C11 and C++11 string literals. In the next gettext release, functions like 'f' (which take a 'const wchar_t *' argument) can be designated as gettext-like functions, for which the argument needs to be extracted and put into the POT file. For this, it must be possible to convert it to UTF-8. The assertion could be converted to a reasonable error message, sure. Having a reasonable error message (with line number) *and* emitting this error message only when the string actually gets extracted would make xgettext more complex. Since Samuel says: ... the file that poses problem is Testing/gtest/test/gtest_unittest.cc This is not something that contains anything to be translated, we'd need some option to just ignore Testing/ entirely. this looks like the better option. Bruno