On 23 Jul 2005, at 09:41, Hieu Le Trung wrote:
Hi have a project that use bison to parse some file.
I need to change to Unicode version and I don't know does bison
support
Unicode or not?
The Bison generated parses token, so from that point of view, it is a
non-issue. You cannot though use the token character forms '...' for
Unicode characters. Instead use standard tokens "%token ...". As for
the error messages that the Bison writes, at least those in English,
these are in the ASCII 7-bit byte subset. If you need error messages
in other languages, you could use UTF-8.
The problem is mostly with the lexer. I have started to use Flex, by
merely writing the .l file in UTF-8 in the "..." patterns, thus
generating a UTF-8 lexer, and it seems to work (which it should,
unless there some unforeseen snag to it). I have also, in the flex
list, posted a Haskell program, by which one can generate Flex like
regular expressions from Unicode character classes. I think there
might be Flex support for Unicode, but I do not know how this work
progresses.
Hans Aberg
_______________________________________________
Help-bison@gnu.org http://lists.gnu.org/mailman/listinfo/help-bison