At 07:35 27/08/2009, Kieran Beltran wrote: >I have encountered a problem when attempting to recognize two >required Standard Z symbols which are "above" the four-hex set >recognized by my generated lexer. The two symbols are \u1D538 and >\u1D53D. [...] >Is the solution to include a fifth digit to be recognized >optionally? Could I simply replace line 495 (as below) and add a >new fragment > >'u' ZDIGIT? XDIGIT XDIGIT XDIGIT XDIGIT
No. It also depends on the stream encoding. IIRC the Java target at least reads in files as UTF-16. So there's no "room" in a single character to store that single digit. Instead, you need to encode it as a surrogate pair. \u1D538, for example, would be encoded as \uD835\uDD38. I'm not entirely sure how it works in the C target, which uses UTF-32 encoding by default; I've never really needed to use characters that high up. List: http://www.antlr.org/mailman/listinfo/antlr-interest Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "il-antlr-interest" group. To post to this group, send email to il-antlr-interest@googlegroups.com To unsubscribe from this group, send email to il-antlr-interest+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/il-antlr-interest?hl=en -~----------~----~----~----~------~----~------~--~---