Reviewers: lemzwerg, Keith, carl.d.sorensen_gmail.com,
Message: On 2012/01/01 02:01:11, Keith wrote:
Works nicely.
Showing the input location will probably be very helpful. We probably
want to
remove the similar message from lily/misc.cc, because both message
together are
very noisy.
Depends on what the message does. This patch checks exclusively the input to the lexer/parser. There are ways of generating strings for the backend programmatically, however. I have decided to check strings, comments and file names here as well. This means that if you use literal strings as binary containers or have to encode file names in non-utf-8 because of other deficiencies in Lilypond, you'll get complaints.
I wish I could think of a way to check the input with a canned regular expression like <http://flex.sourceforge.net/manual/Identifiers.html#Identifiers> or
better one
with comments
<http://www.w3.org/International/questions/qa-forms-utf-8>
Doing so seems to require backing up (which probably won't cause any
harm) or
maybe I'm just not seeing an easy way.
Our lexer has been written with the decision of using non-compressed tables and without backing up. I spent more than a day's worth on doing utf-8 right in the grammar. That's pretty pointless. It also means that we need to provide an error path for every item containing non-UTF-8 characters in order to get a UTF-8 related error message instead of something more mysterious. So I don't think it is really worth the trouble. Description: lexer.ll: Warn about non-UTF-8 characters Making the warnings point to the exact bad byte rather than the enclosing construct would be nice. Please review this at http://codereview.appspot.com/5505090/ Affected files: M lily/include/lily-lexer.hh M lily/lexer.ll _______________________________________________ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel