romanb commented on code in PR #1803: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1803#discussion_r2037530085
########## src/tokenizer.rs: ########## @@ -895,7 +895,7 @@ impl<'a> Tokenizer<'a> { }; let mut location = state.location(); - while let Some(token) = self.next_token(&mut state)? { + while let Some(token) = self.next_token(&mut state, buf.last().map(|t| &t.token))? { Review Comment: Thank you for the review and the suggestion. The latest commit implements an alternative approach within `Parser#parse_compound_expr`. However, I have to admit that I personally prefer the previous fix in the tokenizer, for a few reasons: 1. The `Tokenizer` is part of the public API of the crate, and without fixing this in the tokenizer, it will continue to exhibit the following behavior with the Mysql dialect: * `t.1to2` tokenizes to `t` (Word), `.1to2` (Word) * `t.1e2` tokenizes to `t` (Word), `.1e2` (Number) * ``t.`1to2` `` tokenizes to `t` (Word), `.` (Period), `1to2` (Word) * ``t.`1e2` `` tokenizes to `t` (Word), `.` (Period) `1e2` (Word) This could be very surprising for users and arguably be considered incorrect (when using the Mysql dialect). 2. The handling of `Word` and `Number` tokens in `parse_compound_expr` is not the most elegant with having to split off the `.` and having to create the correct off-by-one spans accordingly. 3. Unqualified identifiers that start with digits are already handled in the tokenizer - and I think have to be (see https://github.com/apache/datafusion-sqlparser-rs/pull/856). Handling the same problem of misinterpreted tokens for qualified identifiers in the parser seems a bit disconnected and a like a downstream solution to the tokenizer producing incorrect tokens, at least that is how I currently perceive it (see point 1). Let me know which solution you prefer (with or without the latest commit) or if you have another idea. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org