iffyio commented on code in PR #1723: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1723#discussion_r1951078414
########## src/ast/operator.rs: ########## @@ -53,6 +53,16 @@ pub enum UnaryOperator { PGAbs, /// Unary logical not operator: e.g. `! false` (Hive-specific) BangNot, + /// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` Review Comment: Could we include a link to the docs where the operators are described? that would be helpful for folks that come across it and need to find more context ########## src/ast/operator.rs: ########## @@ -77,13 +92,13 @@ impl fmt::Display for UnaryOperator { #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "visitor", derive(Visit, VisitMut))] pub enum BinaryOperator { - /// Plus, e.g. `a + b` + /// `+` Addition, also Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' + point '(2.0,0)'` Plus, - /// Minus, e.g. `a - b` + /// `-` Subtraction, Translation (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' - point '(2.0,0)'` Minus, - /// Multiply, e.g. `a * b` + /// `*` Multiplication, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(1,1))' * point '(2.0,0)'` Multiply, - /// Divide, e.g. `a / b` + /// `/` Division, also Scaling (PostgreSQL/Redshift geometric operator), e.g. `box '((0,0),(2,2))' / point '(2.0,0)'` Review Comment: Can we skip this diff? I'm thinking since these operators are fundamental and apply to various scenarios, no need to mention specific dialect semantics given neither the parser nor AST handles them specially as a result ########## src/keywords.rs: ########## @@ -141,6 +141,7 @@ define_keywords!( BOOL, BOOLEAN, BOTH, + BOX, // PostgreSQL/Redshift geometry type, rectangular box Review Comment: same lets drop these comments, the keywords will likely be applicable in other contexts ########## src/parser/mod.rs: ########## @@ -1350,7 +1351,15 @@ impl<'a> Parser<'a> { Ok(Some(expr)) => Ok(expr), // No expression prefix associated with this word - Ok(None) => Ok(self.parse_expr_prefix_by_unreserved_word(&w, span)?), + Ok(None) => { + if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) + && GEOMETRIC_TYPES.contains(&w.keyword) + { + self.parse_geometric_type(w.keyword) Review Comment: could this be its own branch within `parse_expr_prefix_by_reserved_word` instead? It looks like we should be able to do something like ```rust kw if self.dialect.supports_pg_geometric_types() && PG_GEOMETRIC_TYPES.contains(kw) => { self.parse_geometric_expr(kw) } ``` ########## tests/sqlparser_postgres.rs: ########## @@ -2057,13 +2057,13 @@ fn parse_pg_custom_binary_ops() { // Here, we test the ones used by common extensions let operators = [ // PostGIS - "&&&", // n-D bounding boxes intersect - "&<", // (is strictly to the left of) - "&>", // (is strictly to the right of) + "&&&", // n-D bounding boxes intersect + //"&<", // (is strictly to the left of) // Same as OverlapsToLeft, causes duplicate failure + //"&>", // (is strictly to the right of) // Same as OverlapsToRight, causes duplicate failure "|=|", // distance between A and B trajectories at their closest point of approach "<<#>>", // n-D distance between A and B bounding boxes - "|>>", // A's bounding box is strictly above B's. - "~=", // bounding box is the same + //"|>>", // A's bounding box is strictly above B's. // Same as IsStrictlyAbove, causes duplicate failure + //"~=", // bounding box is the same // Same as SameAs, causes duplicate failure Review Comment: not sure I followed why these are commented out, what do we mean by causes duplicate failure? ########## src/ast/operator.rs: ########## @@ -253,6 +268,40 @@ pub enum BinaryOperator { /// Specifies a test for an overlap between two datetime periods: /// <https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#overlaps-predicate> Overlaps, + /// `##` Point of closest proximity (PostgreSQL/Redshift geometric operator), e.g. `point '(0,0)' ## lseg '((2,0),(0,2))'` + PointOfClosestProximity, Review Comment: hmm actually thinking about it, I feel like the naming of the operators might become problematic, e.g. if another dialect supports `##` operator then calling it `PointOfClosestProximity` would be confusing vs if the operator was called `DoubleHash` dialect agnostic. Would it make sense to name the operators in such a manner? i.e. by their characters or if theres a well known description that isnt describing the pg behavior ########## src/ast/operator.rs: ########## @@ -53,6 +53,16 @@ pub enum UnaryOperator { PGAbs, /// Unary logical not operator: e.g. `! false` (Hive-specific) BangNot, + /// `#` Number of points in path or polygon (PostgreSQL/Redshift geometric operator), e.g. `# path '((1,0),(0,1),(-1,0))'` + NumOfPoints, + /// `@-@` Length or circumference (PostgreSQL/Redshift geometric operator), e.g. `@-@ path '((0,0),(1,0))'` + LenOfCircumference, Review Comment: the doc says length _or_ circumference, is the `Of` a typo? ########## src/parser/mod.rs: ########## @@ -1411,6 +1420,33 @@ impl<'a> Parser<'a> { ), }) } + tok @ Token::Sharp + | tok @ Token::AtDashAt + | tok @ Token::AtAt + | tok @ Token::QuestionMarkDash + | tok @ Token::QuestionPipe + if dialect_of!(self is PostgreSqlDialect | RedshiftSqlDialect) => Review Comment: Can we replace these with a dialect method? the `dialect_of` has been deprecated, we can introduce a method like `supports_pg_geometric_types()` or similar ########## src/test_utils.rs: ########## @@ -284,6 +284,17 @@ where dialects } +pub fn only_psql_redshift() -> TestedDialects { + TestedDialects { + dialects: vec![ + Box::new(PostgreSqlDialect {}), + Box::new(RedshiftSqlDialect {}), + ], + options: None, + recursion_limit: None, + } +} + Review Comment: by switching to the dialect method, it would be possible to reuse `all_dialects_where` in place of this ########## src/parser/mod.rs: ########## @@ -1496,6 +1532,33 @@ impl<'a> Parser<'a> { } } + fn parse_geometric_type(&mut self, keyword: Keyword) -> Result<Expr, ParserError> { + let token_with_span = self.next_token(); // Get the full TokenWithSpan + let value = match token_with_span.token { + // Extract the Token field + Token::SingleQuotedString(s) => Value::SingleQuotedString(s), + _ => return self.expected("SingleQuotedString", token_with_span), // Pass full TokenWithSpan + }; Review Comment: ```suggestion let token_with_span = self.next_token(); let value = match token_with_span.token { Token::SingleQuotedString(s) => Value::SingleQuotedString(s), _ => return self.expected("SingleQuotedString", token_with_span), }; ``` thinking we can drop the comments in this case since the code is as descriptive ########## src/ast/operator.rs: ########## @@ -253,6 +268,40 @@ pub enum BinaryOperator { /// Specifies a test for an overlap between two datetime periods: /// <https://jakewheat.github.io/sql-overview/sql-2016-foundation-grammar.html#overlaps-predicate> Overlaps, + /// `##` Point of closest proximity (PostgreSQL/Redshift geometric operator), e.g. `point '(0,0)' ## lseg '((2,0),(0,2))'` + PointOfClosestProximity, + /// `<->` Distance between (PostgreSQL/Redshift geometric operator), e.g. `circle '((0,0),1)' <-> circle '((5,0),1)'` + DistanceBetween, + /// `&&` Overlaps? (PostgreSQL/Redshift geometric operator), e.g. box '((0,0),(1,1))' && box '((0,0),(2,2))' Review Comment: are we missing a variant for this doc comment? ########## src/ast/operator.rs: ########## @@ -121,11 +136,11 @@ pub enum BinaryOperator { MyIntegerDivide, /// Support for custom operators (such as Postgres custom operators) Custom(String), - /// Bitwise XOR, e.g. `a # b` (PostgreSQL-specific) + // `#` PostgreSQL Bitwise XOR, also Intersection (PostgreSQL/Redshift geometric operator), e.g. `box '((1,-1),(-1,1))' # box '((1,1),(-2,-2))'` PGBitwiseXor, - /// Bitwise shift left, e.g. `a << b` (PostgreSQL-specific) + /// `<<` PostgreSQL Bitwise Shift Left, also Left of? (PostgreSQL/Redshift geometric operator), e.g. circle '((0,0),1)' << circle '((5,0),1)' PGBitwiseShiftLeft, - /// Bitwise shift right, e.g. `a >> b` (PostgreSQL-specific) + /// `>>` PostgreSQL Bitwise Shift Right, Right of? (PostgreSQL/Redshift geometric operator), e.g. circle '((5,0),1)' >> circle '((0,0),1)' Review Comment: actually same with these, I think just mentioning the operator suffices, without going into semantics -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org