iffyio commented on code in PR #1808: URL: https://github.com/apache/datafusion-sqlparser-rs/pull/1808#discussion_r2040586609
########## src/ast/mod.rs: ########## @@ -8368,6 +8387,22 @@ pub enum CreateFunctionBody { /// /// [BigQuery]: https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#syntax_11 AsAfterOptions(Expr), + /// Function body with statements before the `RETURN` keyword. + /// + /// Example: + /// ```sql + /// CREATE FUNCTION my_scalar_udf(a INT, b INT) + /// RETURNS INT + /// AS + /// BEGIN + /// DECLARE c INT; + /// SET c = a + b; + /// RETURN c; + /// END; + /// ``` + /// + /// [SQL Server]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql + MultiStatement(Vec<Statement>), Review Comment: Ah yeah I think if possible it would be good to reuse the same representation as `ConditionalStatements`.Maybe something like? ```rust CreateFunctionBody::AsBeginEnd(BeginEndStatements) ConditionalStatements::BeginEnd(BeginEndStatements) ``` Where we pull out the current representation into its own type, it'll also get to implement display once as a result too ########## src/parser/mod.rs: ########## @@ -580,10 +580,12 @@ impl<'a> Parser<'a> { // `BEGIN` is a nonstandard but common alias for the // standard `START TRANSACTION` statement. It is supported // by at least PostgreSQL and MySQL. + // `BEGIN` is also used for multi-statement blocks in SQL Server Keyword::BEGIN => self.parse_begin(), // `END` is a nonstandard but common alias for the // standard `COMMIT TRANSACTION` statement. It is supported // by PostgreSQL. + // `END` is also used for multi-statement blocks in SQL Server Review Comment: ```suggestion Keyword::BEGIN => self.parse_begin(), ``` Thinking we can skip the comment entirely since they don't add much, it will avoid the dilemma of being inconsistent in the comment by not mentioning all supported dialects/use cases vs having to mention all of them ########## src/parser/mod.rs: ########## @@ -617,6 +619,9 @@ impl<'a> Parser<'a> { } // `COMMENT` is snowflake specific https://docs.snowflake.com/en/sql-reference/sql/comment Keyword::COMMENT if self.dialect.supports_comment_on() => self.parse_comment(), + Keyword::RETURN if dialect_of!(self is MsSqlDialect) => { Review Comment: ```suggestion Keyword::RETURN => { ``` I think its reasonable to have the parser always parse the statement if it shows up. e.g. snowflake and bigquery also support the statement so we'd get support for them automatically as a result ########## src/parser/mod.rs: ########## @@ -5135,6 +5146,63 @@ impl<'a> Parser<'a> { })) } + /// Parse `CREATE FUNCTION` for [SQL Server] + /// + /// [SQL Server]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql + fn parse_mssql_create_function( + &mut self, + or_alter: bool, + or_replace: bool, + temporary: bool, + ) -> Result<Statement, ParserError> { + let name = self.parse_object_name(false)?; + + let parse_function_param = + |parser: &mut Parser| -> Result<OperateFunctionArg, ParserError> { + let name = parser.parse_identifier()?; + let data_type = parser.parse_data_type()?; + Ok(OperateFunctionArg { + mode: None, + name: Some(name), + data_type, + default_expr: None, + }) + }; + self.expect_token(&Token::LParen)?; + let args = self.parse_comma_separated0(parse_function_param, Token::RParen)?; + self.expect_token(&Token::RParen)?; + + let return_type = if self.parse_keyword(Keyword::RETURNS) { + Some(self.parse_data_type()?) + } else { + None + }; + + self.expect_keyword_is(Keyword::AS)?; + self.expect_keyword_is(Keyword::BEGIN)?; + let function_body = Some(CreateFunctionBody::MultiStatement(self.parse_statements()?)); Review Comment: Ah if the desired behavior is to have optional semicolon delimited statements for mssql, what we could do would be to make the behavior configurable when [parsing the statements list](https://github.com/apache/datafusion-sqlparser-rs/blob/2457d079daeb564782f8d408c19846fe45080caf/src/parser/mod.rs#L461). We could do something like this? ```rust pub fn parse_statements(&mut self) -> Result<Vec<Statement>, ParserError> { self.parse_statements_inner(true) } fn parse_statements_inner(&mut self, require_semicolon_delimiter) -> Result<Vec<Statement>, ParserError> { // ... same as the body of `parse_statements()` on main // ... I _think_ except for this part to make the behavior configurable if require_semicolon_delimiter && expecting_statement_delimiter { return self.expected("end of statement", self.peek_token()); } // ... } ``` then the mssql code can call the inner function directly ########## src/ast/mod.rs: ########## @@ -3603,13 +3603,14 @@ pub enum Statement { managed_location: Option<String>, }, /// ```sql - /// CREATE FUNCTION + /// CREATE [OR ALTER] FUNCTION Review Comment: ```suggestion /// CREATE FUNCTION ``` similar to `ON REPLACE` etc, I think no need to enumerate the different syntax variants since they differ across dialects ########## src/parser/mod.rs: ########## @@ -5135,6 +5146,63 @@ impl<'a> Parser<'a> { })) } + /// Parse `CREATE FUNCTION` for [SQL Server] + /// + /// [SQL Server]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql Review Comment: ```suggestion /// [MsSql]: https://learn.microsoft.com/en-us/sql/t-sql/statements/create-function-transact-sql ``` for this an similar places in the PR maybe we switch to `MsSql` since that's what the dialect is called? vs `Sql Server` ########## src/ast/mod.rs: ########## @@ -4050,6 +4051,16 @@ pub enum Statement { arguments: Vec<Expr>, options: Vec<RaisErrorOption>, }, + /// Return (SQL Server) + /// + /// for Functions: + /// RETURN + /// RETURN scalar_expression + /// RETURN select_statement Review Comment: 1. Yeah, I think either we include support with tests for the `RETURN` statement in this PR or we can remove related new code and defer it to a follow up PR 2. Yeah due to the `Expr` vs `Statement` I suspect an enum would be required, maybe something like? ```rust enum ReturnStatementValue { Expr(Expr), Statement(Statement) } struct ReturnStatement { value: Option<ReturnStatementValue> } Statement::Return(ReturnStatement) ``` ########## src/ast/ddl.rs: ########## @@ -2157,6 +2157,10 @@ impl fmt::Display for ClusteredBy { #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))] #[cfg_attr(feature = "visitor", derive(Visit, VisitMut))] pub struct CreateFunction { + /// Conditionally alters the function only if it already exists. Review Comment: ```suggestion /// True if this is a `CREATE OR ALTER FUNCTION` statement ``` I think something like this would be more descriptive to someone reading the code or using the library? vs describing the database semantics of the flag -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org