Viicos opened a new issue, #2236:
URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2236

   https://github.com/apache/datafusion-sqlparser-rs/pull/235 introduced 
support for HiveQL, and modified how CTEs are parsed to [parse an additional 
`FROM` 
keyword](https://github.com/apache/datafusion-sqlparser-rs/commit/9d9d681cbabf31d1d07ad166da7a1fd87d07d960#diff-4a04259da480a6b794a2e947e4cc03eff4d1aa9330836f5b91cac68c5398193fR2182-L2173).
   
   A user [reported some 
questions](https://github.com/apache/datafusion-sqlparser-rs/pull/235#issuecomment-1189199817)
 on the PR, and looking at the documentation links provided, it seems like 
HiveQL has the ability to use `FROM` directly after a CTE, but it is unclear 
what for.
   [This 
link](https://docs-archive.cloudera.com/HDPDocuments/HDP3/HDP-3.0.1/using-hiveql/content/hive_create_a_table_using_a_cte.html)
 shows an example to insert from a CTE, and [this 
one](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression) 
shows a `SELECT` statement, using the _FROM_ first variant (but it also seems 
like the Hive dialect doesn't has 
[`supports_from_first_select()`](https://github.com/apache/datafusion-sqlparser-rs/blob/d9b53a0cdb369124d9b6ce6237959e66bad859af/src/dialect/mod.rs#L640-L649)?).
   
   The issue is that when using the generic dialect (or dialects supporting 
from first), the parsing of the FROM keyword breaks, e.g.:
   
   ```sql
   WITH test AS (FROM t SELECT a) FROM test SELECT a
   ```
   
   The AST looks like (reduced for visibility):
   
   ```rs
   Query(
       Query {
           with: Some(
               With {
                   with_token: TokenWithSpan {
                       token: Word(
                           Word {
                               value: "WITH",
                               quote_style: None,
                               keyword: WITH,
                           },
                       ),
                       span: Span(Location(1,1)..Location(1,5)),
                   },
                   recursive: false,
                   cte_tables: [
                       Cte {
                           alias: TableAlias {
                               name: Ident {
                                   value: "test",
                                   quote_style: None,
                                   span: Span(Location(1,6)..Location(1,10)),
                               },
                               columns: [],
                           },
                           query: Query {
                               with: None,
                               body: Select(
                                   Select {
                                       select_token: Some(
                                           TokenWithSpan {
                                               token: Word(
                                                   Word {
                                                       value: "SELECT",
                                                       quote_style: None,
                                                       keyword: SELECT,
                                                   },
                                               ),
                                               span: 
Span(Location(1,22)..Location(1,28)),
                                           },
                                       ),
                                       projection: [
                                           UnnamedExpr(
                                               Identifier(
                                                   Ident {
                                                       value: "a",
                                                       quote_style: None,
                                                       span: 
Span(Location(1,29)..Location(1,30)),
                                                   },
                                               ),
                                           ),
                                       ],
                                       from: [
                                           TableWithJoins {
                                               relation: Table {
                                                   name: ObjectName(
                                                       [
                                                           Identifier(
                                                               Ident {
                                                                   value: "t",
                                                                   quote_style: 
None,
                                                                   span: 
Span(Location(1,20)..Location(1,21)),
                                                               },
                                                           ),
                                                       ],
                                                   ),
                                               },
                                           },
                                       ],
                                       flavor: FromFirst,
                                   },
                               ),
                           },
                           from: Some(  // CTE parsed the FROM
                               Ident {
                                   value: "test",
                                   quote_style: None,
                                   span: Span(Location(1,37)..Location(1,41)),
                               },
                           ),
                           closing_paren_token: TokenWithSpan {
                               token: RParen,
                               span: Span(Location(1,30)..Location(1,31)),
                           },
                       },
                   ],
               },
           ),
           body: Select(
               Select {
                   select_token: Some(
                       TokenWithSpan {
                           token: Word(
                               Word {
                                   value: "SELECT",
                                   quote_style: None,
                                   keyword: SELECT,
                               },
                           ),
                           span: Span(Location(1,42)..Location(1,48)),
                       },
                   ),
                   from_token: None,  // The actual SELECT query doesn't have 
the FROM
                   projection: [
                       UnnamedExpr(
                           Identifier(
                               Ident {
                                   value: "a",
                                   quote_style: None,
                                   span: Span(Location(1,49)..Location(1,50)),
                               },
                           ),
                       ),
                   ],
                   from: [],  // and no FROM available
                   flavor: Standard,
               },
           ),
       },
   )
   ```
   I think the simplest fix (although not ideal according to 
https://github.com/apache/datafusion-sqlparser-rs/issues/1430) would be to gate 
the parsing of the FROM keyword in CTEs only if the current dialect is Hive.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to