i am using spark 3.3.1
here is the sql_string to query a ds partitioned table

```
SELECT
'2024-09-09' AS ds,
 AVG(v1) AS avg_v1,
 AVG(v2) AS avg_v2,
 AVG(v3) AS avg_v3
FROM schema.t1
 WHERE ds = '2024-09-09'
 GROUP BY 1
```

if i am passing the sql_string directly into spark.sql(sql_string), it can
execute without issue.

if i pass the string into catalyst parser, here is the logical_plan in
string representation.
```
Aggregate [1], [2024-09-09 AS ds#164, 'AVG('v1) AS avg_v1#165, 'AVG('v2) AS
avg_v2#166, 'AVG('v3) AS avg_v3#167]
+- 'Filter ('ds = 2024-09-09)
   +- 'UnresolvedRelation [schema, t1], [], false
```

and i want to execute the logical plan
```
val tracker = new QueryPlanningTracker()
// Analyze the logical plan
val analyzedPlan =
sparkSession.sessionState.analyzer.executeAndTrack(logical_plan, tracker)
// Optimize the analyzed plan
optimizedPlan =
sparkSession.sessionState.optimizer.executeAndTrack(analyzedPlan, tracker)
```

it will throw error msg as
>[GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 0 is not in select list
(valid range is [1, 4])


-- 
     Yours
     Rommel

Reply via email to