unsubscribe On Wed, Sep 11, 2024 at 1:51 PM Rommel Holmes <rommelhol...@gmail.com> wrote:
> i am using spark 3.3.1 > here is the sql_string to query a ds partitioned table > > ``` > SELECT > '2024-09-09' AS ds, > AVG(v1) AS avg_v1, > AVG(v2) AS avg_v2, > AVG(v3) AS avg_v3 > FROM schema.t1 > WHERE ds = '2024-09-09' > GROUP BY 1 > ``` > > if i am passing the sql_string directly into spark.sql(sql_string), it can > execute without issue. > > if i pass the string into catalyst parser, here is the logical_plan in > string representation. > ``` > Aggregate [1], [2024-09-09 AS ds#164, 'AVG('v1) AS avg_v1#165, 'AVG('v2) > AS avg_v2#166, 'AVG('v3) AS avg_v3#167] > +- 'Filter ('ds = 2024-09-09) > +- 'UnresolvedRelation [schema, t1], [], false > ``` > > and i want to execute the logical plan > ``` > val tracker = new QueryPlanningTracker() > // Analyze the logical plan > val analyzedPlan = > sparkSession.sessionState.analyzer.executeAndTrack(logical_plan, tracker) > // Optimize the analyzed plan > optimizedPlan = > sparkSession.sessionState.optimizer.executeAndTrack(analyzedPlan, tracker) > ``` > > it will throw error msg as > >[GROUP_BY_POS_OUT_OF_RANGE] GROUP BY position 0 is not in select list > (valid range is [1, 4]) > > > -- > Yours > Rommel > >