Indeed valid points raised including the potential typo in the new spark version. I suggest, in the meantime, you should look for the so called alternative debugging methods
- - Simpler explain(), try basic explain() or explain("extended"). This might provide a less detailed, but potentially functional, explanation. - Manual Analysis*, *analyze the query structure and logical steps yourself - Spark UI, review the Spark UI (accessible through your Spark application on 4040) for delving into query execution and potential bottlenecks. HTH Mich Talebzadeh, Dad | Technologist | Solutions Architect | Engineer London United Kingdom view my Linkedin profile <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> https://en.everybodywiki.com/Mich_Talebzadeh *Disclaimer:* The information provided is correct to the best of my knowledge but of course cannot be guaranteed . It is essential to note that, as with any advice, quote "one test result is worth one-thousand expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". On Wed, 21 Feb 2024 at 08:37, Holden Karau <holden.ka...@gmail.com> wrote: > Do you mean Spark 3.4? 4.0 is very much not released yet. > > Also it would help if you could share your query & more of the logs > leading up to the error. > > On Tue, Feb 20, 2024 at 3:07 PM Sharma, Anup <anu...@amazon.com.invalid> > wrote: > >> Hi Spark team, >> >> >> >> We ran into a dataframe issue after upgrading from spark 3.1 to 4. >> >> >> >> query_result.explain(extended=True)\n File >> \"…/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py\" >> >> raise Py4JJavaError(\npy4j.protocol.Py4JJavaError: An error occurred while >> calling z:org.apache.spark.sql.api.python.PythonSQLUtils.explainString.\n: >> java.lang.IllegalStateException: You hit a query analyzer bug. Please report >> your query to Spark user mailing list.\n\tat >> org.apache.spark.sql.execution.SparkStrategies$Aggregation$.apply(SparkStrategies.scala:516)\n\tat >> >> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63)\n\tat >> scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)\n\tat >> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)\n\tat >> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)\n\tat >> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)\n\tat >> >> org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:72)\n\tat >> >> org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78)\n\tat >> >> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:196)\n\tat >> >> scala.collection.TraversableOnce$folder$1.apply(TraversableOnce.scala:194)\n\tat >> scala.collection.Iterator.foreach(Iterator.scala:943)\n\tat >> scala.collection.Iterator.foreach$(Iterator.scala:943)\n\tat >> scala.collection.AbstractIterator.foreach(Iterator.scala:1431)\n\tat >> scala.collection.TraversableOnce.foldLeft(TraversableOnce.scala:199)\n\tat >> scala.collect... >> >> >> >> >> >> Could you please let us know if this is already being looked at? >> >> >> >> Thanks, >> >> Anup >> > > > -- > Cell : 425-233-8271 >