Hello, I am working as a Data Scientist and using Spark with Python while developing my models.
Our data model has multiple nested structures (*Ex.* attributes.product.feature.name). For the first two-level, I can read them as the following with PySpark: *data.where(col('attributes.product') != "")* However, I cannot read the three-level nested structure: *data.where(col('attributes.product.feature') != "")* I was hoping to overcome this problem with Spark 3 with the advancements of Predicate & Projection Pushdown; however, after I tested them I still cannot read the third-level *without flattening the data*. I would like to ask whether there will be an improvement in reading nested data (JSON/Parquet) that has *more than two levels *in the upcoming versions of Spark 3. Or if I am missing something in the existing Spark versions released, please let me know how to proceed. Kindest Regards, Pınar Ersoy