rangareddy opened a new issue, #14352: URL: https://github.com/apache/hudi/issues/14352
### Bug Description **What happened:** The "column stat expression index" functionality, as implemented in the provided Scala code example within the Spark Quick Start Guide, is not performing its intended optimization or yielding the expected results. ```scala scala> // Query on ts column would prune the data using the idx_column_ts index scala> spark.sql(s"SELECT * FROM hudi_indexed_table WHERE from_unixtime(ts, 'yyyy-MM-dd') = '2023-09-24'").show(false); 25/11/24 11:20:31 WARN CacheManager: Asked to cache already cached data. 25/11/24 11:20:32 WARN CacheManager: Asked to cache already cached data. +-------------------+--------------------+------------------+----------------------+-----------------+---+----+-----+------+----+----+ |_hoodie_commit_time|_hoodie_commit_seqno|_hoodie_record_key|_hoodie_partition_path|_hoodie_file_name|ts |uuid|rider|driver|fare|city| +-------------------+--------------------+------------------+----------------------+-----------------+---+----+-----+------+----+----+ +-------------------+--------------------+------------------+----------------------+-----------------+---+----+-----+------+----+----+ ``` **What you expected:** I expected the Scala code to successfully implement and utilize the column stat expression index, resulting in the anticipated query optimization and improved performance (e.g., predicate pushdown or faster data filtering) as documented in the Quick Start Guide. **Steps to reproduce:** 1. Follow the Spark quick start guide index example (https://hudi.apache.org/docs/quick-start-guide#indexing) 2. Query the table data and you will see empty results. ### Environment **Hudi version:** **Query engine:** (Spark/Flink/Trino etc) **Relevant configs:** ### Logs and Stack Trace _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
