xushiyan opened a new pull request, #9188:
URL: https://github.com/apache/hudi/pull/9188

   ### Change Logs
   
   Currently when bloom/simple index tag locations for input records, RDDs are 
supposed to be cached, but `rdd.unpersist()` was invoked prematurely to make 
caching ineffective. This PR fixes the behavior by marking caching RDD for 
uncaching at `SparkRDDWriteClient#releaseResources` stage.
   
   ### Impact
   
   Indexing performance
   
   ### Risk level
   
   Medium
   
   - [ ] e2e testing & verification
   
   ### Documentation Update
   
   NA
   
   ### Contributor's checklist
   
   - [ ] Read through [contributor's 
guide](https://hudi.apache.org/contribute/how-to-contribute)
   - [ ] Change Logs and Impact were stated clearly
   - [ ] Adequate tests were added if applicable
   - [ ] CI passed
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to