Jason-liujc commented on issue #11535: URL: https://github.com/apache/hudi/issues/11535#issuecomment-2204845455
@danny0405 Ahh gotcha, we do have async cleaner that runs for our Hudi tables. @ad1happy2go I don't see any compaction on metadata table since a given date (I believe that's when we moved Hudi cleaning from sync to async, based on Danny's comment). When I delete the metadata and try to reinitialize I do see this error, which I believe they are the blocking instants: ``` 24/06/15 01:06:20 ip-10-0-157-87 WARN HoodieBackedTableMetadataWriter: Cannot initialize metadata table as operation(s) are in progress on the dataset: [[==>20240523221631416__commit__INFLIGHT__20240523224939000], [==>20240523225648799__commit__INFLIGHT__20240523232254000], [==>20240524111304660__commit__INFLIGHT__20240524142426000], [==>20240524235127638__commit__INFLIGHT__20240525000640000], [==>20240525005114829__commit__INFLIGHT__20240525011802000], [==>20240525065356540__commit__INFLIGHT__20240525071004000], [==>20240525170219523__commit__INFLIGHT__20240525192315000], [==>20240527184608604__commit__INFLIGHT__20240527190327000], [==>20240528190417601__commit__INFLIGHT__20240528192418000], [==>20240529054718316__commit__INFLIGHT__20240529060542000], [==>20240530125710177__commit__INFLIGHT__20240531081522000], [==>20240530234238360__commit__INFLIGHT__20240530234726000], [==>20240531082713041__commit__REQUESTED__20240531082715000], [==>20240601164223688__commit__INFLIGHT__2024060 1190853000], [==>20240602072248313__commit__INFLIGHT__20240603005951000], [==>20240603010859993__commit__INFLIGHT__20240603100305000], [==>20240604043334594__commit__INFLIGHT__20240604061732000], [==>20240605061406367__commit__REQUESTED__20240605061412000], [==>20240605063936872__commit__REQUESTED__20240605063943000], [==>20240605071904045__commit__REQUESTED__20240605071910000], [==>20240605074456040__commit__REQUESTED__20240605074502000], [==>20240605082437667__commit__REQUESTED__20240605082443000], [==>20240605085008272__commit__REQUESTED__20240605085014000], [==>20240605123632368__commit__REQUESTED__20240605123638000], [==>20240605130201503__commit__REQUESTED__20240605130207000], [==>20240605134213113__commit__REQUESTED__20240605134219000], [==>20240605140741158__commit__REQUESTED__20240605140747000], [==>20240605144756228__commit__REQUESTED__20240605144802000], [==>20240605151313557__commit__REQUESTED__20240605151319000], [==>20240605195405678__commit__REQUESTED__202406051954110 00], [==>20240605202017653__commit__REQUESTED__20240605202023000], [==>20240605205949232__commit__REQUESTED__20240605205955000], [==>20240605212536568__commit__REQUESTED__20240605212542000], [==>20240605220432089__commit__REQUESTED__20240605220438000], [==>20240606152537217__commit__INFLIGHT__20240607031027000], [==>20240606181110800__commit__INFLIGHT__20240608000043000], [==>20240607112530977__commit__INFLIGHT__20240607212013000], [==>20240607213124841__commit__INFLIGHT__20240609024214000], [==>20240608001245366__commit__INFLIGHT__20240609045530000], [==>20240609030620894__commit__INFLIGHT__20240609180310000], [==>20240609181330488__commit__REQUESTED__20240609181336000], [==>20240609194304829__commit__INFLIGHT__20240611095337000], [==>20240611003906613__commit__INFLIGHT__20240611014341000], [==>20240611100258837__commit__INFLIGHT__20240612075536000], [==>20240611174425406__commit__INFLIGHT__20240611184626000], [==>20240612081821910__commit__INFLIGHT__20240612102427000], [==>2024061 2204659323__commit__REQUESTED__20240612204705000], [==>20240613044301243__commit__INFLIGHT__20240613075101000], [==>20240613085334404__commit__INFLIGHT__20240613105718000], [==>20240613113055212__commit__REQUESTED__20240613113101000], [==>20240613122745696__commit__REQUESTED__20240613122751000], [==>20240614094542418__commit__REQUESTED__20240614094548000], [==>20240614172456990__commit__REQUESTED__20240614172503000], [==>20240614175526954__commit__REQUESTED__20240614175529000], [==>20240614181441857__commit__REQUESTED__20240614181444000], [==>20240614222012190__commit__REQUESTED__20240614222015000], [==>20240614225952031__commit__REQUESTED__20240614225954000], [==>20240614235545094__commit__REQUESTED__20240614235547000]] ``` I guess my next questions are: 1. Is there a way to run compaction of the metadata table asynchrounously, without cleaning up commits, deleting metadata table and recreating them again? The process is a bit expensive and since based on what Danny said, the going forward metadata table compaction still won't work. 2. Also if we just increase the `hoodie.metadata.max.deltacommits.when_pending` parameter to say like 1000000, what type of performance hit would we expect it take? is it mostly on the S3 file listing level? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
