abhibhat98 opened a new issue #1675: URL: https://github.com/apache/hudi/issues/1675
**Describe the problem you faced** When I do an incremental query, I only get the latest event per key. I want to get all the events as a log. e,g at time T1, key value as K1-V1 at time T2, updated key value is K1-V2 at time T3, updated key value is K1-V3 When I do an incremental query between time 0(start) to T3, I only get K1-V3. Is there a way I can set maxCommits(I see that there's an option Setting fromCommitTime=0 and maxCommits=-1 will fetch the entire source table in HiveIncrementalPuller), so that I can stream all these events back from a certain time. As an example, if I ask incremental updates after T1+1, I'd get: K1-V2 K1-V3 I am able to get it using spark.read.parquet ... Is there a way I can get it from Hudi? The environment I am on is EMR 6.0.0 on AWS with Hudi ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
