Thank you so much for helping! You were correct that Hudi sink requires checkpointing and by default they were not enabled for Zeppelin and SQL Client. I added the interval setting and how it works.
On Tue, Mar 15, 2022 at 10:58 AM Caizhi Weng <tsreape...@gmail.com> wrote: > Hi! > > Hudi sink will commit only after a successful checkpoint or at the end of > input. I guess you did not enable checkpointing and as Kafka is a never > ending source Hudi will never commit the records. For your testing job, as > value sources are finite and will end soon you can see records in Hudi > instantly. > > dz902 <dz9...@gmail.com> 于2022年3月14日周一 19:28写道: > >> Hi, >> >> I have two connectors created with SQL CLI. Source from Kafka/Debezium, >> and the sink S3 Hudi. >> >> I can SELECT from the source table OK. I can issue INSERT INTO the sink >> OK. So I think both of them work fine. Both have same table structure, jus >> >> However when I do: >> >> INSERT INTO sink >> SELECT id, LAST_VALUE(`value`) >> GROUP BY id >> >> I see no data going to the sink. The hoodie metadata were there, but >> waiting for a very long time I get no data going to the sink. >> >> There were no errors on the tasks, and I could directly use INSERT INTO >> sink VALUES(1, 'test') and immediately get the result, but not with the >> INSERT SELECT command. Is there a buffer or something? >> >> Thanks! >> Dai >> >> >>