Re: CSV join in batch mode

2022-02-23 Thread Guowei Ma
Hi, Killian Sorry for responding late! I think there is no simple way that could catch csv processing errors. That means that you need to do it yourself.(Correct me if I am missing something). I think you could use RockDB State Backend[1], which would spill data to disk. [1] https://nightlies.apac

CSV join in batch mode

2022-02-21 Thread Killian GUIHEUX
Hello all, I have to perform a join between two large csv sets that do not fit in ram. I process this two files in batch mode. I also need a side output to catch csv processing errors. So my question is what is the best way to this kind of join operation ? I think I should use a valueState stat