BTW, I'm adding user@ mailing list since this is a user question and should be asked there.
dev@ mailing list is only for discussions of Flink development. Please see https://flink.apache.org/community.html#mailing-lists On Wed, Jul 3, 2019 at 12:34 PM Bowen Li <bowenl...@gmail.com> wrote: > Hi Youssef, > > You need to provide more background context: > > - Which Hive sink are you using? We are working on the official Hive sink > for community and will be released in 1.9. So did you develop yours in > house? > - What do you mean by 1st, 2nd, 3rd window? You mean the parallel > instances of the same operator, or do you have you have 3 windowing > operations chained? > - What does your Hive table look like? E.g. is it partitioned or > non-partitioned? If partitioned, how many partitions do you have? is it > writing in static partition or dynamic partition mode? what format? how > large? > - What does your sink do - is each parallelism writing to multiple > partitions or a single partition/table? Is it only appending data or > upserting? > > On Wed, Jul 3, 2019 at 1:38 AM Youssef Achbany < > youssef.achb...@euranova.eu> wrote: > >> Dear all, >> >> I'm working for a big project and one of the challenge is to read Kafka >> topics and copy them via Hive command into Hive managed tables in order to >> enable ACID HIVE properties. >> >> I try it but I have a issue with back pressure: >> - The first window read 20.000 events and wrote them in Hive tables >> - The second, third, ... send only 100 events because the write in Hive >> take more time than the read of a Kafka topic. But writing 100 events or >> 50.000 events takes +/- the same time for Hive. >> >> Someone have already do this source and sink? Could you help on this? >> Or have you some tips? >> It seems that defining a size window on number of event instead time is >> not >> possible. Is it true? >> >> Thank you for your help >> >> Youssef >> >> -- >> ♻ Be green, keep it on the screen >> >