Re: [DISCUSS] Drop support for Flink 1.8 and 1.9

2021-03-11 Thread Ismaël Mejía
+user > Should we add a warning or something to 2.29.0? Sounds like a good idea. On Thu, Mar 11, 2021 at 7:24 PM Kenneth Knowles wrote: > > Should we add a warning or something to 2.29.0? > > On Thu, Mar 11, 2021 at 10:19 AM Ismaël Mejía wrote: >> >> Hello, >> >> We have been supporting old

Do you use TimestampCombiner.EARLIEST with SlidingWindows or other overlapping windows?

2021-03-11 Thread Kenneth Knowles
Hi users, ** Do you use TimestampCombiner.EARLIEST with SlidingWindows or other overlapping windows? If no, then you can stop reading now. ** We are considering a change to simplify how timestamps are computed for aggregations in this case. Warning: this is a bit complicated and a curious corner

Re: Regarding the over window query support from Beam SQL

2021-03-11 Thread Tao Li
Rui, I think I found another potential bug with rank(). +++ |column_1|column_2| +++ |1 |100 | |1 |200 | +++ Query using Beam SQL: SELECT *, RANK() over (PARTITION BY column_1 ORDER BY column_2 DESC) AS agg FROM PCOLLECTION Res

Re: Apache Beam Python SDK ReadFromKafka does not receive data

2021-03-11 Thread Sumeet Malhotra
Took me some time to setup the Java test (using Java after more than a decade!), but yes a similar pipeline with KafkaIO and Flink seems to work fine. Here's the relevant Java code. The only difference from the Python version is that I had to extract the KV from the KafkaRecord object and construc

Setting rowGroupSize in ParquetIO

2021-03-11 Thread Bashir Sadjad
Hi all, I wonder how I can set the row group size for files generated by ParquetIO.Sink . It doesn't seem to provide the option for setting that and IIUC from the code