Re: [Question] - Time series - cumulative sum in right order with python api in a batch process

2023-04-25 Thread Jan Lukavský
Hi, there is (rather old and long) discussion of this for Java SDK in [1]. This discussion resulted in adding @RequiresTimeSortedInput annotation [2]. Unfortunately this probably has not been transferred to Python SDK. I'll sum up reasons why it was added:  a) inputs to stateful DoFn are nat

Question about BeamSqlSeekableTable

2023-04-25 Thread Jeff Zhang
Hi all, I am a little confused about the implementation of BeamSqlSeekableTable, it looks like the join condition is implemented in method BeamSqlSeekableTable#seekRow, so does that mean whatever the join condition I specified in sql, it would always be ignored, Is that correct? -- Best Regards

Re: Apache Beam pipeline stuck indefinitely using Wait.on transform with JdbcIO

2023-04-25 Thread Juan Cuzmar
I did the following test and it inserted data correctly, but when I try to pull the data it does not arrive. Pipeline p = Pipeline.create(options); Coder utf8Coder = StringUtf8Coder.of(); Coder> mapCoder = MapCoder.of(StringUtf8Coder.of(), StringUtf8Coder.of()); C

Re: [java] Trouble with gradle and using ParquetIO

2023-04-25 Thread Evan Galpin
The root cause was actually "java.lang.ClassNotFoundException: org.apache.hadoop.io.Writable" which I eventually fixed by including hadoop-common as a dep for my pipeline (below). Should hadoop-common be listed as a dep of ParquetIO the beam repo itself? implementation "org.apache.hadoop:hadoop