Hi,
there is (rather old and long) discussion of this for Java SDK in [1].
This discussion resulted in adding @RequiresTimeSortedInput annotation
[2]. Unfortunately this probably has not been transferred to Python SDK.
I'll sum up reasons why it was added:
a) inputs to stateful DoFn are nat
Hi all,
I am a little confused about the implementation of BeamSqlSeekableTable, it
looks like the join condition is implemented in method
BeamSqlSeekableTable#seekRow, so does that mean whatever the join condition
I specified in sql, it would always be ignored, Is that correct?
--
Best Regards
I did the following test and it inserted data correctly, but when I try to pull
the data it does not arrive.
Pipeline p = Pipeline.create(options);
Coder utf8Coder = StringUtf8Coder.of();
Coder> mapCoder = MapCoder.of(StringUtf8Coder.of(),
StringUtf8Coder.of());
C
The root cause was actually "java.lang.ClassNotFoundException:
org.apache.hadoop.io.Writable" which I eventually fixed by including
hadoop-common as a dep for my pipeline (below). Should hadoop-common be
listed as a dep of ParquetIO the beam repo itself?
implementation "org.apache.hadoop:hadoop