Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Brian Hulette
Hi Matthew, > The unit tests also seem to be disabled for this as well and so I don’t know if the PTransform behaves as expected. The exclusion for NeedsRunner tests is just a quirk in our testing framework. NeedsRunner indicates that a test suite can't be executed with the SDK alone, it needs a

Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Kenneth Knowles
On Tue, Jun 1, 2021 at 12:42 PM Brian Hulette wrote: > Hi Matthew, > > > The unit tests also seem to be disabled for this as well and so I don’t > know if the PTransform behaves as expected. > > The exclusion for NeedsRunner tests is just a quirk in our testing > framework. NeedsRunner indicates

Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Reuven Lax
This transform is the same across all runners. A few comments on the test: - Using attachValues directly is error prone (per the comment on the method). I recommend using the withFieldValue builders instead. - I recommend capturing the RenameFields PCollection into a local variable of type PCo

Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Matthew Ouyang
Thank you everyone for your input. I believe it will be easiest to respond to all feedback in a single message rather than messages per person. - NeedsRunner - The tests are run eventually, so obviously all good on my end. I was trying to run the smallest subset of test cases possible and

Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Reuven Lax
Aha, yes this indeed another bug in the transform. The schema is set on the top-level Row but not on any nested rows. On Tue, Jun 1, 2021 at 6:37 PM Matthew Ouyang wrote: > Thank you everyone for your input. I believe it will be easiest to > respond to all feedback in a single message rather th

Re: RenameFields behaves differently in DirectRunner

2021-06-01 Thread Reuven Lax
Some more context - the problem is that RenameFields outputs (in this case) Java Row objects that are inconsistent with the actual schema. For example if you have the following schema: Row { field1: Row { field2: string } } And rename field1.field2 -> renamed, you'll get the followin

Issues running Kafka streaming pipeline in Python

2021-06-01 Thread Alex Koay
Hi all, I have created a simple snippet as such: import apache_beam as beam from apache_beam.io.kafka import ReadFromKafka from apache_beam.options.pipeline_options import PipelineOptions import logging logging.basicConfig(level=logging.WARNING) opts = direct_opts with beam.Pipeline(options=Pip