[Question] Python Batch Pipeline Errors

2021-06-03 Thread Shankar Mane
# *BATCH PIPELINE : * python3 batch.py \ --input beam-userbase.csv \ --output output/batch \ --runner=SparkRunner \ --spark_submit_uber_jar \ --job_endpoint=localhost:8099 \ --spark_master_

[2.28.0] [Java] [protobuf] ProtoMessageSchema doesn't create fields as nullable

2021-06-03 Thread Andrew Kettmann
Using org.apache.beam.sdk.extensions.protobuf.ProtoMessageSchema to create a beam schema from generated protobuf3 classes. However, org.apache.beam.sdk.extensions.protobuf.ProtoSchemaTranslator#beamFieldTypeFromSingularProtoField doesn't apply nullable to fields in the message. My understanding

Re: RenameFields behaves differently in DirectRunner

2021-06-03 Thread Kenneth Knowles
I still don't quite grok the details of how this succeeds or fails in different situations. The invalid row succeeds in serialization because the coder is not sensitive to the way in which it is invalid? Kenn On Wed, Jun 2, 2021 at 2:54 PM Brian Hulette wrote: > > One thing that's been on the b

Re: RenameFields behaves differently in DirectRunner

2021-06-03 Thread Reuven Lax
Correct. On Thu, Jun 3, 2021 at 9:51 AM Kenneth Knowles wrote: > I still don't quite grok the details of how this succeeds or fails in > different situations. The invalid row succeeds in serialization because the > coder is not sensitive to the way in which it is invalid? > > Kenn > > On Wed, Ju

Merging two rows

2021-06-03 Thread Matthew Ouyang
I know there is a method to merge two Beam Schemas into a new Schema. ( https://beam.apache.org/releases/javadoc/2.26.0/org/apache/beam/sdk/schemas/SchemaUtils.html#mergeWideningNullable-org.apache.beam.sdk.schemas.Schema-org.apache.beam.sdk.schemas.Schema- ). Is there a similar method for Beam R

Re: Merging two rows

2021-06-03 Thread Reuven Lax
Do you want them to be flattened, or as two subschemas of a top-level schema? On Thu, Jun 3, 2021 at 12:28 PM Matthew Ouyang wrote: > I know there is a method to merge two Beam Schemas into a new Schema. ( > https://beam.apache.org/releases/javadoc/2.26.0/org/apache/beam/sdk/schemas/SchemaUtils

Re: Allyship workshops for open source contributors

2021-06-03 Thread Austin Bennett
+1, assuming timing can work. On Wed, Jun 2, 2021 at 2:07 PM Aizhamal Nurmamat kyzy wrote: > If we have a good number of people who express interest in this thread, I >> will set up training for the Airflow community. >> > > I meant Beam ^^' I am organizing it for the Airflow community as well.

Re: Allyship workshops for open source contributors

2021-06-03 Thread Ratnakar Malla
+1 From: Austin Bennett Sent: Thursday, June 3, 2021 6:20:25 PM To: user@beam.apache.org Cc: dev Subject: Re: Allyship workshops for open source contributors +1, assuming timing can work. On Wed, Jun 2, 2021 at 2:07 PM Aizhamal Nurmamat kyzy mailto:aizha...@

Custom metrics with OpenCensus in Dataflow

2021-06-03 Thread Rohith Kumar Uppala
Hi All Currently, I am trying to enable OpenCensus (Stats/Metrics) for a Dataflow job. I am trying to count different types of messages I am consuming from Pubsub and tag them based on input. I followed all the steps listed in the Custome metrics with OpenCensus ( https://cloud.google.com/monitor