Unfortunately, the problem is that we are taking serialized protobuf messages 
from pubsub and writing them to Avro, so I was taking the message, using the 
payload to create the object, then converting -> (Beam) Row -> GenericRecord 
(Avro) -> Write to storage. I was using the ProtoMessageSchema.schemaFor to go 
from protobuf generated code object -> beam Row and any nulled fields make it 
complain. Was hoping to use schemas and the like to not have to write manual 
conversion code. Sadly, just not the case due to the java nullable issues and 
naming of fields and the like (trying to access getIde24 instead of getIdE24).

Writing up some wrapper classes to deal with this for now.
________________________________
From: Reuven Lax <re...@google.com>
Sent: Monday, June 7, 2021 11:27 AM
To: user <user@beam.apache.org>
Subject: Re: [2.28.0] [Java] [protobuf] ProtoMessageSchema doesn't create 
fields as nullable

That's why separate has_xxx methods are generated to test whether the specified 
field is present or not.

On Mon, Jun 7, 2021 at 9:17 AM Thomas Fredriksen(External) 
<thomas.fredrik...@cognite.com<mailto:thomas.fredrik...@cognite.com>> wrote:
The problem is that protobuf primitives are represented in Java as primitives, 
which are not nullable.

Ideally, they should be objects instead, but alas - no.

The wrapper is a decent (but not perfect) workaround.

On Mon, Jun 7, 2021, 18:01 Reuven Lax 
<re...@google.com<mailto:re...@google.com>> wrote:
I believe that as of proto 3.12, optional fields are supported directly - 
https://github.com/pseudomuto/protoc-gen-doc/issues/422<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpseudomuto%2Fprotoc-gen-doc%2Fissues%2F422&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101192145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BLA%2F3zHnVOtA%2Bbhp8%2FodJ200D%2FZxzfIF7k%2FhDaDBhkU%3D&reserved=0>
 .  _think_ this should be supported by Beam (assuming Beam uses a new-enough 
proto library), but I'm not sure if it's been tested.

On Mon, Jun 7, 2021 at 8:53 AM Andrew Kettmann 
<akettm...@evolve24.com<mailto:akettm...@evolve24.com>> wrote:
Thanks that looks like it is the issue, I appreciate the help.
________________________________
From: Thomas Fredriksen(External) 
<thomas.fredrik...@cognite.com<mailto:thomas.fredrik...@cognite.com>>
Sent: Monday, June 7, 2021 12:53 AM
To: user@beam.apache.org<mailto:user@beam.apache.org> 
<user@beam.apache.org<mailto:user@beam.apache.org>>
Subject: Re: [2.28.0] [Java] [protobuf] ProtoMessageSchema doesn't create 
fields as nullable

Hi Andrew,

>From the documentation 
>(https://beam.apache.org/releases/javadoc/2.19.0/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.html<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Freleases%2Fjavadoc%2F2.19.0%2Forg%2Fapache%2Fbeam%2Fsdk%2Fextensions%2Fprotobuf%2FProtoSchemaTranslator.html&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101202139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Chqfxu3Q7gaClBvffZeCzR%2BiKdWQxv8YT9HdbHZn%2BxM%3D&reserved=0>):


Protobuf wrapper classes are translated to nullable types, as follows.

  *   google.protobuf.Int32Value maps to a nullable FieldType.INT32
  *   google.protobuf.Int64Value maps to a nullable FieldType.INT64
  *   google.protobuf.UInt32Value maps to a nullable FieldType.logicalType(new 
UInt32())
  *   google.protobuf.UInt64Value maps to a nullable Field.logicalType(new 
UInt64())
  *   google.protobuf.FloatValue maps to a nullable FieldType.FLOAT
  *   google.protobuf.DoubleValue maps to a nullable FieldType.DOUBLE
  *   google.protobuf.BoolValue maps to a nullable FieldType.BOOLEAN
  *   google.protobuf.StringValue maps to a nullable FieldType.STRING
  *   google.protobuf.BytesValue maps to a nullable FieldType.BYTES

This means that you should use the google wrapper-types in order to achieve 
nullable fields.
The wrapper is available here: 
https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/wrappers.proto<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fprotocolbuffers%2Fprotobuf%2Fblob%2Fmaster%2Fsrc%2Fgoogle%2Fprotobuf%2Fwrappers.proto&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101212136%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=zen27PRZD3GEZqSBKdfIbgYWFyAPx5xZ5qu3ba6q358%3D&reserved=0>.

Hope it helps :)

On Thu, Jun 3, 2021 at 6:48 PM Andrew Kettmann 
<akettm...@evolve24.com<mailto:akettm...@evolve24.com>> wrote:
Using org.apache.beam.sdk.extensions.protobuf.ProtoMessageSchema to create a 
beam schema from generated protobuf3 classes. However, 
org.apache.beam.sdk.extensions.protobuf.ProtoSchemaTranslator#beamFieldTypeFromSingularProtoField
 doesn't apply nullable to fields in the message. My understanding is that by 
default protobuf fields ARE optional, is that incorrect? Converting from a 
serialized message without values for some fields crashes when it tries to cast 
them to a Row since the Row is not expecting a field as nullable.

Anyone have any advice regarding this? Modify the schema after it is generated 
by ProtoMessageSchema or is there another method/option I am missing?

[https://storage.googleapis.com/e24-email-images/e24logonotag.png]<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.evolve24.com%2F&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101222131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=n96m5MoW6YnjoXqLewdWkXDrW2AsH%2BjBxoLXesjfRz8%3D&reserved=0>
 Andrew Kettmann
DevOps Engineer
P: 1.314.596.2836<tel:(314)%20596-2836>
[LinkedIn]<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinkedin.com%2Fcompany%2Fevolve24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101222131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=psSd04nl0BjnqhGaVRCScyTb7yN%2F6lERyo3XOE8s8cQ%3D&reserved=0>
 [Twitter] 
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fevolve24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101232125%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=us41kKkltcMS1YCMea6xlIZVnjfwhHa8uL5dOKVAKW4%3D&reserved=0>
  [Instagram] 
<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fevolve_24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101242127%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=njKKMQh5kCum27IxZz1Uid%2B7iyl3hhxL9GZjZEH3ftU%3D&reserved=0>

evolve24 Confidential & Proprietary Statement: This email and any attachments 
are confidential and may contain information that is privileged, confidential 
or exempt from disclosure under applicable law. It is intended for the use of 
the recipients. If you are not the intended recipient, or believe that you have 
received this communication in error, please do not read, print, copy, 
retransmit, disseminate, or otherwise use the information. Please delete this 
email and attachments, without reading, printing, copying, forwarding or saving 
them, and notify the Sender immediately by reply email. No confidentiality or 
privilege is waived or lost by any transmission in error.

Reply via email to