Unfortunately, the problem is that we are taking serialized protobuf messages from pubsub and writing them to Avro, so I was taking the message, using the payload to create the object, then converting -> (Beam) Row -> GenericRecord (Avro) -> Write to storage. I was using the ProtoMessageSchema.schemaFor to go from protobuf generated code object -> beam Row and any nulled fields make it complain. Was hoping to use schemas and the like to not have to write manual conversion code. Sadly, just not the case due to the java nullable issues and naming of fields and the like (trying to access getIde24 instead of getIdE24).
Writing up some wrapper classes to deal with this for now. ________________________________ From: Reuven Lax <re...@google.com> Sent: Monday, June 7, 2021 11:27 AM To: user <user@beam.apache.org> Subject: Re: [2.28.0] [Java] [protobuf] ProtoMessageSchema doesn't create fields as nullable That's why separate has_xxx methods are generated to test whether the specified field is present or not. On Mon, Jun 7, 2021 at 9:17 AM Thomas Fredriksen(External) <thomas.fredrik...@cognite.com<mailto:thomas.fredrik...@cognite.com>> wrote: The problem is that protobuf primitives are represented in Java as primitives, which are not nullable. Ideally, they should be objects instead, but alas - no. The wrapper is a decent (but not perfect) workaround. On Mon, Jun 7, 2021, 18:01 Reuven Lax <re...@google.com<mailto:re...@google.com>> wrote: I believe that as of proto 3.12, optional fields are supported directly - https://github.com/pseudomuto/protoc-gen-doc/issues/422<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fpseudomuto%2Fprotoc-gen-doc%2Fissues%2F422&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101192145%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=BLA%2F3zHnVOtA%2Bbhp8%2FodJ200D%2FZxzfIF7k%2FhDaDBhkU%3D&reserved=0> . _think_ this should be supported by Beam (assuming Beam uses a new-enough proto library), but I'm not sure if it's been tested. On Mon, Jun 7, 2021 at 8:53 AM Andrew Kettmann <akettm...@evolve24.com<mailto:akettm...@evolve24.com>> wrote: Thanks that looks like it is the issue, I appreciate the help. ________________________________ From: Thomas Fredriksen(External) <thomas.fredrik...@cognite.com<mailto:thomas.fredrik...@cognite.com>> Sent: Monday, June 7, 2021 12:53 AM To: user@beam.apache.org<mailto:user@beam.apache.org> <user@beam.apache.org<mailto:user@beam.apache.org>> Subject: Re: [2.28.0] [Java] [protobuf] ProtoMessageSchema doesn't create fields as nullable Hi Andrew, >From the documentation >(https://beam.apache.org/releases/javadoc/2.19.0/org/apache/beam/sdk/extensions/protobuf/ProtoSchemaTranslator.html<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Freleases%2Fjavadoc%2F2.19.0%2Forg%2Fapache%2Fbeam%2Fsdk%2Fextensions%2Fprotobuf%2FProtoSchemaTranslator.html&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101202139%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=Chqfxu3Q7gaClBvffZeCzR%2BiKdWQxv8YT9HdbHZn%2BxM%3D&reserved=0>): Protobuf wrapper classes are translated to nullable types, as follows. * google.protobuf.Int32Value maps to a nullable FieldType.INT32 * google.protobuf.Int64Value maps to a nullable FieldType.INT64 * google.protobuf.UInt32Value maps to a nullable FieldType.logicalType(new UInt32()) * google.protobuf.UInt64Value maps to a nullable Field.logicalType(new UInt64()) * google.protobuf.FloatValue maps to a nullable FieldType.FLOAT * google.protobuf.DoubleValue maps to a nullable FieldType.DOUBLE * google.protobuf.BoolValue maps to a nullable FieldType.BOOLEAN * google.protobuf.StringValue maps to a nullable FieldType.STRING * google.protobuf.BytesValue maps to a nullable FieldType.BYTES This means that you should use the google wrapper-types in order to achieve nullable fields. The wrapper is available here: https://github.com/protocolbuffers/protobuf/blob/master/src/google/protobuf/wrappers.proto<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fprotocolbuffers%2Fprotobuf%2Fblob%2Fmaster%2Fsrc%2Fgoogle%2Fprotobuf%2Fwrappers.proto&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101212136%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=zen27PRZD3GEZqSBKdfIbgYWFyAPx5xZ5qu3ba6q358%3D&reserved=0>. Hope it helps :) On Thu, Jun 3, 2021 at 6:48 PM Andrew Kettmann <akettm...@evolve24.com<mailto:akettm...@evolve24.com>> wrote: Using org.apache.beam.sdk.extensions.protobuf.ProtoMessageSchema to create a beam schema from generated protobuf3 classes. However, org.apache.beam.sdk.extensions.protobuf.ProtoSchemaTranslator#beamFieldTypeFromSingularProtoField doesn't apply nullable to fields in the message. My understanding is that by default protobuf fields ARE optional, is that incorrect? Converting from a serialized message without values for some fields crashes when it tries to cast them to a Row since the Row is not expecting a field as nullable. Anyone have any advice regarding this? Modify the schema after it is generated by ProtoMessageSchema or is there another method/option I am missing? [https://storage.googleapis.com/e24-email-images/e24logonotag.png]<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.evolve24.com%2F&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101222131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=n96m5MoW6YnjoXqLewdWkXDrW2AsH%2BjBxoLXesjfRz8%3D&reserved=0> Andrew Kettmann DevOps Engineer P: 1.314.596.2836<tel:(314)%20596-2836> [LinkedIn]<https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Flinkedin.com%2Fcompany%2Fevolve24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101222131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=psSd04nl0BjnqhGaVRCScyTb7yN%2F6lERyo3XOE8s8cQ%3D&reserved=0> [Twitter] <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Ftwitter.com%2Fevolve24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101232125%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=us41kKkltcMS1YCMea6xlIZVnjfwhHa8uL5dOKVAKW4%3D&reserved=0> [Instagram] <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.instagram.com%2Fevolve_24&data=04%7C01%7Cakettmann%40evolve24.com%7C8291eeb75fb943e9eec408d929d12e1d%7Ce36287f1b44849498093fe543a560976%7C0%7C0%7C637586801101242127%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=njKKMQh5kCum27IxZz1Uid%2B7iyl3hhxL9GZjZEH3ftU%3D&reserved=0> evolve24 Confidential & Proprietary Statement: This email and any attachments are confidential and may contain information that is privileged, confidential or exempt from disclosure under applicable law. It is intended for the use of the recipients. If you are not the intended recipient, or believe that you have received this communication in error, please do not read, print, copy, retransmit, disseminate, or otherwise use the information. Please delete this email and attachments, without reading, printing, copying, forwarding or saving them, and notify the Sender immediately by reply email. No confidentiality or privilege is waived or lost by any transmission in error.