Hello Everyone, *For those new to Beam, even if this is your first day, consider yourselves a welcome contributor to this conversation. Below are definitions/references and a suggested learning path to understand this email.*
Proposal | Java For Java, I would like to add the following to PubsubClient [1]. import org.apache.beam.sdk.schemas.Schema; public Schema getSchema(SchemaPath schemaPath) throws IOException; public static class SchemaPath { /* Supports the projects/<project>/schemas/<schema> resource path[2]. */ } Additionally, I would like to propose two static helper methods to support the PubsubGrpcClient [3], and PubsubJsonClient [4]. static Schema fromPubsubSchema(com.google.api.services.pubsub.model.Schema pubsubSchema) { /* Converts Pub/Sub model Schema to Beam Schema; for use by PubsubJsonClient. */ } static Schema fromPubsubSchema(com.google.pubsub.v1.Schema pubsubSchema) { /* Converts Pub/Sub model Schema to Beam Schema; for use by PubsubGrpcClient. */ } Finally, to support tests, I would like to add this new Schema feature in PubsubTestClient's [5] private State class. import org.apache.beam.sdk.schemas.Schema; private static class State { /** Expected Pub/Sub mapped Beam Schema. */ @Nullable Schema schema; } Proposal | Go For Go, I would like to add the following to pubsubx [6]. func GetSchema(ctx context.Context, client *pubsub.SchemaClient, schemaId string) (*pubsub.SchemaConfig, error) { ... } func EncodeSchema(dst reflect.Type, src *pubsub.SchemaConfig) ([]byte, error) { ... } Rationale Querying from and converting the Pub/Sub Schema [7] to a Beam Schema[8] would allow us to validate that both schemas match to prevent potential errors. This supports the design goals of Pub/Sub schemas to facilitate a contract between publisher and subscriber and facilitate a single source of truth for inter-team production and consumption. This feature surfaced while implementing work related to Pub/Sub and Beam Schemas whose detail is excluded from this email. Definitions/References [1] PubsubClient: An (abstract) helper class for talking to Pubsub via an underlying transport. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubClient.html [2] *Resource Path*: Google Cloud resource naming adheres to the Google API design guide using a '/' delimited pattern. https://cloud.google.com/apis/design/resource_names [3] PubsubGrpcClient: An implementation of PubsubClient [1] using gRPC. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.html [4] *PubsubJsonClient*: An implementation of PubsubClient [1] using JSON transport. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.html [5] PubsubTestClient: A partial implementation of PubsubClient [1] for use by unit tests. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubTestClient.html [6] pubsubx: The Beam Go SDK's package contains utilities for working with Pub/Sub. https://pkg.go.dev/github.com/apache/beam/sdks/v2@v2.42.0/go/pkg/beam/util/pubsubx [7] *Pub/Sub Schema*: A format to which Pub/Sub data must adhere. It facilitates a contract between publisher and subscriber that Pub/Sub will enforce. https://cloud.google.com/pubsub/docs/schemas [8] *Beam Schema*: An object that describes Beam data elements such as field names and their data types. https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/Schema.html Suggested Learning Path To Understand This Email 1. *What is Pub/Sub?* - https://www.youtube.com/playlist?list=PLIivdWyY5sqKwVLe4BLJ-vlh9r9zCdOse 2. *What is Apache Beam?* - https://www.youtube.com/watch?v=65lmwL7rSy4&t=223s 3. *Apache Beam Overview* - https://beam.apache.org/documentation/programming-guide/#overview 4. *Transforms (Up to section 4.1)* - https://beam.apache.org/documentation/programming-guide/#transforms 5. *Pipeline I/O* - https://beam.apache.org/documentation/programming-guide/#pipeline-io 6. Schemas - https://beam.apache.org/documentation/programming-guide/#schemas