Hello Everyone,

*For those new to Beam, even if this is your first day, consider yourselves
a welcome contributor to this conversation.  Below are
definitions/references and a suggested learning path to understand this
email.*

Proposal | Java

For Java, I would like to add the following to PubsubClient [1].

import org.apache.beam.sdk.schemas.Schema;


public Schema getSchema(SchemaPath schemaPath) throws IOException;

public static class SchemaPath { /* Supports the
projects/<project>/schemas/<schema> resource path[2]. */ }


Additionally, I would like to propose two static helper methods to support
the PubsubGrpcClient [3], and PubsubJsonClient [4].

static Schema fromPubsubSchema(com.google.api.services.pubsub.model.Schema
pubsubSchema) { /* Converts Pub/Sub model Schema to Beam Schema; for use by
PubsubJsonClient. */ }


static Schema fromPubsubSchema(com.google.pubsub.v1.Schema pubsubSchema) {
/* Converts Pub/Sub model Schema to Beam Schema; for use by
PubsubGrpcClient. */ }


Finally, to support tests, I would like to add this new Schema feature
in PubsubTestClient's [5] private State class.

import org.apache.beam.sdk.schemas.Schema;

private static class State {

/** Expected Pub/Sub mapped Beam Schema. */
@Nullable Schema schema;

}

Proposal | Go

For Go, I would like to add the following to pubsubx [6].

func GetSchema(ctx context.Context, client *pubsub.SchemaClient, schemaId
string) (*pubsub.SchemaConfig, error) { ... }

func EncodeSchema(dst reflect.Type, src *pubsub.SchemaConfig) ([]byte,
error) { ... }


Rationale

Querying from and converting the Pub/Sub Schema [7] to a Beam Schema[8]
would allow us to validate that both schemas match to prevent potential
errors.  This supports the design goals of Pub/Sub schemas to facilitate a
contract between publisher and subscriber and facilitate a single source of
truth for inter-team production and consumption.  This feature surfaced
while implementing work related to Pub/Sub and Beam Schemas whose detail is
excluded from this email.

Definitions/References

[1] PubsubClient: An (abstract) helper class for talking to Pubsub via an
underlying transport.
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubClient.html

[2] *Resource Path*: Google Cloud resource naming adheres to the Google API
design guide using a '/' delimited pattern.
https://cloud.google.com/apis/design/resource_names

[3] PubsubGrpcClient: An implementation of PubsubClient [1] using gRPC.
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubGrpcClient.html

[4] *PubsubJsonClient*: An implementation of PubsubClient [1] using JSON
transport.
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubJsonClient.html

[5] PubsubTestClient: A partial implementation of PubsubClient [1] for use
by unit tests.
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/io/gcp/pubsub/PubsubTestClient.html

[6] pubsubx: The Beam Go SDK's package contains utilities for working with
Pub/Sub.
https://pkg.go.dev/github.com/apache/beam/sdks/v2@v2.42.0/go/pkg/beam/util/pubsubx

[7] *Pub/Sub Schema*: A format to which Pub/Sub data must adhere.  It
facilitates a contract between publisher and subscriber that Pub/Sub will
enforce.
https://cloud.google.com/pubsub/docs/schemas

[8] *Beam Schema*:  An object that describes Beam data elements such as
field names and their data types.
https://beam.apache.org/releases/javadoc/current/org/apache/beam/sdk/schemas/Schema.html

Suggested Learning Path To Understand This Email

1. *What is Pub/Sub?* -
https://www.youtube.com/playlist?list=PLIivdWyY5sqKwVLe4BLJ-vlh9r9zCdOse
2. *What is Apache Beam?* -
https://www.youtube.com/watch?v=65lmwL7rSy4&t=223s
3. *Apache Beam Overview* -
https://beam.apache.org/documentation/programming-guide/#overview
4. *Transforms (Up to section 4.1)* -
https://beam.apache.org/documentation/programming-guide/#transforms
5. *Pipeline I/O* -
https://beam.apache.org/documentation/programming-guide/#pipeline-io
6. Schemas -
https://beam.apache.org/documentation/programming-guide/#schemas

Reply via email to