Sorry for the typo. I mean I think we can go with *(3)* and (4): use the
data type that is schema-aware as the input of ReadAll.
On Wed, Jun 24, 2020 at 7:42 PM Boyuan Zhang wrote:
> Thanks for the summary, Cham!
>
> I think we can go with (2) and (4): use the data type that is schema-aware
> as
Thanks for the summary, Cham!
I think we can go with (2) and (4): use the data type that is schema-aware
as the input of ReadAll.
Converting Read into ReadAll helps us to stick with SDF-like IO. But only
having (3) is not enough to solve the problem of using ReadAll in x-lang
case.
The key poin
I see. So it seems like there are three options discussed so far when it
comes to defining source descriptors for ReadAll type transforms
(1) Use Read PTransform as the element type of the input PCollection
(2) Use a POJO that describes the source as the data element of the input
PCollection
(3) P
I'm not aware of any ZeroMQ connector implementations that are part of
Apache Beam.
On Wed, Jun 24, 2020 at 11:44 AM Sherif A. Kozman <
sherif.koz...@extremesolution.com> wrote:
> Hello,
>
> We were in the process of planning a deployment of exporting stream data
> from Aruba Networks Analytics e
Hello,
We were in the process of planning a deployment of exporting stream data
from Aruba Networks Analytics engine through Apache beam and it turns out
that it utilizes ZeroMQ for messaging.
We couldn't find any ZeroMQ connectors and were wondering if it does exist
or it would be compatible with
I believe we do require PTransforms to be serializable since anonymous
DoFns typically capture the enclosing PTransform.
On Wed, Jun 24, 2020 at 10:52 AM Chamikara Jayalath
wrote:
> Seems like Read in PCollection refers to a transform, at least here:
> https://github.com/apache/beam/blob/master/
Seems like Read in PCollection refers to a transform, at least here:
https://github.com/apache/beam/blob/master/sdks/java/io/hbase/src/main/java/org/apache/beam/sdk/io/hbase/HBaseIO.java#L353
I'm in favour of separating construction time transforms from execution
time data objects that we store in
Hi Ismael,
I think the ReadAll in the IO connector refers to the IO with SDF
implementation despite the type of input, where Read refers to
UnboundedSource. One major pushback of using KafkaIO.Read as source
description is that not all configurations of KafkaIO.Read are meaningful
to populate dur
To provide additional context, the KafkaIO ReadAll transform takes a
PCollection. This KafkaSourceDescriptor is a POJO
that contains the configurable parameters for reading from Kafka. This is
different from the pattern that Ismael listed because they take
PCollection as input and the Read is the s
Hi Ismael,
Thanks for taking this on. Have you considered an approach similar (or
dual) to FileIO.write(), where we in a sense also have to configure a
dynamic number different IO transforms of the same type (file writes)?
E.g. how in this example we configure many aspects of many file writes:
t
Thanks for the information. So it looks like we can't easily run portable
pipelines on Dataproc cluster at the moment.
> you can set --output_executable_path to create a jar that you can then
submit to yarn via spark-submit.
I tried to create a jar, but I ran into a problem. I left an error messa
Hi Brian,
Done. Welcome to the project!
> On 24 Jun 2020, at 01:52, Brian Michalski wrote:
>
> Greetings!
>
> I'm wading my way a few small Go SDK tickets. Can I have contributor
> permissions on JIRA? My username is bamnet.
>
> Thanks,
> ~Brian M
Hello,
(my excuses for the long email but this requires context)
As part of the move from Source based IOs to DoFn based ones. One pattern
emerged due to the composable nature of DoFn. The idea is to have a different
kind of composable reads where we take a PCollection of different sorts of
inter
13 matches
Mail list logo