Hi Folks,
Question from a Newbie for both Calcite and Beam:
I understand Calcite can make a tree of execution plan with relational
algebra and push certain operations to a "data source". And at the same
time, it can allow source-specific optimizations.
I also understand that Beam SQL can run Sql
tensions/create-external-table/#kafka
> [3]
> https://beam.apache.org/releases/javadoc/2.35.0/org/apache/beam/sdk/extensions/sql/SqlTransform.html
> [4]
> https://beam.apache.org/documentation/dsls/sql/extensions/create-external-table/#generic-payload-handling
>
> On Mon, Jan 10, 2
ple, which, in its turn, already contains another
> PCollection from your Kafka source. So, once you have
> a PCollectionTuple with two TupleTags (from Kafka and MySql), you can apply
> SqlTransform over it.
>
> —
> Alexey
>
>
>
>
> On 11 Jan 2022, at 03:54, Yushu Ya
Hi Team,
We have a use case that needs to add `--acl bucket-owner-full-control`
whenever we want to upload a file to S3. So if we want to use aws cli, it
will be:
aws s3 cp myfile s3://bucket/path/file --acl bucket-owner-full-control
So to do it in java code, we use (assuming aws s3 sdk v1):
Ini
workaround for that since the related jira
> issue [1] is still open.
>
> Side question: are you interested only in multipart version or both?
>
> —
> Alexey
>
> [1] https://issues.apache.org/jira/browse/BEAM-10850
>
> On 9 Mar 2022, at 00:19, Yushu Yao wrote:
>
>
Hi Experts,
We have a dynamically configurable list of transformations to be performed
on an input stream. Each input element will be transformed by one of those
transformations. Each transformation can be expressed as a SQL.
Wondering can this be achieved by BeamSQL? Say, for each input, it
will
Hi Folks,
I know this is not the optimal way to use beam :-) But assume I only use
the spark runner.
I have a spark library (very complex) that emits a spark dataframe (or
RDD).
I also have an existing complex beam pipeline that can do post processing
on the data inside the dataframe.
However, t
Mon, May 23, 2022 at 11:35 AM Robert Bradshaw
> wrote:
>
>> The easiest way to do this would be to write the RDD somewhere then
>> read it from Beam.
>>
>> On Mon, May 23, 2022 at 9:39 AM Yushu Yao wrote:
>> >
>> > Hi Folks,
>> >
>> &g
be supported). And
> dreaming a bit more, for those who need to have a mixed pipeline (e.g.
> Spark + Beam) such connectors could support the push-downs of pure Spark
> pipelines and then use the result downstream in Beam.
>
> —
> Alexey
>
>
>
> Brian
>
>
Hi Team,
Does anyone have a working example of a beam job running on top of spark?
So that I can use the beam metric syntax and the metrics will be shipped
out via spark's infra?
The only thing I achieved is to be able to queryMetrics() every half second
and copy all the metrics into the spark met
le yet. Typically, that means
> distributing such code by other means.
>
>
>
> /Moritz
>
>
>
>
>
> On 14.07.22, 21:26, "Yushu Yao" wrote:
>
>
>
> Hi Team, Does anyone have a working example of a beam job running on top
>
annot be
instantiated*
* Caused by: java.lang.ClassNotFoundException:
com.salesforce.einstein.data.platform.connectors.JmxSink*
I guess this is the classpath issue you were referring to above. Any hints
on how to fix it will be greatly appreciated.
Thanks!
-Yushu
/Moritz
>
>
>
>
>
> On 14.07.22, 21:26, &q
12 matches
Mail list logo