Hey,
The documentation lists 2 options for running Python Beam on Flink:
"Portable (Python)" and "Portable (Python / Go / Java)". What are the
differences between them? which one is recommended?
Thanks!
Nir
Hi,
I'm trying to deploy the word count job on a Flink cluster in Kubernetes.
However, when trying to run the job (with python workers as a side car to
the Flink task masters), I get the following error:
2021/01/14 14:11:36 Initializing python harness: /opt/apache/beam/boot
--id=1-1 --logging_endp
It's mostly just a different entry point. With (Python / Go / Java), the
user has to start a job server themselves before submitting a job. With
(Python), the job server is managed automatically in the background. Aside
from that, though, there's very little difference. We recommend (Python) to
new
Hi Nir,
I have a simple repo where I have a proof of concept deployment setup for
doing this.
https://github.com/sambvfx/beam-flink-k8s
Depending on the type of runner you're using there are a few explanations.
That repo should hopefully point you in the right direction.
On Thu, Jan 14, 2021 at
Should Unnest be allowed to specify multiple array fields, or just one?
On Wed, Jan 13, 2021 at 11:59 PM Manninger, Matyas <
matyas.mannin...@veolia.com> wrote:
> I would also not unnest arrays nested in arrays just the top-level array
> of the specified fields.
>
> On Wed, 13 Jan 2021 at 20:58,
I think it makes sense to allow specifying more than one, if desired. This
is equivalent to just stacking multiple Unnests. (Possibly one could even
have a special syntax like "*" for all array fields.)
On Thu, Jan 14, 2021 at 10:05 AM Reuven Lax wrote:
> Should Unnest be allowed to specify mult
And the result is essentially a cross product of all the different array
elements?
On Thu, Jan 14, 2021 at 11:25 AM Robert Bradshaw
wrote:
> I think it makes sense to allow specifying more than one, if desired. This
> is equivalent to just stacking multiple Unnests. (Possibly one could even
> ha
I don't see any other reasonable interpretation. (One could use this as an
argument to only support one field at a time, to make the potential
explosion in data size all the more obvious.)
On Thu, Jan 14, 2021 at 11:30 AM Reuven Lax wrote:
> And the result is essentially a cross product of all t
Thanks! I actually managed to have my own deployment for running it locally
and it works well for local file, but I get this weird error while trying
to run the word count example and I'm trying to understand what can be the
cause of it?
On Thu, Jan 14, 2021 at 7:29 PM Sam Bourne wrote:
> Hi Nir
Thanks! And is there a recommended approach for deploying the SDK harness?
On Thu, Jan 14, 2021 at 7:19 PM Kyle Weaver wrote:
> It's mostly just a different entry point. With (Python / Go / Java), the
> user has to start a job server themselves before submitting a job. With
> (Python), the job s
BTW - is this something we may want to add to the official repo? (I can do
it ofc). It took me a while to find how to set up a local deployment for
Flink and the docs weren't super detailed.
On Thu, Jan 14, 2021 at 9:39 PM Nir Gazit wrote:
> Thanks! I actually managed to have my own deployment f
DOCKER is the recommended way but for testing and development you can also
use LOOPBACK.
More details here
https://github.com/apache/beam/blob/17fbe8be70b71bc1236e553a5fe7902a2df5dda4/sdks/python/apache_beam/options/pipeline_options.py#L1115
On Thu, Jan 14, 2021 at 11:40 AM Nir Gazit wrote:
> Th
Thanks! I actually managed to have my own deployment for running it locally
and it works well for local file, but I get this weird error while trying
to run the word count example and I’m trying to understand what can be the
cause of it?
If you’re referring to this error:
2021/01/14 14:11:45 Fail
13 matches
Mail list logo