Hi Folks,
I'm running Beam Python on Flink on Kubernetes. One thing I'm noticing is
that it takes a really long time for jobs to start. It looks like this
slowdown is due to the cost of uploading the Flink Beam Uber Jar (~225 Mb)
to the Job server.
Is there any way to speed this up?
1. Can the J
aving to upload the entire job server jar to Flink.
> Right now the best way to avoid this cost is to start a dedicated job
> server and submit your Beam Python job to that rather than using
> --flink_submit_uber_jar.
>
> On Tue, Aug 3, 2021 at 2:06 PM Jeremy Lewi wrote:
>
>&g
Hi Folks,
I'm running Beam Python (2.31.0) on Flink on Kubernetes. I'm using the
PortableRunner and a beam job server.
I'm using S3 for the artifacts dir. The job server is throwing exceptions
like the one below complaining that filesystem scheme S3 isn't registered.
I'm using the stock job serv
Thanks.
I opened
https://issues.apache.org/jira/browse/BEAM-12739
And submitted a patch
https://github.com/apache/beam/pull/15313
On Fri, Aug 6, 2021 at 7:57 AM Chamikara Jayalath
wrote:
> Hi Jeremy,
>
> On Thu, Aug 5, 2021 at 7:36 PM Jeremy Lewi wrote:
>
>> Hi Folks,
>
Hi Folks,
I'm trying to run Beam Python 2.31 on Flink 1.13.
I've created a simple streaming program to count Kafka messages. Running on
the DirectRunner this works fine. But when I try to submit to my Flink
cluster. I get the exception below in my taskmanager.
I'm using the PortableRunner. Any s
che/beam/vendor/grpc/v1p36p0/io/grpc/internal/DnsNameResolver.class
>
> Java has a tendency to only report the full cause on the first failure of
> this kind with all subsequent failures only reporting the
> ClassNotFoundException. This happens because the ClassLoader remembers
> which class
ntation/programming-guide/#multi-language-pipelines>
make
it seem like the expansion service only runs at job submission time and
only needs to be accessible from the machine where you are running your
python program to submit the job.
Thanks
J
On Wed, Aug 25, 2021 at 12:09 PM Jeremy Lewi wrote:
k
> is trying to do.
>
> On Wed, Aug 25, 2021 at 4:26 PM Jeremy Lewi wrote:
>
>> Hi Folks,
>>
>> So I tried putting the beam job server on the Flink JobManager and
>> Taskmanager containers and setting classloader.resolve-order
>> <https://ci.apache.org/pro
> Hope this helps, if you had any more questions, I'd be glad to help.
>
> Jan
>
> [1] https://issues.apache.org/jira/browse/BEAM-11998
>
> [2] https://issues.apache.org/jira/browse/BEAM-12538
>
> [3] https://issues.apache.org/jira/browse/BEAM-12704
> On 8/2
/java/core/src/main/java/org/apache/beam/sdk/options/PortablePipelineOptions.java
> On 8/26/21 3:14 PM, Jeremy Lewi wrote:
>
> HI Jan,
>
> That's very helpful. Do you have a timeline for 2.33? Until then I will
> try building from source.
>
> So if I understand correctl
Hi Folks,
I'm running Beam 2.33 using a JobServer on Flink on Kubernetes.
My client is unable to connect to the expansion service running on the job
server. Connecting to the other services works fine and I can submit jobs
that don't require the expansion service.
When I debugged this it looks l
>>> done due to a real failure in a different environment.
>>>
>>> On Fri, Aug 27, 2021 at 12:44 PM Jeremy Lewi
>>> wrote:
>>>
>>>> Hi Folks,
>>>>
>>>> I'm running Beam 2.33 using a JobServer on Flink on Kubernetes.
/jobsubmission/JobServerDriver.java#L260
Should I file a JIRA for this?
Thanks.
J
On Fri, Aug 27, 2021 at 12:36 PM Jeremy Lewi wrote:
> Thank you very much for the pointers. I'm working on getting the code
> built from 2.33 branch and trying that out.
>
> J
>
> On Thu, Aug
he job server
so that I could set the options to control the default environment and pass
the use_deprecated_read_flag.
J
On Fri, Aug 27, 2021 at 7:16 PM Jeremy Lewi wrote:
> It looks to me like https://github.com/apache/beam/pull/15082 added the
> ability configure the default en
I filed https://issues.apache.org/jira/browse/BEAM-12814 to support
external environments for the JAVA SDK harness to better support K8s.
On Sat, Aug 28, 2021 at 8:52 AM Jeremy Lewi wrote:
> Hi Folks,
>
> Thank you so much for all your help. I was able to get this working
> altho
_ that it might
> work for cluster as well. Did you test it and it did not work?
>
> Jan
>
> [1]
> https://github.com/apache/beam/blob/cbb363f2f01d44dd3f7c063c6cd9d529b5fa9104/sdks/python/apache_beam/runners/portability/flink_runner.py#L51
> On 8/28/21 5:52 PM, Jeremy Lewi wro
I filed BEAM-12836 <https://issues.apache.org/jira/browse/BEAM-12836> to
add options to the Flink Job Server to configure the expansion service.
J
On Mon, Aug 30, 2021 at 6:27 AM Jeremy Lewi wrote:
> HI Jan,
>
> I should clarify that I'm using the portable job runner not th
Hi Folks,
I'm running Python Beam on Flink and I'm trying to change the logging
levels. Specifically
- Set the log level to Debug
in org.apache.beam.runners.fnexecution.environment (so I log container
starts)
- Set the log level for Kafka called from the Java Kafka IO to WARN (I'm
issue.
Here's an updated gist.
https://gist.github.com/jlewi/759505b754ea0e84716afd58d59aedc0
J
On Tue, Oct 19, 2021 at 11:06 AM Jeremy Lewi wrote:
> Hi Folks,
>
> I'm running Python Beam on Flink and I'm trying to change the logging
> levels. Specifically
>
>
J
On Tue, Oct 19, 2021 at 5:44 PM Jeremy Lewi wrote:
> So I discovered this issue with the Flink Operator producing incorrect
> Log4J files
> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/issues/472
> I based my config on those.
>
> I updated my configs based on
Hi Folks,
I'm using a patched version of apache Beam 2.35 python and running on Flink
on Kubernetes using the PortableJobRunner.
It looks like when submitting the job, the runner tries to upload a large
274 Mb flink job server jar to the staging service.
This doesn't seem right. I already have a
working so its
possible I broke something).
Is this working as intended? Is there someway to avoid this without having
to hack the runner code?
Interestingly, it seems like downloading the jars is much faster than
uploading them but I haven't investigated this.
J
On Fri, Feb 11, 2022 at 8:23 A
m/apache/beam/blob/7fa5387ffac4f2801077f2e55aa2eba7a47036d9/sdks/java/core/src/main/java/org/apache/beam/sdk/options/FileStagingOptions.java#L38
>
> Thanks,
> Cham
>
> On Fri, Feb 11, 2022 at 2:09 PM Jeremy Lewi wrote:
>
>> Hi Folks,
>>
>> So I think this is what
23 matches
Mail list logo