I have a feeling there has to be something going on wrong with your
source. If I were you, I would probably do the following:
- verify that I never produce an UnboundedSource split with no queue
associated (Preconditions.checkState())
- make sure that I *really never* block in call to advan
Hi, all. I am still ramping up on my learning of how to use Beam, and I
have a couple of questions for the experts. And, while I have read the
documentation, I have either looked at the wrong parts, or my particular
questions were not specifically answered. If I have missed something, then
pleas
(1) I don't have direct experience with contacting MongoDB in Beam, but my
general expectation is that yes, Beam IOs will do reasonable things for
connection setup and teardown. For the case of MongoDbIO in the Java SDK,
it looks like this connection setup happens in BoundedMongoDbReader [0]. In
ge
I already poll a local queue in advance() and respond false if queue empty.
Is there an example of restricting UnboundedSource splits?
On 2019/09/25 08:13:43, Jan Lukavský wrote:
> I have a feeling there has to be something going on wrong with your
> source. If I were you, I would probably do
Hi Magnus,
Your observation seems to be correct. There is an issue with the file
system registration.
The two types of errors you are seeing, as well as the successful run,
are just due to the different structure of the generated transforms. The
Flink scheduler will distribute them different
Hi Preston,
Sorry about the name mixup, of course I meant to write Preston not
Magnus :) See my reply below.
cheers,
Max
On 25.09.19 08:31, Maximilian Michels wrote:
Hi Magnus,
Your observation seems to be correct. There is an issue with the file
system registration.
The two types of err
Not a problem! Thanks for looking into this. In reading through the source
associated with the stacktrace, I also noticed that there's neither user-code,
nor beam-to-flink lifecycle code available for initialization. As far as I
could tell, it was pure flink down to the coders. Nothing new h
Hi Steve,
1) In general, managing the client connections is a responsibility of IO
transform. Usually, one client instance is used per input split (bounded or
unbounded source) and it opens a connection in the beginning of reading this
split and closes in the end. The is in theory. Practically,
+1 for local execution using Flink.
On Tue, Sep 17, 2019 at 4:24 AM Paweł Kordek
wrote:
> Hi Steve
>
> Maybe local execution on a Flink cluster will work for you:
> https://beam.apache.org/documentation/runners/flink/ ?
>
> Cheers
> Pawel
>
> On Tue, 17 Sep 2019 at 10:51, Steve973 wrote:
>
>> H
Gentle reminder that Seattle Apache Beam meetup is happening tomorrow!
Here is a quick agenda:
- 18:00 - Registrations, speed networking, pizza and drinks.
- 18:30 - kick-off
- 18:40 - Making Beam Schemas Portable by Brian Hulette
- 19:10 - Apache Beam @ Brightcove - A case study
- 19:40 - ZetaSQL
Hello.
I would like to ask question for ParDo .
I am getting below error inside TaskManager when running code on Apache
Flink using Portable Runner.
=
File
"/usr/local/lib/python3.5/site-packages/apache_beam/runners/worker/bundle_processo
You will need to set the save_main_session pipeline option to True.
Kyle Weaver | Software Engineer | github.com/ibzib | kcwea...@google.com
On Wed, Sep 25, 2019 at 3:44 PM Yu Watanabe wrote:
> Hello.
>
> I would like to ask question for ParDo .
>
> I am getting below error inside TaskManager
Google Dataflow currently uses a JSON representation of the pipeline graph
and also the pipeline proto. We represent the graph in two different ways
which leads to some wonderful *features*.
Google Dataflow also side steps the Beam job service since Dataflow has its
own Job API. Supporting the Beam
> The only requirement is that TaskManager nodes must have access to Docker.
That's not strictly true anymore; there is also the PROCESS option, but
I'll warn you it's still under-documented. See
https://beam.apache.org/roadmap/portability/#sdk-harness-config
> Sorry if/that this is stupid questi
Thank you for the reply.
" save_main_session" did not work, however, situation had changed.
1. get_all_options() output. "save_main_session" set to True.
=
2019-09-26 09:04:11,586 DEBUG Pipeline Options:
{'wait_until_
super has some issues wile pickling in python3. Please refer
https://github.com/uqfoundation/dill/issues/300 for more details.
Removing reference to super in your dofn should help.
On Wed, Sep 25, 2019 at 5:13 PM Yu Watanabe wrote:
> Thank you for the reply.
>
> " save_main_session" did not wor
Thank you for the help.
I have chosen to remove the super().__init__() .
Thanks,
Yu
On Thu, Sep 26, 2019 at 9:18 AM Ankur Goenka wrote:
> super has some issues wile pickling in python3. Please refer
> https://github.com/uqfoundation/dill/issues/300 for more details.
>
> Removing reference to s
Thanks for organizing. I'll be there!
Kenn
On Wed, Sep 25, 2019 at 2:50 PM Aizhamal Nurmamat kyzy
wrote:
> Gentle reminder that Seattle Apache Beam meetup is happening tomorrow!
>
> Here is a quick agenda:
> - 18:00 - Registrations, speed networking, pizza and drinks.
> - 18:30 - kick-off
> - 1
Actually there was a good example in the latest wordcount.py in master repo.
https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount.py
On Thu, Sep 26, 2019 at 12:00 PM Yu Watanabe wrote:
> Thank you for the help.
>
> I have chosen to remove the super().__init__()
Hello.
I would like to ask for help with resolving dependency issue for imported
module.
I have a directory structure as below and I am trying to import Frames
class from frames.py to main.py.
=
quality-validation/
bin/setup.py
main.py
20 matches
Mail list logo