Hi Nam-Luc,
Several of your observations in the blog post are to the point. Iterations
are already pipelined, and the distributed state that the delta iterations
access can be possibly lifted to a parameter server API.
We need to work a bit through the details on how fault tolerance and
terminati
Hey Tran Nam-Luc!
Great post with some really cool thoughts.
I just posted this answer to your LinkedIN post.
Greetings,
Stephan
=
Nice post, very cool idea! Your understanding of Flink in that respect is
really good. I had not heard of SSP before,
bu
As a workaround, it should always work to get the Edge and Vertex data set
from the graph and use the regular Fink iteration operators?
On Sun, Feb 22, 2015 at 4:53 PM, Vasiliki Kalavri wrote:
> Hi,
>
> yes, I was referring to the parallel Boruvka algorithm. There are several
> ways to implemen
Does everyone know of a good, simple and realistic streaming iteration
example? The current example tests a random generator, but it should be
replaced by something deterministic in order to be testable.
Peter
I think that the Samoa people have quite a few nice examples along the
lines of model training with feedback.
@Paris: What would be the simplest example?
On Mon, Feb 23, 2015 at 11:27 AM, Szabó Péter
wrote:
> Does everyone know of a good, simple and realistic streaming iteration
> example? The
Hi Stephan,
yes, this would work for the cases where an algorithm only updates the
vertex values or only updates the edge values.
What we would like to also support is
(a) algorithms where both vertices and edges are updated in one iteration
(b) algorithms where the graph structure changes from o
Some things may not work well as "closed-loop" iterations.
Is it possible to express those as for-loop iterations?
On Mon, Feb 23, 2015 at 1:03 PM, Vasiliki Kalavri wrote:
> Hi Stephan,
>
> yes, this would work for the cases where an algorithm only updates the
> vertex values or only updates th
Stephan Ewen created FLINK-1598:
---
Summary: Give better error messages when serializers run out of
space.
Key: FLINK-1598
URL: https://issues.apache.org/jira/browse/FLINK-1598
Project: Flink
Is
Hello Peter,
Streaming machine learning algorithms make use of iterations quite widely. One
simple example is implementing distributed stream learners. There, in many
cases you need some central model aggregator, distributed estimators to offload
the central node and of course feedback loops to
Nice. Thank you guys!
@Paris
Are there any Flink implementations of this model? The GitHub doc is quite
general.
Peter
2015-02-23 14:05 GMT+01:00 Paris Carbone :
> Hello Peter,
>
> Streaming machine learning algorithms make use of iterations quite widely.
> One simple example is implementing di
Max Michels created FLINK-1599:
--
Summary: TypeComperator
Key: FLINK-1599
URL: https://issues.apache.org/jira/browse/FLINK-1599
Project: Flink
Issue Type: Bug
Components: Distributed Ru
for-loop iterations could cover some cases, I guess, when the number of
iterations is known beforehand.
Are there currently any restrictions on what can be used inside a for-loop?
How are they translated into execution plans?
On 23 February 2015 at 13:08, Stephan Ewen wrote:
> Some things may no
Stephan Ewen created FLINK-1600:
---
Summary: Failure when submitting a job leaves user code libraries
in the BLOB manager
Key: FLINK-1600
URL: https://issues.apache.org/jira/browse/FLINK-1600
Project: Fli
For loops are basically rolled out - they yield long execution plans.
On Mon, Feb 23, 2015 at 2:44 PM, Vasiliki Kalavri wrote:
> for-loop iterations could cover some cases, I guess, when the number of
> iterations is known beforehand.
> Are there currently any restrictions on what can be used in
Closed-loop iterations are much more efficient right now. Long for loops
suffer from memory fragmentation (an issue that is in the list to fix).
Also, closed loops can be stateful (delta iterations) and do not require
task re-deployment.
On Mon, Feb 23, 2015 at 4:15 PM, Vasiliki Kalavri wrote:
I see that's cool :-)
So, what is the advantage of closed-loop versus for-loop iterations?
Custom convergence criteria / aggregators and more efficient execution
plans?
On 23 February 2015 at 15:01, Stephan Ewen wrote:
> For loops are basically rolled out - they yield long execution plans.
>
> O
Till Rohrmann created FLINK-1601:
Summary: Sometimes the YARNSessionFIFOITCase fails on Travis
Key: FLINK-1601
URL: https://issues.apache.org/jira/browse/FLINK-1601
Project: Flink
Issue Type:
Hello guys,
Thank you for your replies.
>(1) How to checkpoint such computations in order to recover them
upon failures.
Approximate snapshots are possible with SSP. In this case, a snapshot
(solution set and or working set, what is exactly snapshotted can be
discussed) is triggered once a "clock
I’m getting "Could not build up connection to JobManager.” When i tried to run
the wordCount example. Can anyone help?
Dulaj
I see, thanks a lot for the answers!
To rephrase my original question, would it make sense to define a
closed-loop iteration where the state is the whole graph?
If you want to take a look at the current implementation of DMST using
delta iteration, Andra has made a PR [1].
On a high-level, this al
Hi,
you said in the other email thread that the error only occurs for
Wordcount, not for Kmeans.
Can you copy me the commands for both examples?
I can not really believe that there is a difference between the two jobs.
Can you also send us the contents of the jobmanager log file?
Best,
Robert
O
Yes. It seams it is not a problem with the arguments. I tried two days but
different error occurs. It seams the web client can’t connect to the job
manager although it is running
Right now, I can’t even get the webclient to run. ./bin/start-webclient.sh
executes fine but I cannot connect to loca
Thank you for the quick reply.
The log you've send is from the webclient. Can you also send the log of the
JobManager?
On Mon, Feb 23, 2015 at 7:28 PM, Dulaj Viduranga
wrote:
> Yes. It seams it is not a problem with the arguments. I tried two days but
> different error occurs. It seams the web
Hi all,
we’re currently playing/messing around with scheduling in flink 0.7. We found
out that if we run a single job with a certain degree of parallelism, multiple
tasks/vertices are executed within a single task manager at the same time (or
at least before the prior stage is switched to finis
Hi Nico,
yes, Flink runs tasks in threads. Each TaskManager runs in its own JVM,
everything within a TaskManager is parallelized in threads. Since a TM can
offer multiple slots, also tasks across slots run in the same JVM and in
different threads.
Flink has a pipelined processing model, which mea
In addition, to what Fabian wrote:
Yes, one slot can run multiple Tasks. In the batch API, one slot can run
concurrently one task of each operation (for example one source, one
mapper, one reducer, one sink).
On Mon, Feb 23, 2015 at 10:38 PM, Fabian Hueske wrote:
> Hi Nico,
>
> yes, Flink runs
Hi Dulaj,
That error message indicates that the JobManager is not running. Are you
sure that the JobManager runs properly? Anything in the JobManager logs?
BTW: The 0.9 branch is under heavy development / changes. That is why it
may behave a bit different on different days right now. I would reco
Hi All,
I am seeing some same class names, even though in different package
names, that could confuse new contributors. One of the attractiveness
of Spark that it is the code structure is simple to follow than Hadoop
(or Hive for that matter).
For example we have IntermediateResultPartition in bo
Henry Saputra created FLINK-1602:
Summary: Remove 0.6-incubating from Flink website SVN
Key: FLINK-1602
URL: https://issues.apache.org/jira/browse/FLINK-1602
Project: Flink
Issue Type: Task
Henry Saputra created FLINK-1603:
Summary: Update how to contribute doc to include information to
send Github PR instead of attaching patch
Key: FLINK-1603
URL: https://issues.apache.org/jira/browse/FLINK-1603
Just to be clear that I was not advocating flink to simplify the code
just for the sake of clarity :)
Flink has a lot to offer by providing simple APIs by hiding complexity to
achieve performance. Which I think is one of the key differentiator compare
to other general distributed processing platfo
The JobManager seems to run fine. I don't know. When I tried to run
start-local.sh again, It shows the PID of the running JobManager and also :8081
runs fine. I want to contribute to the project and I could get a little boost
if I could see the capabilities of FLINK. :)
Will it be OK to use 0.
32 matches
Mail list logo