Hi Matthias,
Of course, here is the package that contains the example's source classes.
https://github.com/mbalassi/flink/tree/storm-backup/flink-staging/flink-streaming/flink-storm-examples/src/main/java/org/apache/flink/stormcompatibility/singlejoin
It is mostly a copy-paste of SimpleJoin from s
Aljoscha Krettek created FLINK-2103:
---
Summary: Expose partitionBy to the user in Stream API
Key: FLINK-2103
URL: https://issues.apache.org/jira/browse/FLINK-2103
Project: Flink
Issue Type:
Hi!
If you want to have type hierarchies (like base tuples and different
classes), you cannot use tuples (they are expected to be 'exact schema'),
but you can use other classes. Create your own tuple POJO with subclasses,
and it should work.
Stephan
On Thu, May 28, 2015 at 1:30 AM, Amit Pawar
Thanks Stephan :)
Will try the same.
Thanks and Regards
Amit Pawar
On Thu, May 28, 2015 at 10:41 AM, Stephan Ewen wrote:
> Hi!
>
> If you want to have type hierarchies (like base tuples and different
> classes), you cannot use tuples (they are expected to be 'exact schema'),
> but you can use
Till Rohrmann created FLINK-2104:
Summary: Fallback implicit values for PredictOperation and
TransformOperation don't work if Nothing is inferred as the output type
Key: FLINK-2104
URL: https://issues.apache.org/j
Hi everyone,
I am a bit worried about that recent change of the print() method. I can
understand the rationale that obtaining the stdout from all the taskmanagers is
cumbersome (although, for local debugging the old print() was fine).
However, a major problem, I see with the new print(), is, th
Hi Sebastian,
thank you for the feedback. I agree that both variants have a right to
exist.
I would vote for adding another method to the DataSet called "printLocal()"
that has the old behavior.
On Thu, May 28, 2015 at 1:01 PM, Kruse, Sebastian
wrote:
> Hi everyone,
>
> I am a bit worried abou
Fabian Hueske created FLINK-2105:
Summary: Implement Sort-Merge Outer Join algorithm
Key: FLINK-2105
URL: https://issues.apache.org/jira/browse/FLINK-2105
Project: Flink
Issue Type: Sub-task
+1 for both.
printLocal() might not be the best name, because "local" is not well
defined and could also be understood as the local machine of the user.
How about naming the method completely different (writeToWorkerStdOut()?)
to make sure users are not confused with eager and lazy execution?
20
Fabian Hueske created FLINK-2106:
Summary: Add outer joins to API, Optimizer, and Runtime
Key: FLINK-2106
URL: https://issues.apache.org/jira/browse/FLINK-2106
Project: Flink
Issue Type: Sub-
Okay, you are right, local is actually confusing.
I'm against introducing "worker" as a term in the API. Its still called
"TaskManager". Maybe "printOnTaskManager()" ?
On Thu, May 28, 2015 at 2:06 PM, Fabian Hueske wrote:
> +1 for both.
>
> printLocal() might not be the best name, because "local
I would avoid to call it printXYZ, since print()'s behavior changed to
eager execution.
2015-05-28 14:10 GMT+02:00 Robert Metzger :
> Okay, you are right, local is actually confusing.
> I'm against introducing "worker" as a term in the API. Its still called
> "TaskManager". Maybe "printOnTaskMana
Actually, there is a method "print(String prefix)" which still goes to the
sysout of where the job is executed.
Let's give that one the name "printOnTaskManager()" and then we should have
it...
On Thu, May 28, 2015 at 2:13 PM, Fabian Hueske wrote:
> I would avoid to call it printXYZ, since prin
As I said, the common print prefix might indicate eager execution.
I know that writeToTaskManagerStdOut() is quite bulky, but we should make
the difference in the behavior very clear, IMO.
2015-05-28 14:29 GMT+02:00 Stephan Ewen :
> Actually, there is a method "print(String prefix)" which still
Thanks, for your quick responses!
I also think that renaming the old print method should do the trick. As a
contribution to your brainstorming for a name, I propose logOnTaskManager() ;)
Cheers,
Sebastian
-Original Message-
From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Donnersta
Hi,
At Ericsson, we are implementing something similar to what the
SessionWindowing example does:
There are events belonging to phone calls (sessions), and every event
has a call_id, which tells us which call it belongs to. At the end of
every call, a large event has to be emitted that contains s
Thanks for debugging this Gabor, indeed a good catch.
I am not so sure about surfacing it in the API though - it seems very
specific for the session windowing case. I am also wondering whether maybe
this should actually be the default behavior - if there are already empty
windows for a group why n
What tweaks would that be? I mean what is required to implement L-BFGS?
I guess that we won’t get rid of the case statements because we have to
decide between two code paths: One with and the other without convergence
criterion. But I think by pulling each branch in its own function, it
becomes cl
Hi Till and Theodore,
I think the code is cleaned up a lot now, introducing the
mapWithBcVariable helped a lot.
I also get that the goal was to make a cost function for learning
linear model configurable well. My main concern was that the solver
itself was already too specifically bound to the ki
Oh wait.. continue to type. accidentally sent out the message to early.
On Thu, May 28, 2015 at 4:03 PM, Mikio Braun wrote:
> Hi Till and Theodore,
>
> I think the code is cleaned up a lot now, introducing the
> mapWithBcVariable helped a lot.
>
> I also get that the goal was to make a cost funct
[Ok, so maybe this is exactly what is implemented, sorry if I'm just
repeating you... ]
So
C(w, xys) = C regularization(w) + sum over yxs of losses
Gradient is
C grad reg(w) + sum grad losses(w, xy)
For some regularization functions, regularization is better performed
by some explicit op
+1 for printOnTaskManager()
On Thu, May 28, 2015 at 2:53 PM, Kruse, Sebastian
wrote:
> Thanks, for your quick responses!
>
> I also think that renaming the old print method should do the trick. As a
> contribution to your brainstorming for a name, I propose logOnTaskManager()
> ;)
>
> Cheers,
>
Fabian Hueske created FLINK-2107:
Summary: Implement Hash Outer Join algorithm
Key: FLINK-2107
URL: https://issues.apache.org/jira/browse/FLINK-2107
Project: Flink
Issue Type: Sub-task
Hey Mikio,
yes you’re right. The SGD only needs to know the gradient of the loss
function and some mean to update the weights in accordance with the
regularization scheme. Additionally, we also need to be able to compute the
loss for the convergence criterion.
That’s also how it is implemented in
I agree that avoiding name which starts with “print” is better.
Regards,
Chiwan Park
> On May 28, 2015, at 11:35 PM, Maximilian Michels wrote:
>
> +1 for printOnTaskManager()
>
> On Thu, May 28, 2015 at 2:53 PM, Kruse, Sebastian
> wrote:
>
>> Thanks, for your quick responses!
>>
>> I also t
GradientDescent is the just the (batch-)SGD optimizer right? Actually
I think the parameter update should be done by a
RegularizationFunction.
IMHO the structure should be like this:
GradientDescent
- collects gradient and regularization updates from - CostFunction
LinearModelCostFunction
- i
Yes GradientDescent == (batch-)SGD.
That was also my first idea of how to implement it. However, what happens
if the regularization is specific to the actually used algorithm. For
example, for L-BFGS with L1 regularization you have a different
`parameterUpdate` step (Orthant-wise Limited Memory Qu
Ah yeah, I see.. .
Yes, it's right that many algorithms perform quite differently
depending on the kind of regularization... . Same holds for cutting
plane algorithms which either reduce to linear or quadratic programs
depending on L1 or L2. Generally speaking, I think this is also not
surprising
I think so too. Ok, I'll try to update the PR accordingly.
On Thu, May 28, 2015 at 5:36 PM, Mikio Braun
wrote:
> Ah yeah, I see.. .
>
> Yes, it's right that many algorithms perform quite differently
> depending on the kind of regularization... . Same holds for cutting
> plane algorithms which ei
Theodore Vasiloudis created FLINK-2108:
--
Summary: Add score function for Predictors
Key: FLINK-2108
URL: https://issues.apache.org/jira/browse/FLINK-2108
Project: Flink
Issue Type: Impro
+1
This separation was the idea from the start, there is trade-off between
having highly configureable optimizers and ensuring that the right types of
regularization can only be applied to optimization algorithms that support
them.
It comes down to viewing the optimization framework mostly as a b
Ufuk Celebi created FLINK-2109:
--
Summary: CancelTaskException leads to FAILED task state
Key: FLINK-2109
URL: https://issues.apache.org/jira/browse/FLINK-2109
Project: Flink
Issue Type: Bug
Ufuk Celebi created FLINK-2110:
--
Summary: Early slot release after Execution failure
Key: FLINK-2110
URL: https://issues.apache.org/jira/browse/FLINK-2110
Project: Flink
Issue Type: Bug
Hi,
Indeed a good catch, and a valid issue exactly because of the stateful
nature of the trigger and eviction policies.
I agree with the suggested approach that this should be configurable for
the discretizers (and could be set through the API).
As for the default behaviour, I am not 100%. It co
Hi,
I would vote for making the default behaviour to drop all state for
empty groups, and allow a configuration to set the current behaviour
instead. This issue will probably have a paragraph in the
documentation, but if someone overlooks this, then there is potential
for a greater disaster with t
Let's not get all dramatic :D
If we don't call any methods on the empty groups we can still keep them
off-memory in a persistent storage with a lazy checkpoint/state-access
logic with practically 0 memory overhead.
Automatically dropping everything will break a lot of programs without
people noti
Hi All,
I would like to announce the new Apache Flink meetup in bay area:
http://www.meetup.com/Bay-Area-Apache-Flink-Meetup/
We are cooking the first event for the meetup soon and will have
awesome speakers to talk about Apache Flink =)
Please join the Bay Area meetup to get the latest news ab
> Let's not get all dramatic :D
Ok, sorry :D
> If we don't call any methods on the empty groups we can still keep them
> off-memory in a persistent storage with a lazy checkpoint/state-access
> logic with practically 0 memory overhead.
So you mean that whether to call notifyOnLastGlobalElement w
38 matches
Mail list logo