Aljoscha Krettek created FLINK-2203:
---
Summary: Add Support for Null-Values in RowSerializer
Key: FLINK-2203
URL: https://issues.apache.org/jira/browse/FLINK-2203
Project: Flink
Issue Type:
In StreamingJobGraphGenerator.connect(headOfChain, edge) is checks the
strategy. If it is FORWARD it only does a POINTWISE connection to the
low-level downstream vertex. I know, this is all very unclear... :D
On Thu, 11 Jun 2015 at 00:13 Matthias J. Sax
wrote:
> I am not sure about this... You a
I am not sure about this... You are right about the super constructor,
however, selectChannels(...) does not call super.getStrategy() what is
the only way to get back the value set in the super class (ie,
StreamPartitioner.strategy).
selectChannels() computes the return value independently from
th
Thanks :) works like a charm.
2015-06-10 22:28 GMT+02:00 Fabian Hueske :
> Hi,
>
> use ./bin/flink run -c your.MainClass yourJar to specify the Main class.
> Check the documentation of the CLI client for details.
>
> Cheers, Fabian
> On Jun 10, 2015 22:24, "Felix Neutatz" wrote:
>
> > Hi,
> >
>
Hi,
use ./bin/flink run -c your.MainClass yourJar to specify the Main class.
Check the documentation of the CLI client for details.
Cheers, Fabian
On Jun 10, 2015 22:24, "Felix Neutatz" wrote:
> Hi,
>
> I try to run this Scala program:
>
> https://github.com/FelixNeutatz/wikiTrends/blob/master/
Hi Felix,
You need to also specify the path to your main class.
For example: ./flink run -p10-v --class
org.apache.flink.graph.example.MusicProfiles then jar and arguments
-p is the desired degree of parallelism and --class needs to be followed by
the path I was talking about,
Beast of luck :)
A
Hi,
I try to run this Scala program:
https://github.com/FelixNeutatz/wikiTrends/blob/master/extraction/src/main/scala/io/sanfran/wikiTrends/extraction/flink/DownloadTopKPages.scala
on a cluster.
I tried this command:
/share/flink/flink-0.9-SNAPSHOT/bin/flink run
/home/neutatz/jars/extraction-1.0
It doesn't evaluate the member forward, but it calls the super constructor
with a partitioning strategy that depends on on the forward parameter.
That's how it works.
On Wed, 10 Jun 2015 at 18:51 Márton Balassi
wrote:
> Thanks for spotting the documentation issues. I'm fixing them quickly for
>
Thanks for spotting the documentation issues. I'm fixing them quickly for
the release then.
The RebalancePartitioner indeed is obfuscated, let me see what can be done
there.
On Wed, Jun 10, 2015 at 6:34 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Thanks!
>
>
> About shuffle() vs
Thanks!
About shuffle() vs rebalance(): I would suggest to explain the
difference (random vs round-robin) in the JavaDoc of DataStream.
Furthermore, I was wondering if the JavaDoc for @return is correct for
"forward()", "rebalance()", and "global()". They all state
> "@return The DataStream wit
Yes, that needs to be fixed IMO
2015-06-10 17:51 GMT+02:00 Till Rohrmann :
> Yes since it is clearly a deadlock in the scheduler, the current version
> shouldn't be released.
>
> On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi wrote:
>
> >
> > On 10 Jun 2015, at 16:18, Maximilian Michels wrote:
> >
Yes since it is clearly a deadlock in the scheduler, the current version
shouldn't be released.
On Wed, Jun 10, 2015 at 5:48 PM Ufuk Celebi wrote:
>
> On 10 Jun 2015, at 16:18, Maximilian Michels wrote:
>
> > I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its
> > cause bu
On 10 Jun 2015, at 16:18, Maximilian Michels wrote:
> I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its
> cause but still need to find out how to fix it.
Very good find, Max!
Max, Till, and I have looked into this and it is a reproducible deadlock in the
scheduler duri
Theodore Vasiloudis created FLINK-2202:
--
Summary: Calling distinct() requires tuple input
Key: FLINK-2202
URL: https://issues.apache.org/jira/browse/FLINK-2202
Project: Flink
Issue Type:
Márton Balassi created FLINK-2201:
-
Summary: Inconsistent use of ClosureCleaner in streaming window
helpers
Key: FLINK-2201
URL: https://issues.apache.org/jira/browse/FLINK-2201
Project: Flink
I see four options to solve this without adding the dependency:
1. Move CaseClassTypeInfo and CaseClassComparator to flink-core. Till
said that we want to avoid mixed Scala and Java modules, which rules
this out.
2. Create a new toplevel maven project scala-core, and move things there.
3. Hacky sol
If they can be easily converted to Java code, that would be a good solution.
On Wed, 10 Jun 2015 at 15:56 Gábor Gévay wrote:
> > "it does not feel right to add an API package to a core package
>
> Yes, that makes sense. I just tried removing the flink-java dependency
> from flink-streaming to se
I'm debugging the TaskManagerFailsWithSlotSharingITCase. I've located its
cause but still need to find out how to fix it.
On Wed, Jun 10, 2015 at 2:25 PM, Aljoscha Krettek
wrote:
> I added a section at the top of the release testing document to keep
> track of commits that we might want to cherr
> "it does not feel right to add an API package to a core package
Yes, that makes sense. I just tried removing the flink-java dependency
from flink-streaming to see what needs it, and it builds fine without
it :)
What do you think about the second option? (to move the Scala
typeutils (or just Cas
As for the current streaming dependencies: flink-java is not needed at all,
flink-clients is only needed in specific cases. Thanks for spotting it, I
am cleaning it.
As for Gabor's problem: maybe we should move the CaseClassTypeInfo to
flink-core then given Till's comment. Any opinions on that?
O
Btw: I noticed that all streaming modules depend on flink-core,
flink-runtime, flink-clients and flink-java. Is there a particular reason
why the streaming connectors depend on flink-clients and flink-java?
On Wed, Jun 10, 2015 at 3:41 PM Till Rohrmann wrote:
> I see the reason why you want to a
Philipp Götze created FLINK-2200:
Summary: Flink API with Scala 2.11 - Maven Repository
Key: FLINK-2200
URL: https://issues.apache.org/jira/browse/FLINK-2200
Project: Flink
Issue Type: Wish
I see the reason why you want to add flink-scala as a dependency to
flink-streaming-core. However, it does not feel right to add an API package
to a core package IMHO.
But I noticed that flink-streaming-core also depends on flink-java. Which
seems odd to me as well. I'm not a streaming expert and
Done, I will merge it after travis passes.
Maximilian Michels ezt írta (időpont: 2015. jún. 10., Sze,
15:25):
> Let's mark the method of the environment as deprecated like Aljoscha
> suggested. Then I think we could merge it.
>
> On Wed, Jun 10, 2015 at 2:50 PM, Gyula Fóra wrote:
>
> > Then I s
Let's mark the method of the environment as deprecated like Aljoscha
suggested. Then I think we could merge it.
On Wed, Jun 10, 2015 at 2:50 PM, Gyula Fóra wrote:
> Then I suggest we leave it in the environment along with the other
> checkpointing methods.
>
> I updated my PR so it includes hint
Then I suggest we leave it in the environment along with the other
checkpointing methods.
I updated my PR so it includes hints how to force enable checkpoints (and
the reduced guarantees) when an error is thrown for iterative jobs.
On Wed, Jun 10, 2015 at 2:46 PM, Aljoscha Krettek
wrote:
> We c
We could add a method on the ExecutionConfig but mark it as deprecated
and explain that it will go away once the interplay of iterations,
state and so on is properly figured out.
On Wed, Jun 10, 2015 at 2:36 PM, Ufuk Celebi wrote:
> On 10 Jun 2015, at 14:29, Gyula Fóra wrote:
>
>> Max suggested
Hey,
As the storm-compatibility-core build goes fine this is a dependency issue
with storm-compatibility-examples. As a first try replace:
org.apache.flink
flink-streaming-core
${project.version}
test
tests
with
org.apache.flink
flink-streaming-core
${project.version}
test
Travis caches Maven dependendies and sometimes fails to update them.
Try to clear you Travis cache via "Settings" (up right) -> "Caches"
Cheers, Fabian
2015-06-10 14:22 GMT+02:00 Matthias J. Sax :
> Hi,
>
> the current PR of storm compatibility layer builds successfully on my
> laptop (mvn clean
This seems like a version mismatch. For example,
DataStream.distribute() was changed to DataStream.rebalance()
recently. Maybe your build getting some outdated jars from the travis
cache.
On Wed, Jun 10, 2015 at 2:22 PM, Matthias J. Sax
wrote:
> Hi,
>
> the current PR of storm compatibility layer
On 10 Jun 2015, at 14:29, Gyula Fóra wrote:
> Max suggested that I add this feature slightly hidden to the execution
> config instance.
>
> The problem then is that I either make a public field in the config or once
> again add a method.
>
> Any ideas?
I thought about this as well. The only wa
Max suggested that I add this feature slightly hidden to the execution
config instance.
The problem then is that I either make a public field in the config or once
again add a method.
Any ideas?
Aljoscha Krettek ezt írta (időpont: 2015. jún. 10.,
Sze, 14:07):
> Thanks :D, now I see. It makes s
Hi,
the current PR of storm compatibility layer builds successfully on my
laptop (mvn clean install). However, on travis I get strange error
messages in the IT-Cases:
https://travis-ci.org/mjsax/flink/builds/66137928
For example:
> Caused by: java.lang.AbstractMethodError:
> org.apache.flink.st
I added a section at the top of the release testing document to keep
track of commits that we might want to cherry-pick to the release. I
included the YARNSessionFIFOITCase fix and the optional stream
iteration partitioning (both already on release branch).
On Wed, Jun 10, 2015 at 12:51 PM, Fabian
Hello,
I would like to ask if it would be OK if I added flink-scala as a
dependency to flink-streaming-core. An alternative would be to move
the Scala typeutils to flink-core (to where the Java typeutils are).
Why I need this:
While I am implementing the fast median calculation for windows as
par
Thanks :D, now I see. It makes sense because we don't have another way
of keeping the cluster state synced/distributed across parallel
instances of the operators.
On Wed, Jun 10, 2015 at 12:52 PM, Gyula Fóra wrote:
> Here is an example for you:
>
> Parallel streaming kmeans, the state we keep is
Fabian Hueske created FLINK-2199:
Summary: Build with skipped tests fails on Windows
Key: FLINK-2199
URL: https://issues.apache.org/jira/browse/FLINK-2199
Project: Flink
Issue Type: Bug
Fabian Hueske created FLINK-2198:
Summary: Build fails on Windows
Key: FLINK-2198
URL: https://issues.apache.org/jira/browse/FLINK-2198
Project: Flink
Issue Type: Bug
Components: Bu
Here is an example for you:
Parallel streaming kmeans, the state we keep is the current cluster
centers, and we use iterations to sync the centers across parallel
instances.
We can afford lost model updated in the loop but we need the checkpoint the
models.
https://github.com/gyfora/stream-cluste
@Sachin: I reproduced the build error on my Windows machine.
2015-06-10 12:22 GMT+02:00 Maximilian Michels :
> @Sachin: This looks like a file permission issue. We should have someone
> else verify that on a Windows system.
>
> On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel
> wrote:
>
> > I have
You are right, to have consistent results we would need to persist the
records.
But since we cannot do that right now, we can still checkpoint all operator
states and understand that inflight records in the loop are lost on failure.
This is acceptable for most the use-cases that we have developed
The elements that are in-flight in an iteration are also state of the
job. I'm wondering whether the state inside iterations still makes
sense without these in-flight elements. But I also don't know the King
use-case, that's why I though an example could be helpful.
On Wed, Jun 10, 2015 at 12:37 P
I don't understand the question, I vote for checkpointing all state in the
job, even inside iterations (its more of a loop).
Aljoscha Krettek ezt írta (időpont: 2015. jún. 10.,
Sze, 12:34):
> I don't understand why having the state inside an iteration but not
> the elements that correspond to th
I don't understand why having the state inside an iteration but not
the elements that correspond to this state or created this state is
desirable. Maybe an example could help understand this better?
On Wed, Jun 10, 2015 at 11:27 AM, Gyula Fóra wrote:
> The other tests verify that the checkpointin
I agree with Theo. I think it’s a nice feature to have as part of the
standard API because only few users will be aware of something like
DataSetUtils. However, as a first version we can make it part of
DataSetUtils.
Cheers,
Till
On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis <
theodoros.
@Sachin: This looks like a file permission issue. We should have someone
else verify that on a Windows system.
On Wed, Jun 10, 2015 at 11:28 AM, Sachin Goel
wrote:
> I have run "mvn clean verify" five times now and every time I'm getting
> these failed tests:
>
> BlobUtilsTest.before:45 null
>
Regarding the iteration partitioning feature, since I use it of course I
find it very useful, but it is true that it needs to be tested more
extensively and also be discussed by the community before it is added in a
release.
Moreover, given the fact that I can still use it for research purposes (I
Till Rohrmann created FLINK-2197:
Summary: Scala API is not working when using batch and streaming
API in the same program
Key: FLINK-2197
URL: https://issues.apache.org/jira/browse/FLINK-2197
Project
+1 for Fabian, but I would very much like to see this as part of the API in
the future.
This function would be very useful for FlinkML as well, as we noted in a
recent discussion on the mailing list regarding time series datasets.
On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske wrote:
> As Andr
I have run "mvn clean verify" five times now and every time I'm getting
these failed tests:
BlobUtilsTest.before:45 null
BlobUtilsTest.before:45 null
BlobServerDeleteTest.testDeleteFails:291 null
BlobLibraryCacheManagerTest.testRegisterAndDownload:196 Could not
remove write permissions from
The other tests verify that the checkpointing algorithm runs properly. That
also ensures that it runs for iterations because a loop is just an extra
source and sink in the jobgraph (so it is the same for the algorithm).
Fabian Hueske ezt írta (időpont: 2015. jún. 10., Sze,
11:19):
> Without goin
Without going into the details, how well tested is this feature? The PR
only extends one test by a few lines.
Is that really enough to ensure that
1) the change does not cause trouble
2) is working as expected
If this feature should go into the release, it must be thoroughly checked
and we must t
The KMeans quickstart example does not work with the current state of
the KMeansDataGenerator. I created PR that brings the two in sync.
This should probably go into the release since it affects initial user
"satisfaction".
On Wed, Jun 10, 2015 at 11:14 AM, Márton Balassi
wrote:
> As for the stre
As for the streaming commit cherry-picked to the release branch:
This is an unfortunate communication issue, let us make sure that we
clearly communicate similar issues in the future.
As for FLINK-2192: This is essentially a duplicate issue of the testability
of the streaming iteration. Not a bloc
Hey Gyula, Max,
On 10 Jun 2015, at 10:54, Gyula Fóra wrote:
> This feature needs to be included in the release, it has been tested and
> used extensively. And many applciations depend on it.
It would be nice to announce/discuss this before just cherry-picking it into
the release branch. The is
And also I would like to remind everyone that any fault tolerance we
provide is only as good as the fault tolerance of the master node. Which is
non existent at the moment.
So I don't see a reason why a user should not be able to choose whether he
wants state checkpoints for iterations as well.
I
Adding one more thing to the list:
The code contains a misplaced class (mea culpa) in flink-java,
org.apache.flink.api.java.SortPartitionOperator which is API facing and
should be moved to the operators package. If we do that after the release,
it will break binary compatibility. I created FLINK-2
Fabian Hueske created FLINK-2196:
Summary: Missplaced Class in flink-java SortPartitionOperator
Key: FLINK-2196
URL: https://issues.apache.org/jira/browse/FLINK-2196
Project: Flink
Issue Type
I'm not against including the feature but I'd like to discuss it first. I
believe that only very carefully selected commits should be added to
release-0.9. If that feature happens to be tested extensively and is very
important for user satisfactory then we might include it.
On Wed, Jun 10, 2015 at
As Andra said, I'd would not add it to the API at this point.
However, I don't think it should go into a separate Maven module
(flink-contrib) that needs to be added as dependency but rather into some
DataSetUtils class in flink-java.
We can easily add it to the API later, if necessary. We should
I agree with Gyula regarding the iteration partitioning.
I have also been using this feature for developing machine learning
algorithms. And I think SAMOA also needs this feature.
Faye
2015-06-10 10:54 GMT+02:00 Gyula Fóra :
> This feature needs to be included in the release, it has been tested
This feature needs to be included in the release, it has been tested and
used extensively. And many applciations depend on it.
Maximilian Michels ezt írta (időpont: 2015. jún. 10., Sze,
10:47):
> With all the issues discovered, it looks like we'll have another release
> candidate. Right now, we
With all the issues discovered, it looks like we'll have another release
candidate. Right now, we have discovered the following problems:
1 YARN ITCase fails [fixed via 2eb5cfe]
2 No Jar for SessionWindowing example [fixed in #809]
3 Wrong description of the input format for the graph examples (eg
Hey everyone,
We needed to assign unique labels as vertex values in Gelly at some point.
We got a nice suggestion on how to do that in parallel (Implemented in
https://github.com/apache/flink/pull/801#issuecomment-110654447).
Now the question is where should these two functions go? Should they be
Ufuk Celebi created FLINK-2195:
--
Summary: Set Configuration for Configurable Hadoop InputFormats
Key: FLINK-2195
URL: https://issues.apache.org/jira/browse/FLINK-2195
Project: Flink
Issue Type:
To continue Gyula's point, for consistent snapshots we need to persist the
records in transit within the loop and also slightly change the current
protocol since it works only for DAGs. Before going into that direction though
I would propose we first see whether there is a nice way to make ite
I disagree. Not having checkpointed operators inside the iteration still
breaks the guarantees.
It is not about the states it is about the loop itself.
On Wed, Jun 10, 2015 at 10:12 AM Aljoscha Krettek
wrote:
> This is the answer I gave on the PR (we should have one place for
> discussing this,
This is the answer I gave on the PR (we should have one place for
discussing this, though):
I would be against merging this in the current form. What I propose is
to analyse the topology to verify that there are no checkpointed
operators inside iterations. Operators before and after iterations can
I agree that for the sake of the above mentioned use cases it is reasonable
to add this to the release with the right documentation, for machine
learning potentially loosing one round of feedback data should not matter.
Let us not block prominent users until the next release on this.
On Wed, Jun
Till Rohrmann created FLINK-2194:
Summary: Type extractor does not support Writable type
Key: FLINK-2194
URL: https://issues.apache.org/jira/browse/FLINK-2194
Project: Flink
Issue Type: Bug
Sebastian Kruse created FLINK-2193:
--
Summary: Partial shuffling
Key: FLINK-2193
URL: https://issues.apache.org/jira/browse/FLINK-2193
Project: Flink
Issue Type: Improvement
Repor
71 matches
Mail list logo