Re: Tuple

2015-08-02 Thread Stephan Ewen
The idea of the dedicated project was to make the tuples usable in other programs, that may interact with Flink, but won't want the full dependencies. I share the concern about too many small projects... On Mon, Aug 3, 2015 at 1:01 AM, Matthias J. Sax < mj...@informatik.hu-berlin.de> wrote: > Th

Re: Tuple

2015-08-02 Thread Matthias J. Sax
Thanks for the advice about Tuple0. I personally don't see any advantage in having "flink-tuple" project. Do I miss anything about it? Furthermore, I am not sure if it is a good idea the have too many too small projects. On 08/03/2015 12:48 AM, Stephan Ewen wrote: > Tuple0 would need special ser

Re: Tuple

2015-08-02 Thread Stephan Ewen
Tuple0 would need special serialization and comparator logic. If that is given, I see no reason not to support it. There is BTW, the request to create a dedicated "flink-tuple" project, that only contains the tuple classes. Any opinions on that? On Mon, Aug 3, 2015 at 12:45 AM, Matthias J. Sax <

Re: Tuple

2015-08-02 Thread Matthias J. Sax
Thanks for the explanation! As I mentioned before, Tuple0 might also be helpful for streaming. And I guess I will need it for Storm compatibility layer, too. (I need to double check, but Storm supports zero-attribute-tuples, too). With regard to the information I collected during the discussion,

New failing Kafka Test

2015-08-02 Thread Matthias J. Sax
Hi, I hit a failing test KafkaITCase.bigRecordTestTopology https://travis-ci.org/mjsax/flink/jobs/73807115 I know that Kafka tests have been unstable all the time. However, I want to report it, to keep the problem on the agenda. -Matthias signature.asc Description: OpenPGP digital signature

Re: Tuple

2015-08-02 Thread Chesnay Schepler
First of all, it was a really good idea to start a discussion about this. So the general idea behind Tuple0 was this: The Python API maps python tuples to flink tuples. Python can have empty tuples, so i thought "well duh, let's make a Tuple0 class!". What i did not wanna do is create some non

Re: Question About "Preserve Partitioning" in Stream Iteration

2015-08-02 Thread Stephan Ewen
This model strikes me as pretty complicated. Imagine the extra logic and code path necessary for proper checkpointing as well. Why not do a simple approach: - There is one parallel head, one parallel tail, both with the same parallelism - Any computation in between may have it own parallelism

Re: Question About "Preserve Partitioning" in Stream Iteration

2015-08-02 Thread Aljoscha Krettek
To answer the question plain and simple: No, there are several different parallel heads and tails. For example in this: val iter = ds.iteration() val head_tail1 = iter.map().parallelism(2) val head_tail2 = iter.map().parallelism(4) iter.closeWith(head_tail1.union(head_tail2)) We have one head/t

[jira] [Created] (FLINK-2464) BufferSpillerTest sometimes fails

2015-08-02 Thread Gyula Fora (JIRA)
Gyula Fora created FLINK-2464: - Summary: BufferSpillerTest sometimes fails Key: FLINK-2464 URL: https://issues.apache.org/jira/browse/FLINK-2464 Project: Flink Issue Type: Bug Component

Re: Tuple

2015-08-02 Thread Matthias J. Sax
Can you elaborate how and why Python used Tuple0? If it cannot be serialized similar to regular Tuples, what is the usage in Python? Right now it seems, as there is no special serialization code for Tuple0. I just want to understand the topic in detail. -Matthias On 08/01/2015 03:38 PM, Stephan

Re: Question About "Preserve Partitioning" in Stream Iteration

2015-08-02 Thread Gyula Fóra
In a streaming program when we create an IterativeDataStream, we practically mark the union point of some later feedback stream (the one passed in to closeWith(..)). The operators applied on this IterativeDataStream will receive the feedback input as well. We call the operators applied on the iter

Re: Question About "Preserve Partitioning" in Stream Iteration

2015-08-02 Thread Stephan Ewen
I don't get the discussion here, can you help me with what you mean by "different iteration heads and tails" ? An iteration does not have one parallel head and one parallel tail? On Fri, Jul 31, 2015 at 6:52 PM, Gyula Fóra wrote: > Maybe you can reuse some of the logic that is currently there o

[jira] [Created] (FLINK-2463) Chow execution config in dashboard

2015-08-02 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2463: --- Summary: Chow execution config in dashboard Key: FLINK-2463 URL: https://issues.apache.org/jira/browse/FLINK-2463 Project: Flink Issue Type: Sub-task

Re: Question about DataStream class hierarchy

2015-08-02 Thread Stephan Ewen
I agree with Gyula here. Getting the API right is too important to "quick fix" it. On Fri, Jul 31, 2015 at 10:06 PM, Gyula Fóra wrote: > Hi Matthias, > > I think Aljoscha is preparing a nice PR that completely reworks the > DataStream classes and the information they actually contain. I don't t

[jira] [Created] (FLINK-2462) Wrong exception reporting in streaming jobs

2015-08-02 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2462: --- Summary: Wrong exception reporting in streaming jobs Key: FLINK-2462 URL: https://issues.apache.org/jira/browse/FLINK-2462 Project: Flink Issue Type: Bug

[jira] [Created] (FLINK-2461) RemoteExecutorHostnameResolutionTest and ClientHostnameResolutionTest

2015-08-02 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2461: --- Summary: RemoteExecutorHostnameResolutionTest and ClientHostnameResolutionTest Key: FLINK-2461 URL: https://issues.apache.org/jira/browse/FLINK-2461 Project: Flink

On integrating Flink with HBase

2015-08-02 Thread Slim Baltagi
Hi I came across this https://issues.apache.org/jira/browse/HBASE-13992 on a hbase-spark module. This might be inspiring for integrating Flink with HBase. May be a hbase-flink module for use cases such as the ones referred to in the documents listed in the above Jira description. What do you thi