If the community can agree that the proposal that Gábor Horváth has
suggested is a nice approach and can accept that the results will be coming
around mid summer, then I would strongly suggest "reserving" him this topic.
His previous experience makes him a strong candidate for the task. To add
to
Very nice proposal!
On Wed, Mar 9, 2016 at 6:35 PM, Stephan Ewen wrote:
> Thanks for posting this.
>
> I think it is not super urgent (in the sense of weeks or few months), so
> results around mid summer is probably good.
> The background in LLVM is a very good base for this!
>
> On Wed, Mar 9, 2
Thanks for posting this.
I think it is not super urgent (in the sense of weeks or few months), so
results around mid summer is probably good.
The background in LLVM is a very good base for this!
On Wed, Mar 9, 2016 at 3:56 PM, Gábor Horváth wrote:
> Hi,
>
> In the meantime I sent out the curren
Hi,
In the meantime I sent out the current version of the proposal draft [1].
Hopefully it will help you triage this task and contribute to the
discussion of the problem.
How urgent is this issue? In what time frame should there be results?
Best Regards,
Gábor
[1]
http://apache-flink-mailing-lis
Do we have consensus that we want to "reserve" this topic for a GSoC
student?
It is becoming a feature that gains more importance. To see we can "hold
off" on working on that, would be good to know a bit more, like
- when is it decided whether this project takes place?
- when would results be
@Fabian: That is my bad, but I think we should be still on time. Pinged Uli
just to make sure. Proposal from Gabor and Jira from me are coming soon.
On Tue, Mar 8, 2016 at 11:43 AM, Fabian Hueske wrote:
> Hi Gabor,
>
> I did not find any Flink proposals for this year's GSoC in JIRA (should be
>
Hi Gabor,
I did not find any Flink proposals for this year's GSoC in JIRA (should be
labeled with gsoc2016).
I am also not sure if any of the Flink committers signed up as a GSoC
mentor.
Maybe it is still time to do that but as it looks right now there are no
GSoC projects offered by Flink.
Best,
Hi!
I am planning to do GSoC and I would like to work on the serializers. More
specifically I would like to implement code generation. I am planning to
send the first draft of the proposal to the mailing list early next week.
If everything is going well, that will include some preliminary benchmar
Ah, very good, that makes sense!
I would guess that this performance difference could probably be seen at
various points where generic serializers and comparators are used (also for
Comparable, Writable) or
where the TupleSerializer delegates to a sequence of other TypeSerializers.
I guess creati
The issue is not with the Tuple hierarchy (running Gelly examples had no
effect on runtime, and as you note there aren't any subclass overrides) but
with CopyableValue. I had been using IntValue exclusively but had switched
to using LongValue for graph generation. CopyableValueComparator and
Copyab
Hi Greg!
Sounds very interesting.
Do you have a hunch what "virtual" Tuple methods are being used that become
less jit-able? In many cases, tuples use only field accesses (like
"vakle.f1") in the user functions.
I have to dig into the serializers, to see if they could suffer from that.
The "getF
I set parallelism of map to 4 (and I double checked, that the 4 mappers
are running on different machines). Furthermore, fromElements() source
has parallelism of 1. Thus, some data is going over the network for sure.
On 08/04/2015 02:31 PM, Chesnay Schepler wrote:
> i think this job would be chai
i think this job would be chained completely and never do any serialization.
On 04.08.2015 14:25, Matthias J. Sax wrote:
Works for batch job, too. See enclosed.
On 08/04/2015 01:34 PM, Matthias J. Sax wrote:
Yes, that is was the program does. However, streaming is not lazy so
deserialization s
Works for batch job, too. See enclosed.
On 08/04/2015 01:34 PM, Matthias J. Sax wrote:
> Yes, that is was the program does. However, streaming is not lazy so
> deserialization should have happened.
>
> I will try a batch job, later today.
>
> On 08/04/2015 01:27 PM, Chesnay Schepler wrote:
>> so
Yes, that is was the program does. However, streaming is not lazy so
deserialization should have happened.
I will try a batch job, later today.
On 08/04/2015 01:27 PM, Chesnay Schepler wrote:
> so I'm not to much into the streaming API, but as i see it this program
> creates an infinite number of
Yes, that is was the program does. However, streaming is not lazy so
deserialization should have happened.
I will try a batch job, later today.
On 08/04/2015 01:27 PM, Chesnay Schepler wrote:
> so I'm not to much into the streaming API, but as i see it this program
> creates an infinite number of
Yes, that is was the program does. However, streaming is not lazy so
deserialization should have happened.
I will try a batch job, later today.
On 08/04/2015 01:27 PM, Chesnay Schepler wrote:
> so I'm not to much into the streaming API, but as i see it this program
> creates an infinite number of
I think in the Streaming Case it works because every Serializer ends up
being wrapped up in a StreamRecordSerializer. When the
StreamRecordSerializer serializes/deserializes stuff it should be ok that
the Tuple0 doesn't actually serialize/deserialize anything.
On Tue, 4 Aug 2015 at 13:27 Chesnay S
so I'm not to much into the streaming API, but as i see it this program
creates an infinite number of tuples and then counts them, right?
The problem with serialization as i understand it is that the receiver
can't tell how many Tuple0 are sent, since you never actually read any
data when dese
Hi,
I just opened a PR for this. https://github.com/apache/flink/pull/983
However, I was not able to "reproduce" serialization issues... I tested
Tuple0 (see enclosed code) in a cluster, and the program worked. Do I
miss anything?
-Matthias
On 08/03/2015 01:01 AM, Matthias J. Sax wrote:
> Tha
The idea of the dedicated project was to make the tuples usable in other
programs, that may interact with Flink, but won't want the full
dependencies.
I share the concern about too many small projects...
On Mon, Aug 3, 2015 at 1:01 AM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Th
Thanks for the advice about Tuple0.
I personally don't see any advantage in having "flink-tuple" project. Do
I miss anything about it? Furthermore, I am not sure if it is a good
idea the have too many too small projects.
On 08/03/2015 12:48 AM, Stephan Ewen wrote:
> Tuple0 would need special ser
Tuple0 would need special serialization and comparator logic. If that is
given, I see no reason not to support it.
There is BTW, the request to create a dedicated "flink-tuple" project, that
only contains the tuple classes. Any opinions on that?
On Mon, Aug 3, 2015 at 12:45 AM, Matthias J. Sax <
Thanks for the explanation!
As I mentioned before, Tuple0 might also be helpful for streaming. And I
guess I will need it for Storm compatibility layer, too. (I need to
double check, but Storm supports zero-attribute-tuples, too).
With regard to the information I collected during the discussion,
First of all, it was a really good idea to start a discussion about this.
So the general idea behind Tuple0 was this:
The Python API maps python tuples to flink tuples. Python can have empty
tuples, so i thought "well duh, let's make a Tuple0 class!". What i did
not wanna do is create some non
Can you elaborate how and why Python used Tuple0? If it cannot be
serialized similar to regular Tuples, what is the usage in Python? Right
now it seems, as there is no special serialization code for Tuple0.
I just want to understand the topic in detail.
-Matthias
On 08/01/2015 03:38 PM, Stephan
I think a Tuple0 cannot be implemented like the current tuples, at least
with respect to runtime serialization.
The system makes the assumption that it makes progress in consuming bytes
when deserializing values. If a Tuple= never consumes data from the byte
stream, this assumption is broken. It w
I just double checked. Scala does not have type Tuple0. IMHO, it would
be best to remove Tuple0 for consistency. Having Tuple types is for
consistency reason with Scala in the first place, right? Please give
feedback.
-Matthias
On 08/01/2015 01:04 PM, Matthias J. Sax wrote:
> I see.
>
> I think
yes, if it is present in the core flink files it must work just as any
tuple in flink.
removing is not an option though; but moving is. The Python API uses it
(that's the reason Tuple0 was added in the first place).
On 01.08.2015 13:04, Matthias J. Sax wrote:
I see.
I think that it might be
I see.
I think that it might be useful to have Tuple0, because in rare cases,
you only want to "notify" a downstream operators (taking about
streaming) that something happened but there is no actual data to be
processed. Furthermore, if Flink cannot deal with Tuple0 it should be
removed completely
also, I'm not sure if I ever sent a Tuple0 through a program, it could
be that the system freaks out.
On 31.07.2015 22:40, Chesnay Schepler wrote:
there's no specific reason. it was added fairly recently by me (mid of
april), and you're most likely the second person to use it.
i didn't integr
there's no specific reason. it was added fairly recently by me (mid of
april), and you're most likely the second person to use it.
i didn't integrate into all our tuple related stuff because, well, i
never thought anyone would actually need it, so i saved myself the trouble.
Hi,
is there an
It would be an interesting addition.
Such a method cannot be done fully type safe in Java, but that might be
okay, since it is user-code internal.
On Wed, May 27, 2015 at 11:52 AM, Flavio Pompermaier
wrote:
> Sorry, to be effective the project should also take in input the target
> tuple itself
Sorry, to be effective the project should also take in input the target
tuple itself :)
Tuple3 reuse = tuple.project(reuse, 0,2,5)?
On Wed, May 27, 2015 at 11:51 AM, Flavio Pompermaier
wrote:
> Hi flinkers,
>
> it happens very often to me that I have to output a reuse tuple that
> basically is
34 matches
Mail list logo