RE: Guarantees for object reuse modes and documentation

2016-03-03 Thread Ken Krugler
ng is done, which prevents a problem with output tuple re-use. Is this the same model used by Flink? Thanks for clarifying, -- Ken > From: Gábor Gévay > Sent: February 20, 2016 4:04:09am PST > To: dev@flink.apache.org > Subject: Re: Guarantees for object reuse modes and documentation &g

Re: Guarantees for object reuse modes and documentation

2016-02-25 Thread Fabian Hueske
Gabor and Greg gave some good comments on the proposal. If there is no more feedback, I'll go ahead and open a PR to update the documentation tomorrow. Thanks, Fabian 2016-02-24 12:24 GMT+01:00 Fabian Hueske : > Regarding the scope of the object-reuse setting, I agree with Greg. > It would be v

Re: Guarantees for object reuse modes and documentation

2016-02-24 Thread Fabian Hueske
Regarding the scope of the object-reuse setting, I agree with Greg. It would be very nice if we could specify the object-reuse mode for each user function. Greg, do you want to open a JIRA for that such that we can continue the discussion there? 2016-02-24 12:07 GMT+01:00 Fabian Hueske : > Hi ev

Re: Guarantees for object reuse modes and documentation

2016-02-24 Thread Fabian Hueske
Hi everybody, thanks for your input. I sketched a proposal for updated object-reuse semantics and documentation, based on Gabor's proposal (1), Greg's input, and the changed semantics that I discussed earlier in this thread. --> https://docs.google.com/document/d/1jpPr2UuWlqq1iIDIo_1kmPL9QjA-sXA

Re: Guarantees for object reuse modes and documentation

2016-02-20 Thread Gábor Gévay
Thanks, Ken! I was wondering how other systems handle these issues. Fortunately, the deep copy - shallow copy problem doesn't arise in Flink: when we copy an object, it is always a deep copy (at least, I hope so :)). Best, Gábor 2016-02-19 22:29 GMT+01:00 Ken Krugler : > Not sure how useful th

RE: Guarantees for object reuse modes and documentation

2016-02-19 Thread Ken Krugler
Not sure how useful this is, but we'd run into similar issues with Cascading over the years. This wasn't an issue for input data, as Cascading "locks" the Tuple such that attempts to modify it will fail. And in general Hadoop always re-uses the data container being passed to operations, so you

Re: Guarantees for object reuse modes and documentation

2016-02-18 Thread Greg Hogan
Hi Fabian, I would only add to your citations Stephan's comment [1] concerning the design, implementation, and use of object reuse. I see two separate concerns addressed in code. First, as Stephan noted, for certain classes deserialization is sufficiently expensive relative to object creation and

Re: Guarantees for object reuse modes and documentation

2016-02-18 Thread Till Rohrmann
Judging from our chaining condition ds.getPushChainDriverClass() != null && !(pred instanceof NAryUnionPlanNode) &&// first op after union is stand-alone, because union is merged !(pred instanceof BulkPartialSolutionPlanNode) &&// partial solution merges anyways !(pred instanceof WorksetPl

Re: Guarantees for object reuse modes and documentation

2016-02-18 Thread Fabian Hueske
Thanks Matthias. Maybe I should clarify, that I do not want to change the guarantees for the enableObjectReuse mode, but for the disableObjectReuse mode. The rules for the enableObjectReuse mode should remain the same. 2016-02-18 9:37 GMT+01:00 Matthias J. Sax : > Hi, > > I like Fabian's proposa

Re: Guarantees for object reuse modes and documentation

2016-02-18 Thread Matthias J. Sax
Hi, I like Fabian's proposal. The idea of object reuse is performance gain, and we should not sacrifice this. Even more important is that the rules are easy to understand! -Matthias On 02/17/2016 06:17 PM, Fabian Hueske wrote: > Hi, > > > > Flink's DataSet API features a configuration parame