Hi all,

We would like to revive the discussion and provide a quick update on
the recent work of the FLIP. We have implemented a POC[1], run cases
in the flink-benchmarks[2] against the POC, and verified that many of
the operators in the benchmark will enable object-reuse without code
changes, while the global object-reuse is disabled.

Please let me know if you have any further comments on the FLIP. If
there are no more comments, we will open the voting in 3 days.

Best regards,
Xuannan

[1] https://github.com/apache/flink/pull/22897
[2] https://github.com/apache/flink-benchmarks


On Fri, Jul 7, 2023 at 9:18 AM Dong Lin <lindon...@gmail.com> wrote:
>
> Hi Jing,
>
> Thank you for the suggestion. Yes, we can extend it to support null if in
> the future we find any use-case for this flexibility.
>
> Best,
> Dong
>
> On Thu, Jul 6, 2023 at 7:55 PM Jing Ge <j...@ververica.com> wrote:
>
> > Hi Dong,
> >
> > one scenario I could imagine is that users could enable global object
> > reuse features but force deep copy for some user defined specific functions
> > because of any limitations. But that is only my gut feeling. And agree, we
> > could keep the solution simple for now as FLIP described and upgrade to 3VL
> > once there are such real requirements that are rising.
> >
> > Best regards,
> > Jing
> >
> > On Thu, Jul 6, 2023 at 12:30 PM Dong Lin <lindon...@gmail.com> wrote:
> >
> >> Hi Jing,
> >>
> >> Thank you for the detailed explanation. Please see my reply inline.
> >>
> >> On Thu, Jul 6, 2023 at 3:17 AM Jing Ge <j...@ververica.com> wrote:
> >>
> >>> Hi Xuannan, Hi Dong,
> >>>
> >>> Thanks for your clarification.
> >>>
> >>> @Xuannan
> >>>
> >>> A Jira ticket has been created for the doc update:
> >>> https://issues.apache.org/jira/browse/FLINK-32546
> >>>
> >>> @Dong
> >>>
> >>> I don't have a concrete example. I just thought about it from a
> >>> conceptual or pattern's perspective. Since we have 1. coarse-grained 
> >>> global
> >>> switch(CGS as abbreviation), i.e. the pipeline.object-reuse and 2.
> >>> fine-grained local switch(FGS as abbreviation), i.e. the
> >>> objectReuseCompliant variable for specific operators/functions, there will
> >>> be the following patterns with appropriate combinations:
> >>>
> >>> pattern 1: coarse-grained switch only. Local object reuse will be
> >>> controlled by the coarse-grained switch:
> >>> 1.1 cgs == true -> local object reused enabled
> >>> 1.2 cgs == true  -> local object reused enabled
> >>> 1.3 cgs == false -> local object reused disabled, i.e. deep copy enabled
> >>> 1.4 cgs == false -> local object reused disabled, i.e. deep copy enabled
> >>>
> >>> afaiu, this is the starting point. I wrote 4 on purpose to make the
> >>> regression check easier. We can consider it as the combinations with
> >>> cgs(true/false) and fgs(true/false) while fgs is ignored.
> >>>
> >>> Now we introduce fine-grained switch. There will be two patterns:
> >>>
> >>> pattern 2: fine-grained switch over coarse-grained switch.
> >>> Coarse-grained switch will be ignored when the local fine-grained switch
> >>> has different value:
> >>> 2.1 cgs == true and fgs == true -> local object reused enabled
> >>> 2.2 cgs == true and fgs == false -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>> 2.3 cgs == false and fgs == true -> local object reused enabled
> >>> 2.4 cgs == false and fgs == false -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>>
> >>> cgs is actually ignored.
> >>>
> >>> Current FLIP is using a slightly different pattern:
> >>>
> >>> pattern 3: fine-grained switch over coarse-grained switch only when
> >>> coarse-grained switch is off, i..e cgs OR fgs:
> >>> 3.1 cgs == true and fgs == true -> local object reused enabled
> >>> 3.2 cgs == true and fgs == false -> local object reused enabled
> >>> 3.3 cgs == false and fgs == true -> local object reused enabled
> >>> 3.4 cgs == false and fgs == false -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>>
> >>> All of those patterns are rational and each has different focus. It
> >>> depends on the real requirement to choose one of them.
> >>>
> >>> As we can see, if fgs is using 2VL, there is a regression between
> >>> pattern 1 and pattern 2. You are absolutely right in this case. That's why
> >>> I suggested 3VL, i.e. fgs will have triple values: true, false,
> >>> unknown(e.g. null)
> >>>
> >>> pattern 4: 3VL fgs with the null as init value (again, there are just
> >>> two combination, I made it 4 on purpose):
> >>> 4.1 cgs == true and fgs == null -> local object reused enabled
> >>> 4.2 cgs == true and fgs == null -> local object reused enabled
> >>> 4.3 cgs == false and fgs == null -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>> 4.4 cgs == false and fgs == null -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>>
> >>> Since the default value of fgs is null, pattern 4 is backward compatible
> >>> with pattern 1, which means no regression.
> >>>
> >>> Now we will set value to fgs and follow the pattern 2:
> >>> 4.5 cgs == true and fgs == true -> local object reused enabled
> >>> 4.6 cgs == true and fgs == false -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>> 4.7 cgs == false and fgs == true -> local object reused enabled
> >>> 4.8 cgs == false and fgs == false -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>>
> >>> Pattern 4 contains pattern 3 with the following combinations(force
> >>> enabling local object reuse):
> >>> 4.5 cgs == true and fgs == true -> local object reused enabled
> >>> 4.2 cgs == true and fgs == null -> local object reused enabled
> >>> 4.7 cgs == false and fgs == true -> local object reused enabled
> >>> 4.4 cgs == false and fgs == null -> local object reused disabled, i.e.
> >>> deep copy enabled
> >>>
> >>> Comparing pattern 4 to pattern 3, user will have one additional
> >>> flexibility to control(force disabling) the local object reuse capability
> >>> because of 3VL, i.e. 4.2+4.6 vs. 3.2.
> >>>
> >>
> >> I think you are suggesting to allow the user setting fgs to null so that
> >> "user will have one additional flexibility to control(force disabling) the
> >> local object reuse capability".
> >>
> >> In general, an API that only allows false/true is a bit more complex to
> >> implement than an API that allows false/true/null. All things being equal,
> >> I believe it is preferred to have a simpler public API.
> >>
> >> I understand you are coming from a conceptual perspective and trying to
> >> make it similar to hierarchical RBAC. However, after thinking through this,
> >> I still could not find a use-case where we actually need this flexibility.
> >> In particular, for cases where a user has explicitly configured
> >> pipeline.object-reuse to true, that means the user already knows (or takes
> >> the responsibility of ensuring) that correctness can be achieved, why would
> >> the user want to explicitly disable the object-reuse for an operator?
> >>
> >> Regards,
> >> Dong
> >>
> >>
> >>>
> >>> It is commonly used in the hierarchical RBAC to enable more fine-grained
> >>> access control of sub role.
> >>>
> >>> I hope I have been able to explain myself clearly. Looking forward to
> >>> your feedback.
> >>>
> >>> Best regards,
> >>> Jing
> >>>
> >>>
> >>>
> >>> On Wed, Jul 5, 2023 at 12:47 PM Dong Lin <lindon...@gmail.com> wrote:
> >>>
> >>>> Hi Jing,
> >>>>
> >>>> Thanks for the comments! Please find below my comments, which are based
> >>>> on the offline discussion with Xuannan.
> >>>>
> >>>> On Wed, Jul 5, 2023 at 1:36 AM Jing Ge <j...@ververica.com> wrote:
> >>>>
> >>>>> Hi Xuannan, Hi Dong
> >>>>>
> >>>>> Thanks for the Proposal! After reading the FLIP, I'd like to ask some
> >>>>> questions:
> >>>>>
> >>>>> 1. Naming convention for boolean variables. It is recommended to
> >>>>> follow JavaBean [1], i.e. objectReuseCompliant as the variable name with
> >>>>> isObjectReuseCompliant() and setObjectReuseCompliant() as the methods' 
> >>>>> name.
> >>>>>
> >>>>>
> >>>> Good point. We have updated the FLIP as suggested.
> >>>>
> >>>>
> >>>>>
> >>>>> 2.
> >>>>>
> >>>>>    -
> >>>>>
> >>>>>    *If pipeline.object-reuse is set to true, records emitted by this
> >>>>>    operator will be re-used.*
> >>>>>    -
> >>>>>
> >>>>>    *Otherwise, if getIsObjectReuseCompliant() returns true, records
> >>>>>    emitted by this operator will be re-used.*
> >>>>>    -
> >>>>>
> >>>>>    *Otherwise, records emitted by this operator will be deep-copied
> >>>>>    before being given to the next operator in the chain.*
> >>>>>
> >>>>>
> >>>>> If I understand you correctly,  the hard coding objectReusedCompliant
> >>>>> should have higher priority over the configuration, the checking logic
> >>>>> should be:
> >>>>>
> >>>>>    -
> >>>>>
> >>>>>    *If getIsObjectReuseCompliant() returns true, records emitted by
> >>>>>    this operator will be re-used.*
> >>>>>    -
> >>>>>
> >>>>>    *Otherwise, if pipeline.object-reuse is set to true, records
> >>>>>    emitted by this operator will be re-used.*
> >>>>>    -
> >>>>>
> >>>>>    *Otherwise, records emitted by this operator will be deep-copied
> >>>>>    before being given to the next operator in the chain.*
> >>>>>
> >>>>>
> >>>>> The results are the same but the checking logics are different, but
> >>>>> there are some additional thoughts, which lead us to the next question.
> >>>>>
> >>>>
> >>>>>
> >>>>
> >>>>> 3. Current design lets specific operators enable object reuse and
> >>>>> ignore the global config. There could be another thought, on the 
> >>>>> contrary:
> >>>>> if an operator has hard coded the objectReuseCompliant as false, i.e.
> >>>>> disable object reuse on purpose, records should not be reused even if 
> >>>>> the
> >>>>> global config pipeline.object-reused is set to true, which turns out 
> >>>>> that
> >>>>> the objectReuseCompliant could be a triple value logic: ture: force 
> >>>>> object
> >>>>> reusing; false: force deep-copying; unknown: depends on
> >>>>> pipeline.object-reuse config.
> >>>>>
> >>>>
> >>>> With the current proposal, if pipeline.object-reused == true and
> >>>> operatorA's objectReuseCompliant == false, Flink will not deep-copy
> >>>> operatorA's output. I think you are suggesting changing the behavior such
> >>>> that Flink should deep-copy the operatorA's output.
> >>>>
> >>>> Could you explain what is the advantage of this approach compared to
> >>>> the approach described in the FLIP?
> >>>>
> >>>> My concern with this approach is that it can cause performance
> >>>> regression. This is an operator's objectReuseCompliant will be false by
> >>>> default unless it is explicitly overridden. For those jobs which are
> >>>> currently configured with pipeline.object-reused = true, these jobs will
> >>>> likely start to have lower performance (due to object deep-copy) after
> >>>> upgrading to the newer Flink version.
> >>>>
> >>>> Best,
> >>>> Dong
> >>>>
> >>>>
> >>>>>
> >>>>> Best regards,
> >>>>> Jing
> >>>>>
> >>>>>
> >>>>> [1] https://en.wikipedia.org/wiki/JavaBeans
> >>>>>
> >>>>> On Mon, Jul 3, 2023 at 4:25 AM Xuannan Su <suxuanna...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> Dong(cc'ed) and I are opening this thread to discuss our proposal to
> >>>>>> add operator attribute to allow operator to specify support for
> >>>>>> object-reuse [1].
> >>>>>>
> >>>>>> Currently, the default configuration for pipeline.object-reuse is set
> >>>>>> to false to avoid data corruption, which can result in suboptimal
> >>>>>> performance. We propose adding APIs that operators can utilize to
> >>>>>> inform the Flink runtime whether it is safe to reuse the emitted
> >>>>>> records. This enhancement would enable Flink to maximize its
> >>>>>> performance using the default configuration.
> >>>>>>
> >>>>>> Please refer to the FLIP document for more details about the proposed
> >>>>>> design and implementation. We welcome any feedback and opinions on
> >>>>>> this proposal.
> >>>>>>
> >>>>>> Best regards,
> >>>>>>
> >>>>>> Dong and Xuannan
> >>>>>>
> >>>>>> [1]
> >>>>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=255073749
> >>>>>>
> >>>>>

Reply via email to