稍等,我看一下您的反馈


发自我的iPhone


------------------ Original ------------------
From: Zhongpu Chen <chenlov...@gmail.com&gt;
Date: Fri,Feb 24,2023 7:47 PM
To: user <user@flink.apache.org&gt;
Subject: Re: RE: Re: Re: Should we always mark ValueState as "transient" 
forRichFunctions



Hi Shammon,

Sorry for the inaccurate description of my last reply. Let me restate my 
question again:

Fact 1: we know that ValueState here should not 
serialized/de-serialized, so it is a good practice to mark it with 
"transient".

Fact 2: on the other hand, if we don't mark it with "transient", it will 
be initialized to null, and this null value will be 
serialized/de-serialized. I think it will occur some overhead if the 
number of partitions is very large.

Given the two facts above, the program works well in both cases in terms 
of accuracy. And my question is: is there any performance benchmark in 
real (large) applications to compare two cases?

Feel free to point out if I've misunderstood.

On 2023/02/24 11:01:51 Shammon FY wrote:
 &gt; Hi
 &gt;
 &gt; Sorry that I don't quite understand your question. I think the above
 &gt; functions will only be deserialized when the job is submitted, do you 
want
 &gt; to test the impact of this on submission throughput?
 &gt;
 &gt; Best,
 &gt; Shammon
 &gt;
 &gt;
 &gt; On Fri, Feb 24, 2023 at 3:04 PM Zhongpu Chen  wrote:
 &gt;
 &gt; &gt; Hi Gen,
 &gt; &gt;
 &gt; &gt; Thanks for your explanation.
 &gt; &gt;
 &gt; &gt; Back to this code snippet, since they are not marked with "transient"
 &gt; &gt; now, I suppose Flink will use avro to serialize them (null values). 
Is
 &gt; &gt; there any benchmark to show the performance test between null values
 &gt; &gt; serialization and "transient"? I mean, it is indeed not good to write
 &gt; &gt; them with "transient", but it works. So is there any performance 
lose here?
 &gt; &gt;
 &gt; &gt;
 &gt; &gt; On 2023/02/24 06:47:21 Gen Luo wrote:
 &gt; &gt; &gt; Hi,
 &gt; &gt; &gt;
 &gt; &gt; &gt; ValueState is a handle rather than an actual value. So it 
should 
never
 &gt; &gt; be
 &gt; &gt; &gt; serialized. In fact, ValueState itself is not a Serializable. It
 &gt; &gt; should be
 &gt; &gt; &gt; ok to always mark it as transient.
 &gt; &gt; &gt;
 &gt; &gt; &gt; In this case, I suppose it works because the ValueState is not 
set
 &gt; &gt; (which
 &gt; &gt; &gt; happens during the runtime) when the function is serialized 
(while
 &gt; &gt; &gt; deploying). But it's not good.
 &gt; &gt; &gt;
 &gt; &gt; &gt; On Fri, Feb 24, 2023 at 10:29 AM Zhongpu Chen  
wrote:
 &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; Hi,
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; When I am reading the code from flink-training-repo [1], I 
noticed the
 &gt; &gt; &gt; &gt; following code:
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; ```java
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; public static class EnrichmentFunction
 &gt; &gt; &gt; &gt; extends RichCoFlatMapFunction {
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; private ValueState rideState; private
 &gt; &gt; ValueState fareState;
 &gt; &gt; &gt; &gt; ...
 &gt; &gt; &gt; &gt; }
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; ```
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; From my understanding, since ValueState variables here are 
scoped
 &gt; &gt; to each
 &gt; &gt; &gt; &gt; instance, they should not be serialized for the 
performance sake.
 &gt; &gt; Thus, we
 &gt; &gt; &gt; &gt; should always mark them with "transient". Similar 
discussion can be
 &gt; &gt; found
 &gt; &gt; &gt; &gt; here [2].
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; Should we always mark ValueState as "transient", and why? 
Please
 &gt; &gt; help me
 &gt; &gt; &gt; &gt; to figure it out.
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; [1]
 &gt; &gt; &gt; &gt;
 &gt; &gt;
 &gt; &gt; 
https://github.com/apache/flink-training/blob/master/rides-and-fares/src/solution/java/org/apache/flink/training/solutions/ridesandfares/RidesAndFaresSolution.java
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt; &gt; [2]
 &gt; &gt; &gt; &gt;
 &gt; &gt;
 &gt; &gt; 
https://stackoverflow.com/questions/72556202/flink-managed-state-as-transient
 &gt; &gt; &gt; &gt;
 &gt; &gt; &gt;
 &gt; &gt;
 &gt;

Reply via email to