Hi all,

Unless there are some objections or new suggestions, I will open
a voting thread next Monday, thanks~

Best,
Rui

On Wed, Nov 15, 2023 at 4:11 PM Rui Fan <1996fan...@gmail.com> wrote:

> Thanks to Hangxiang and Feng's feedback!
>
> Best,
> Rui
>
> On Wed, Nov 15, 2023 at 3:37 PM Feng Jin <jinfeng1...@gmail.com> wrote:
>
>> Hi, Rui
>>
>> Thanks for driving this. over LGTM.
>>
>> I agree that redirecting stdout to LOG at first as it is not the default
>> behavior .
>> However, if there are other scenarios in the future, we can consider
>> adding
>> additional parameters to redirect stdout to a separate file.
>>
>>
>> Best,
>> Feng
>>
>> On Wed, Nov 15, 2023 at 2:23 PM Hangxiang Yu <master...@gmail.com> wrote:
>>
>> > Hi, Rui.
>> > Overall LGTM now.
>> > Thanks for driving this again!
>> >
>> > On Wed, Nov 15, 2023 at 2:07 PM Rui Fan <1996fan...@gmail.com> wrote:
>> >
>> > > Hi Hangxiang,
>> > >
>> > > Thanks for your feedback!
>> > >
>> > > > I saw some users may debug operator correctness using print() or
>> > > System.out
>> > > > which may bring lots of info.
>> > > > It's fine if files are separate, otherwise these will make
>> > > taskMnanager.log
>> > > > roll quickly.
>> > > > But I think it's fine if we don't use it as default first.
>> > >
>> > > Got it, we will support redirect System.out to LOG, and keep
>> > > original behaviour(DEFAULT enum) as default value.
>> > >
>> > >
>> > > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
>> > > > > userLogContext)?
>> > > >
>> > > > Yes, That's what I mean.
>> > >
>> > > Sounds make sense, I added a new option to FLIP[1],
>> > > it's taskmanager.system-out.log.thread-name.enabled: false.
>> > >
>> > > Why log the thread name here?
>> > >
>> > > Flink's RichFunction can get subtaskId, but redirecting
>> > > System.out to LOG is public code, so it is difficult to get subtaskId.
>> > > After consideration, a workaround solution is to log thread name,
>> > > because the thread name of the default Task Thread contains the
>> > > task name and subtask id. And this solution can support all threads,
>> > > because some non-task threads may also call println.
>> > >
>> > > [1] https://cwiki.apache.org/confluence/x/4guZE
>> > >
>> > > Best,
>> > > Rui
>> > >
>> > > On Wed, Nov 15, 2023 at 10:26 AM Hangxiang Yu <master...@gmail.com>
>> > wrote:
>> > >
>> > > > Hi, Rui.
>> > > >
>> > > > As I mentioned before: the user didn't really want
>> > > > > to log into taskmanager.out, it just happened by accident.
>> > > > > So, if users change the System.out to LOG.info, it still happen.
>> > > >
>> > > > Thanks for the feedback.
>> > > > I saw some users may debug operator correctness using print() or
>> > > System.out
>> > > > which may bring lots of info.
>> > > > It's fine if files are separate, otherwise these will make
>> > > taskMnanager.log
>> > > > roll quickly.
>> > > > But I think it's fine if we don't use it as default first.
>> > > >
>> > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
>> > > > > userLogContext)?
>> > > > >
>> > > >
>> > > > Yes, That's what I mean.
>> > > >
>> > > > On Sat, Nov 11, 2023 at 12:54 AM Piotr Nowojski <
>> > > piotr.nowoj...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Thanks! :)
>> > > > >
>> > > > > Best, Piotrek
>> > > > >
>> > > > > czw., 9 lis 2023 o 16:15 Rui Fan <1996fan...@gmail.com>
>> napisał(a):
>> > > > >
>> > > > > > Hi Piotr,
>> > > > > >
>> > > > > > Thanks for your feedback!
>> > > > > >
>> > > > > > > Or implement your own loop? It shouldn't be more than a
>> couple of
>> > > > > lines.
>> > > > > >
>> > > > > > Implementing it directly is fine, I have updated the FLIP.
>> > > > > > And this logic can be found in the  `isLineEnded` method.
>> > > > > >
>> > > > > > Best,
>> > > > > > Rui
>> > > > > >
>> > > > > > On Thu, Nov 9, 2023 at 11:00 PM Piotr Nowojski <
>> > > > piotr.nowoj...@gmail.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Hi Rui,
>> > > > > > >
>> > > > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int
>> > aFromIndex,
>> > > > int
>> > > > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
>> > > > > > > > and java11 has it. Do you have any other suggestions for
>> java8?
>> > > > > > >
>> > > > > > > Maybe use `ByteBuffer.wrap`?
>> > > > > > >
>> > > > > > > ByteBuffer.wrap(array, ...,
>> ...).equals(ByteBuffer.wrap(array2,
>> > > ...,
>> > > > > > ...))
>> > > > > > >
>> > > > > > > This shouldn't have overheads as far as I remember.
>> > > > > > >
>> > > > > > > Or implement your own loop? It shouldn't be more than a
>> couple of
>> > > > > lines.
>> > > > > > >
>> > > > > > > Best,
>> > > > > > > Piotrek
>> > > > > > >
>> > > > > > > czw., 9 lis 2023 o 06:43 Rui Fan <1996fan...@gmail.com>
>> > > napisał(a):
>> > > > > > >
>> > > > > > > > Hi Piotr, Archit, Feng and Hangxiang:
>> > > > > > > >
>> > > > > > > > Thanks a lot for your feedback!
>> > > > > > > >
>> > > > > > > > Following is my comment, please correct me if I
>> misunderstood
>> > > > > anything!
>> > > > > > > >
>> > > > > > > > To Piotr:
>> > > > > > > >
>> > > > > > > > > Is there a reason why you are suggesting to copy out bytes
>> > from
>> > > > > `buf`
>> > > > > > > to
>> > > > > > > > `bytes`,
>> > > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex,
>> int
>> > > > > > aToIndex,
>> > > > > > > > int[] b, int bFromIndex, int bToIndex)`?
>> > > > > > > >
>> > > > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int
>> > aFromIndex,
>> > > > int
>> > > > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
>> > > > > > > > and java11 has it. Do you have any other suggestions for
>> java8?
>> > > > > > > >
>> > > > > > > > Also, this code doesn't run in production. As the comment of
>> > > > > > > > System.lineSeparator():
>> > > > > > > >
>> > > > > > > > > On UNIX systems, it returns {@code "\n"}; on Microsoft
>> > > > > > > > > Windows systems it returns {@code "\r\n"}.
>> > > > > > > >
>> > > > > > > > So Mac and Linux just return one character, we will compare
>> > > > > > > > one byte directly.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > To Feng:
>> > > > > > > >
>> > > > > > > > > Will they be written to the taskManager.log file by
>> default
>> > > > > > > > > or the taskManager.out file?
>> > > > > > > >
>> > > > > > > > I prefer LOG as the default value for
>> > > taskmanager.system-out.mode.
>> > > > > > > > It's useful for job stability and doesn't introduce
>> significant
>> > > > > impact
>> > > > > > to
>> > > > > > > > users. Also, our production has already used this feature
>> for
>> > > > > > > > more than 1 years, it works well.
>> > > > > > > >
>> > > > > > > > However, I write the DEFAULT as the default value for
>> > > > > > > > taskmanager.system-out.mode, because when the community
>> > > introduces
>> > > > > > > > new options, the default value often selects the original
>> > > behavior.
>> > > > > > > >
>> > > > > > > > Looking forward to hear more thoughts from community about
>> this
>> > > > > > > > default value.
>> > > > > > > >
>> > > > > > > > > If we can make taskmanager.out splittable and rolling,
>> would
>> > it
>> > > > be
>> > > > > > > > > easier for users to use this feature?
>> > > > > > > >
>> > > > > > > > Making taskmanager.out splittable and rolling is a good
>> choice!
>> > > > > > > > I have some concerns about it:
>> > > > > > > >
>> > > > > > > > 1. Users may also want to use LOG.info in their code and
>> just
>> > > > > > > >   accidentally use System.out.println. It is possible that
>> they
>> > > > will
>> > > > > > > >   also find the logs directly in taskmanager.log.
>> > > > > > > > 2. I'm not sure whether the rolling strategy is easy to
>> > > implement.
>> > > > > > > >   If we do it, it's necessary to define a series of flink
>> > options
>> > > > > > similar
>> > > > > > > >   to log options, such as: fileMax(how many files should be
>> > > > > retained),
>> > > > > > > >   fileSize(The max size each file), fileNamePatten (The
>> suffix
>> > of
>> > > > > file
>> > > > > > > > name),
>> > > > > > > > 3. Check the file size periodically: all logs are written by
>> > log
>> > > > > > plugin,
>> > > > > > > >   they can check the log file size after writing. However,
>> > > > System.out
>> > > > > > > >   are written directly. And flink must start a thread to
>> check
>> > > the
>> > > > > > latest
>> > > > > > > >   taskmanager.out size periodically. If it's too quick,
>> most of
>> > > job
>> > > > > > > aren't
>> > > > > > > >   necessary. If it's too slow, the file size cannot be
>> > controlled
>> > > > > > > properly.
>> > > > > > > >
>> > > > > > > > Redirect it to LOG.info may be a reasonable and easy choice.
>> > > > > > > > The user didn't really want to log into taskmanager.out, it
>> > just
>> > > > > > > > happened by accident.
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > To Hangxiang:
>> > > > > > > >
>> > > > > > > > > 1. I have a similar concern as Feng. Will we redirect to
>> > > another
>> > > > > log
>> > > > > > > file
>> > > > > > > > > not taskManager.log ?
>> > > > > > > >
>> > > > > > > > Please see my last comment, thanks!
>> > > > > > > >
>> > > > > > > > > taskManager.log contains lots of important information
>> like
>> > > init
>> > > > > log.
>> > > > > > > It
>> > > > > > > > > will be rolled quickly if we redirect out and error here.
>> > > > > > > >
>> > > > > > > > IIUC, this issue isn't caused by System.out, and it can
>> happen
>> > if
>> > > > > user
>> > > > > > > > call a lot of LOG.info. As I mentioned before: the user
>> didn't
>> > > > really
>> > > > > > > want
>> > > > > > > > to log into taskmanager.out, it just happened by accident.
>> > > > > > > > So, if users change the System.out to LOG.info, it still
>> > happen.
>> > > > > > > >
>> > > > > > > > > 2. Since we have redirected to LOG mode, Could we also log
>> > the
>> > > > > > subtask
>> > > > > > > > info
>> > > > > > > > > ? It may help us to debug granularly.
>> > > > > > > >
>> > > > > > > > I'm not sure what `log the subtask info` means. Let me
>> confirm
>> > > with
>> > > > > you
>> > > > > > > > first.
>> > > > > > > > Do you mean like this: LOG.info("taskName {} : {}",
>> taskName,
>> > > > > > > > userLogContext)?
>> > > > > > > >
>> > > > > > > > Best,
>> > > > > > > > Rui
>> > > > > > > >
>> > > > > > > > On Thu, Nov 9, 2023 at 11:47 AM Hangxiang Yu <
>> > > master...@gmail.com>
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi, Rui.
>> > > > > > > > > Thanks for the proposal. It sounds reasonable.
>> > > > > > > > > I have some questions, PTAL:
>> > > > > > > > > 1. I have a similar concern as Feng. Will we redirect to
>> > > another
>> > > > > log
>> > > > > > > file
>> > > > > > > > > not taskManager.log ?
>> > > > > > > > > taskManager.log contains lots of important information
>> like
>> > > init
>> > > > > log.
>> > > > > > > It
>> > > > > > > > > will be rolled quickly if we redirect out and error here.
>> > > > > > > > > 2. Since we have redirected to LOG mode, Could we also log
>> > the
>> > > > > > subtask
>> > > > > > > > info
>> > > > > > > > > ? It may help us to debug granularly.
>> > > > > > > > >
>> > > > > > > > > On Thu, Nov 9, 2023 at 9:47 AM Feng Jin <
>> > jinfeng1...@gmail.com
>> > > >
>> > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi, Rui.
>> > > > > > > > > >
>> > > > > > > > > > Thank you for initiating this proposal.
>> > > > > > > > > >
>> > > > > > > > > > I have a question regarding redirecting stdout and
>> stderr
>> > to
>> > > > LOG:
>> > > > > > > > > >
>> > > > > > > > > > Will they be written to the taskManager.log file by
>> default
>> > > or
>> > > > > the
>> > > > > > > > > > taskManager.out file?
>> > > > > > > > > > If we can make taskmanager.out splittable and rolling,
>> > would
>> > > it
>> > > > > be
>> > > > > > > > easier
>> > > > > > > > > > for users to use this feature?
>> > > > > > > > > >
>> > > > > > > > > > Best,
>> > > > > > > > > > Feng
>> > > > > > > > > >
>> > > > > > > > > > On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal
>> > > > > > > > <argo...@linkedin.com.invalid
>> > > > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Rui,
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for the proposal.
>> > > > > > > > > > >
>> > > > > > > > > > > The proposed solution of supporting System out and
>> err to
>> > > be
>> > > > > > > > redirected
>> > > > > > > > > > to
>> > > > > > > > > > > LOG or discarded and introducing an enum and two
>> options
>> > to
>> > > > > > manage
>> > > > > > > > > this,
>> > > > > > > > > > > seems reasonable.
>> > > > > > > > > > >
>> > > > > > > > > > > +1
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks,
>> > > > > > > > > > > Archit Goyal
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > From: Piotr Nowojski <pnowoj...@apache.org>
>> > > > > > > > > > > Date: Wednesday, November 8, 2023 at 7:38 AM
>> > > > > > > > > > > To: dev@flink.apache.org <dev@flink.apache.org>
>> > > > > > > > > > > Subject: Re: [DISCUSS] FLIP-390: Support System out
>> and
>> > err
>> > > > to
>> > > > > be
>> > > > > > > > > > > redirected to LOG or discarded
>> > > > > > > > > > > Hi Rui,
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for the proposal.
>> > > > > > > > > > >
>> > > > > > > > > > > +1 I don't have any major comments :)
>> > > > > > > > > > >
>> > > > > > > > > > > One nit. In `SystemOutRedirectToLog` in this code:
>> > > > > > > > > > >
>> > > > > > > > > > >            System.arraycopy(buf, count -
>> > > > LINE_SEPARATOR_LENGTH,
>> > > > > > > > bytes,
>> > > > > > > > > 0,
>> > > > > > > > > > > LINE_SEPARATOR_LENGTH);
>> > > > > > > > > > >             return Arrays.equals(LINE_SEPARATOR_BYTES,
>> > > bytes)
>> > > > > > > > > > >
>> > > > > > > > > > > Is there a reason why you are suggesting to copy out
>> > bytes
>> > > > from
>> > > > > > > `buf`
>> > > > > > > > > to
>> > > > > > > > > > > `bytes`,
>> > > > > > > > > > > instead of using `Arrays.equals(int[] a, int
>> aFromIndex,
>> > > int
>> > > > > > > > aToIndex,
>> > > > > > > > > > > int[] b, int bFromIndex, int bToIndex)`?
>> > > > > > > > > > >
>> > > > > > > > > > > Best,
>> > > > > > > > > > > Piotrek
>> > > > > > > > > > >
>> > > > > > > > > > > śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com
>> >
>> > > > > > napisał(a):
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi all!
>> > > > > > > > > > > >
>> > > > > > > > > > > > I would like to start a discussion of FLIP-390:
>> Support
>> > > > > System
>> > > > > > > out
>> > > > > > > > > and
>> > > > > > > > > > > err
>> > > > > > > > > > > > to be redirected to LOG or discarded[1].
>> > > > > > > > > > > >
>> > > > > > > > > > > > In various production environments, either cloud
>> native
>> > > or
>> > > > > > > physical
>> > > > > > > > > > > > machines, the disk space that Flink TaskManager can
>> use
>> > > is
>> > > > > > > limited.
>> > > > > > > > > > > >
>> > > > > > > > > > > > In general, the flink users shouldn't use the
>> > > > > > > `System.out.println`
>> > > > > > > > in
>> > > > > > > > > > > > production,
>> > > > > > > > > > > > however this may happen when the number of Flink
>> jobs
>> > and
>> > > > job
>> > > > > > > > > > developers
>> > > > > > > > > > > > is very large. Flink job may use System.out to
>> output a
>> > > > large
>> > > > > > > > amount
>> > > > > > > > > of
>> > > > > > > > > > > > data
>> > > > > > > > > > > > to the taskmanager.out file. This file will not
>> roll,
>> > it
>> > > > will
>> > > > > > > > always
>> > > > > > > > > > > > increment.
>> > > > > > > > > > > > Eventually the upper limit of what the TM can be
>> used
>> > for
>> > > > is
>> > > > > > > > reached.
>> > > > > > > > > > > >
>> > > > > > > > > > > > We can support System out and err to be redirected
>> to
>> > LOG
>> > > > or
>> > > > > > > > > discarded,
>> > > > > > > > > > > > the LOG can roll and won't increment forever.
>> > > > > > > > > > > >
>> > > > > > > > > > > > This feature is useful for SREs who maintain Flink
>> > > > > > environments,
>> > > > > > > > they
>> > > > > > > > > > can
>> > > > > > > > > > > > redirect System.out to LOG by default. Although the
>> > cause
>> > > > of
>> > > > > > this
>> > > > > > > > > > problem
>> > > > > > > > > > > > is
>> > > > > > > > > > > > that the user's code is not standardized, for SRE,
>> > > pushing
>> > > > > > users
>> > > > > > > to
>> > > > > > > > > > > modify
>> > > > > > > > > > > > the code one by one is usually a very time-consuming
>> > > > > operation.
>> > > > > > > > It's
>> > > > > > > > > > also
>> > > > > > > > > > > > useful for job stability where System.out is
>> > accidentally
>> > > > > used.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Looking forward to your feedback, thanks~
>> > > > > > > > > > > >
>> > > > > > > > > > > > [1]
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0
>> > > > > > > > > > > <https://cwiki.apache.org/confluence/x/4guZE>
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best,
>> > > > > > > > > > > > Rui
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > > Best,
>> > > > > > > > > Hangxiang.
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Best,
>> > > > Hangxiang.
>> > > >
>> > >
>> >
>> >
>> > --
>> > Best,
>> > Hangxiang.
>> >
>>
>

Reply via email to