Hi, Rui

Thanks for driving this. over LGTM.

I agree that redirecting stdout to LOG at first as it is not the default
behavior .
However, if there are other scenarios in the future, we can consider adding
additional parameters to redirect stdout to a separate file.


Best,
Feng

On Wed, Nov 15, 2023 at 2:23 PM Hangxiang Yu <master...@gmail.com> wrote:

> Hi, Rui.
> Overall LGTM now.
> Thanks for driving this again!
>
> On Wed, Nov 15, 2023 at 2:07 PM Rui Fan <1996fan...@gmail.com> wrote:
>
> > Hi Hangxiang,
> >
> > Thanks for your feedback!
> >
> > > I saw some users may debug operator correctness using print() or
> > System.out
> > > which may bring lots of info.
> > > It's fine if files are separate, otherwise these will make
> > taskMnanager.log
> > > roll quickly.
> > > But I think it's fine if we don't use it as default first.
> >
> > Got it, we will support redirect System.out to LOG, and keep
> > original behaviour(DEFAULT enum) as default value.
> >
> >
> > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > > userLogContext)?
> > >
> > > Yes, That's what I mean.
> >
> > Sounds make sense, I added a new option to FLIP[1],
> > it's taskmanager.system-out.log.thread-name.enabled: false.
> >
> > Why log the thread name here?
> >
> > Flink's RichFunction can get subtaskId, but redirecting
> > System.out to LOG is public code, so it is difficult to get subtaskId.
> > After consideration, a workaround solution is to log thread name,
> > because the thread name of the default Task Thread contains the
> > task name and subtask id. And this solution can support all threads,
> > because some non-task threads may also call println.
> >
> > [1] https://cwiki.apache.org/confluence/x/4guZE
> >
> > Best,
> > Rui
> >
> > On Wed, Nov 15, 2023 at 10:26 AM Hangxiang Yu <master...@gmail.com>
> wrote:
> >
> > > Hi, Rui.
> > >
> > > As I mentioned before: the user didn't really want
> > > > to log into taskmanager.out, it just happened by accident.
> > > > So, if users change the System.out to LOG.info, it still happen.
> > >
> > > Thanks for the feedback.
> > > I saw some users may debug operator correctness using print() or
> > System.out
> > > which may bring lots of info.
> > > It's fine if files are separate, otherwise these will make
> > taskMnanager.log
> > > roll quickly.
> > > But I think it's fine if we don't use it as default first.
> > >
> > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > > userLogContext)?
> > > >
> > >
> > > Yes, That's what I mean.
> > >
> > > On Sat, Nov 11, 2023 at 12:54 AM Piotr Nowojski <
> > piotr.nowoj...@gmail.com>
> > > wrote:
> > >
> > > > Thanks! :)
> > > >
> > > > Best, Piotrek
> > > >
> > > > czw., 9 lis 2023 o 16:15 Rui Fan <1996fan...@gmail.com> napisał(a):
> > > >
> > > > > Hi Piotr,
> > > > >
> > > > > Thanks for your feedback!
> > > > >
> > > > > > Or implement your own loop? It shouldn't be more than a couple of
> > > > lines.
> > > > >
> > > > > Implementing it directly is fine, I have updated the FLIP.
> > > > > And this logic can be found in the  `isLineEnded` method.
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Thu, Nov 9, 2023 at 11:00 PM Piotr Nowojski <
> > > piotr.nowoj...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Rui,
> > > > > >
> > > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int
> aFromIndex,
> > > int
> > > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > > >
> > > > > > Maybe use `ByteBuffer.wrap`?
> > > > > >
> > > > > > ByteBuffer.wrap(array, ..., ...).equals(ByteBuffer.wrap(array2,
> > ...,
> > > > > ...))
> > > > > >
> > > > > > This shouldn't have overheads as far as I remember.
> > > > > >
> > > > > > Or implement your own loop? It shouldn't be more than a couple of
> > > > lines.
> > > > > >
> > > > > > Best,
> > > > > > Piotrek
> > > > > >
> > > > > > czw., 9 lis 2023 o 06:43 Rui Fan <1996fan...@gmail.com>
> > napisał(a):
> > > > > >
> > > > > > > Hi Piotr, Archit, Feng and Hangxiang:
> > > > > > >
> > > > > > > Thanks a lot for your feedback!
> > > > > > >
> > > > > > > Following is my comment, please correct me if I misunderstood
> > > > anything!
> > > > > > >
> > > > > > > To Piotr:
> > > > > > >
> > > > > > > > Is there a reason why you are suggesting to copy out bytes
> from
> > > > `buf`
> > > > > > to
> > > > > > > `bytes`,
> > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int
> > > > > aToIndex,
> > > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > > >
> > > > > > > I see java8 doesn't have `Arrays.equals(int[] a, int
> aFromIndex,
> > > int
> > > > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > > > >
> > > > > > > Also, this code doesn't run in production. As the comment of
> > > > > > > System.lineSeparator():
> > > > > > >
> > > > > > > > On UNIX systems, it returns {@code "\n"}; on Microsoft
> > > > > > > > Windows systems it returns {@code "\r\n"}.
> > > > > > >
> > > > > > > So Mac and Linux just return one character, we will compare
> > > > > > > one byte directly.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > To Feng:
> > > > > > >
> > > > > > > > Will they be written to the taskManager.log file by default
> > > > > > > > or the taskManager.out file?
> > > > > > >
> > > > > > > I prefer LOG as the default value for
> > taskmanager.system-out.mode.
> > > > > > > It's useful for job stability and doesn't introduce significant
> > > > impact
> > > > > to
> > > > > > > users. Also, our production has already used this feature for
> > > > > > > more than 1 years, it works well.
> > > > > > >
> > > > > > > However, I write the DEFAULT as the default value for
> > > > > > > taskmanager.system-out.mode, because when the community
> > introduces
> > > > > > > new options, the default value often selects the original
> > behavior.
> > > > > > >
> > > > > > > Looking forward to hear more thoughts from community about this
> > > > > > > default value.
> > > > > > >
> > > > > > > > If we can make taskmanager.out splittable and rolling, would
> it
> > > be
> > > > > > > > easier for users to use this feature?
> > > > > > >
> > > > > > > Making taskmanager.out splittable and rolling is a good choice!
> > > > > > > I have some concerns about it:
> > > > > > >
> > > > > > > 1. Users may also want to use LOG.info in their code and just
> > > > > > >   accidentally use System.out.println. It is possible that they
> > > will
> > > > > > >   also find the logs directly in taskmanager.log.
> > > > > > > 2. I'm not sure whether the rolling strategy is easy to
> > implement.
> > > > > > >   If we do it, it's necessary to define a series of flink
> options
> > > > > similar
> > > > > > >   to log options, such as: fileMax(how many files should be
> > > > retained),
> > > > > > >   fileSize(The max size each file), fileNamePatten (The suffix
> of
> > > > file
> > > > > > > name),
> > > > > > > 3. Check the file size periodically: all logs are written by
> log
> > > > > plugin,
> > > > > > >   they can check the log file size after writing. However,
> > > System.out
> > > > > > >   are written directly. And flink must start a thread to check
> > the
> > > > > latest
> > > > > > >   taskmanager.out size periodically. If it's too quick, most of
> > job
> > > > > > aren't
> > > > > > >   necessary. If it's too slow, the file size cannot be
> controlled
> > > > > > properly.
> > > > > > >
> > > > > > > Redirect it to LOG.info may be a reasonable and easy choice.
> > > > > > > The user didn't really want to log into taskmanager.out, it
> just
> > > > > > > happened by accident.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > To Hangxiang:
> > > > > > >
> > > > > > > > 1. I have a similar concern as Feng. Will we redirect to
> > another
> > > > log
> > > > > > file
> > > > > > > > not taskManager.log ?
> > > > > > >
> > > > > > > Please see my last comment, thanks!
> > > > > > >
> > > > > > > > taskManager.log contains lots of important information like
> > init
> > > > log.
> > > > > > It
> > > > > > > > will be rolled quickly if we redirect out and error here.
> > > > > > >
> > > > > > > IIUC, this issue isn't caused by System.out, and it can happen
> if
> > > > user
> > > > > > > call a lot of LOG.info. As I mentioned before: the user didn't
> > > really
> > > > > > want
> > > > > > > to log into taskmanager.out, it just happened by accident.
> > > > > > > So, if users change the System.out to LOG.info, it still
> happen.
> > > > > > >
> > > > > > > > 2. Since we have redirected to LOG mode, Could we also log
> the
> > > > > subtask
> > > > > > > info
> > > > > > > > ? It may help us to debug granularly.
> > > > > > >
> > > > > > > I'm not sure what `log the subtask info` means. Let me confirm
> > with
> > > > you
> > > > > > > first.
> > > > > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > > > > > userLogContext)?
> > > > > > >
> > > > > > > Best,
> > > > > > > Rui
> > > > > > >
> > > > > > > On Thu, Nov 9, 2023 at 11:47 AM Hangxiang Yu <
> > master...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Rui.
> > > > > > > > Thanks for the proposal. It sounds reasonable.
> > > > > > > > I have some questions, PTAL:
> > > > > > > > 1. I have a similar concern as Feng. Will we redirect to
> > another
> > > > log
> > > > > > file
> > > > > > > > not taskManager.log ?
> > > > > > > > taskManager.log contains lots of important information like
> > init
> > > > log.
> > > > > > It
> > > > > > > > will be rolled quickly if we redirect out and error here.
> > > > > > > > 2. Since we have redirected to LOG mode, Could we also log
> the
> > > > > subtask
> > > > > > > info
> > > > > > > > ? It may help us to debug granularly.
> > > > > > > >
> > > > > > > > On Thu, Nov 9, 2023 at 9:47 AM Feng Jin <
> jinfeng1...@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Rui.
> > > > > > > > >
> > > > > > > > > Thank you for initiating this proposal.
> > > > > > > > >
> > > > > > > > > I have a question regarding redirecting stdout and stderr
> to
> > > LOG:
> > > > > > > > >
> > > > > > > > > Will they be written to the taskManager.log file by default
> > or
> > > > the
> > > > > > > > > taskManager.out file?
> > > > > > > > > If we can make taskmanager.out splittable and rolling,
> would
> > it
> > > > be
> > > > > > > easier
> > > > > > > > > for users to use this feature?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Feng
> > > > > > > > >
> > > > > > > > > On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal
> > > > > > > <argo...@linkedin.com.invalid
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Rui,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal.
> > > > > > > > > >
> > > > > > > > > > The proposed solution of supporting System out and err to
> > be
> > > > > > > redirected
> > > > > > > > > to
> > > > > > > > > > LOG or discarded and introducing an enum and two options
> to
> > > > > manage
> > > > > > > > this,
> > > > > > > > > > seems reasonable.
> > > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Archit Goyal
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > From: Piotr Nowojski <pnowoj...@apache.org>
> > > > > > > > > > Date: Wednesday, November 8, 2023 at 7:38 AM
> > > > > > > > > > To: dev@flink.apache.org <dev@flink.apache.org>
> > > > > > > > > > Subject: Re: [DISCUSS] FLIP-390: Support System out and
> err
> > > to
> > > > be
> > > > > > > > > > redirected to LOG or discarded
> > > > > > > > > > Hi Rui,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal.
> > > > > > > > > >
> > > > > > > > > > +1 I don't have any major comments :)
> > > > > > > > > >
> > > > > > > > > > One nit. In `SystemOutRedirectToLog` in this code:
> > > > > > > > > >
> > > > > > > > > >            System.arraycopy(buf, count -
> > > LINE_SEPARATOR_LENGTH,
> > > > > > > bytes,
> > > > > > > > 0,
> > > > > > > > > > LINE_SEPARATOR_LENGTH);
> > > > > > > > > >             return Arrays.equals(LINE_SEPARATOR_BYTES,
> > bytes)
> > > > > > > > > >
> > > > > > > > > > Is there a reason why you are suggesting to copy out
> bytes
> > > from
> > > > > > `buf`
> > > > > > > > to
> > > > > > > > > > `bytes`,
> > > > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex,
> > int
> > > > > > > aToIndex,
> > > > > > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Piotrek
> > > > > > > > > >
> > > > > > > > > > śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com>
> > > > > napisał(a):
> > > > > > > > > >
> > > > > > > > > > > Hi all!
> > > > > > > > > > >
> > > > > > > > > > > I would like to start a discussion of FLIP-390: Support
> > > > System
> > > > > > out
> > > > > > > > and
> > > > > > > > > > err
> > > > > > > > > > > to be redirected to LOG or discarded[1].
> > > > > > > > > > >
> > > > > > > > > > > In various production environments, either cloud native
> > or
> > > > > > physical
> > > > > > > > > > > machines, the disk space that Flink TaskManager can use
> > is
> > > > > > limited.
> > > > > > > > > > >
> > > > > > > > > > > In general, the flink users shouldn't use the
> > > > > > `System.out.println`
> > > > > > > in
> > > > > > > > > > > production,
> > > > > > > > > > > however this may happen when the number of Flink jobs
> and
> > > job
> > > > > > > > > developers
> > > > > > > > > > > is very large. Flink job may use System.out to output a
> > > large
> > > > > > > amount
> > > > > > > > of
> > > > > > > > > > > data
> > > > > > > > > > > to the taskmanager.out file. This file will not roll,
> it
> > > will
> > > > > > > always
> > > > > > > > > > > increment.
> > > > > > > > > > > Eventually the upper limit of what the TM can be used
> for
> > > is
> > > > > > > reached.
> > > > > > > > > > >
> > > > > > > > > > > We can support System out and err to be redirected to
> LOG
> > > or
> > > > > > > > discarded,
> > > > > > > > > > > the LOG can roll and won't increment forever.
> > > > > > > > > > >
> > > > > > > > > > > This feature is useful for SREs who maintain Flink
> > > > > environments,
> > > > > > > they
> > > > > > > > > can
> > > > > > > > > > > redirect System.out to LOG by default. Although the
> cause
> > > of
> > > > > this
> > > > > > > > > problem
> > > > > > > > > > > is
> > > > > > > > > > > that the user's code is not standardized, for SRE,
> > pushing
> > > > > users
> > > > > > to
> > > > > > > > > > modify
> > > > > > > > > > > the code one by one is usually a very time-consuming
> > > > operation.
> > > > > > > It's
> > > > > > > > > also
> > > > > > > > > > > useful for job stability where System.out is
> accidentally
> > > > used.
> > > > > > > > > > >
> > > > > > > > > > > Looking forward to your feedback, thanks~
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0
> > > > > > > > > > <https://cwiki.apache.org/confluence/x/4guZE>
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Rui
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best,
> > > > > > > > Hangxiang.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Hangxiang.
> > >
> >
>
>
> --
> Best,
> Hangxiang.
>

Reply via email to