Hi Hangxiang,

Thanks for your feedback!

> I saw some users may debug operator correctness using print() or
System.out
> which may bring lots of info.
> It's fine if files are separate, otherwise these will make
taskMnanager.log
> roll quickly.
> But I think it's fine if we don't use it as default first.

Got it, we will support redirect System.out to LOG, and keep
original behaviour(DEFAULT enum) as default value.


> > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > userLogContext)?
>
> Yes, That's what I mean.

Sounds make sense, I added a new option to FLIP[1],
it's taskmanager.system-out.log.thread-name.enabled: false.

Why log the thread name here?

Flink's RichFunction can get subtaskId, but redirecting
System.out to LOG is public code, so it is difficult to get subtaskId.
After consideration, a workaround solution is to log thread name,
because the thread name of the default Task Thread contains the
task name and subtask id. And this solution can support all threads,
because some non-task threads may also call println.

[1] https://cwiki.apache.org/confluence/x/4guZE

Best,
Rui

On Wed, Nov 15, 2023 at 10:26 AM Hangxiang Yu <master...@gmail.com> wrote:

> Hi, Rui.
>
> As I mentioned before: the user didn't really want
> > to log into taskmanager.out, it just happened by accident.
> > So, if users change the System.out to LOG.info, it still happen.
>
> Thanks for the feedback.
> I saw some users may debug operator correctness using print() or System.out
> which may bring lots of info.
> It's fine if files are separate, otherwise these will make taskMnanager.log
> roll quickly.
> But I think it's fine if we don't use it as default first.
>
> Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > userLogContext)?
> >
>
> Yes, That's what I mean.
>
> On Sat, Nov 11, 2023 at 12:54 AM Piotr Nowojski <piotr.nowoj...@gmail.com>
> wrote:
>
> > Thanks! :)
> >
> > Best, Piotrek
> >
> > czw., 9 lis 2023 o 16:15 Rui Fan <1996fan...@gmail.com> napisał(a):
> >
> > > Hi Piotr,
> > >
> > > Thanks for your feedback!
> > >
> > > > Or implement your own loop? It shouldn't be more than a couple of
> > lines.
> > >
> > > Implementing it directly is fine, I have updated the FLIP.
> > > And this logic can be found in the  `isLineEnded` method.
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Nov 9, 2023 at 11:00 PM Piotr Nowojski <
> piotr.nowoj...@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Rui,
> > > >
> > > > > I see java8 doesn't have `Arrays.equals(int[] a, int aFromIndex,
> int
> > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > and java11 has it. Do you have any other suggestions for java8?
> > > >
> > > > Maybe use `ByteBuffer.wrap`?
> > > >
> > > > ByteBuffer.wrap(array, ..., ...).equals(ByteBuffer.wrap(array2, ...,
> > > ...))
> > > >
> > > > This shouldn't have overheads as far as I remember.
> > > >
> > > > Or implement your own loop? It shouldn't be more than a couple of
> > lines.
> > > >
> > > > Best,
> > > > Piotrek
> > > >
> > > > czw., 9 lis 2023 o 06:43 Rui Fan <1996fan...@gmail.com> napisał(a):
> > > >
> > > > > Hi Piotr, Archit, Feng and Hangxiang:
> > > > >
> > > > > Thanks a lot for your feedback!
> > > > >
> > > > > Following is my comment, please correct me if I misunderstood
> > anything!
> > > > >
> > > > > To Piotr:
> > > > >
> > > > > > Is there a reason why you are suggesting to copy out bytes from
> > `buf`
> > > > to
> > > > > `bytes`,
> > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int
> > > aToIndex,
> > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > >
> > > > > I see java8 doesn't have `Arrays.equals(int[] a, int aFromIndex,
> int
> > > > > aToIndex, int[] b, int bFromIndex, int bToIndex)`,
> > > > > and java11 has it. Do you have any other suggestions for java8?
> > > > >
> > > > > Also, this code doesn't run in production. As the comment of
> > > > > System.lineSeparator():
> > > > >
> > > > > > On UNIX systems, it returns {@code "\n"}; on Microsoft
> > > > > > Windows systems it returns {@code "\r\n"}.
> > > > >
> > > > > So Mac and Linux just return one character, we will compare
> > > > > one byte directly.
> > > > >
> > > > >
> > > > >
> > > > > To Feng:
> > > > >
> > > > > > Will they be written to the taskManager.log file by default
> > > > > > or the taskManager.out file?
> > > > >
> > > > > I prefer LOG as the default value for taskmanager.system-out.mode.
> > > > > It's useful for job stability and doesn't introduce significant
> > impact
> > > to
> > > > > users. Also, our production has already used this feature for
> > > > > more than 1 years, it works well.
> > > > >
> > > > > However, I write the DEFAULT as the default value for
> > > > > taskmanager.system-out.mode, because when the community introduces
> > > > > new options, the default value often selects the original behavior.
> > > > >
> > > > > Looking forward to hear more thoughts from community about this
> > > > > default value.
> > > > >
> > > > > > If we can make taskmanager.out splittable and rolling, would it
> be
> > > > > > easier for users to use this feature?
> > > > >
> > > > > Making taskmanager.out splittable and rolling is a good choice!
> > > > > I have some concerns about it:
> > > > >
> > > > > 1. Users may also want to use LOG.info in their code and just
> > > > >   accidentally use System.out.println. It is possible that they
> will
> > > > >   also find the logs directly in taskmanager.log.
> > > > > 2. I'm not sure whether the rolling strategy is easy to implement.
> > > > >   If we do it, it's necessary to define a series of flink options
> > > similar
> > > > >   to log options, such as: fileMax(how many files should be
> > retained),
> > > > >   fileSize(The max size each file), fileNamePatten (The suffix of
> > file
> > > > > name),
> > > > > 3. Check the file size periodically: all logs are written by log
> > > plugin,
> > > > >   they can check the log file size after writing. However,
> System.out
> > > > >   are written directly. And flink must start a thread to check the
> > > latest
> > > > >   taskmanager.out size periodically. If it's too quick, most of job
> > > > aren't
> > > > >   necessary. If it's too slow, the file size cannot be controlled
> > > > properly.
> > > > >
> > > > > Redirect it to LOG.info may be a reasonable and easy choice.
> > > > > The user didn't really want to log into taskmanager.out, it just
> > > > > happened by accident.
> > > > >
> > > > >
> > > > >
> > > > > To Hangxiang:
> > > > >
> > > > > > 1. I have a similar concern as Feng. Will we redirect to another
> > log
> > > > file
> > > > > > not taskManager.log ?
> > > > >
> > > > > Please see my last comment, thanks!
> > > > >
> > > > > > taskManager.log contains lots of important information like init
> > log.
> > > > It
> > > > > > will be rolled quickly if we redirect out and error here.
> > > > >
> > > > > IIUC, this issue isn't caused by System.out, and it can happen if
> > user
> > > > > call a lot of LOG.info. As I mentioned before: the user didn't
> really
> > > > want
> > > > > to log into taskmanager.out, it just happened by accident.
> > > > > So, if users change the System.out to LOG.info, it still happen.
> > > > >
> > > > > > 2. Since we have redirected to LOG mode, Could we also log the
> > > subtask
> > > > > info
> > > > > > ? It may help us to debug granularly.
> > > > >
> > > > > I'm not sure what `log the subtask info` means. Let me confirm with
> > you
> > > > > first.
> > > > > Do you mean like this: LOG.info("taskName {} : {}", taskName,
> > > > > userLogContext)?
> > > > >
> > > > > Best,
> > > > > Rui
> > > > >
> > > > > On Thu, Nov 9, 2023 at 11:47 AM Hangxiang Yu <master...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi, Rui.
> > > > > > Thanks for the proposal. It sounds reasonable.
> > > > > > I have some questions, PTAL:
> > > > > > 1. I have a similar concern as Feng. Will we redirect to another
> > log
> > > > file
> > > > > > not taskManager.log ?
> > > > > > taskManager.log contains lots of important information like init
> > log.
> > > > It
> > > > > > will be rolled quickly if we redirect out and error here.
> > > > > > 2. Since we have redirected to LOG mode, Could we also log the
> > > subtask
> > > > > info
> > > > > > ? It may help us to debug granularly.
> > > > > >
> > > > > > On Thu, Nov 9, 2023 at 9:47 AM Feng Jin <jinfeng1...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi, Rui.
> > > > > > >
> > > > > > > Thank you for initiating this proposal.
> > > > > > >
> > > > > > > I have a question regarding redirecting stdout and stderr to
> LOG:
> > > > > > >
> > > > > > > Will they be written to the taskManager.log file by default or
> > the
> > > > > > > taskManager.out file?
> > > > > > > If we can make taskmanager.out splittable and rolling, would it
> > be
> > > > > easier
> > > > > > > for users to use this feature?
> > > > > > >
> > > > > > > Best,
> > > > > > > Feng
> > > > > > >
> > > > > > > On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal
> > > > > <argo...@linkedin.com.invalid
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Rui,
> > > > > > > >
> > > > > > > > Thanks for the proposal.
> > > > > > > >
> > > > > > > > The proposed solution of supporting System out and err to be
> > > > > redirected
> > > > > > > to
> > > > > > > > LOG or discarded and introducing an enum and two options to
> > > manage
> > > > > > this,
> > > > > > > > seems reasonable.
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Archit Goyal
> > > > > > > >
> > > > > > > >
> > > > > > > > From: Piotr Nowojski <pnowoj...@apache.org>
> > > > > > > > Date: Wednesday, November 8, 2023 at 7:38 AM
> > > > > > > > To: dev@flink.apache.org <dev@flink.apache.org>
> > > > > > > > Subject: Re: [DISCUSS] FLIP-390: Support System out and err
> to
> > be
> > > > > > > > redirected to LOG or discarded
> > > > > > > > Hi Rui,
> > > > > > > >
> > > > > > > > Thanks for the proposal.
> > > > > > > >
> > > > > > > > +1 I don't have any major comments :)
> > > > > > > >
> > > > > > > > One nit. In `SystemOutRedirectToLog` in this code:
> > > > > > > >
> > > > > > > >            System.arraycopy(buf, count -
> LINE_SEPARATOR_LENGTH,
> > > > > bytes,
> > > > > > 0,
> > > > > > > > LINE_SEPARATOR_LENGTH);
> > > > > > > >             return Arrays.equals(LINE_SEPARATOR_BYTES, bytes)
> > > > > > > >
> > > > > > > > Is there a reason why you are suggesting to copy out bytes
> from
> > > > `buf`
> > > > > > to
> > > > > > > > `bytes`,
> > > > > > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int
> > > > > aToIndex,
> > > > > > > > int[] b, int bFromIndex, int bToIndex)`?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Piotrek
> > > > > > > >
> > > > > > > > śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com>
> > > napisał(a):
> > > > > > > >
> > > > > > > > > Hi all!
> > > > > > > > >
> > > > > > > > > I would like to start a discussion of FLIP-390: Support
> > System
> > > > out
> > > > > > and
> > > > > > > > err
> > > > > > > > > to be redirected to LOG or discarded[1].
> > > > > > > > >
> > > > > > > > > In various production environments, either cloud native or
> > > > physical
> > > > > > > > > machines, the disk space that Flink TaskManager can use is
> > > > limited.
> > > > > > > > >
> > > > > > > > > In general, the flink users shouldn't use the
> > > > `System.out.println`
> > > > > in
> > > > > > > > > production,
> > > > > > > > > however this may happen when the number of Flink jobs and
> job
> > > > > > > developers
> > > > > > > > > is very large. Flink job may use System.out to output a
> large
> > > > > amount
> > > > > > of
> > > > > > > > > data
> > > > > > > > > to the taskmanager.out file. This file will not roll, it
> will
> > > > > always
> > > > > > > > > increment.
> > > > > > > > > Eventually the upper limit of what the TM can be used for
> is
> > > > > reached.
> > > > > > > > >
> > > > > > > > > We can support System out and err to be redirected to LOG
> or
> > > > > > discarded,
> > > > > > > > > the LOG can roll and won't increment forever.
> > > > > > > > >
> > > > > > > > > This feature is useful for SREs who maintain Flink
> > > environments,
> > > > > they
> > > > > > > can
> > > > > > > > > redirect System.out to LOG by default. Although the cause
> of
> > > this
> > > > > > > problem
> > > > > > > > > is
> > > > > > > > > that the user's code is not standardized, for SRE, pushing
> > > users
> > > > to
> > > > > > > > modify
> > > > > > > > > the code one by one is usually a very time-consuming
> > operation.
> > > > > It's
> > > > > > > also
> > > > > > > > > useful for job stability where System.out is accidentally
> > used.
> > > > > > > > >
> > > > > > > > > Looking forward to your feedback, thanks~
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0
> > > > > > > > <https://cwiki.apache.org/confluence/x/4guZE>
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Rui
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best,
> > > > > > Hangxiang.
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Best,
> Hangxiang.
>

Reply via email to