Hi Piotr, Archit, Feng and Hangxiang: Thanks a lot for your feedback!
Following is my comment, please correct me if I misunderstood anything! To Piotr: > Is there a reason why you are suggesting to copy out bytes from `buf` to `bytes`, > instead of using `Arrays.equals(int[] a, int aFromIndex, int aToIndex, int[] b, int bFromIndex, int bToIndex)`? I see java8 doesn't have `Arrays.equals(int[] a, int aFromIndex, int aToIndex, int[] b, int bFromIndex, int bToIndex)`, and java11 has it. Do you have any other suggestions for java8? Also, this code doesn't run in production. As the comment of System.lineSeparator(): > On UNIX systems, it returns {@code "\n"}; on Microsoft > Windows systems it returns {@code "\r\n"}. So Mac and Linux just return one character, we will compare one byte directly. To Feng: > Will they be written to the taskManager.log file by default > or the taskManager.out file? I prefer LOG as the default value for taskmanager.system-out.mode. It's useful for job stability and doesn't introduce significant impact to users. Also, our production has already used this feature for more than 1 years, it works well. However, I write the DEFAULT as the default value for taskmanager.system-out.mode, because when the community introduces new options, the default value often selects the original behavior. Looking forward to hear more thoughts from community about this default value. > If we can make taskmanager.out splittable and rolling, would it be > easier for users to use this feature? Making taskmanager.out splittable and rolling is a good choice! I have some concerns about it: 1. Users may also want to use LOG.info in their code and just accidentally use System.out.println. It is possible that they will also find the logs directly in taskmanager.log. 2. I'm not sure whether the rolling strategy is easy to implement. If we do it, it's necessary to define a series of flink options similar to log options, such as: fileMax(how many files should be retained), fileSize(The max size each file), fileNamePatten (The suffix of file name), 3. Check the file size periodically: all logs are written by log plugin, they can check the log file size after writing. However, System.out are written directly. And flink must start a thread to check the latest taskmanager.out size periodically. If it's too quick, most of job aren't necessary. If it's too slow, the file size cannot be controlled properly. Redirect it to LOG.info may be a reasonable and easy choice. The user didn't really want to log into taskmanager.out, it just happened by accident. To Hangxiang: > 1. I have a similar concern as Feng. Will we redirect to another log file > not taskManager.log ? Please see my last comment, thanks! > taskManager.log contains lots of important information like init log. It > will be rolled quickly if we redirect out and error here. IIUC, this issue isn't caused by System.out, and it can happen if user call a lot of LOG.info. As I mentioned before: the user didn't really want to log into taskmanager.out, it just happened by accident. So, if users change the System.out to LOG.info, it still happen. > 2. Since we have redirected to LOG mode, Could we also log the subtask info > ? It may help us to debug granularly. I'm not sure what `log the subtask info` means. Let me confirm with you first. Do you mean like this: LOG.info("taskName {} : {}", taskName, userLogContext)? Best, Rui On Thu, Nov 9, 2023 at 11:47 AM Hangxiang Yu <master...@gmail.com> wrote: > Hi, Rui. > Thanks for the proposal. It sounds reasonable. > I have some questions, PTAL: > 1. I have a similar concern as Feng. Will we redirect to another log file > not taskManager.log ? > taskManager.log contains lots of important information like init log. It > will be rolled quickly if we redirect out and error here. > 2. Since we have redirected to LOG mode, Could we also log the subtask info > ? It may help us to debug granularly. > > On Thu, Nov 9, 2023 at 9:47 AM Feng Jin <jinfeng1...@gmail.com> wrote: > > > Hi, Rui. > > > > Thank you for initiating this proposal. > > > > I have a question regarding redirecting stdout and stderr to LOG: > > > > Will they be written to the taskManager.log file by default or the > > taskManager.out file? > > If we can make taskmanager.out splittable and rolling, would it be easier > > for users to use this feature? > > > > Best, > > Feng > > > > On Thu, Nov 9, 2023 at 3:15 AM Archit Goyal <argo...@linkedin.com.invalid > > > > wrote: > > > > > Hi Rui, > > > > > > Thanks for the proposal. > > > > > > The proposed solution of supporting System out and err to be redirected > > to > > > LOG or discarded and introducing an enum and two options to manage > this, > > > seems reasonable. > > > > > > +1 > > > > > > Thanks, > > > Archit Goyal > > > > > > > > > From: Piotr Nowojski <pnowoj...@apache.org> > > > Date: Wednesday, November 8, 2023 at 7:38 AM > > > To: dev@flink.apache.org <dev@flink.apache.org> > > > Subject: Re: [DISCUSS] FLIP-390: Support System out and err to be > > > redirected to LOG or discarded > > > Hi Rui, > > > > > > Thanks for the proposal. > > > > > > +1 I don't have any major comments :) > > > > > > One nit. In `SystemOutRedirectToLog` in this code: > > > > > > System.arraycopy(buf, count - LINE_SEPARATOR_LENGTH, bytes, > 0, > > > LINE_SEPARATOR_LENGTH); > > > return Arrays.equals(LINE_SEPARATOR_BYTES, bytes) > > > > > > Is there a reason why you are suggesting to copy out bytes from `buf` > to > > > `bytes`, > > > instead of using `Arrays.equals(int[] a, int aFromIndex, int aToIndex, > > > int[] b, int bFromIndex, int bToIndex)`? > > > > > > Best, > > > Piotrek > > > > > > śr., 8 lis 2023 o 11:53 Rui Fan <1996fan...@gmail.com> napisał(a): > > > > > > > Hi all! > > > > > > > > I would like to start a discussion of FLIP-390: Support System out > and > > > err > > > > to be redirected to LOG or discarded[1]. > > > > > > > > In various production environments, either cloud native or physical > > > > machines, the disk space that Flink TaskManager can use is limited. > > > > > > > > In general, the flink users shouldn't use the `System.out.println` in > > > > production, > > > > however this may happen when the number of Flink jobs and job > > developers > > > > is very large. Flink job may use System.out to output a large amount > of > > > > data > > > > to the taskmanager.out file. This file will not roll, it will always > > > > increment. > > > > Eventually the upper limit of what the TM can be used for is reached. > > > > > > > > We can support System out and err to be redirected to LOG or > discarded, > > > > the LOG can roll and won't increment forever. > > > > > > > > This feature is useful for SREs who maintain Flink environments, they > > can > > > > redirect System.out to LOG by default. Although the cause of this > > problem > > > > is > > > > that the user's code is not standardized, for SRE, pushing users to > > > modify > > > > the code one by one is usually a very time-consuming operation. It's > > also > > > > useful for job stability where System.out is accidentally used. > > > > > > > > Looking forward to your feedback, thanks~ > > > > > > > > [1] > > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fx%2F4guZE&data=05%7C01%7Cargoyal%40linkedin.com%7C937821de7bd846e6b97408dbe070beae%7C72f988bf86f141af91ab2d7cd011db47%7C0%7C0%7C638350547072823674%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=zEv6B0Xiq2SNuU6Fm%2BAXnH%2BRILbm6Q0uDRbN7h6iaPM%3D&reserved=0 > > > <https://cwiki.apache.org/confluence/x/4guZE> > > > > > > > > Best, > > > > Rui > > > > > > > > > > > > -- > Best, > Hangxiang. >