Re: [DISCUSS] Improve history server with log support

2020-02-17 Thread SHI Xiaogang
Hi all, Thanks a lot for your interest. We are very interesting to contribute the trace system to the community. We will draft a design document and share it soon. The trace system actually is an complement to existing metric and logging systems, and definitely can not replace the logging system.

Re: [DISCUSS] Improve history server with log support

2020-02-17 Thread Rong Rong
Hi All, Thank you all for the prompt feedbacks. Based on the discussion I think this seems to be a very useful feature. I would start an initial draft of a design doc (or should it be a FLIP?) and share with the community. Hi Yang, Thanks for the interest and thanks for sharing the ideas. In f

Re: [DISCUSS] Improve history server with log support

2020-02-16 Thread Yang Wang
Hi Rong Rong, Thanks for starting this discussion. I think the log is an important part of improving user experience of Flink. The logs is very important for debugging problems or checking the expected output. Some users, especially for machine learning, print global steps or residual to the log

Re: [DISCUSS] Improve history server with log support

2020-02-14 Thread Venkata Sanath Muppalla
@Xiaogang Could please share more details about the trace mechanism you mentioned. As Rong mentioned, we are also working on something similar. On Fri, Feb 14, 2020, 9:12 AM Rong Rong wrote: > Thank you for the prompt feedbacks > > @Aljoscha. Yes you are absolutely correct - adding Hadoop depend

Re: [DISCUSS] Improve history server with log support

2020-02-14 Thread Rong Rong
Thank you for the prompt feedbacks @Aljoscha. Yes you are absolutely correct - adding Hadoop dependency to cluster runtime component is definitely not what we are proposing. We were trying to see how the community thinks about the idea of adding log support into History server. - The reference t

Re: [DISCUSS] Improve history server with log support

2020-02-13 Thread Aljoscha Krettek
Hi, what's the difference in approach to the mentioned related Jira Issue ([1])? I commented there because I'm skeptical about adding Hadoop-specific code to the generic cluster components. Best, Aljoscha [1] https://issues.apache.org/jira/browse/FLINK-14317 On 13.02.20 03:47, SHI Xiaogang

Re: [DISCUSS] Improve history server with log support

2020-02-12 Thread SHI Xiaogang
Hi Rong Rong, Thanks for the proposal. We are also suffering from some pains brought by history server. To address them, we propose a trace system, which is very similar to the metric system, for historical information. A trace is semi-structured information about events in Flink. Useful traces i

[DISCUSS] Improve history server with log support

2020-02-12 Thread Rong Rong
Hi All, Recently we have been experimenting using Flinkā€™s history server as a centralized debugging service for completed streaming jobs. Specifically, we dynamically generate links to access log files on the YARN host; in the meantime, we use the Flink history server to show job graphs, exceptio