RE: Flink 1.11 job hit error "Job leader lost leadership" or "ResourceManager leader changed to new address null"

2021-03-27 Thread Colletta, Edward
FYI, we experience a similar error again, lost leadership but not due to timeout but a disconnect from zookeeper. This time I examined logs for other errors related to zookeeper and found the kafka cluster that uses the same zookeeper also was disconnected. We run on AWS and this seems to be

Re: How to visualize the results of Flink processing or aggregation?

2021-03-27 Thread Xiong Qiang
Thank you, @David Anderson and @Fuyao Li. This answered my question and cleared my confusions. On Fri, Mar 26, 2021 at 11:08 AM David Anderson wrote: > Prometheus is a metrics system; you can use Flink's Prometheus metrics > reporter to send metrics to Prometheus. > > Grafana can also be connect