Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2020-01-09 Thread Colin McCabe
On Thu, Jan 9, 2020, at 16:39, Jose Garcia Sancio wrote: > Thanks Colin, > > LGTM in general. The Linux documentation ( > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/proc.txt?id=HEAD#n1644) > defines these metrics as > > read_bytes > > ---

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2020-01-09 Thread Colin McCabe
On Thu, Jan 9, 2020, at 16:34, Lucas Bradstreet wrote: > Hi Colin, > > This is a great idea, as it is very useful to have these metrics in > addition to the usual Kafka metrics given the impact of hitting disk > outside of page cache. Describing it as a gauge did initially strike me as > oldd, but

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2020-01-09 Thread Jose Garcia Sancio
Thanks Colin, LGTM in general. The Linux documentation ( https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/filesystems/proc.txt?id=HEAD#n1644) defines these metrics as read_bytes > -- > > I/O counter: bytes read > Attempt to count the number of bytes wh

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2020-01-09 Thread Lucas Bradstreet
Hi Colin, This is a great idea, as it is very useful to have these metrics in addition to the usual Kafka metrics given the impact of hitting disk outside of page cache. Describing it as a gauge did initially strike me as oldd, but given the way this is works it makes sense to me. /proc/[pid]/io

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2020-01-06 Thread Colin McCabe
On Tue, Dec 10, 2019, at 11:10, Magnus Edenhill wrote: > Hi Colin, > Hi Magnus, Thanks for taking a look. > aren't those counters (ever increasing), rather than gauges (fluctuating)? Since this is in the Kafka broker, we're using Yammer. This might be confusing, but Yammer's concept of a "co

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2019-12-10 Thread Magnus Edenhill
Hi Colin, aren't those counters (ever increasing), rather than gauges (fluctuating)? You also mention CPU usage as a side note, you could use getrusage(2)'s ru_utime (user) and ru_stime (sys) to allow the broker to monitor its own CPU usage. /Magnus Den tis 10 dec. 2019 kl 19:33 skrev Colin McC