Hi Colin, This is a great idea, as it is very useful to have these metrics in addition to the usual Kafka metrics given the impact of hitting disk outside of page cache. Describing it as a gauge did initially strike me as oldd, but given the way this is works it makes sense to me.
/proc/[pid]/io appears to only be supported as of kernel 2.6.20. Given that was released back in 2007, maybe it's safe enough to assume it exists, but I thought I would mention that anyway. Without bikeshedding the metric names, would including a "Total" in the name be better e.g. kafka.server:type=KafkaServer,name=DiskReadBytesTotal? Cheers, Lucas On Mon, Jan 6, 2020 at 5:28 PM Colin McCabe <cmcc...@apache.org> wrote: > On Tue, Dec 10, 2019, at 11:10, Magnus Edenhill wrote: > > Hi Colin, > > > > Hi Magnus, > > Thanks for taking a look. > > > aren't those counters (ever increasing), rather than gauges > (fluctuating)? > > Since this is in the Kafka broker, we're using Yammer. This might be > confusing, but Yammer's concept of a "counter" is not actually monotonic. > It can decrease as well as increase. > > In general Yammer counters require you to call inc(amount) or dec(amount) > on them. This doesn't match up with what we need to do here, which is to > (essentially) make a callback into the kernel by reading from /proc. > > The counter/gauge dichotomy doesn't affect the JMX, (I think?), so it's > really kind of an implementation detail. > > > > > You also mention CPU usage as a side note, you could use getrusage(2)'s > > ru_utime (user) and ru_stime (sys) > > to allow the broker to monitor its own CPU usage. > > > > Interesting idea. It might be better to save that for a future KIP, > though, to avoid scope creep. > > best, > Colin > > > /Magnus > > > > Den tis 10 dec. 2019 kl 19:33 skrev Colin McCabe <cmcc...@apache.org>: > > > > > Hi all, > > > > > > I wrote KIP about adding support for exposing disk read and write > > > metrics. Check it out here: > > > > > > https://cwiki.apache.org/confluence/x/sotSC > > > > > > best, > > > Colin > > > > > >