Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-06-07 Thread Mickael Maison
Hi Cong, Some of use cases I have in mind are more around validation that an operation was successful. - Let's say you trigger a reassignment to even out disk usage on some brokers. Your tool will submit the reassignment and wait for completion using the AlterPartitionReassignments and ListPartiti

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-06-03 Thread Cong Ding
Thank you, Mickael. One more question: are you imaging these tooling/automation to call this API at a very low frequency? since high-frequency calls to this API are prohibitively expensive. Can you give some examples of low-frequency call use cases? I can think of some high-frequency call use cases

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-06-03 Thread Mickael Maison
Hi Cong, Maybe some people can do without this KIP. But in many cases, especially around tooling and automation, it's useful to be able to retrieve disk utilization values via the Kafka API rather than interfacing with a metrics system. Does that clarify the motivation? Thanks, Mickael On Wed,

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-06-01 Thread Cong Ding
Thanks for the explanation. I think the question is that if we have disk utilization in our environment, what is the use case for KIP-827? The disk utilization in our environment can already do the job. Is there anything I missed? Thanks, Cong On Tue, May 31, 2022 at 2:57 AM Mickael Maison wrote

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-31 Thread Jun Rao
Hi, Mickael, Thanks for the explanation. The KIP looks to me now. Jun On Tue, May 31, 2022 at 6:44 AM Mickael Maison wrote: > Hi Jun, > > Igor answered to your question. > Users should rely on their host metrics to monitor disk usage. But > with tooling and automation it's sometimes not ideal

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-31 Thread Mickael Maison
Hi Jun, Igor answered to your question. Users should rely on their host metrics to monitor disk usage. But with tooling and automation it's sometimes not ideal to retrieve values from metrics. So exposing disk usage via the Kafka API will simplify coordinating disk operations. I've updated the mo

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-31 Thread Mickael Maison
Hi Raman, Unfortunately the replica size only includes the log files and it does not include indexes or other metadata files. Obviously any extra non Kafka files are also not included either. For these reasons, I decided to have a separate field with the actual usable space reported by the volume

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-31 Thread Mickael Maison
Hi Cong, Kafka does not expose disk utilization metrics. This is something you need to provide in your environment. You definitively should have a mechanism for exposing metrics from your Kafka broker hosts and you should absolutely monitor disk usage and have appropriate alerts. Thanks, Mickael

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-26 Thread Jun Rao
Hi, Igor, Thanks for the reply. I agree that this KIP could be useful for improving the tool for moving data across disks. It would be useful to clarify on the main motivation of the KIP. Also, DescribeLogDirsResponse already includes the size of each partition on a disk. So, it seems that Usable

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-26 Thread Igor Soarez
Hi, This can also be quite useful to make better use of existing functionality in the Kafka API — moving replicas between log directories via ALTER_REPLICA_LOG_DIRS. If usable space information is also available the caller can make better decisions using the same API. It means a more consistent

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-25 Thread Jun Rao
Hi, Mickael, Thanks for the KIP. Since this is mostly for monitoring and alerting, could we expose them as metrics instead of as part of the API? We already have a size metric per log. Perhaps we could extend that to add used/total metrics per disk? Thanks, Jun On Thu, May 19, 2022 at 10:21 PM

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Raman Verma
Hello Mikael, Thanks for the KIP. I see that the API response contains some information about each partition. ``` { "name": "PartitionSize", "type": "int64", "versions": "0+", "about": "The size of the log segments in this partition in bytes." } ``` Can this be summed up to provide a used space

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Cong Ding
Hey Mickael, Great KIP! I have one question: You mentioned "DescribeLogDirs is usually a low volume API. This change should not significantly affect the latency of this API." and "That would allow to easily validate whether disk operations (like a resize), or topic deletion (log deletion only ha

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-19 Thread Mickael Maison
Hi Ismael, 1. I'm fine dropping "Space" from the field name, I think the names are clear enough, I've updated the KIP. 2. These values are properties of the volume each log directory is into. If you have multiple log directories in the same volume, they will both return the usable and total size o

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-17 Thread Ismael Juma
Hi Mickael, Thanks for the KIP. Two questions: 1. Is `space` redundant? is `totalBytes` and `usableBytes` a more concise description of the same thing? 2. Is usable space a property of the log directory? What if you have multiple log directories in the same underlying OS partition? Ismael On T

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-16 Thread Divij Vaidya
Thanks for addressing my comments Mickael. No more comments/suggestions from my side. LGTM. Divij Vaidya On Tue, May 10, 2022 at 6:10 PM Mickael Maison wrote: > Hi Colin, > > Thanks for the suggestion. > > I guess there are pros and cons with both methods. In my mind I'm > expecting these val

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-10 Thread Mickael Maison
Hi Colin, Thanks for the suggestion. I guess there are pros and cons with both methods. In my mind I'm expecting these values to always be there in the long run (once people have upgraded to brokers that support this feature). So I thought having a primitive directly may be nicer to use in the fu

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-04 Thread Igor Soarez
Hi Mickael, Thanks for writing this KIP. This would be a very useful improvement! -- Igor On Thu, Apr 7, 2022, at 10:16 AM, Mickael Maison wrote: > Hi, > > I wrote a small KIP to expose the total and usable space of logdirs > via the DescribeLogDirs API: > https://cwiki.apache.org/confluence/dis

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-03 Thread Colin McCabe
Hi Mickael, Thanks for the KIP. In the API, I would suggest using an OptionalLong rather than a "magic value" of -1. best, Colin On Thu, Apr 7, 2022, at 02:16, Mickael Maison wrote: > Hi, > > I wrote a small KIP to expose the total and usable space of logdirs > via the DescribeLogDirs API: >

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-03 Thread Divij Vaidya
I understand your point but I think we could potentially improve the user experience here by sending different error codes for the following two different situations. Situation 1: "when broker hits an Exception accessing a logdir" -> UNKNOWN_SPACE Situation 2: "when feature is not supported" -> UNS

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-05-02 Thread Mickael Maison
Hi Divij, The new fields default to -1 in the protocol too. So in case a broker hits an Exception accessing a logdir and sends an error response back, the client will get -1. For this reason I think UNKNOWN_SPACE is slightly better than UNSUPPORTED. Thanks, Mickael On Fri, Apr 22, 2022 at 9:35 A

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-22 Thread Tom Bentley
Hi Mickael, Thanks for the KIP, I can see this would be useful. I guess you could have used optional tagged fields, rather than bumping the version, but then again I don't see it being particularly advantageous in this case either. Kind regards, Tom On Tue, 19 Apr 2022 at 10:23, Divij Vaidya

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-19 Thread Divij Vaidya
I have a minor suggestion below but overall KIP looks good to me to start a vote. *Reg#6* Would you consider replacing UNKNOWN_SPACE with UNSUPPORTED? UNSUPPORTED tells the user explicitly that the value is missing due to client/server version mismatch whereas with UNKNOWN_SPACE, the user is left

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-15 Thread Mickael Maison
Hi Luke, 7. I've updated the KIP to clarify these sizes are in bytes. Thanks, Mickael On Fri, Apr 15, 2022 at 12:16 PM Luke Chen wrote: > > Hi Mickael, > > Thanks for the KIP! > This is a good improvement. > > (3) +1 for not adding the number of files in the directory. Counting the > file numbe

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-15 Thread Luke Chen
Hi Mickael, Thanks for the KIP! This is a good improvement. (3) +1 for not adding the number of files in the directory. Counting the file numbers should be slow. (7) Could you make the fields clear in `DescribeLogDirsResponse`, to mention the returned number is size in Byte (or not?) Thank you.

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-15 Thread Mickael Maison
Hi, Thanks for the feedback. 3. Yes that's right. Also the number of file descriptors is really not a property of log directories. Administrators typically tracked that count per process and for the whole operating system. 5. That's a good point, I've updated the KIP to mention sizes will be cap

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-08 Thread Divij Vaidya
Thanks for replying. I still have a few lingering questions/comments. *Reg#1* Understood. I checked and the underlying system call is statvfs for unix systems which should be ok to call here. *Reg#2* Fair point. I checked again and yes, log.dir always means local storage even when tiered storage i

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-08 Thread Mickael Maison
Hi Divij, Thanks for taking a look! 1. In order to retrieve the sizes, the plan is to use getTotalSpace() and getUsableSpace() from java.nio.file.FileStore. The implementations may vary depending on the filesystem but these calls typically don't depend on the size of storage but instead just retu

Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-07 Thread Divij Vaidya
Hi Mickael Thanks for starting this. It is a very useful feature. Some initial thoughts (I am new to Kafka so please excuse if these are naive suggestions): 1. What is the impact on latency of the DescribeLogDirs API due to this change? Would calculating the totalSpace from each logdir be a bottl

[DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-04-07 Thread Mickael Maison
Hi, I wrote a small KIP to expose the total and usable space of logdirs via the DescribeLogDirs API: https://cwiki.apache.org/confluence/display/KAFKA/KIP-827%3A+Expose+logdirs+total+and+usable+space+via+Kafka+API Please take a look and let me know if you have any feedback. Thanks, Mickael