Hi Chia-Ping,

Thanks for the review. It looks like there is another thread
which don’t match to the discussion thread on the KIP, so
I change to response here.

chia_00: Renamed `setIncludeRemoteInfo(boolean)` to 
`includeRemoteInfo(boolean)`.

chia_01: Yes, like Kamal mentioned, the `PartitionSize` is
total size of local segments. The `RemoteLogSize` is total
size of remote segments. The `OnlyLocalLogSize` is size
of local segments which is not in remote segments.

chia_02: I agree with Kamal. Not all remote storage
implementer can provide information like `TotalBytes` and
`UsableBytes`. How about we add it when we need it.

—

Hi Kamal,

I added `OnlyLocalLogSize` to `DescribeLogDirsResponse`
and `onlyLocalSize` to `ReplicaInfo`. Thanks for the suggestion.

If there is no other discussion, I will start a vote thread tomorrow.

Thanks,
PoAn

> On Jul 21, 2025, at 12:26 PM, Kamal Chandraprakash 
> <kamal.chandraprak...@gmail.com> wrote:
> 
> Hi PoAn,
> 
>> Regarding `onlyLocalLogSize`, does it refer to local log size minus
> overlapping retention part?
> 
> Yes, correct.
> https://sourcegraph.com/github.com/apache/kafka/-/blob/storage/src/main/java/org/apache/kafka/storage/internals/log/UnifiedLog.java?L2005
> 
> Thanks,
> Kamal
> 
> On Mon, Jul 21, 2025 at 7:37 AM PoAn Yang <yangp...@gmail.com> wrote:
> 
>> Hi Kamal,
>> 
>> Thanks for the feedback. It’s helpful to understand the overlapping
>> parts between local and remote logs.
>> 
>> Regarding `onlyLocalLogSize`, does it refer to local log size minus
>> overlapping retention part? If so, I prefer to set the field when
>> IncludeRemoteInfo is true.
>> 
>> Renamed the `RemotePartitionSize` to `RemoteLogSize`.
>> 
>> Updated `PartitionSize` description.
>> 
>> ---
>> 
>> Hi Ally,
>> 
>> The log dirs can also include directories in remote storage. If we
>> introduce `describeRemoteLog`, it will be a new RPC. However,
>> for retrieving the remote log size, we don’t need a new RPC. We
>> can include this information in the existing `describeLogDirs` RPC.
>> If we want to provide more details about remote logs in the future,
>> we can consider introducing a new RPC at that time.
>> 
>> Thank you,
>> PoAn
>> 
>>> On Jun 27, 2025, at 3:46 PM, Kamal Chandraprakash <
>> kamal.chandraprak...@gmail.com> wrote:
>>> 
>>> Hi PoAn,
>>> 
>>> Thanks for the KIP! Having the remoteLogSize as part of the
>> DescribeLogDirs
>>> response will be useful to determine
>>> the exact cost of the topic.
>>> 
>>> Kafka uploads the segments eagerly. Assume a topic is configured with 48
>>> hrs of retention time and 12 hrs of
>>> local-retention time. Then, the remote storage might contain ~47 hrs
>>> (excluding the active segment) of data and
>>> local storage might contain 12 hrs of data. When you combine the
>>> (PartitionSize + RemotePartitionSize) sizes,
>>> then it might be ~60 hrs.
>>> 
>>> The current attributes in DescribeLogDirsResult:
>>> 
>>> 1. `PartitionSize` - provides the size of the local-log and
>>> 2. `RemotePartitionSize` - provides the size of the remote-log.
>>> 
>>> We may need another attribute to provide the size of the segments that
>>> exist only in the local-log: `onlyLocalLogSize`.
>>> The user can determine the cost of the topic as per their requirements.
>>> 
>>> nit:
>>> 1. Shall we rename the `RemotePartitionSize` to `RemoteLogSize` since all
>>> the replicas send the values in the response?
>>> 2. Shall we update the about/description of the `PartitionSize` attribute
>>> to mention that the size can be of local-log when remote storage is
>> enabled
>>> on the topic?
>>> 
>>> Thanks,
>>> Kamal
>>> 
>>> On Mon, Jun 16, 2025 at 8:03 PM PoAn Yang <yangp...@gmail.com> wrote:
>>> 
>>>> Hi all,
>>>> 
>>>> I would like to start a discussion thread about KIP-1187.
>>>> 
>>>> Please take a look and feel free to share any thought.
>>>> 
>>>> https://cwiki.apache.org/confluence/x/sYkhFg
>>>> 
>>>> Thanks,
>>>> PoAn
>> 
>> 

Reply via email to