Hi Igor, > IS1: Without an indication of how many pages or items are left, > it is impossible for the client to calculate progress. > Having the cursor in the response just indicates whether there > are more pages or not, it does not distinguish whether there is > just one left, or a thousand more to go. > Is it possible to include the indication, taking authorization > into account? I originally added this field after our offline discussion however after rethinking about it. I think there are a couple of misconceptions here (even with authorisation taken in account), 1. There is no consistency guarantee between requests (similar to KIP-966), so this number not necessarily accurate between requests and it just will add to the inconsistency. 2. The operator would not manually provide the next cursor or fire the pagination between requests manually instead AdminClient will handle this similar to how we handled it in KIP-966 check `KafkaAdminClient::describeTopics`. So technically the operator will never see this value or adjust the ResponsePagingationLimit in between requests.
Based on this I don’t see this adding much value especially that AdminClient is the one handling the iterating through the pages. > IS2: It is a shame that the pagination fields are different for > each RPC. This allows for divergences or mistakes as e.g. the KIP > currently includes GroupId under the Cursor for MetadataRequest. > I was wondering if instead of identifying the next object in > the cursor, we could instead indicate its index, always using > an integer value. Or perhaps the page number, considering the > value of ResponsePagingationLimit. Is is possible to have a standard > way to deal with pagination, and use consistently for all RPCs? Due to the inconsistency between requests I would say having the cursor as “item” instead of index would be better as adding resources between pagination requests that change the index might lead some odd cases as the request aren’t transparent regarding what resources we actually waiting for. For example, Next cursor : TopicPartition XX-0 has index 100 But by the time the next pagination request get fired someone else either a- created other topic partitions that pushed XX-0 down and take over index 100 instead which mean this pagination request will be redundant as it might return same resources as the previous page (depend on how many topic partitions were added). b- or someone deleted topic partition before index 100 making XX-0 and everything after index 100 jump to different position now fetching the next page with cursor 100 will lead to data loss in metadata and this case is much worse. We arguable can hit the similar issues with the proposed cursor but - the transparency about what you resources the cursor is targeting make the requests more clear. - I also don’t believe it lead to loss of metadata of old existing resources (as point b) and we only might miss new created resources if it happened to be before XX-0 if we sorted the resources or might return metadata for some recently deleted resources (and both cases existed without pagination as well). > IS3: I don't quite understand the following rejected alternative: > >> Make the pagination more generic by extending FieldSpec to include >> pagination flag that indicate if a field will be used for pagination >> or not. This would simplify the code however, it will be tricky to >> use on fields like TopicPartition which usually is defined using >> two fields topic and partition index which usually represented as >> nested field. We can consider this only for Response to add an >> extra flag to FieldSpec indicating that a field will have NextCursor >> field but it is not clear what will be the usage of this except >> making getting the next cursor maybe more generic to build but >> arguable we can do this without touching FieldSpec. > > The message specification format is only used in this project, > so we can extend it or redefine it freely. If pagination is implemented > in a consistent manner across RPCs, it would be valuable to have some > high-level way to enable it in the RPC definition. > Could you add a bit more detail about why that's not possible? The reason why I think it wouldn’t work is the nested fields like what we have in `OffsetFetchResponse` or `DescribeLogDirsResponse` when we have nested levels like topic partitions within consumer group or topic partitions within logDir. Which what this section is mentioning. I’ll update it to make it a bit clearer. >> it will be tricky to >> use on fields like TopicPartition which usually is defined using >> two fields topic and partition index which usually represented as >> Nested field Thanks Omnia > On 4 Aug 2024, at 11:29, Igor Soarez <soa...@apache.org> wrote: > > Hi Omnia, > > Thanks for this KIP, this is a valuable improvement, and > apologies for my late feedback on this. > > IS1: Without an indication of how many pages or items are left, > it is impossible for the client to calculate progress. > Having the cursor in the response just indicates whether there > are more pages or not, it does not distinguish whether there is > just one left, or a thousand more to go. > Is it possible to include the indication, taking authorization > into account? > > IS2: It is a shame that the pagination fields are different for > each RPC. This allows for divergences or mistakes as e.g. the KIP > currently includes GroupId under the Cursor for MetadataRequest. > I was wondering if instead of identifying the next object in > the cursor, we could instead indicate its index, always using > an integer value. Or perhaps the page number, considering the > value of ResponsePagingationLimit. Is is possible to have a standard > way to deal with pagination, and use consistently for all RPCs? > > IS3: I don't quite understand the following rejected alternative: > >> Make the pagination more generic by extending FieldSpec to include >> pagination flag that indicate if a field will be used for pagination >> or not. This would simplify the code however, it will be tricky to >> use on fields like TopicPartition which usually is defined using >> two fields topic and partition index which usually represented as >> nested field. We can consider this only for Response to add an >> extra flag to FieldSpec indicating that a field will have NextCursor >> field but it is not clear what will be the usage of this except >> making getting the next cursor maybe more generic to build but >> arguable we can do this without touching FieldSpec. > > The message specification format is only used in this project, > so we can extend it or redefine it freely. If pagination is implemented > in a consistent manner across RPCs, it would be valuable to have some > high-level way to enable it in the RPC definition. > Could you add a bit more detail about why that's not possible? > > Thanks, > > -- > Igor >