[ https://issues.apache.org/jira/browse/KAFKA-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16094629#comment-16094629 ]
Joe Wood commented on KAFKA-5243: --------------------------------- I'll raise a KIP for this as it's become essential for distributed queries to be scalable for us. A guess at the key range may end up resulting in millions of values, which causes unpredictable behavior. I agree, there needs to be a special contract for ordered keys - although I think the built in ones already comply to this. > Request to add row limit in ReadOnlyKeyValueStore range function > ---------------------------------------------------------------- > > Key: KAFKA-5243 > URL: https://issues.apache.org/jira/browse/KAFKA-5243 > Project: Kafka > Issue Type: Improvement > Components: streams > Affects Versions: 0.10.1.1 > Reporter: Joe Wood > > When using distributed queries across a cluster of stream stores it's quite > common to use query pagination to limit the number of rows returned. The > {{range}} function on {{ReadOnlyKeyValueStore}} only accepts the {{to}} and > {{from}} keys. This means that the query created either unncessarily > retrieves the entire range and manually limits the rows, or estimates the > range based on the key values. Neither options are ideal for processing > distributed queries. > This suggestion is to add an overload to the {{range}} function by adding a > third (or replacement second) argument as a suggested row limit count. This > means that the range of keys returned will not exceed the supplied count. > {code:java} > // Get an iterator over a given range of keys, limiting to limit elements. > KeyValueIterator<K,V> range(K from, K to, int limit) > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)