Regarding the null bug: I had time to open a JIRA today. Looks like an
issue already exists: https://issues.apache.org/jira/browse/KAFKA-4750

Regarding scan order: I would gladly produce a sample that replicates this
behavior if you can confirm that you will perceive this as a defect. I
would really love to be able to do ordered prefixed range scans with
interactive queries. But if you don't think the lack of this facility is a
defect then I can't spend more time on this.

Thank you!

On Fri, Mar 17, 2017 at 1:18 PM, Dmitry Minkovsky <dminkov...@gmail.com>
wrote:

> Ah! Yes. Thank you! That make sense.
>
> Anyway, I _think_ that's not what I was doing given that all items were
> being routed to and then read from a partition identified by one key.
>
> On Fri, Mar 17, 2017 at 12:50 PM, Damian Guy <damian....@gmail.com> wrote:
>
>> > When you use Queryable State you are actually querying multiple
>>
>> > underlying stores, i.e., one per partition.
>> >
>> > Huh? I was only querying one partition. In my example, I have a user's
>> > posts. Upon creation, they are routed to a particular partition using a
>> > partitioner that hashes the post's user ID. The posts are then indexed
>> on
>> > that partition by prefixed keys using the method described above. When
>> > querying, I am only querying the one partition that has all of the
>> user's
>> > posts. As far as I know, I am not querying across multiple partitions.
>> > Furthermore, I did not even think this was possible, given the fact that
>> > Interactive Queries require you to manually forward requests that
>> should go
>> > to other partitions.
>> >
>> >
>> Each KafkaStreams instance is potentially responsible for multiple
>> partitions, so when you use Queryable State on a particular instance you
>> are querying all partitions for that store on the given instance.
>>
>>
>>
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Mar 16, 2017 at 2:11 PM, Damian Guy <damian....@gmail.com>
>> wrote:
>> >
>> > > I think what you are seeing is that the order is not guaranteed across
>> > > partitions. When you use Queryable State you are actually querying
>> > multiple
>> > > underlying stores, i.e., one per partition. The implementation
>> iterates
>> > > over one store/partition at a time, so the ordering will appear
>> random.
>> > > This could be improved
>> > >
>> > > The tombstone records appearing in the results seems like a bug.
>> > >
>> > > Thanks,
>> > > Damian
>> > >
>> > > On Thu, 16 Mar 2017 at 17:37 Matthias J. Sax <matth...@confluent.io>
>> > > wrote:
>> > >
>> > > > Can you check if the problem exist for 0.10.2, too? (0.10.2 is
>> > > > compatible to 0.10.1 broker -- so you can upgrade your Streams code
>> > > > independently from the brokers).
>> > > >
>> > > > About the range: I did double check this, and I guess my last answer
>> > was
>> > > > not correct, and range() should return ordered data, but I got a
>> follow
>> > > > up question: what the key type and serializer you use? Internally,
>> data
>> > > > is stored in serialized form and ordered according to
>> > > > `LexicographicByteArrayComparator` -- thus, if the serialized bytes
>> > > > don't reflect the order of the deserialized data, it returned range
>> > > > shows up unordered to you.
>> > > >
>> > > >
>> > > > -Matthias
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On 3/16/17 10:14 AM, Dmitry Minkovsky wrote:
>> > > > > Hi Matthias. Thank you for your response.
>> > > > >
>> > > > > Yes, I was able to reproduce the null issue reliably. I can't
>> open a
>> > > JIRA
>> > > > > at this time, but I can say I was using 0.10.1.0 and it was
>> trivial
>> > to
>> > > > > reproduce. Just send records and the tombstones to a table topic.
>> > Then
>> > > > scan
>> > > > > the range. You'll see the tombstones.
>> > > > >
>> > > > > Indeed, ranges are returned with no specific order. I'm not sure
>> what
>> > > you
>> > > > > mean that default stores are hash-based, but this ordering thing
>> is a
>> > > > shame
>> > > > > because it kind of kills the ability to use KS as a full fledged
>> DB
>> > > that
>> > > > > lets you index things like HBase (composite keys for lists of
>> items).
>> > > Is
>> > > > > that how RocksDB works? Just returns range scans in random order?
>> I
>> > > don't
>> > > > > know C++ so the documentation is a bit opaque to me. But what's
>> the
>> > > point
>> > > > > of scanning a range if the data comes in some random order? That
>> > being
>> > > > the
>> > > > > case, the number of possible use-case scenarios seem to become
>> > > > > significantly limited.
>> > > > >
>> > > > >
>> > > > > Thank you!
>> > > > > Dmitry
>> > > > >
>> > > > > On Tue, Mar 14, 2017 at 1:12 PM, Matthias J. Sax <
>> > > matth...@confluent.io>
>> > > > > wrote:
>> > > > >
>> > > > >>> However,
>> > > > >>>> for keys that have been tombstoned, it does return null for me.
>> > > > >>
>> > > > >> Sound like a bug. Can you reliable reproduce this? Would you mind
>> > > > >> opening a JIRA?
>> > > > >>
>> > > > >> Can you check if this happens for both cases: caching enabled and
>> > > > >> disabled? Or only for once case?
>> > > > >>
>> > > > >>
>> > > > >>> "No ordering guarantees are provided."
>> > > > >>
>> > > > >> That is correct. Internally, default stores are hash-based --
>> thus,
>> > we
>> > > > >> don't give a sorted list/iterator back. You could replace RocksDB
>> > > with a
>> > > > >> custom store though.
>> > > > >>
>> > > > >>
>> > > > >> -Matthias
>> > > > >>
>> > > > >>
>> > > > >> On 3/13/17 3:56 PM, Dmitry Minkovsky wrote:
>> > > > >>> I am using interactive streams to query tables:
>> > > > >>>
>> > > > >>>             ReadOnlyKeyValueStore<Messages.ByUserAndDate,
>> > > > >>> Messages.UserLetter> store
>> > > > >>>               = streams.store("view-user-drafts",
>> > > > >>> QueryableStoreTypes.keyValueStore());
>> > > > >>>
>> > > > >>> Documentation says that #range() should not return null values.
>> > > > However,
>> > > > >>> for keys that have been tombstoned, it does return null for me.
>> > > > >>>
>> > > > >>> Also, I noticed only just now that "No ordering guarantees are
>> > > > >> provided." I
>> > > > >>> haven't done enough testing or looked at the code carefully
>> enough
>> > > yet
>> > > > >> and
>> > > > >>> wonder if someone who knows could confirm: is this true? Is this
>> > > common
>> > > > >> to
>> > > > >>> all store implementations? I was hoping to use interactive
>> streams
>> > > like
>> > > > >>> HBase to scan ranges. It appears this is not possible.
>> > > > >>>
>> > > > >>> Thank you,
>> > > > >>> Dmitry
>> > > > >>>
>> > > > >>
>> > > > >>
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>>
>
>

Reply via email to