If you are referring to, for example: -4611686018427387904 > 0 -4611686018427387904 > 4611686018427387903
It is because when we compare the bytes we do not treat them as longs at all, so we just compare them based on bytes; I admit that if users's header types have some semantic meanings (e.g. it is encoded from a long) they we are forcing them to choose the encoder that obeys key lexicographic ordering; but I felt it is more general than enforcing any fields that may be used for log cleaner to be defined as a special type. Guozhang On Wed, Apr 11, 2018 at 10:18 AM, Guozhang Wang <wangg...@gmail.com> wrote: > > I do not mean that it is "used", but if what you meant is that you > would prefer to use that field instead of a header? > > This is in relation to a previous point of yours: > > I think maybe we have a mis-communication here: I'm not against the idea > of using headers, but just trying to argue that we could make `timestamp` > field a special config value that is referring to the timestamp field in > the metadata. So from log cleaner's pov: > > 1. if the config value is "offset", look into the offset field, > 2. if the config value is "timestamp", look into the offset field; > 2. otherwise, say the config value is "foo", search for key "foo" in the > message header. > > > > get super-inconsistent results, which make me reluctant to rely on it: > https://codebunk.com/b/704211525/ > > Hmm, could you elaborate which part of the results are inconsistent? I > cannot tell directly from the console output of the code you posted. > > > > Guozhang > > > > On Wed, Apr 11, 2018 at 9:16 AM, Luís Cabral < > luis_cab...@yahoo.com.invalid> wrote: > >> Hi Guozhang, >> >> >> bq. I'm not sure I understand you statement that it is used to determine >> the "version" of the record >> >> I do not mean that it is "used", but if what you meant is that you would >> prefer to use that field instead of a header? >> This is in relation to a previous point of yours: >> >>> 1) I'm also in favor of making the `timestamp` a preserved config >> value along with `offset`, for which we would not go into the headers to >> look for the matching key, but directly look into the timestamp field of >> the message. >> >> >> >> bq. Regarding the byte arrays: I think byte arrays are indeed >> comparable, right? >> >> As far as I am aware, they are not comparable. Then again, I am not aware >> of everything that exists everywhere :) >> I just experimented with the code you mentioned and get >> super-inconsistent results, which make me reluctant to rely on it: >> https://codebunk.com/b/704211525/ >> >> >> >> Thank you again for the comments. >> > > > > -- > -- Guozhang > -- -- Guozhang