If you are referring to, for example:

-4611686018427387904 > 0
-4611686018427387904 > 4611686018427387903


It is because when we compare the bytes we do not treat them as longs at
all, so we just compare them based on bytes; I admit that if users's header
types have some semantic meanings (e.g. it is encoded from a long) they we
are forcing them to choose the encoder that obeys key lexicographic
ordering; but I felt it is more general than enforcing any fields that may
be used for log cleaner to be defined as a special type.

Guozhang



On Wed, Apr 11, 2018 at 10:18 AM, Guozhang Wang <wangg...@gmail.com> wrote:

> > I do not mean that it is "used", but if what you meant is that you
> would prefer to use that field instead of a header?
> > This is in relation to a previous point of yours:
>
> I think maybe we have a mis-communication here: I'm not against the idea
> of using headers, but just trying to argue that we could make `timestamp`
> field a special config value that is referring to the timestamp field in
> the metadata. So from log cleaner's pov:
>
> 1. if the config value is "offset", look into the offset field,
> 2. if the config value is "timestamp", look into the offset field;
> 2. otherwise, say the config value is "foo", search for key "foo" in the
> message header.
>
>
> > get super-inconsistent results, which make me reluctant to rely on it:
> https://codebunk.com/b/704211525/
>
> Hmm, could you elaborate which part of the results are inconsistent? I
> cannot tell directly from the console output of the code you posted.
>
>
>
> Guozhang
>
>
>
> On Wed, Apr 11, 2018 at 9:16 AM, Luís Cabral <
> luis_cab...@yahoo.com.invalid> wrote:
>
>> Hi Guozhang,
>>
>>
>> bq. I'm not sure I understand you statement that it is used to determine
>> the "version" of the record
>>
>> I do not mean that it is "used", but if what you meant is that you would
>> prefer to use that field instead of a header?
>> This is in relation to a previous point of yours:
>> >>> 1) I'm also in favor of making the `timestamp` a preserved config
>> value along with `offset`, for which we would not go into the headers to
>> look for the matching key, but directly look into the timestamp field of
>> the message.
>>
>>
>>
>> bq. Regarding the byte arrays: I think byte arrays are indeed
>> comparable, right?
>>
>> As far as I am aware, they are not comparable. Then again, I am not aware
>> of everything that exists everywhere :)
>> I just experimented with the code you mentioned and get
>> super-inconsistent results, which make me reluctant to rely on it:
>> https://codebunk.com/b/704211525/
>>
>>
>>
>> Thank you again for the comments.
>>
>
>
>
> --
> -- Guozhang
>



-- 
-- Guozhang

Reply via email to