Hi Guozhang, Thank you very much for the patience in explaining your points, I've learnt quite a bit in researching and experimenting after your replies.
bq. I still think it is worth defining `timestamp` as a special compaction value I don't personally see advantages in it, but also the only disadvantage that I can think of is putting multiple meanings on this field, which does not seem enough to dissuade anyone, so I've added it to the KIP as a compromise. (please also see the pull request in case you want to confirm the implementation matches your idea) bq. Should it be "the record with the highest value will be kept"? That is describing a scenario where the records being compared have the same value, in which case the offset is used as a tie-breaker. With trying to cover as much as possible, the "Proposed Changes" may have became confusing to read, sorry for that... bq. Users are then responsible to encode their compaction field according to the byte array lexico-ordering to full fill their ordering semantics. It is more flexible to enforce users to encode their compaction field always as a long type. This was indeed my focus on the previous replies, since I am not sure how this would work without adding a lot of responsibility on the client side. So, rather than trying to debate best practices, since I don't know which ones are being followed in this project, I will instead debate my own selfish need for this feature: Having it this way would jeopardize my own particular use case, as I need to have an incremental number representing the version (i.e.: 1, 2, 3, 5, 52, et cetera). It does not totally invalidate it, since we can always convert it to String on the client side and left-pad with 0's to the max length of a long, but it seems a shame to have to do this as it would increase the data transfer size (I'm trying to avoid it becoming a bottleneck during high throughput periods). This would likely mean that I would start abusing the "timestamp" approach discussed above, as it keeps the messages nimble, but it would again be a shame to be forced into such a hacky solution. This is how I see it, and why I would like to avoid it. But maybe there is some smarter way that you know of on how to handle it on the client side that would invalidate these concerns? Please let me know, and I would also greatly value some more feedback from other people regarding this topic, so please don't be shy! Kind Regards,Luis On Friday, April 20, 2018, 7:41:30 PM GMT+2, Guozhang Wang <wangg...@gmail.com> wrote: Hi Luís, What I'm thinking primarily is that we only need to compare the compaction values as LONG for the offset and timestmap "type" (I still think it is worth defining `timestamp` as a special compaction value, with the reasons below). Not sure if you've seen my other comment earlier regarding the offset / timestmap, I'm pasting / editing them here to illustrate my idea: -------------- I think maybe we have a mis-communication here: I'm not against the idea of using headers, but just trying to argue that we could make `timestamp` field a special config value that is referring to the timestamp field in the metadata. So from log cleaner's pov: 1. if the config value is "offset", look into the offset field, *comparing their value as long* 2. if the config value is "timestamp", look into the timestamp field, *comparing their value as long* 3. otherwise, say the config value is "foo", search for key "foo" in the message header, comparing the value as *byte arrays* I.e. "offset" and "timestamp" are treated as special cases other than case 3) above. -------------- I think your main concern is that "Although the byte[] can be compared, it is not actually comparable as the versioning is based on a long", while I'm thinking we can indeed generalize it: there is not hard reasons that the "compaction value" has to be a long, and since the goal of this KIP is to generalize the log compaction logic to consider header fields, why not allowing it to be of any types than enforcing them still to be a long type? Users are then responsible to encode their compaction field according to the byte array lexico-ordering to full fill their ordering semantics. It is more flexible to enforce users to encode their compaction field always as a long type. Let me know WDYT. Also I have some minor comments on the wiki itself: 1) "When both records being compared contain a matching "compaction value", then the record with the highest offset will be kept;" Should it be "the record with the highest value will be kept"? Guozhang On Fri, Apr 20, 2018 at 1:05 AM, Luís Cabral <luis_cab...@yahoo.com.invalid> wrote: > Guozhang, is this reply ok with you? > > > If you insist on the byte[] comparison directly, then I would need some > suggestions on how to represent a "version" with it, and then the KIP could > be changed to that. > On Tuesday, April 17, 2018, 2:44:16 PM GMT+2, Luís Cabral < > luis_cab...@yahoo.com> wrote: > > Oops, missed that email... > > bq. It is because when we compare the bytes we do not treat them as longs > atall, so we just compare them based on bytes; I admit that if users's > headertypes have some semantic meanings (e.g. it is encoded from a long) > they weare forcing them to choose the encoder that obeys key > lexicographicordering; but I felt it is more general than enforcing any > fields that maybe used for log cleaner to be defined as a special type. > > Yes, you can compare bytes between each other (its what that code does). > You can then assume (or infer) that the encoding used allows for > lexicographic ordering, which I hope you do not do a lot of. This is > (logically) the same as converting to String and then comparing the > strings, except that it allows for abstracting from the String encoding > (again, either with assumptions or with inferred knowledge). > This is purely academic, however, as the versioning is based on a long, > which is not compatible with this approach. So, is this comment a > fact-check stating that it is possible to compare byte[] overall, or is it > about trying to use it in this KIP? > > Cheers > > PS (because I'm stubborn): It is still not comparable, this comparison is > all based on assumptions about the content of the byte array, but I hope we > can leave this stuff to Stack Overflow instead of debating it here :) > > -- -- Guozhang