[ 
https://issues.apache.org/jira/browse/KAFKA-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16065155#comment-16065155
 ] 

Guozhang Wang commented on KAFKA-4750:
--------------------------------------

[~evis] Inside RocksDB store, after the serialization, if we get "null" byte 
arrays (NOTE it is not "null" object that gets passed into the API) then we 
should always treat it as a delete call; i.e. the current implementation inside 
RocksDB is ok:

{code}
private void putInternal(byte[] rawKey, byte[] rawValue) {
        if (rawValue == null) {
            try {
                db.delete(wOptions, rawKey);
            } catch (RocksDBException e) {
               ...
            }
        } else {
            try {
                db.put(wOptions, rawKey, rawValue);
            } catch (RocksDBException e) {
                ...
            }
        }
    }
{code}

The question is, on the API layer do we want to enforce "null" object to 
indicate deletion as well. Currently we are a bit vague in this, I was 
proposing two options and make it clear:

1) Clarify in javadoc that null value in {{put(key, value)}} indicates 
deletion; if it is "null" object by-pass the serde and send "null" bytes 
directly into inner functions and vice verse for deserialization; do not 
enforce user customized serdes how to handle null values since we are not going 
to call them with null values any more.

2) Do NOT enforce in java doc that null value in {{put(key, value)}} indicates 
deletion; implement all {{delete(key)}} call directly throughout all the layers 
of stores instead of calling {{put(key, null)}}; recommend user customized 
serdes to handle null values themselves.

I am a bit inclined to the second option, and [~mjsax] seem to be favoring the 
first option. And I'd like to hear see how others think.

> KeyValueIterator returns null values
> ------------------------------------
>
>                 Key: KAFKA-4750
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4750
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.1.1, 0.11.0.0, 0.10.2.1
>            Reporter: Michal Borowiecki
>            Assignee: Evgeny Veretennikov
>              Labels: newbie
>         Attachments: DeleteTest.java
>
>
> The API for ReadOnlyKeyValueStore.range method promises the returned iterator 
> will not return null values. However, after upgrading from 0.10.0.0 to 
> 0.10.1.1 we found null values are returned causing NPEs on our side.
> I found this happens after removing entries from the store and I found 
> resemblance to SAMZA-94 defect. The problem seems to be as it was there, when 
> deleting entries and having a serializer that does not return null when null 
> is passed in, the state store doesn't actually delete that key/value pair but 
> the iterator will return null value for that key.
> When I modified our serilizer to return null when null is passed in, the 
> problem went away. However, I believe this should be fixed in kafka streams, 
> perhaps with a similar approach as SAMZA-94.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to