Thanks Fabian! At least now I know the bug is probably not in the driver where I was looking :)
On 16 September 2015 at 17:33, Fabian Hueske <fhue...@gmail.com> wrote: > Yes, probing the HashTable with a key that does not exist will yield a join > function call with a null value (or empty iterator in case of CoGroup). > > The semantics of the join are the same regardless of the hash table > implementation. > The fact that the error only occurs with the managed HT, indicates that > there is a bug somewhere :-( > > 2015-09-16 17:26 GMT+02:00 Vasiliki Kalavri <vasilikikala...@gmail.com>: > > > Hi, > > > > thanks a lot Fabian! > > > > I didn't know that join with the solution set is an outer join. That's a > > surprise :) > > > > So, if I understand correctly, I should have a null value when my other > > input to the join contains some key that doesn't exist in the solution > set, > > right? That's not the case in my application; I'm not generating any new > > keys. > > > > Also, when setting the solutionSetUnManaged option, the exception doesn't > > occur anymore. Are the join semantics different when the solution set is > in > > unmanaged memory? > > > > Cheers, > > Vasia. > > > > > > On 16 September 2015 at 16:50, Fabian Hueske <fhue...@gmail.com> wrote: > > > > > Hi Vasia, > > > > > > I looked into the code. A serializer should never return null when > > > deserializing. Either it does not detect that something went wrong with > > the > > > deserialization or it should throw an exception. > > > > > > Regarding the handling of null returns in the Drivers. If there is no > > entry > > > in the HT for a certain key, the HT will return null which is expected. > > > If a CoGroupWithSolutionSet*Driver receives a null value, it gives an > > empty > > > iterator to the user function. The JoinWithSolutionSet*Driver calls the > > > join function with a null value. Both behaviors are expected. A join > > with a > > > solution set is actually an outer join and a join function in such a > join > > > needs to be able to handle null values on the solution set side. > > > > > > Cheers, Fabian > > > > > > > > > 2015-09-15 17:41 GMT+02:00 Vasiliki Kalavri <vasilikikala...@gmail.com > >: > > > > > > > Hello to my squirrels, > > > > > > > > I ran into an NPE for some iterations code and it looks like what's > > > > described in FLINK-2443 < > > > https://issues.apache.org/jira/browse/FLINK-2443 > > > > >. > > > > I'm trying to understand the problem and I could really use your help > > :) > > > > > > > > So far, it seems that the exception is caused by a null value > returned > > by > > > > CompactingHashTable.*getMatchFor*(PT probeSideRecord). > > > > > > > > This method returns null in the following cases: > > > > - when the hash table is "closed" > > > > - when the segment is done > > > > - if the serializer actually returns a null record > > > > > > > > It seems that on the join/cogroup driver side there is no check or > > > special > > > > handling when the build side record is null, i.e. the null record is > > > still > > > > passed to the join function. > > > > Is this correct and if not, what should the driver do in this case? > > > > > > > > Thank you! > > > > > > > > Cheers, > > > > Vasia. > > > > > > > > > >