As an update, I've tried to place the following messages 
to c++/src/capnp/arena.c++:

   SegmentMap* segments = nullptr;
   KJ_IF_MAYBE(s, *lock) {
     KJ_IF_MAYBE(segment, s->find(id.value)) {
       return *segment;
     }
     segments = s;
+  } else {
+    size_t this_id = 
std::hash<std::thread::id>{}(std::this_thread::get_id());
+    KJ_DBG("map doesn't exist", this_id, this);
   }

It looks like (just before the crash) multiple threads print "map doesn't 
exist" for the same 'this' value. It's as if lock did not work for some 
reason. I could not reproduce the issue in a pure capnp test yet.

For context, we have the same type of message with 2 segments printed in a 
high frequency. We have a stack of them being read by multiple readers. 
Apart from the mentioned exception being thrown, we often have segfaults in 
the insert() function.

On Tuesday, October 1, 2019 at 10:39:16 AM UTC-7, Cenk Oguz Saglam wrote:
>
> Thanks for the quick response Kenton.
>
> I was also suspecting a race condition. Thanks for checking the mutex. It 
> is very likely that the issue is due to our usage. I'll share what I find 
> as I debug this further.
>
> On Tuesday, October 1, 2019 at 9:30:42 AM UTC-7, Kenton Varda wrote:
>>
>> Hi Oguz,
>>
>> You can get better stack traces by compiling in debug mode (both Cap'n 
>> Proto itself, and your project). You should then see a symbolic trace 
>> instead of a bunch of addresses.
>>
>> This is a strange error, though. Looking at the code for 
>> ReaderArena::tryGetSegment(), the insert() call only happens after a find() 
>> call looking for the same key has failed. How could the inserted row 
>> already exist, then?
>>
>> Moreover, the whole sequence is performed under a mutex lock, seemingly 
>> ruling out any race conditions.
>>
>> I'm not sure what to say here. If you can come up with a self-contained 
>> test case that reproduces the issue, I'd be happy to debug.
>>
>> -Kenton
>>
>> On Tue, Oct 1, 2019 at 9:10 AM <[email protected]> wrote:
>>
>>> Thanks for this amazing software.
>>>
>>> We are using v0.7.0. I would like to ask help debugging the following 
>>> exception which we rarely but consistently get:
>>>
>>> terminate called after throwing an instance of 'kj::ExceptionImpl'
>>>   what():  kj/table.c++:44: failed: inserted row already exists in table
>>> stack: 7f6f7f0697 7f6f7f1ee3 7f6f802623 7f6f5dc9ab 7f6f5b4823 7f77fdd5bf 
>>> 557c6f2957 557c63df2b 7f78a30e13 7f78b0f087
>>>
>>> Our backtrace shows that we were trying to read from a proto, then the 
>>> following two functions in capnp were called:
>>>
>>>    - kj::Table<kj::HashMap<unsigned int, 
>>>    kj::Own<capnp::_::SegmentReader> >::Entry, 
>>>    kj::HashIndex<kj::HashMap<unsigned int, kj::Own<capnp::_::SegmentReader> 
>>>    >::Callbacks> >::insert(kj::HashMap<unsigned int, 
>>>    kj::Own<capnp::_::SegmentReader> >::Entry&&)
>>>    - capnp::_::ReaderArena::tryGetSegment(kj::Id<unsigned int, 
>>>    capnp::_::Segment>)
>>>
>>> Why would reading from proto trigger an insert call?
>>>
>>> How can I make use of the "stack: 7f6f7f0697 7f6f7f1ee3..." to debug 
>>> this further?
>>>
>>> If the way we read proto is OK, can this perhaps be caused by how we 
>>> populate the proto contents? Perhaps some missing or corrupt members?
>>>
>>> Thank you very much for help in advance,
>>> Oguz
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Cap'n Proto" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/capnproto/d6512e2e-b990-47c6-9892-a252c0c629c3%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/capnproto/d6512e2e-b990-47c6-9892-a252c0c629c3%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/874ea12b-865c-4acb-9b11-74cd0154ee63%40googlegroups.com.

Reply via email to