Re: [capnproto] How to not have to worry about the traversal limit?

Christian Plesner Hansen Tue, 14 Feb 2017 22:52:28 -0800

It's trivial to not repeatedly access the array in the test case but for
the data I'm actually using, which is deeply nested and gets accessed
randomly, it would require a nontrivial amount of added complexity. It
would partly negate the benefit of moving away from protos.


I still also have the concern from my initial email -- I get that 2^63 is a
large limit but given that it's basically a termination condition on my
entire app it worries me that I just have to trust that it won't get hit,
either because of accessing it for a long time or because of bugs in the
library.


c

On Tue, Feb 14, 2017 at 10:21 PM, Kenton Varda <[email protected]> wrote:

> On Tue, Feb 14, 2017 at 5:19 AM, Christian Plesner Hansen <
> [email protected]> wrote:
>
>> Thanks for the quick response! I've filed an issue,
>> https://github.com/jparyani/pycapnp/issues/137. It's easy to reproduce.
>>
>
> Yeah seems like a bug.
>
>
>> Should I take what you're saying to mean that in the absence of bugs the
>> reader should only ever count data that is actually, "physically", read
>> against the limit? So if, say, I have a large array of structs and I read a
>> field of one of those structs, it's only the size of that field that gets
>> counted -- not the size of the struct or the size of the entire array?
>>
>
> Technically, when dereferencing a pointer, the shallow size of the
> pointed-to object (not counting any further objects it points to in turn)
> is counted. Since struct arrays have flat layout, it would in fact count
> the size of all the structs in the array every time you request the field
> from the parent object.
>
> So this code:
>
>   for i in range(0, count):
>     assert message.entries[i].f0 == i
>     assert message.entries[i].f1 == 10 + i
>     assert message.entries[i].f2 == 20 + i
>     assert message.entries[i].f3 == 30 + i
>
> Will count the full size of the array 4 * count times -- so, you will
> traverse O(count^2) words. However, even then, it is unlikely that you are
> hitting the traversal limit because you'd need the array to have on the
> order of 2^30 (~1 billion) elements in order for the square of the count to
> be near 2^60.
>
> That said, you definitely should rewrite your code like this:
>
>   entries = message.entries
>   for i in range(0, count):
>     entry = entries[i]
>     assert entry.f0 == i
>     assert entry.f1 == 10 + i
>     assert entry.f2 == 20 + i
>     assert entry.f3 == 30 + i
>
> This avoids re-traversing the same pointer repeatedly, which not only
> helps avoid the traversal limit, but will also make your code considerably
> faster, because traversing a pointer is "sort of slow" given then need for
> bounds checking and FFI overhead.
>
> -Kenton
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Re: [capnproto] How to not have to worry about the traversal limit?

Reply via email to