It's trivial to not repeatedly access the array in the test case but for
the data I'm actually using, which is deeply nested and gets accessed
randomly, it would require a nontrivial amount of added complexity. It
would partly negate the benefit of moving away from protos.

I still also have the concern from my initial email -- I get that 2^63 is a
large limit but given that it's basically a termination condition on my
entire app it worries me that I just have to trust that it won't get hit,
either because of accessing it for a long time or because of bugs in the
library.


c

On Tue, Feb 14, 2017 at 10:21 PM, Kenton Varda <[email protected]> wrote:

> On Tue, Feb 14, 2017 at 5:19 AM, Christian Plesner Hansen <
> [email protected]> wrote:
>
>> Thanks for the quick response! I've filed an issue,
>> https://github.com/jparyani/pycapnp/issues/137. It's easy to reproduce.
>>
>
> Yeah seems like a bug.
>
>
>> Should I take what you're saying to mean that in the absence of bugs the
>> reader should only ever count data that is actually, "physically", read
>> against the limit? So if, say, I have a large array of structs and I read a
>> field of one of those structs, it's only the size of that field that gets
>> counted -- not the size of the struct or the size of the entire array?
>>
>
> Technically, when dereferencing a pointer, the shallow size of the
> pointed-to object (not counting any further objects it points to in turn)
> is counted. Since struct arrays have flat layout, it would in fact count
> the size of all the structs in the array every time you request the field
> from the parent object.
>
> So this code:
>
>   for i in range(0, count):
>     assert message.entries[i].f0 == i
>     assert message.entries[i].f1 == 10 + i
>     assert message.entries[i].f2 == 20 + i
>     assert message.entries[i].f3 == 30 + i
>
> Will count the full size of the array 4 * count times -- so, you will
> traverse O(count^2) words. However, even then, it is unlikely that you are
> hitting the traversal limit because you'd need the array to have on the
> order of 2^30 (~1 billion) elements in order for the square of the count to
> be near 2^60.
>
> That said, you definitely should rewrite your code like this:
>
>   entries = message.entries
>   for i in range(0, count):
>     entry = entries[i]
>     assert entry.f0 == i
>     assert entry.f1 == 10 + i
>     assert entry.f2 == 20 + i
>     assert entry.f3 == 30 + i
>
> This avoids re-traversing the same pointer repeatedly, which not only
> helps avoid the traversal limit, but will also make your code considerably
> faster, because traversing a pointer is "sort of slow" given then need for
> bounds checking and FFI overhead.
>
> -Kenton
>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
Visit this group at https://groups.google.com/group/capnproto.

Reply via email to