It's trivial to not repeatedly access the array in the test case but for the data I'm actually using, which is deeply nested and gets accessed randomly, it would require a nontrivial amount of added complexity. It would partly negate the benefit of moving away from protos.
I still also have the concern from my initial email -- I get that 2^63 is a large limit but given that it's basically a termination condition on my entire app it worries me that I just have to trust that it won't get hit, either because of accessing it for a long time or because of bugs in the library. c On Tue, Feb 14, 2017 at 10:21 PM, Kenton Varda <[email protected]> wrote: > On Tue, Feb 14, 2017 at 5:19 AM, Christian Plesner Hansen < > [email protected]> wrote: > >> Thanks for the quick response! I've filed an issue, >> https://github.com/jparyani/pycapnp/issues/137. It's easy to reproduce. >> > > Yeah seems like a bug. > > >> Should I take what you're saying to mean that in the absence of bugs the >> reader should only ever count data that is actually, "physically", read >> against the limit? So if, say, I have a large array of structs and I read a >> field of one of those structs, it's only the size of that field that gets >> counted -- not the size of the struct or the size of the entire array? >> > > Technically, when dereferencing a pointer, the shallow size of the > pointed-to object (not counting any further objects it points to in turn) > is counted. Since struct arrays have flat layout, it would in fact count > the size of all the structs in the array every time you request the field > from the parent object. > > So this code: > > for i in range(0, count): > assert message.entries[i].f0 == i > assert message.entries[i].f1 == 10 + i > assert message.entries[i].f2 == 20 + i > assert message.entries[i].f3 == 30 + i > > Will count the full size of the array 4 * count times -- so, you will > traverse O(count^2) words. However, even then, it is unlikely that you are > hitting the traversal limit because you'd need the array to have on the > order of 2^30 (~1 billion) elements in order for the square of the count to > be near 2^60. > > That said, you definitely should rewrite your code like this: > > entries = message.entries > for i in range(0, count): > entry = entries[i] > assert entry.f0 == i > assert entry.f1 == 10 + i > assert entry.f2 == 20 + i > assert entry.f3 == 30 + i > > This avoids re-traversing the same pointer repeatedly, which not only > helps avoid the traversal limit, but will also make your code considerably > faster, because traversing a pointer is "sort of slow" given then need for > bounds checking and FFI overhead. > > -Kenton > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. Visit this group at https://groups.google.com/group/capnproto.
