I found a solution that I think I can live with:
https://dwrensha.github.io/capnproto-rust/2020/01/19/new-feature-to-allow-unaligned-buffers.html

On Tue, Jan 14, 2020 at 6:45 PM David Renshaw <[email protected]> wrote:

> Ah, indeed it does! Now I’m feeling silly for failing at reading
> comprehension. :)
>
> On Tue, Jan 14, 2020 at 6:33 PM Kenton Varda <[email protected]>
> wrote:
>
>> UnalignedFlatArrayMessageReader's own doc comment mentions this fact. :)
>>
>> FWIW I don't actually recommend using that class, but I was convinced to
>> add it when enough people demanded it.
>>
>>
>> -Kenton
>>
>> On Tue, Jan 14, 2020 at 5:13 PM David Renshaw <[email protected]>
>> wrote:
>>
>>> Ralf Jung, who knows a lot about software verification, suggests that
>>> capnproto-c++'s UnalignedFlatArrayMessageReader might cause undefined
>>> behavior even on x86_64:
>>>
>>> https://www.reddit.com/r/rust/comments/en9fmn/should_capnprotorust_force_users_to_worry_about/fedi5hk/?context=8&depth=9
>>>
>>> On Sat, Jan 11, 2020 at 11:11 AM David Renshaw <[email protected]>
>>> wrote:
>>>
>>>> Thanks for the feedback!
>>>>
>>>> I figured out how to get rustc to emit assembly for a variety of
>>>> targets. Results are in this blog post:
>>>> https://dwrensha.github.io/capnproto-rust/2020/01/11/unaligned-memory-access.html
>>>>
>>>> I don't think there's any case in which the extra copy will actually be
>>>> an out-of-line memcpy function call.
>>>>
>>>> - David
>>>>
>>>> On Fri, Jan 10, 2020 at 10:25 AM Kenton Varda <[email protected]>
>>>> wrote:
>>>>
>>>>> First, make sure you add the -O2 compiler option in godbolt, so that
>>>>> these are actually optimized. If you do that, `direct()` becomes two
>>>>> instructions (on both architectures), while `indirect()` on ARM is still 9
>>>>> instructions.
>>>>>
>>>>> It's true that on x86_64, this change will have no negative impact, as
>>>>> you observed. But that's specifically because x86_64 supports unaligned
>>>>> reads and writes, and so on this platform you don't actually need to 
>>>>> change
>>>>> anything to support unaligned buffers.
>>>>>
>>>>> On ARM, your example is generating an out-of-line function call to
>>>>> memcpy. I could be wrong, but I think this will be heavier than you are
>>>>> imagining. There are three issues:
>>>>>
>>>>> - The function call itself takes several instructions.
>>>>> - An out-of-line function call will force the compiler to be more
>>>>> conservative about optimizations around it. When a getter is inlined into 
>>>>> a
>>>>> larger function body, this could lead to a lot more overhead than is
>>>>> visible in the godbolt example. For example, caller-saved registers used 
>>>>> by
>>>>> that outer function would need to be saved and restored around each call.
>>>>> - The glibc implementation of memcpy() itself needs to be designed to
>>>>> handle any size of memcpy, and is optimized for larger, variable-sized
>>>>> copies, since small fixed copies would normally be inlined. Several
>>>>> branches will be needed even for a small copy.
>>>>>
>>>>> Here's the code:
>>>>> https://github.com/lattera/glibc/blob/master/string/memcpy.c
>>>>> And macros it depends on:
>>>>> https://github.com/lattera/glibc/blob/master/sysdeps/generic/memcopy.h
>>>>>
>>>>> It's hard to say how much effect all this would really have, but it
>>>>> would make me uncomfortable.
>>>>>
>>>>> But it might not be too hard to convince the compiler to generate a
>>>>> fixed sequence of byte copies, rather than a memcpy call. That could be a
>>>>> lot better. I'm kind of surprised that GCC doesn't optimize it this way
>>>>> automatically, TBH.
>>>>>
>>>>> BTW it looks like arm64 gets optimized to an unaligned load just like
>>>>> x86_64. So the future seems to be one where we don't need to worry about
>>>>> alignment anymore. Maybe that's a good argument for going ahead with this
>>>>> approach now.
>>>>>
>>>>> -Kenton
>>>>>
>>>>> On Thu, Jan 9, 2020 at 10:03 PM David Renshaw <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> I want to make it easy and safe for users of capnproto-rust to read
>>>>>> messages from unaligned buffers without copying.  (See this github
>>>>>> issue <https://github.com/capnproto/capnproto-rust/issues/101>.)
>>>>>>
>>>>>> Currently, a user must pass their unaligned buffer through unsafe fn
>>>>>> bytes_to_words()
>>>>>> <https://github.com/capnproto/capnproto-rust/blob/d1988731887b2bbb0ccb35c68b9292d98f317a48/capnp/src/lib.rs#L82-L88>,
>>>>>> asserting that they believe their hardware to be okay with unaligned 
>>>>>> reads.
>>>>>> In other words, we require that the user understand some tricky low-level
>>>>>> processor details, and that the user preclude their software from running
>>>>>> on many platforms.
>>>>>>
>>>>>> (With libraries like sqlite, zmq, redis, and many others, there
>>>>>> simply is no way to request that a buffer be aligned -- you are just 
>>>>>> given
>>>>>> an array of bytes. You can copy the bytes into an aligned buffer, but 
>>>>>> that
>>>>>> has a performance cost and a complexity cost (who owns the new buffer?).)
>>>>>>
>>>>>> I believe that it would be better for capnproto-rust to work natively
>>>>>> on unaligned buffers. In fact, I have a work-in-progress branch that
>>>>>> achieves this, essentially by changing a bunch of direct memory accesses
>>>>>> into tiny memcpy() calls. This c++ godbolt snippe
>>>>>> <https://godbolt.org/z/Wki7uy>t captures the main idea, and shows
>>>>>> that, on x86_64 at least, the extra indirection gets optimized away
>>>>>> completely. Indeed, my performance measurements so far support the
>>>>>> hypothesis that there will be no performance cost in the x86_64 case. For
>>>>>> processors that don't support unaligned access, the extra copy will still
>>>>>> be there (e.g. https://godbolt.org/z/qgsGMT), but I hypothesize that
>>>>>> it will be fast.
>>>>>>
>>>>>> All in all, this change seems to me like a big usability win. So I'm
>>>>>> wondering: have I missed anything in the above analysis? Are there good
>>>>>> reasons I shouldn't make the change?
>>>>>>
>>>>>> - David
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Cap'n Proto" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit
>>>>>> https://groups.google.com/d/msgid/capnproto/CABR6rW-JpiJntc0i7O4cVywzfvd2YnVp89BgYeJp_Gwzoc_Edg%40mail.gmail.com
>>>>>> <https://groups.google.com/d/msgid/capnproto/CABR6rW-JpiJntc0i7O4cVywzfvd2YnVp89BgYeJp_Gwzoc_Edg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/CABR6rW9KxCBMd4M9fB5LiDRoFEiLZ89zX_zRuccvzKSts8ZDAg%40mail.gmail.com.

Reply via email to