From: Tom Herbert
Date: Sun, 3 Jan 2016 15:22:27 -0800
> Implement assembly routine for csum_partial for 64 bit x86. This
> primarily speeds up checksum calculation for smaller lengths such as
> those that are present when doing skb_postpull_rcsum when getting
> CHECKSUM_COMPLETE from device or a
On Mon, Jan 4, 2016 at 3:52 PM, Eric Dumazet wrote:
> On Mon, 2016-01-04 at 15:34 -0800, Tom Herbert wrote:
>> On Mon, Jan 4, 2016 at 2:36 PM, Eric Dumazet wrote:
>> > On Sun, 2016-01-03 at 15:22 -0800, Tom Herbert wrote:
>> > \...
>> >> +402: /* Length 2, align is 1, 3, or 5 */
>> >> + movb
On Mon, 2016-01-04 at 15:34 -0800, Tom Herbert wrote:
> On Mon, Jan 4, 2016 at 2:36 PM, Eric Dumazet wrote:
> > On Sun, 2016-01-03 at 15:22 -0800, Tom Herbert wrote:
> > \...
> >> +402: /* Length 2, align is 1, 3, or 5 */
> >> + movb(%rdi), %al
> >> + movb1(%rdi), %ah
> >
> > Looks
On 05.01.2016 00:34, Tom Herbert wrote:
On Mon, Jan 4, 2016 at 2:36 PM, Eric Dumazet wrote:
On Sun, 2016-01-03 at 15:22 -0800, Tom Herbert wrote:
\...
+402: /* Length 2, align is 1, 3, or 5 */
+ movb(%rdi), %al
+ movb1(%rdi), %ah
Looks like a movw (%rdi),%ax
Wouldn't that b
On Mon, Jan 4, 2016 at 2:36 PM, Eric Dumazet wrote:
> On Sun, 2016-01-03 at 15:22 -0800, Tom Herbert wrote:
> \...
>> +402: /* Length 2, align is 1, 3, or 5 */
>> + movb(%rdi), %al
>> + movb1(%rdi), %ah
>
> Looks like a movw (%rdi),%ax
>
Wouldn't that be an unaligned access?
>
> A
On Sun, 2016-01-03 at 15:22 -0800, Tom Herbert wrote:
\...
> +402: /* Length 2, align is 1, 3, or 5 */
> + movb(%rdi), %al
> + movb1(%rdi), %ah
Looks like a movw (%rdi),%ax
Also you probably should send this patch to x86 maintainers.
--
To unsubscribe from this list: send the l
On 04.01.2016 00:22, Tom Herbert wrote:
Implement assembly routine for csum_partial for 64 bit x86. This
primarily speeds up checksum calculation for smaller lengths such as
those that are present when doing skb_postpull_rcsum when getting
CHECKSUM_COMPLETE from device or after CHECKSUM_UNNECESSA
Implement assembly routine for csum_partial for 64 bit x86. This
primarily speeds up checksum calculation for smaller lengths such as
those that are present when doing skb_postpull_rcsum when getting
CHECKSUM_COMPLETE from device or after CHECKSUM_UNNECESSARY
conversion.
This implementation is sim