2013/8/9 Florian Fainelli <f.faine...@gmail.com>: > I am looking at bgmac_dma_rx_read() and I do not quite understand why > you would need to copy data to the newly allocated SKB as it might > really be killing performance here. Looking at b44, the code path > doing this is just when the packet is smaller (say less than 256 > bytes) because in that case, the cost of a data cache invalidate might > be higher than a fresh allocation plus memcpy(). Rather, the logic I > would use is the following: > > - consume a packet from the DMA RX ring at a given index > - dma_sync_single_for_cpu() this packet > - call netif_receive_skb() for this packet > - allocate a new SKB for the same RX ring index > > Eventually if you realize that for small packets you had better do a > new allocation plus memcpy() (aka: copybreak) you could try that.
I've implemented that solution, but it didn't really help much :( Tx try: # readprofile -r; iperf -t 60 -c 192.168.1.218; readprofile | sort -nr ------------------------------------------------------------ Client connecting to 192.168.1.218, TCP port 5001 TCP window size: 20.5 KByte (default) ------------------------------------------------------------ [ 3] local 192.168.1.1 port 38653 connected with 192.168.1.218 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0-60.0 sec 614 MBytes 85.8 Mbits/sec 11814 total 0.0046 3179 *unknown* 1119 __do_softirq 2.5665 1037 __copy_user_common 1.4899 550 csum_partial 0.3852 388 tcp_transmit_skb 0.1537 336 tcp_sendmsg 0.0844 271 dev_hard_start_xmit 0.1656 247 nf_hook_slow 0.6938 237 r4k_dma_cache_wback_inv 1.0972 222 bgmac_poll 0.1888 183 kmem_cache_alloc 0.6267 174 bgmac_start_xmit 0.1570 172 __kmalloc 0.4886 163 tcp_v4_rcv 0.0619 162 tcp_write_xmit 0.0568 161 dev_queue_xmit 0.1227 __copy_user_common is still appearing, and I've still no idea abut that *unknown* Also no real improvement for Rx: # readprofile -r; iperf -s; readprofile | sort -nr ------------------------------------------------------------ Server listening on TCP port 5001 TCP window size: 85.3 KByte (default) ------------------------------------------------------------ [ 4] local 192.168.1.1 port 5001 connected with 192.168.1.218 port 56297 [ ID] Interval Transfer Bandwidth [ 4] 0.0-60.0 sec 1.25 GBytes 178 Mbits/sec ^C 11302 total 0.0044 4132 *unknown* 1860 csum_partial 1.3025 1720 __copy_user_common 2.4713 586 tcp_v4_rcv 0.2226 480 r4k_dma_cache_inv 2.0339 479 ip_rcv 0.5162 442 cpu_idle 4.6042 394 nf_hook_slow 1.1067 376 skb_copy_ubufs 0.7705 329 __netif_receive_skb 0.1685 266 ip_local_deliver_finish 0.4890 252 tcp_rcv_established 0.1620 234 __bzero 0.6573 222 bgmac_poll 0.1888 172 process_backlog 0.3739 164 mips_dma_map_page 0.9318 I'll post my patches soon, so you can verify my changes. -- Rafał _______________________________________________ openwrt-devel mailing list openwrt-devel@lists.openwrt.org https://lists.openwrt.org/cgi-bin/mailman/listinfo/openwrt-devel