[dpdk-dev] Who can correct me about 82599 RSS Hash Function

2013-12-13 Thread chen_lp

Thank you very much.
Depending on your correct, the problem has been resolved.


? 2013?12?12? 23:17, Vladimir Medvedkin ??:
> Hi,
>
> First, I hope you configure
> port_conf->rx_adv_conf.rss_conf.rss_key and .rss_hf
> properly.
> Secondly,
>
> -for(j=0;j<8;j++){
> +for(j=7;j>=0;j--){
>
>
> Regards,
> Vladimir
>
> 2013/12/11 chen_lp at neusoft.com  
> mailto:chen_lp at neusoft.com>>
>
>
> I want calculate the NIC rss hash result by function,but the
> result is not right, I don't know where the wrong.
>
>
> struct mbf_cb{
> uint32_t sip;
> uint32_t dip;
> uint16_t sport;
> uint16_t dport;
> };
>
> static uint8_t test_rss[]={
> 0x6d,0x5a,0x56,0xda,0x25,0x5b,0x0e,0xc2,
> 0x41,0x67,0x25,0x3d,0x43,0xa3,0x8f,0xb0,
> 0xd0,0xca,0x2b,0xcb,0xae,0x7b,0x30,0xb4,
> 0x77,0xcb,0x2d,0xa3,0x80,0x30,0xf2,0x0c,
> 0x6a,0x42,0xb7,0x3b,0xbe,0xac,0x01,0xfa,
> };
>
> static uint8_t input_mask[]={
> 0x01,0x02,0x04,0x08,
> 0x10,0x20,0x40,0x80,
> };
>
>  mcb.sip=rte_cpu_to_be_32(IPv4(66,9,149,187));
>   mcb.dip=rte_cpu_to_be_32(IPv4(161,142,100,80));
>mcb.sport=rte_cpu_to_be_16(2794);
>mcb.dport=rte_cpu_to_be_16(1766);
>
>
> uint32_t compute_hash(uint8_t *input, int n)
> {
> int i,j,k;
> uint32_t result=0;
> uint32_t *lk;
> uint8_t rss_key[40];
>
> memcpy(rss_key,test_rss,40);
>
> lk=(uint32_t *)rss_key;
> for(i=0;i for(j=0;j<8;j++){
> if((input_mask[j])&input[i]){
> result^=*lk;
> }
>
> // shift k left 1 bit
> rss_key[0]=rss_key[0]<<1;
> for(k=1;k<40;k++){
> if(rss_key[k]&0x80){
> rss_key[k-1]|=0x01;
> }
> rss_key[k]=rss_key[k]<<1;
> }
> }
> }
> return result;
> }
>
> printf("rss_hash=%#x\n",compute_hash((uint8_t *)&mcb,sizeof(struct
> mbf_cb)));
>
> rss_hash=0x57476eca
>  but the right result is 0x51ccc178
>
>
>
>
>
>
>
> 
> ---
> Confidentiality Notice: The information contained in this e-mail
> and any accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If
> any reader of this communication is
> not the intended recipient, unauthorized use, forwarding,
> printing,  storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received
> this communication in error,please
> immediately notify the sender by return e-mail, and delete the
> original message and all copies from
> your system. Thank you.
> 
> ---
>
>

---
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please 
immediately notify the sender by return e-mail, and delete the original message 
and all copies from 
your system. Thank you. 
---


[dpdk-dev] kni vs. pmd

2013-12-13 Thread Jose Gavine Cueto
Hi Pashupati,

Thanks for mentioning the extra copy.  But I couldn't grasp much about "I
look at KNI as more for control path operation and PMDs for data path" .
 Could you please give a simple example if you have time ?

Thanks,
Pepe


On Fri, Dec 13, 2013 at 7:07 AM, Pashupati Kumar  wrote:

> > -Original Message-
> > From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Jose Gavine Cueto
> > Sent: Tuesday, December 10, 2013 3:16 PM
> > To: dev at dpdk.org
> > Subject: Re: [dpdk-dev] kni vs. pmd
> >
> > Additional question:
> >
> > Apart from the possible fact that kni performs zero-copy in the driver
> layer,
> > does this also apply on the sockets layer, or does the sockets
> operations (+
> > sys calls) are not avoided ?  This is assuming that the application uses
> regular
> > sockets to read/write to knis.
> If you are going to use KNI, there is a copy involved from iovec to RTE
> mbuf memory ( assuming you are going to use Ring library for communication
> between DPDK application and KNI). I look at KNI as more for control path
> operation and PMDs for data path.
> >
> > Cheers,
> > Pepe
> >
> >
> > On Wed, Dec 11, 2013 at 7:12 AM, Jose Gavine Cueto
> > wrote:
> >
> > > Hi,
> > >
> > > Correct me if I'm wrong, but in a high-level perspective I see that
> > > kni is providing an option for applications to use their regular
> interfaces
> > (e.g.
> > > sockets) and abstracts the usage of pmds.
> > >
> > > If this is somehow correct, are there any differences with regard to
> > > performance benefits that can be brought between directly using pmd
> > > apis and kni ?
> > >
> > > I see that kni is easier to use, however at first (no code inspection)
> > > look, it interfaces with the kernel which might have introduced some
> > > overhead.
> > >
> > > Cheers,
> > > Pepe
> > >
> > >
> > > --
> > > To stop learning is like to stop loving.
> > >
> >
> >
> >
> > --
> > To stop learning is like to stop loving.
>
> Thanks
> Pash
>



-- 
To stop learning is like to stop loving.


[dpdk-dev] [dpdk-ovs] Problem with OVS

2013-12-13 Thread Paul Barrette

On 12/12/2013 05:25 PM, Romulo Rosa wrote:
> Hi,
>
> i'm tryin to run OVS and i get a error. In documentation was mentioned a
> possible solution for this problem but it didnt work to me. Someone have
> any idea how to solve this problem?
> The uio module is loaded.
>
> *Command: *./datapath/dpdk/build/ovs_dpdk -c 0xf -n 4 -- -p 0xc
I think you have the wrong port mask.  Try using -p 0x3.

Pb
>   -n 2 -k 2
> --stats=1 --vswitchd=0 --client_switching_core=1 --config="(0,0,2),(1,0,3)"
>
> *Error: *
> EAL: Core 2 is ready (tid=43fd700)
> EAL: Core 3 is ready (tid=3bfc700)
> EAL: Core 1 is ready (tid=4bfe700)
> WARNING: requested port 2 not present - ignoring
> WARNING: requested port 3 not present - ignoring
> config = 16,0,2
> config = 17,0,3
> nb_cfg_params = 2
> Creating mbuf pool 'MProc_pktmbuf_pool' [14336 mbufs] ...
> HASH: Enabling CRC32 instruction for hashing
> APP: memzone address is 2ef33980
> EAL: Error - exiting with code: 1
>Cause: Cannot allocate memory for port tx_q details
>
> Thanks!



[dpdk-dev] outw() in virtio_ring_doorbell() in DPDK+virtio consume 40% of the CPU in oprofile

2013-12-13 Thread James Yu
I am using Spirent to send a 2Gbps traffic to a 10G port that are looped
back by l2fwd+DPDK+virtio in a CentOS 32-bit and receive on the other port
only at 700 Mbps.   The CentOS 32-bit is on a Fedora 18 KVM host. The
virtual interfaces are configured as virtio port type, not e1000. vhost-net
was automatically used in qemu-kvm when virtio ports are used in the guest.

The questions are
A. Why it can only reach 2Gbps
B. Why outw() is using 40% of the entire measurement when it only try to
write 2 bytes to the IO port using assembly outw command ? Is it a blocking
call ? or it wastes time is mapping from the IO address of the guest to the
physical address of the IO port on the host ?
C. any way to improve it ?
D. vmxnet PMD codes are using memory mapped IO address, not port IO
address. Will it be faster to use memory mapped IO address ?

Any pointers or feedback will help.
Thanks

James

---
While the traffic is on, I run a oprofile and oreport using the following
scripts on a seperate xterm window.
1. ./oprofile_start.sh
2. wait for 10 seconds
3. ./oprofile_stop.sh
::
oprofile_start.sh
::
#!/bin/bash
opcontrol --reset
opcontrol --deinit
modprobe oprofile timer=1
opcontrol --no-vmlinux --separate=cpu,thread --callgraph=10
--separate=kernel
opcontrol --session-dir=/root
opcontrol --start

::
oprofile_stop.sh
::
opcontrol --dump
opcontrol --stop
opcontrol --shutdown
opreport --session-dir=/root --details --merge tgid --symbols
/root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd

Profiling through timer interrupt
vma  samples  %image name   symbol name
0d36 5445 40.1105  librte_pmd_virtio.so outw
  0d54 5442 99.9449
  0d55 3 0.0551
3032 3513 25.8785  librte_pmd_virtio.so virtio_recv_buf

---
static void outw_jyu1(unsigned short int value, unsigned short int __port){
  __asm__ __volatile__ ("outw %w0,%w1": :"a" (value), "Nd" (__port));
}
---
This link
http://www.cs.nthu.edu.tw/~ychung/slides/Virtualization/VM-Lecture-2-3-IO%20Virtualization.pptx(page
17 ? 22) described about the how IO ports can be accessed.


[dpdk-dev] outw() in virtio_ring_doorbell() in DPDK+virtio consume 40% of the CPU in oprofile

2013-12-13 Thread James Yu
I am using Spirent to send a 2Gbps traffic to a 10G port that are looped
back by l2fwd+DPDK+virtio in a CentOS 32-bit and receive on the other port
only at 700 Mbps.   The CentOS 32-bit is on a Fedora 18 KVM host. The
virtual interfaces are configured as virtio port type, not e1000. vhost-net
was automatically used in qemu-kvm when virtio ports are used in the guest.

The questions are
A. Why it can only reach 2Gbps
B. Why outw() is using 40% of the entire measurement when it only try to
write 2 bytes to the IO port using assembly outw command ? Is it a blocking
call ? or it wastes time is mapping from the IO address of the guest to the
physical address of the IO port on the host ?
C. any way to improve it ?
D. vmxnet PMD codes are using memory mapped IO address, not port IO
address. Will it be faster to use memory mapped IO address ?

Any pointers or feedback will help.
Thanks

James

---
While the traffic is on, I run a oprofile and oreport using the following
scripts on a seperate xterm window.
1. ./oprofile_start.sh
2. wait for 10 seconds
3. ./oprofile_stop.sh
::
oprofile_start.sh
::
#!/bin/bash
opcontrol --reset
opcontrol --deinit
modprobe oprofile timer=1
opcontrol --no-vmlinux --separate=cpu,thread --callgraph=10
--separate=kernel
opcontrol --session-dir=/root
opcontrol --start

::
oprofile_stop.sh
::
opcontrol --dump
opcontrol --stop
opcontrol --shutdown
opreport --session-dir=/root --details --merge tgid --symbols /root/dpdk/
dpdk-1.3.1r2/examples/l2fwd/build/l2fwd

Profiling through timer interrupt
vma  samples  %image name   symbol name
0d36 5445 40.1105  librte_pmd_virtio.so outw
  0d54 5442 99.9449
  0d55 3 0.0551
3032 3513 25.8785  librte_pmd_virtio.so virtio_recv_buf

---
static void outw_jyu1(unsigned short int value, unsigned short int __port){
  __asm__ __volatile__ ("outw %w0,%w1": :"a" (value), "Nd" (__port));
}
---
This link
http://www.cs.nthu.edu.tw/~ychung/slides/Virtualization/VM-Lecture-2-3-IO%20Virtualization.pptx(page
17 ? 22) described about the how IO ports can be accessed.


[dpdk-dev] outw() in virtio_ring_doorbell() in DPDK+virtio consume 40% of the CPU in oprofile

2013-12-13 Thread James Yu
Resending it due to missing [dpdk-dev] in the subject line.

I am using Spirent to send a 2Gbps traffic to a 10G port that are looped
back by l2fwd+DPDK+virtio in a CentOS 32-bit and receive on the other port
only at 700 Mbps.   The CentOS 32-bit is on a Fedora 18 KVM host. The
virtual interfaces are configured as virtio port type, not e1000. vhost-net
was automatically used in qemu-kvm when virtio ports are used in the guest.

The questions are
A. Why it can only reach 2Gbps
B. Why outw() is using 40% of the entire measurement when it only try to
write 2 bytes to the IO port using assembly outw command ? Is it a blocking
call ? or it wastes time is mapping from the IO address of the guest to the
physical address of the IO port on the host ?
C. any way to improve it ?
D. vmxnet PMD codes are using memory mapped IO address, not port IO
address. Will it be faster to use memory mapped IO address ?

Any pointers or feedback will help.
Thanks

James

---
While the traffic is on, I run a oprofile and oreport using the following
scripts on a seperate xterm window.
1. ./oprofile_start.sh
2. wait for 10 seconds
3. ./oprofile_stop.sh
::
oprofile_start.sh
::
#!/bin/bash
opcontrol --reset
opcontrol --deinit
modprobe oprofile timer=1
opcontrol --no-vmlinux --separate=cpu,thread --callgraph=10
--separate=kernel
opcontrol --session-dir=/root
opcontrol --start

::
oprofile_stop.sh
::
opcontrol --dump
opcontrol --stop
opcontrol --shutdown
opreport --session-dir=/root --details --merge tgid --symbols
/root/dpdk/dpdk-1.3.1r2/examples/l2fwd/build/l2fwd

Profiling through timer interrupt
vma  samples  %image name   symbol name
0d36 5445 40.1105  librte_pmd_virtio.so outw
  0d54 5442 99.9449
  0d55 3 0.0551
3032 3513 25.8785  librte_pmd_virtio.so virtio_recv_buf

---
static void outw_jyu1(unsigned short int value, unsigned short int __port){
  __asm__ __volatile__ ("outw %w0,%w1": :"a" (value), "Nd" (__port));
}
---
This link
http://www.cs.nthu.edu.tw/~ychung/slides/Virtualization/VM-Lecture-2-3-IO%20Virtualization.pptx(page
17 ? 22) described about the how IO ports can be accessed.


[dpdk-dev] outw() in virtio_ring_doorbell() in DPDK+virtio consume 40% of the CPU in oprofile

2013-12-13 Thread Stephen Hemminger
On Fri, 13 Dec 2013 14:04:35 -0800
James Yu  wrote:

> Resending it due to missing [dpdk-dev] in the subject line.
> 
> I am using Spirent to send a 2Gbps traffic to a 10G port that are looped
> back by l2fwd+DPDK+virtio in a CentOS 32-bit and receive on the other port
> only at 700 Mbps.   The CentOS 32-bit is on a Fedora 18 KVM host. The
> virtual interfaces are configured as virtio port type, not e1000. vhost-net
> was automatically used in qemu-kvm when virtio ports are used in the guest.
> 
> The questions are
> A. Why it can only reach 2Gbps
> B. Why outw() is using 40% of the entire measurement when it only try to
> write 2 bytes to the IO port using assembly outw command ? Is it a blocking
> call ? or it wastes time is mapping from the IO address of the guest to the
> physical address of the IO port on the host ?
> C. any way to improve it ?
> D. vmxnet PMD codes are using memory mapped IO address, not port IO
> address. Will it be faster to use memory mapped IO address ?
> 
> Any pointers or feedback will help.
> Thanks
> 
> James

The outw is a VM exit to the hypervisor. It informs the hypervisor that data
is ready to send and it runs then. To really get better performance, virtio
needs to be able to do multiple packets per send. For bulk throughput
GSO support would help, but that is a generic DPDK issues.

Virtio use I/O to signal hypervisor (there is talk of using MMIO in later
versions but it won't be faster.