On Fri, Apr 21, 2017 at 09:47:52AM +0000, Damjan Marion (damarion) wrote:
> 
> 
> > On 21 Apr 2017, at 04:10, Steven Luong (sluong) <slu...@cisco.com> wrote:
> > 
> > Eric,
> > 
> > How do you configure the startup.conf with multiple worker threads? Did you 
> > change both corelist-workers and workers? For example, this is how I 
> > configure 2 worker threads using core 2 and 14.
> > 
> >     corelist-workers 2,14
> >     workers 2
> > 
> > Any chance you can start vpp with gdb to get the backtrace to see where it 
> > went belly up?
> 
> Those 2 options are exclusive to each other, either corelist-workers or 
> workers should be used…
> 
> 

Yeah...  I just said skip the first four cours, and selected "workers 2"

Will try to get gdb going this morning.  Damjan, are there other settings
that should be changed as well when adding more workers?

Eric


> > 
> > Steven
> > 
> > On 4/20/17, 5:32 PM, "Ernst, Eric" <eric.er...@intel.com> wrote:
> > 
> >    Makes sense, thanks Steven.
> > 
> >    One more round of questions -- I expected the numbers I got between the 
> > two VMs (~2gpbs) given that I had just a single core running for VPP.  I 
> > went ahead and amended my startup.conf in order to make use of 2 and then 
> > again as 4 worker threads, all within the same socket.
> > 
> >    After booting the VMs and testing basic connectivity (ping!), I begin to 
> > either run ab and nginx, or just iperf between the VMs.  In either case, in 
> > short time VPP crashes.  Does this ring a bell?  I am still ramping on VPP 
> > and understand I likely am making some assumptions that are wrong.    
> > Guidance?
> > 
> >    With two workers:
> >    Apr 20 17:17:03 eernstworkstation systemd[1]: 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.device:
> >  Job 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.device/start
> >  timed out.
> >    Apr 20 17:17:03 eernstworkstation systemd[1]: Timed out waiting for 
> > device 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.device.
> >    Apr 20 17:17:03 eernstworkstation systemd[1]: Dependency failed for 
> > /dev/disk/by-uuid/def55f66-6b20-47c6-a02f-bdaf324ed3b7.
> >    Apr 20 17:17:03 eernstworkstation systemd[1]: 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.swap: 
> > Job 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.swap/start
> >  failed with result 'dependenc
> >    Apr 20 17:17:03 eernstworkstation systemd[1]: 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.device:
> >  Job 
> > dev-disk-by\x2duuid-def55f66\x2d6b20\x2d47c6\x2da02f\x2dbdaf324ed3b7.device/start
> >  failed with result 'timeo
> >    Apr 20 17:17:06 eernstworkstation vpp[38637]: /usr/bin/vpp[38637]: 
> > received signal SIGSEGV, PC 0x7f0d02b5b49c, faulting address 0x7f1cc12f5770
> >    Apr 20 17:17:06 eernstworkstation /usr/bin/vpp[38637]: received signal 
> > SIGSEGV, PC 0x7f0d02b5b49c, faulting address 0x7f1cc12f5770
> >    Apr 20 17:17:06 eernstworkstation systemd[1]: vpp.service: Main process 
> > exited, code=killed, status=6/ABRT
> >    Apr 20 17:17:06 eernstworkstation systemd[1]: vpp.service: Unit entered 
> > failed state.
> >    Apr 20 17:17:06 eernstworkstation systemd[1]: vpp.service: Failed with 
> > result 'signal'.
> >    Apr 20 17:17:06 eernstworkstation systemd[1]: vpp.service: Service 
> > hold-off time over, scheduling restart.
> > 
> >    Apr 20 17:17:06 eernstworkstation systemd[1]: Stopped vector packet 
> > processing engine.
> > 
> > 
> > 
> >    -----Original Message-----
> >    From: Steven Luong (sluong) [mailto:slu...@cisco.com] 
> >    Sent: Thursday, April 20, 2017 4:33 PM
> >    To: Ernst, Eric <eric.er...@intel.com>; Billy McFall <bmcf...@redhat.com>
> >    Cc: Damjan Marion (damarion) <damar...@cisco.com>; vpp-dev@lists.fd.io
> >    Subject: Re: [vpp-dev] Connectivity issue when using vhost-user on 17.04?
> > 
> >    Eric,
> > 
> >    In my testing, I notice my number is 2 to 3X better when coalesce is 
> > disabled. I am using Ivy Bridge. So it looks like the mileage varies a lot 
> > with Sandy Bridge, 40X better.
> > 
> >    What is coalesce?
> >    When the driver places descriptors into the vring, it may request 
> > interrupt or no interrupt after the device is done processing with the 
> > descriptors. If the driver wants interrupt, the device may send it 
> > immediately if coalesce is not enabled. If it is enabled, the device will 
> > delay posting the interrupt until more descriptors are received to meet the 
> > coalesce number. This is an attempt to reduce the number of interrupts 
> > generated to the driver. My guess is when coalesce is enabled, the 
> > application, iperf3 in this case, is not shooting packets as fast as it can 
> > until it receives the interrupt for the packets sent. Thus the total 
> > bandwidth number looks bad. By disabling coalesce, the application is 
> > shooting a lot more packets in the interval at the expense of more 
> > interrupts are generated in the VM.
> > 
> >    I don’t know why coalesce is enabled by default. This was done before I 
> > was born. Damjan or others may chime in for this and the answer for 2) as 
> > well. Show errors is all I know.
> > 
> >    Steven
> > 
> >    On 4/20/17, 3:54 PM, "Ernst, Eric" <eric.er...@intel.com> wrote:
> > 
> >        Steven,
> > 
> >        Thanks for the help.  As before, setup is described @ 
> > https://gist.github.com/egernst/5982ae6f0590cd83330faafacc3fd545 (updated 
> > since I no longer am using the evil feature mask).
> > 
> >        I'm going to need to read up on what coalesce frames setting is 
> > doing .... 
> > 
> >        Without that set, you can find my output from iperf3 appended.  No 
> > retransmissions in the output, and no errors observed on VPP side (that is, 
> > nothing notable in systemctl status vpp).
> > 
> >        When I set coalesce frames I see *major* improvements -- getting in 
> > the ballbark of what I would expect for a single thread; about 2 gbps.  
> > Phew -a major relief .   Couple things:
> >        1)  So, can you  tell me more about what this is doing, and why this 
> > isn't enabled by default.
> >        2) Is there a straight forward way to monitor VPP setup (particular 
> > counters) to identify where the issue is?
> > 
> >        Thanks again!
> > 
> >        Cheers,
> >        Eric
> > 
> >        -------
> >        *Server*:
> >        # iperf3 -s
> >        -----------------------------------------------------------
> >        Server listening on 5201
> >        -----------------------------------------------------------
> >        Accepted connection from 192.168.0.2, port 41058
> >        [  5] local 192.168.0.1 port 5201 connected to 192.168.0.2 port 41060
> >        [ ID] Interval           Transfer     Bandwidth
> >        [  5]   0.00-1.00   sec  12.8 MBytes   107 Mbits/sec
> >        [  5]   1.00-2.00   sec  7.93 MBytes  66.5 Mbits/sec
> >        [  5]   2.00-3.00   sec  7.94 MBytes  66.6 Mbits/sec
> >        [  5]   3.00-4.00   sec  5.37 MBytes  45.0 Mbits/sec
> >        [  5]   4.00-5.00   sec  5.29 MBytes  44.4 Mbits/sec
> >        [  5]   5.00-6.00   sec  4.28 MBytes  35.9 Mbits/sec
> >        [  5]   6.00-7.00   sec  4.14 MBytes  34.8 Mbits/sec
> >        [  5]   7.00-8.00   sec  4.14 MBytes  34.7 Mbits/sec
> >        [  5]   8.00-9.00   sec  4.14 MBytes  34.8 Mbits/sec
> >        [  5]   9.00-10.00  sec  4.14 MBytes  34.7 Mbits/sec
> >        [  5]  10.00-10.03  sec   133 KBytes  34.9 Mbits/sec
> >        - - - - - - - - - - - - - - - - - - - - - - - - -
> >        [ ID] Interval           Transfer     Bandwidth
> >        [  5]   0.00-10.03  sec  0.00 Bytes  0.00 bits/sec                  
> > sender
> >        [  5]   0.00-10.03  sec  60.3 MBytes  50.4 Mbits/sec                 
> >  receiver
> >        -----------------------------------------------------------
> >        Server listening on 5201
> >        -----------------------------------------------------------
> > 
> >        *Client*:
> >        # iperf3 -c 192.168.0.1
> >        Connecting to host 192.168.0.1, port 5201
> >        [  4] local 192.168.0.2 port 41060 connected to 192.168.0.1 port 5201
> >        [ ID] Interval           Transfer     Bandwidth       Retr  Cwnd
> >        [  4]   0.00-1.00   sec  13.8 MBytes   116 Mbits/sec    0   8.48 
> > KBytes
> >        [  4]   1.00-2.00   sec  8.05 MBytes  67.5 Mbits/sec    0   8.48 
> > KBytes
> >        [  4]   2.00-3.00   sec  7.74 MBytes  64.9 Mbits/sec    0   8.48 
> > KBytes
> >        [  4]   3.00-4.00   sec  5.28 MBytes  44.3 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   4.00-5.00   sec  5.28 MBytes  44.3 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   5.00-6.00   sec  4.35 MBytes  36.5 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   6.00-7.00   sec  4.04 MBytes  33.9 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   7.00-8.00   sec  4.35 MBytes  36.5 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   8.00-9.00   sec  4.04 MBytes  33.9 Mbits/sec    0   5.66 
> > KBytes
> >        [  4]   9.00-10.00  sec  4.04 MBytes  33.9 Mbits/sec    0   5.66 
> > KBytes
> >        - - - - - - - - - - - - - - - - - - - - - - - - -
> >        [ ID] Interval           Transfer     Bandwidth       Retr
> >        [  4]   0.00-10.00  sec  61.0 MBytes  51.2 Mbits/sec    0            
> >  sender
> >        [  4]   0.00-10.00  sec  60.3 MBytes  50.6 Mbits/sec                 
> >  receiver
> > 
> >        iperf Done.
> >        -----
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> >        -----Original Message-----
> >        From: Steven Luong (sluong) [mailto:slu...@cisco.com] 
> >        Sent: Thursday, April 20, 2017 3:05 PM
> >        To: Ernst, Eric <eric.er...@intel.com>; Billy McFall 
> > <bmcf...@redhat.com>
> >        Cc: Damjan Marion (damarion) <damar...@cisco.com>; 
> > vpp-dev@lists.fd.io
> >        Subject: Re: [vpp-dev] Connectivity issue when using vhost-user on 
> > 17.04?
> > 
> >        Eric,
> > 
> >        As a first step, please share the output of iperf3 to see how many 
> > retransmissions that you have for the run. From VPP, please collect show 
> > errors to see if vhost drops anything. As an additional data point for 
> > comparison, please also try disabling vhost coalesce to see if you get 
> > better result by adding the following configuration to /etc/vpp/startup.conf
> > 
> >        vhost-user {
> >          coalesce-frames 0
> >        }
> > 
> >        Steven
> > 
> >        On 4/20/17, 2:19 PM, "vpp-dev-boun...@lists.fd.io on behalf of 
> > Ernst, Eric" <vpp-dev-boun...@lists.fd.io on behalf of 
> > eric.er...@intel.com> wrote:
> > 
> >            Thanks Billy - it was through some examples that i had found 
> > that I ended up
> >            grabbing that.  I reinstalled 1704 and can verify connectivity 
> > when removing the
> >            evil feature-mask.
> > 
> >            Thanks for the quick feedback, Damjan.  If we could only go back 
> > in time!  
> > 
> >            Now if I could just figure out why I'm getting capped bandwidth 
> > (via iperf)
> >            of ~45 mbps between two VMs on the same socket on a sandybridge 
> > xeon, I will
> >            be really happy!  If anyone has suggestions on debug methods for 
> > this, it'd be
> >            appreciated.  I see a huge difference when switching to ovs 
> > vhost-user, keeping
> >            all else the same.
> > 
> >            --Eric
> > 
> > 
> >            On Thu, Apr 20, 2017 at 04:29:23PM -0400, Billy McFall wrote:
> >> The vHost examples on the Wiki used the feature-mask of 0xFF. I think that
> >> is how it got propagated. In 16.09 when I did the CLI documentation for the
> >> vHost, I expanded what the bits meant and used feature-mask 0x40400000 as
> >> the example. I will gladly add an additional comment indicating that the
> >> recommended use is to leave blank if this was intended to be debug.
> >> 
> >> https://docs.fd.io/vpp/17.07/clicmd_src_vnet_devices_virtio.html
> >> 
> >> Billy
> >> 
> >> On Thu, Apr 20, 2017 at 4:17 PM, Damjan Marion (damarion) <
> >> damar...@cisco.com> wrote:
> >> 
> >>> 
> >>> Eric,
> >>> 
> >>> long time ago ( i think 3+ years) when I wrote original vhost-user driver
> >>> in vpp,
> >>> I added feature-mask knob to cli which messes up with feature bitmap
> >>> purely for debugging
> >>> reasons.
> >>> 
> >>> And I regret many times…
> >>> 
> >>> Somebody dig it out and documented it somewhere, for to me unknown 
> >>> reasons.
> >>> Now it spreads like a virus and I cannot stop it :)
> >>> 
> >>> So please don’t use it, it is evil….
> >>> 
> >>> Thanks,
> >>> 
> >>> Damjan
> >>> 
> >>>> On 20 Apr 2017, at 20:49, Ernst, Eric <eric.er...@intel.com> wrote:
> >>>> 
> >>>> All,
> >>>> 
> >>>> After updating the startup.conf to not reference DPDK, per direction in
> >>> release
> >>>> notification thread, I was able to startup vpp and create interfaces.
> >>>> 
> >>>> Now that I'm testing, I noticed that I can no longer ping between VM
> >>> hosts which
> >>>> make use of vhost-user interfaces and are connected via l2 bridge domain
> >>>> (nor l2 xconnect).  I double checked, then reverted back to 17.01, where
> >>> I could
> >>>> again verify connectivity between the guests.
> >>>> 
> >>>> Any else seeing this, or was there a change in how this should be set
> >>> up?  For
> >>>> reference, I have my (simple) setup described @ a gist at [1].
> >>>> 
> >>>> Thanks,
> >>>> eric
> >>>> 
> >>>> 
> >>>> [1] - https://gist.github.com/egernst/5982ae6f0590cd83330faafacc3fd545
> >>>> _______________________________________________
> >>>> vpp-dev mailing list
> >>>> vpp-dev@lists.fd.io
> >>>> https://lists.fd.io/mailman/listinfo/vpp-dev
> >>> 
> >>> _______________________________________________
> >>> vpp-dev mailing list
> >>> vpp-dev@lists.fd.io
> >>> https://lists.fd.io/mailman/listinfo/vpp-dev
> >> 
> >> 
> >> 
> >> 
> >> -- 
> >> *Billy McFall*
> >> SDN Group
> >> Office of Technology
> >> *Red Hat*
> >            _______________________________________________
> >            vpp-dev mailing list
> >            vpp-dev@lists.fd.io
> >            https://lists.fd.io/mailman/listinfo/vpp-dev
> > 
> > 
> > 
> > 
> > 
> 
_______________________________________________
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Reply via email to