Hi, Neale,

Thanks for the info, that solved the problem!

This might be something to put into the main section rather than
"advanced", and possibly even in the README.md that goes along with the
vppsb/router.

It would probably actually be more helpful if the parameter were an IPv4
prefix-count rather than a memory amount, as users will know the former but
not the latter.

The syntax might be something like "ip { max-prefix 1M }", and have the
ip4_mtrie do the math.

It would probably also be beneficial to have the "heapsize N" calculated
from the other values, if that is possible.

Thanks again,
Brian

On Wed, Dec 5, 2018 at 11:47 AM Neale Ranns (nranns) <nra...@cisco.com>
wrote:

>
>
> Hi Brian,
>
>
>
> If you’re adding lots of routes, you’ll also need to bump the heap size
> for the IP FIBs as well as the main heap:
>
>
> https://fdio-vpp.readthedocs.io/en/latest/gettingstarted/users/configuring/startup.html#ip
>
>
>
> to run in gdb:
>
>   sudo service vpp stop (or your OS equivalent)
>
>   make build
>
>   sudo gdb –args ./build-root/install-vpp_debug-native/vpp/bin/vpp –c
> <YOUR_CONF_FILE> plugin_path <PATH/TO/ALL/PLUGINS>
>
>
>
> hope that helps,
>
>
>
> /neale
>
>
>
>
>
>
>
>
>
> *De : *<vpp-dev@lists.fd.io> au nom de Brian Dickson <
> brian.peter.dick...@gmail.com>
> *Date : *mercredi 5 décembre 2018 à 19:31
> *À : *"vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
> *Objet : *[vpp-dev] vnet crashes, and problems building debug version
> (was Re: netlink & router (vppsb or patch->vpp) - help building/running)
>
>
>
> Greetings again,
>
>
>
> Here is more context on the problem I'm seeing.
>
> The problem occurs if a large-ish number of IPv4 prefixes are added to the
> FIB (by way of the netlink and router plugin).
>
>
>
> If the prefix count is below some threshold (e.g. 50,000 prefixes), things
> work fine.
>
> At some prefix count (haven't narrowed it down to a specific number, but I
> don't think the actual number is relevant), vnet crashes, in a failure
> within ip4_mtrie.c.
>
>
>
> I have been trying to run in debug mode, but am having a lot of difficulty
> building everything with debug.
>
> Basically, the only way I can successfully build everything is to use the
> script vagrant/build.sh (which does a make pkg-rpm that generates a bunch
> of rpm files that I then install with yum).
>
> Then, I have to rebuild things using the instructions from
> vppsb/router/README.md (doing 4 symlinks and various make iterations, and
> THEN having to run some of those with a bunch of CFLAGS values just to get
> it to compile).
>
>
>
> I don't see any good/easy way to build debug images from this environment,
> without a LOT of work/investigation on how all the various build components
> work.
>
>
>
> Is the problem easy enough to diagnose from a non-symbolic stack dump, or
> can someone provide details on how to build and run vpp with everything to
> use gdb, including the plugins for netlink/router, so the problem can be
> further isolated?
>
>
>
> I think there's basically some kind of bug related to the fib stuff in
> vnet, that really needs to be fixed.
>
>
>
> The box has an unreasonably large amount of memory (128GB, doing nothing
> but VPP), and I get the same error even if I up the initial heap size by a
> factor of 2^12 (changing 32<<20 to 32ULL<<32).
>
>
>
> Please help.
>
>
>
> Brian
>
>
>
> (In the following, the buffer space message is likely a consequence of the
> thread handling netlink messages dying, rather than a cause.)
>
> Here's the log messages:
>
> Dec  4 17:08:14 sj2tldnslab09 vnet[19785]: dpdk_pool_create:535:
> ioctl(VFIO_IOMMU_MAP_DMA) pool 'dpdk_mbuf_pool_socket0': Inappropriate
> ioctl for device (errno 25)
>
> Dec  4 17:08:14 sj2tldnslab09 vnet[19785]: dpdk_ipsec_process:1026: not
> enough DPDK crypto resources, default to OpenSSL
>
> Dec  4 17:08:16 sj2tldnslab09 vnet[19785]: rtnl_ns_recv:403: Received
> notification while in sync. Restart synchronization.
>
> Dec  4 17:08:16 sj2tldnslab09 vnet[19785]: rtnl_process_read:467:
> rtnetlink recv error (31) []: Bad file descriptor
>
> Dec  4 17:08:58 sj2tldnslab09 vnet[19785]: rtnl_process_read:467:
> rtnetlink recv error (27) []: No buffer space available
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: rtnl_process_read:467:
> rtnetlink recv error (27) []: No buffer space available
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: received signal SIGABRT, PC
> 0x7f043c3c7277
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #0  0x00007f043e5c18c5
> 0x7f043e5c18c5
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #1  0x00007f043c9716d0
> 0x7f043c9716d0
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #2  0x00007f043c3c7277 gsignal
> + 0x37
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #3  0x00007f043c3c8968 abort +
> 0x148
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #4  0x00005569eb7900d3
> 0x5569eb7900d3
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #5  0x00007f043d0e8512
> vec_resize_allocate_memory + 0x2f2
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #6  0x00007f043dd9809f
> 0x7f043dd9809f
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #7  0x00007f043dd985cd
> ip4_fib_mtrie_route_add + 0x17d
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #8  0x00007f043e129b08
> fib_entry_src_action_install + 0xb8
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #9  0x00007f043e1274a0
> fib_entry_create + 0x70
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #10 0x00007f043e11e890
> fib_table_entry_path_add2 + 0x190
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #11 0x00007f03f86833fd
> add_del_route + 0x34c
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #12 0x00007f03f8683594
> netns_notify_cb + 0x8c
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #13 0x00007f03f8466e71
> netns_notify + 0x1f3
>
> Dec  4 17:09:07 sj2tldnslab09 vnet[19785]: #14 0x00007f03f84684ed
> ns_rcv_route + 0x825
>
>
>
> On Tue, Nov 27, 2018 at 6:17 PM Brian Dickson <
> brian.peter.dick...@gmail.com> wrote:
>
> I have been working with the netlink and router plugins, which I was able
> to build from the 18.07 tree via the instructions in vppsb/router.
>
>
>
> (NB: trying to build from anything more recent, e.g. 18.10 or 19.01
> breaks, with no obvious easy resolution).
>
>
>
> When running with these plugins, connected with an open source router
> (bird version 1.6.4 or 2.02) and with a very small routing table, it works
> really really well.
>
>
>
> (I was able to run roughly line-rate 10g even with small packets, and when
> using a second host with vpp and the span->pg->pcap to /tmp, didn't lose
> any data.)
>
>
>
> However, when trying to load up the routing table, things went sideways,
> and it seems to be something netlink-related.(This was using BGP to feed in
> 3 copies of the full routing table, each copy of which is about 750K
> routes.)
>
>
>
> I was hoping someone could provide good instructions (good == tested and
> works) on building from a more recent release of VPP to see if it's an
> issue that has been fixed.
>
>
>
> If the issue persists and/or looks to be netlink-specific, would anyone be
> able to look into it? I'm happy to provide logs etc.
>
>
>
> System is bare metal centos7.5, tons of cores, memory, etc.
>
>
>
> The first few messages in syslog look like:
>
> Nov 27 17:57:30 sj2tldnslab09 bird: Kernel dropped some netlink messages,
> will resync on next scan.
>
> Nov 27 17:57:30 sj2tldnslab09 vnet[127960]: rtnl_process_read:467:
> rtnetlink recv error (27) []: No buffer space available
>
> Nov 27 17:57:30 sj2tldnslab09 vnet[127960]: rtnl_process_read:467:
> rtnetlink recv error (27) []: No buffer space available
>
>
>
> After a bunch of similar groups of messages, VPP appears to crash.
>
>
>
> If this is a known problem or if there's something that needs to be
> tweaked on the host, any assistance would be greatly appreciated.
>
>
>
> Brian
>
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#11506): https://lists.fd.io/g/vpp-dev/message/11506
Mute This Topic: https://lists.fd.io/mt/28615952/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to