[dpdk-dev] [PATCH] eth_dev: make ether dev_ops const
-Original Message- >From: Stephen Hemminger [mailto:stephen at networkplumber.org] >Sent: Monday, April 06, 2015 11:05 AM >To: dev at dpdk.org >Subject: [dpdk-dev] [PATCH] eth_dev: make ether dev_ops const > >Ethernet device function tables should be immutable for correctness and >security. Special case for the test code driver. ... >diff --git a/app/test/virtual_pmd.c b/app/test/virtual_pmd.c index >f163562..f579558 100644 >--- a/app/test/virtual_pmd.c >+++ b/app/test/virtual_pmd.c ... >+/* This driver uses private mutable eth_dev_ops for each >+ * instance so it is safe to override const here. >+ */ >+#pragma GCC diagnostic push >+#pragma GCC diagnostic ignored "-Wcast-qual" > void > virtual_ethdev_start_fn_set_success(uint8_t port_id, uint8_t success) { > struct rte_eth_dev *vrtl_eth_dev = &rte_eth_devices[port_id]; >+ struct eth_dev_ops *dev_ops >+ = (struct eth_dev_ops *) vrtl_eth_dev->dev_ops; ... If this is really safe, then you should be able to accomplish it without disabling a bunch of protection. I suggest adding a pointer that isn't const to the private data block and adjusting the allocated dispatch table through that instead of through the pointer to the immutable dispatch table you've established in struct rte_eth_dev. That reinforces the fact that modifying the dispatch table is a private matter within the driver while showing structurally exactly why it's safe to change it. And it's not nearly so ugly. -don provan dprovan at bivio.net
[dpdk-dev] tools brainstorming
>NOTE Please avoid, as much as possible, including headers from other headers >file. Doing so should be properly explained and justified. Actually, I think a *failure* to #include other header files that this header file depends on should be what needs explained and justified. It drives me crazy when a header file forces me to figure out what header files it depends on and then forces me to include them in my sources even though my sources don't use them. Especially when #include is such a straightforward way to document the dependency while keeping me out of it. Or are you only talking about when the header file doesn't depend on the header files its #including? If so, then I'd prefer ruling it out entirely rather than saying it needs to be justified. >For consistency, getopt(3) should be used to parse options. I assume this means the getopt() family, and I'd expect getopt_long() to be used normally. (If it were me, I'd *encourage* getopt_long() over getopt().) >not:: > if (!*p) I'm not sure why you'd bother to rule out this common idiom or the similar NULL pointer check. Are "if (*p)" and "if (p)" also prohibited, or just their negations? >Values in return statements should be enclosed in parentheses. Please don't encourage people to have this silly habit. It makes no more sense than insisting variables be set with "x = (-1)". >static void > usage() This has nothing to do with the point being made here in your document, but surely you want to insist on "static void usage(void)", right? In fact, you might mention parameterless functions explicitly in the section on function declarations. Everything else looks pretty cool. I'm surprised and impressed. -don provan
[dpdk-dev] [PATCH v2 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API
>From: Xie, Huawei [mailto:huawei.xie at intel.com] >Sent: Monday, December 21, 2015 7:22 AM >Subject: Re: [dpdk-dev] [PATCH v2 1/2] mbuf: provide rte_pktmbuf_alloc_bulk API > >The loop unwinding could give performance gain. The only problem is the >switch/loop >combination makes people feel weird at the first glance but soon they will >grasp this style. >Since this is inherited from old famous duff's device, i prefer to keep this >style which saves >lines of code. You don't really mean "lines of code", of course, since it increases the lines of code. It reduces the number of branches. Is Duff's Device used in other "bulk" routines? If not, what justifies making this a special case? -don provan dprovan at bivio.net
Re: [dpdk-dev] [PATCH] eal: bus scan and probe never fail
> -Original Message- > From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] > Sent: Monday, October 09, 2017 4:10 AM > To: Jan Blunck ; Thomas Monjalon > > Cc: dev ; Hemant Agrawal > Subject: Re: [dpdk-dev] [PATCH] eal: bus scan and probe never fail > >... > This is where I have disagreement/doubt. > Reporting error code from rte_bus_scan would do two things: > > 1. rte_eal_init is not designed to ignore/log-only these errors - it > would quit initialization. (But, this can be changed) > 2. What should rte_eal_init do with this error? rte_bus_scan would have > already printed the problematic bus->scan() failure. These practical problems confirm to me that the failure of a bus scan is more of a strategic issue: when asking "which devices can I use?", "none" is a perfectly valid answer that does not seem like an error to me even when a failed bus scan is the reason for that answer. From the application's point of view, the potential error here is that the device it wants to use isn't available. I don't see that either the init function or the probe function will have enough information to understand that application-level problem, so they should leave it to the application to detect it. -don provan dpro...@bivio.net
Re: [dpdk-dev] [PATCH] eal: bus scan and probe never fail
> -Original Message- > From: Shreyansh Jain [mailto:shreyansh.j...@nxp.com] > Sent: Monday, October 09, 2017 10:01 PM > To: Don Provan ; Jan Blunck ; > Thomas Monjalon > Cc: dev ; Hemant Agrawal > Subject: Re: [dpdk-dev] [PATCH] eal: bus scan and probe never fail > > ... > > > From the application's point of view, the potential error here > > is that the device it wants to use isn't available. I don't see that > > either the init function or the probe function will have enough > > information to understand that application-level problem, so > > they should leave it to the application to detect it. > > I think I understand you comment but just want to cross check again: > Scan or probe error should simply be ignored by EAL layer and let the > application take stance when it detects that the device it was looking > for is missing. Is my understanding correct? > > I am trying to come a conclusion so that this patch can either be > modified or pushed as it is. If the above understanding is correct, I > don't see any changes required in the patch. Yes, I agree my comments support the patch as is. -don
[dpdk-dev] [PATCH] mbuf: add comment explaining confusing code
> > > > > > > > if (likely (rte_mbuf_refcnt_read(m) == 1) || > > > > > > > > likely (rte_mbuf_refcnt_update(m, -1) > > > > > > > > == 0)) In all the debate about atomics, I don't think anyone got around to pointing out that in the unlikely case that the refcnt is not 1, then it's equally unlikely that decrementing it will result in 0 despite the code's claim to the contrary. That's the part that confused me. Would it make sense to fix this while adding the comment? -don dprovan at bivio.net
[dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instructions.
I probably shouldn't stick my nose into this, but I can't help myself. An experienced programmer will tend to ignore the documentation for a routine named "blahblah_memcmp" and just assume it functions like memcmp. Whether or not there's currently a use case in DPDK is completely irrelevant because as soon as there *is* a use case, some poor DPDK developer will try to use rte_memcmp for that and may or may not have a test case that reveals their mistake. The term "compare" suggests checking for larger or smaller. If you want to check for equality, use "equal" or "eq" in the name and return true if they're equal. But personally, I'd compare unless there was a good reason not to. Indeed, I would just implement full memcmp functionality and be done with it, even if that meant using my fancy new assembly code for the cases I handle and then calling memcmp itself for the cases I didn't. If a routine that appears to take an arbitrary size doesn't, the name should in some manner reflect what sizes it takes. Better would be for a routine that only handles specific sizes to be split into versions that only take fixed sizes, but I don't know enough about your use cases to say whether that makes sense here. -don provan dprovan at bivio.net -Original Message- From: Ravi Kerur [mailto:rke...@gmail.com] Sent: Monday, May 11, 2015 1:47 PM To: Ananyev, Konstantin Cc: dev at dpdk.org Subject: Re: [dpdk-dev] [PATCH v2] Implement memcmp using AVX/SSE instructions. ... Following memcmp semantics is not hard but there are no use-cases for it in DPDK currently. Keeping it specific to DPDK usage simplifies code as well. I can change the name to "rte_compare" and add comments to the function. Will it work? ...
[dpdk-dev] [PATCH] eal/bsd: reinitialize optind and optreset to 1
Actually, this is a good opportunity to fix a bug that's been in this code forever: it shouldn't be resetting optind to some arbitrary value: it should be saving optind (and optarg and optopt) at the beginning, initializing optind to 1 before calling getopt_long(), then restoring all the values after. (And, from what you're saying, optreset should be handled the same as optind.) This avoids broken behavior if rte_eal_init() is called by code that's in the middle of using getopt() to parse its own unrelated argc/argv parameters. -don provan dprovan at bivio.net -Original Message- From: Tiwei Bie [mailto:b...@mail.ustc.edu.cn] Sent: Tuesday, October 13, 2015 1:54 AM To: dev at dpdk.org Subject: [dpdk-dev] [PATCH] eal/bsd: reinitialize optind and optreset to 1 The variable optind must be reinitialized to 1 in order to skip over argv[0] on FreeBSD. Because getopt() on FreeBSD will return -1 when it meets an argument which doesn't start with '-'. The variable optreset is provided on FreeBSD to indicate the additional set of calls to getopt(). So, also reinitialize it to 1. Signed-off-by: Tiwei Bie --- lib/librte_eal/bsdapp/eal/eal.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index 1b6f705..35feaee 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -334,7 +334,8 @@ eal_log_level_parse(int argc, char **argv) break; } - optind = 0; /* reset getopt lib */ + optind = 1; /* reset getopt lib */ + optreset = 1; } /* Parse the argument given in the command line of the application */ @@ -403,7 +404,8 @@ eal_parse_args(int argc, char **argv) if (optind >= 0) argv[optind-1] = prgname; ret = optind-1; - optind = 0; /* reset getopt lib */ + optind = 1; /* reset getopt lib */ + optreset = 1; return ret; } -- 2.6.0
[dpdk-dev] [PATCH] eal/bsd: reinitialize optind and optreset to 1
> > On Wed, Oct 14, 2015 at 10:28:44AM +0800, Tiwei Bie wrote: > > > It is designed to have DPDK's parameters specified in the front of the > > > cmd line and terminated by '--'. In other words, it is designed assuming the DPDK library can dictate the application's command line. This is an incorrect assumption that should be eliminated. I don't think I'm the only one that has figured out that layering a service on top of DPDK requires creating a fake argc/argv array. The original plan of having rte_eal_init() parse the actual application command line organized as dictated by DPDK is not reasonable. > > > And 1 or 0 are not some > > > arbitrary values. They are used to put the index back to the beginning > > > of the new argv[] array. They're arbitrary values from the point of view of the calling application. If the caller is using getopt() for its own purposes, it has every reason to expect optind to have the same value after the call to rte_eal_init() that it had before, not some other value that the DPDK library happened to think was useful. > > > We shouldn't mix up DPDK's parameters and application's parameters. > > > And we should group them using '--'. Exactly! Yet we do confuse the two by using getopt() without considering that the application's parameters might not be in the argc/argv list that's passed to rte_eal_init(). > I'm considering updating optind to make it point to the option after > '--' before eal_parse_args() returns, and eal_parse_args() will return > 0 to avoid breaking the current applications which use the return value > of rte_eal_init() to update argc/argv. > > Any comments? Thanks! :-) Don't do it. Last time I looked, optind wasn't even mentioned in the documentation. Even if I thought it was reasonable to change it, I would still argue against using it as an undocumented return value, particularly when the documented return value works fine. -don provan dprovan at bivio.net
[dpdk-dev] [PATCH] eal: don't reset getopt lib
Looks perfect. Thanks! -don -Original Message- From: Tiwei Bie [mailto:b...@mail.ustc.edu.cn] Sent: Thursday, October 15, 2015 4:46 AM To: Don Provan ; bruce.richardson at intel.com; dev at dpdk.org Subject: [PATCH] eal: don't reset getopt lib Someone may need to call rte_eal_init() with a fake argc/argv array in the middle of using getopt() to parse its own unrelated argc/argv parameters. So getopt lib shouldn't be reset by rte_eal_init(). Now eal will always save optind, optarg and optopt (and optreset on FreeBSD) at the beginning, initialize optind (and optreset on FreeBSD) to 1 before calling getopt_long(), then restore all values after. Suggested-by: Don Provan Suggested-by: Bruce Richardson Signed-off-by: Tiwei Bie --- lib/librte_eal/bsdapp/eal/eal.c | 59 +++-- lib/librte_eal/linuxapp/eal/eal.c | 69 ++- 2 files changed, 102 insertions(+), 26 deletions(-) diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c index 1b6f705..bd09377 100644 --- a/lib/librte_eal/bsdapp/eal/eal.c +++ b/lib/librte_eal/bsdapp/eal/eal.c @@ -312,8 +312,20 @@ eal_log_level_parse(int argc, char **argv) int opt; char **argvopt; int option_index; + int old_optind; + int old_optopt; + int old_optreset; + char *old_optarg; + + /* save getopt lib */ + old_optind = optind; + old_optopt = optopt; + old_optreset = optreset; + old_optarg = optarg; argvopt = argv; + optind = 1; + optreset = 1; eal_reset_internal_config(&internal_config); @@ -334,7 +346,11 @@ eal_log_level_parse(int argc, char **argv) break; } - optind = 0; /* reset getopt lib */ + /* restore getopt lib */ + optind = old_optind; + optopt = old_optopt; + optreset = old_optreset; + optarg = old_optarg; } /* Parse the argument given in the command line of the application */ @@ -345,25 +361,37 @@ eal_parse_args(int argc, char **argv) char **argvopt; int option_index; char *prgname = argv[0]; + int old_optind; + int old_optopt; + int old_optreset; + char *old_optarg; + + /* save getopt lib */ + old_optind = optind; + old_optopt = optopt; + old_optreset = optreset; + old_optarg = optarg; argvopt = argv; + optind = 1; + optreset = 1; while ((opt = getopt_long(argc, argvopt, eal_short_options, eal_long_options, &option_index)) != EOF) { - int ret; - /* getopt is not happy, stop right now */ if (opt == '?') { eal_usage(prgname); - return -1; + ret = -1; + goto out; } ret = eal_parse_common_option(opt, optarg, &internal_config); /* common parser is not happy */ if (ret < 0) { eal_usage(prgname); - return -1; + ret = -1; + goto out; } /* common parser handled this option */ if (ret == 0) @@ -387,23 +415,34 @@ eal_parse_args(int argc, char **argv) "on FreeBSD\n", opt); } eal_usage(prgname); - return -1; + ret = -1; + goto out; } } - if (eal_adjust_config(&internal_config) != 0) - return -1; + if (eal_adjust_config(&internal_config) != 0) { + ret = -1; + goto out; + } /* sanity checks */ if (eal_check_common_options(&internal_config) != 0) { eal_usage(prgname); - return -1; + ret = -1; + goto out; } if (optind >= 0) argv[optind-1] = prgname; ret = optind-1; - optind = 0; /* reset getopt lib */ + +out: + /* restore getopt lib */ + optind = old_optind; + optopt = old_optopt; + optreset = old_optreset; + optarg = old_optarg; + return ret; } diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index 33e1067..4796030 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -505,8 +505,17 @@ eal_log_level_parse(int argc, char **argv) int opt; char **argvopt; int option_index; + int old_optind; + int old_optopt; + char *old_optarg; + + /* save getopt lib */ + old_optind = optind; + old_optopt = optopt; + old_optarg = optarg; argvopt =
[dpdk-dev] rte_eal_init() alternative?
Thomas Monjalon: >Yes but please, do not create an alternative init function. >We just need to replace panic/exit with error codes and be sure that apps and >examples handle them correctly. I understand your concerns, but the panics are really just the tip of the iceberg of the EAL library not realizing it's a library. It really makes no sense to think the library should define the application's command line, or that the PCI bus should be probed without considering whether this application is going to use PCI, and or to insist that EAL work be done on internal EAL threads. So I'd say it's way past time to consider revamping initialization to start the process of ending the DPDK library's tail wagging the application's dog. Naturally this would have to be done while retaining the existing init routine on top of a real library initialization, but that's just an unfortunate artifact of the library's history, not a rational design decision for moving forward. -don provan
[dpdk-dev] rte_eal_init() alternative?
From: Wiles, Keith: >That stated I am not a big fan of huge structures being passed into >a init routine as that structure would need to be versioned and it will >grow/change. Plus he did not really want to deal in strings, so the >structure would be binary values and strings as required. A typical library has an init routine which establishes defaults, and then the application adjusts parameters through targeted set routines before starting to use the library operationally. In the argc/argv wrapper, the parsing code would call one of those individual routines when it parses the corresponding command line flag. The idea that there has to be one massive init routine which is passed every possible configuration parameter is more of the same monolithic thinking that DPDK needs to shake. -don provan dprovan at bivio.net
[dpdk-dev] [PATCH 04/12] mbuf: add function to calculate a checksum
> -Original Message- > From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com] > Sent: Thursday, July 21, 2016 3:51 AM > Subject: Re: [dpdk-dev] [PATCH 04/12] mbuf: add function to calculate a > checksum > >... > > + Added a new function ``rte_pktmbuf_cksum()`` to process the checksum > > + of data embedded in an mbuf chain. > >... > > +#include > > As a nit, do we need to introduce a dependency for librte_mbuf on librte_net? > Might be better to put this functionality into librte_net? That's not a nit at all. This is clearly a net function that has no place in the mbuf code. That should be obvious even before we notice this circular dependency. -don dprovan at bivio.net
[dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
> -Original Message- > From: Ananyev, Konstantin [mailto:konstantin.ananyev at intel.com] > Sent: Thursday, June 09, 2016 8:45 AM > To: Thomas Monjalon > Cc: dev at dpdk.org; Olivier Matz ; Adrien > Mazarguil ; Zhang, Helin intel.com> > Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements >... > But as I said, if everyone are that desperate about symmetry, we can just > create a new public function: > > void > rte_mbuf_raw_free(stuct rte_mbuf *m) > { > if (rte_mbuf_refcnt_update(m, -1) == 0) > __rte_mbuf_raw_free(m); > } Well, if you're going to do that, let's recognize that rte_mbuf_raw_free() was misnamed: it doesn't free an mbuf, it returns an mbuf that's already free back to the pool. So instead of "__rte_mbuf_raw_free", I'd call it "rte_mbuf_raw_release". Logically I like this solution, as I'm very uncomfortable with the idea that a free mbuf might sometimes have refcnt set to one. On the other hand, if mbufs can often be freed without touching refcnt, that could be a big win that might persuade me to accept being uncomfortable. -don provan dprovan at bivio.net
[dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional?
--no-pci is cool. I'm pretty sure that wasn't there when the PCI scan was first added to the library init routine. I'm glad to see it, so thanks for pointing it out. Just for the record: The comment says, "for debug purposes, PCI can be disabled". This exhibits one of those classic DPDK blindspots. The library can be used for many things entirely unrelated to hardware. My project has several DPDK applications intended to be used by users that do not have privs to mess around with PCI hardware, so, for them, turning off PCI wouldn't be just for debugging purposes. -don provan dprovan at bivio.net -Original Message- From: David Marchand [mailto:david.march...@6wind.com] Sent: Friday, November 13, 2015 12:50 AM To: Roger B Melton Cc: dev at dpdk.org Subject: Re: [dpdk-dev] Making rte_eal_pci_probe() in rte_eal_init() optional? ... Did you try the --no-pci option ? It avoids the initial sysfs scan, so with no pci device, the initial pci_probe should do nothing. ...
[dpdk-dev] DPDK namespace
I can't believe you guys are seriously considering changing the prefix from rte_. That's a nightmare at the practical level, but it really doesn't make as much sense as some of you seem to think. I've always been really impressed that the names were prefixed with rte_ instead of dpdk_. While the primary goal was to provide dataplane capabilities, so much of the library -- mempools and rings, for example -- is general purpose, so a dpdk_ prefix wouldn't be appropriate for those routines, anyway. The idea that everything in the library should be named "dpdk" is the same thinking that leads to the monolithic initialization function I've complained about before. I have major applications that use the DPDK library but do nothing at all with hardware, yet the library still insists on initializing the PCI components because there's no concept of using the library for anything other than receiving packets from hardware. -don provan dprovan at bivio.net
[dpdk-dev] removing mbuf error flags
>From: Olivier Matz [mailto:olivier.matz at 6wind.com] >Subject: [dpdk-dev] removing mbuf error flags > >My opinion is that invalid packets should not be given to the application and >only a statistic counter should be incremented. The idea of an application that handles bad packets is perfectly valid. Most applications don't want to see them, of course, but, conceptually, some applications would want to ask for bad packets because they are specifically designed to handle various networking problems including those that result in bad packets that the application can look at and report. Furthermore, it makes technical sense for DPDK to support such applications. Having said that, I have no idea if that's why that field was added, and I don?t myself care if DPDK provides that feature in the future. I just thought I'd put the idea out there in case it makes any difference to you. If it were me, I'd probably decide it isn't hurting anything and not bother to remove it in case some day someone wants to implement that feature in one driver or another. -don provan dprovan at bivio.net
[dpdk-dev] removing mbuf error flags
>From: Olivier MATZ [mailto:olivier.matz at 6wind.com] >Sent: Friday, April 29, 2016 1:58 PM > >The point is today it's broken, and no application running on top of DPDK >check these flags because they are set to 0. If we decide to assign a value >to these flags, it will break the working applications because they don't >expect to receive invalid packets. Maybe a proper solution would be to >enable these flags on demand in PMD configuration, and add a feature >flag for this feature. It's not broken, it just doesn't do anything. Yes, such a feature *has* to be explicitly requested by the application. By default, broken packets should not be delivered. >I think we should not keep things half-done too long. It's confusing and >useless as-is. Fine with me. I don't see how it's confusing, but, from what you're saying, it is clearly useless. The only reason to keep it would be that if such a feature is added in the future, it could be added without changing the mbuf structure, but I don't know whether that's important. -don provan dprovan at bivio.net
[dpdk-dev] [PATCH v2] log: respect rte_openlog_stream calls before rte_eal_init
> -Original Message- > From: John Ousterhout [mailto:ouster at cs.stanford.edu] > Sent: Tuesday, October 11, 2016 9:30 AM > To: Thomas Monjalon > Cc: dev at dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2] log: respect rte_openlog_stream calls > before rte_eal_init > > On Tue, Oct 11, 2016 at 1:08 AM, Thomas Monjalon > > wrote: > > I don't know either. > > What is best between stdout and stderr for logs? > > I would guess that stdout makes more sense, since most log entries describe > normal operation, not errors. I'm happy to make these consistent, but this > would introduce a behavior change for BSD (which currently uses stderr); > would that be considered antisocial? I've never seen a pronouncement or anything, but as a linux programmer, my attitude is that stdout should be the output the application is producing when carrying out its function. Debugging output isn't part of what the application is trying to accomplish, so it should be sent to stderr where it can be segregated from the functional output when needed. -don dprovan at bivio.net