Re: [dpdk-dev] [PATCH 2/2] eal: fix not checking unlock result

2018-04-25 Thread Tan, Jianfeng



On 4/17/2018 11:42 PM, Anatoly Burakov wrote:

Coverity issue: 272607

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 009f963..01fee51 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -304,8 +304,11 @@ clear_hugedir(const char * hugedir)
/* if lock succeeds, unlock and remove the file */
if (lck_result != -1) {
lck.l_type = F_UNLCK;
-   fcntl(fd, F_SETLK, &lck);
-   unlinkat(dir_fd, dirent->d_name, 0);
+   if (fcntl(fd, F_SETLK, &lck) < 0)
+   RTE_LOG(ERR, EAL, "Couldn't unlock %s: %s\n",
+   dirent->d_name, strerror(errno));


It seems that we shall return error if this nearly-impossible error 
happens, no?



+   else
+   unlinkat(dir_fd, dirent->d_name, 0);
}
close (fd);
dirent = readdir(dir);




Re: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo Software driver

2018-04-25 Thread De Lara Guarch, Pablo
Hi Amr,

> -Original Message-
> From: Mokhtar, Amr
> Sent: Tuesday, April 24, 2018 7:53 PM
> To: De Lara Guarch, Pablo ; Chalupnik, KamilX
> ; dev@dpdk.org
> Cc: Chalupnik, KamilX 
> Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo Software
> driver
> 
> 
> > -Original Message-
> > From: De Lara Guarch, Pablo
> > Sent: Tuesday 24 April 2018 18:56
> > To: Chalupnik, KamilX ; dev@dpdk.org
> > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > 
> > Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> > Software driver
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of KamilX
> > > Chalupnik
> > > Sent: Tuesday, April 17, 2018 3:44 PM
> > > To: dev@dpdk.org
> > > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > > 
> > > Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: update Turbo
> > Software
> > > driver
> > >
> > > Update Turbo Software driver for Wireless Baseband Device:
> > > - support for optional CRC overlap in decode processing implemented
> > > - function scaling input LLR values to specific range [-16, 16]
> > > added
> > > - sizes of the internal buffers used by decoding were increased due to
> > >   problem with memory for large test vectors
> > > - new test vectors to check device capabilities added
> > >
> >
> > Split this patch into multiple patches, each one doing a single item
> > of your above list.
> > Again, make sure that it can be compiled and that is functional along
> > the patches.
> >
> 
> Too much splits is a bit an overkill.
> All the above changes are enhancements of Turbo coding operations.
> They all fall under one common topic and appears like they are good to stay
> combined in one patch.
> The new test vectors are related to the added enhancements.

I understand that they fall under the same top, that's why you should send them
in the same patchset. In DPDK, we aim at shorter patches (when possible),
with are easier to review. We tend to avoid patches making multiple changes, 
when they
can be breakable (generally, when you have a list of changes in your commit 
message,
that means they should go into separate patches).

Thanks,
Pablo



[dpdk-dev] [PATCH v2] app/testpmd: fix testpmd failure due to RSS offload check

2018-04-25 Thread Qi Zhang
After add RSS hash offload check, default rss_hf will fail on
devices that not support all bits, the patch introduce RSS
negotiate flag, when it is set, RSS configuration will negotiate
with device capability. By default negotiate is turn on, so it
will not break exist PMD. It can be disabled by "--disable-rss-neg"
in start parameters or be enable/disable by testpmd command
"port config all rss neg|noneg".

Fixes: 527624c663f8 ("ethdev: add supported hash function check")
Signed-off-by: Qi Zhang 
---

v2:
- fix command help string.

 app/test-pmd/cmdline.c  | 22 --
 app/test-pmd/parameters.c   |  5 +
 app/test-pmd/testpmd.c  | 12 +++-
 app/test-pmd/testpmd.h  |  1 +
 doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++-
 5 files changed, 36 insertions(+), 8 deletions(-)

diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
index 333852194..9390be741 100644
--- a/app/test-pmd/cmdline.c
+++ b/app/test-pmd/cmdline.c
@@ -822,7 +822,7 @@ static void cmd_help_long_parsed(void *parsed_result,
" for ports.\n\n"
 
"port config all rss (all|default|ip|tcp|udp|sctp|"
-   "ether|port|vxlan|geneve|nvgre|none|)\n"
+   
"ether|port|vxlan|geneve|nvgre|none|neg|noneg|)\n"
"Set the RSS mode.\n\n"
 
"port config port-id rss reta 
(hash,queue)[,(hash,queue)]\n"
@@ -2029,7 +2029,13 @@ cmd_config_rss_parsed(void *parsed_result,
else if (isdigit(res->value[0]) && atoi(res->value) > 0 &&
atoi(res->value) < 64)
rss_conf.rss_hf = 1ULL << atoi(res->value);
-   else {
+   else if (!strcmp(res->value, "neg")) {
+   rss_hf_noneg = 0;
+   return;
+   } else if (!strcmp(res->value, "noneg")) {
+   rss_hf_noneg = 1;
+   return;
+   } else {
printf("Unknown parameter\n");
return;
}
@@ -2037,10 +2043,14 @@ cmd_config_rss_parsed(void *parsed_result,
/* Update global configuration for RSS types. */
rss_hf = rss_conf.rss_hf;
RTE_ETH_FOREACH_DEV(i) {
-   if (!strcmp(res->value, "default")) {
-   rte_eth_dev_info_get(i, &dev_info);
+   rte_eth_dev_info_get(i, &dev_info);
+   if (!strcmp(res->value, "default"))
rss_conf.rss_hf = dev_info.flow_type_rss_offloads;
-   }
+   else
+   rss_conf.rss_hf &= (rss_hf_noneg ?
+   rss_conf.rss_hf :
+   dev_info.flow_type_rss_offloads);
+
diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
if (diag < 0)
printf("Configuration of RSS hash at ethernet port %d "
@@ -2064,7 +2074,7 @@ cmdline_parse_inst_t cmd_config_rss = {
.f = cmd_config_rss_parsed,
.data = NULL,
.help_str = "port config all rss "
-   
"all|default|ip|tcp|udp|sctp|ether|port|vxlan|geneve|nvgre|none|",
+   
"all|default|ip|tcp|udp|sctp|ether|port|vxlan|geneve|nvgre|none|neg|noneg|",
.tokens = {
(void *)&cmd_config_rss_port,
(void *)&cmd_config_rss_keyword,
diff --git a/app/test-pmd/parameters.c b/app/test-pmd/parameters.c
index 394fa6d92..4758ec1d9 100644
--- a/app/test-pmd/parameters.c
+++ b/app/test-pmd/parameters.c
@@ -139,6 +139,7 @@ usage(char* progname)
printf("  --enable-hw-vlan-extend: enable hardware vlan extend.\n");
printf("  --enable-drop-en: enable per queue packet drop.\n");
printf("  --disable-rss: disable rss.\n");
+   printf("  --disable-rss-neg: disable rss negotiate with device 
capa.\n");
printf("  --port-topology=N: set port topology (N: paired (default) or "
   "chained).\n");
printf("  --forward-mode=N: set forwarding mode (N: %s).\n",
@@ -594,6 +595,7 @@ launch_args_parse(int argc, char** argv)
{ "enable-hw-vlan-extend",  0, 0, 0 },
{ "enable-drop-en",0, 0, 0 },
{ "disable-rss",0, 0, 0 },
+   { "disable-rss-neg",0, 0, 0 },
{ "port-topology",  1, 0, 0 },
{ "forward-mode",   1, 0, 0 },
{ "rss-ip", 0, 0, 0 },
@@ -909,6 +911,9 @@ launch_args_parse(int argc, char** argv)
 
if (!strcmp(lgopts[opt_idx].name, "disable-rss"))
rss_hf = 0;
+   if (!strcmp(lgopts[opt_idx].name, "disable-rss-neg")) {
+   rss_hf_noneg = 1;
+   }
if (!strcmp(lgopts[opt_idx].name, "port-topology")) {

Re: [dpdk-dev] [PATCH v2] baseband/turbo_sw: offload cost measurement test

2018-04-25 Thread De Lara Guarch, Pablo


> -Original Message-
> From: Mokhtar, Amr
> Sent: Tuesday, April 24, 2018 8:09 PM
> To: De Lara Guarch, Pablo ; Chalupnik, KamilX
> ; dev@dpdk.org
> Cc: Chalupnik, KamilX 
> Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: offload cost
> measurement test
> 
> 
> 
> > -Original Message-
> > From: De Lara Guarch, Pablo
> > Sent: Tuesday 24 April 2018 18:45
> > To: Chalupnik, KamilX ; dev@dpdk.org
> > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > 
> > Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: offload cost
> > measurement test
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of KamilX
> > > Chalupnik
> > > Sent: Tuesday, April 17, 2018 3:27 PM
> > > To: dev@dpdk.org
> > > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > > 
> > > Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: offload cost
> > measurement
> > > test
> > >
> > > New test created to measure offload cost.
> > > Changes were introduced in API, turbo software driver and test
> > application.
> > >
> >
> > Shouldn't this be generic to bbdev/baseband drivers in general and not
> > just turbo?
> >
> 
> Yes, it is generic.
> But the only driver we have right now is the turbo_sw driver. Future drivers 
> will
> have a similar support.

Right, then the title should be something like "bbdev: measure offload cost",
since this is affecting multiple components in bbdev.

> 
> > > Signed-off-by: KamilX Chalupnik 
> >
> > ...
> >
> > > --- a/lib/librte_bbdev/rte_bbdev.h
> > > +++ b/lib/librte_bbdev/rte_bbdev.h
> > > @@ -239,6 +239,10 @@ struct rte_bbdev_stats {
> > >   uint64_t enqueue_err_count;
> > >   /** Total error count on operations dequeued */
> > >   uint64_t dequeue_err_count;
> > > +#ifdef RTE_TEST_BBDEV
> > > + /** It stores turbo decoder/encoder working time. */
> > > + uint64_t turbo_perf_time;
> > > +#endif
> >
> > I don't think it is a good idea to use RTE_TEST_BBDEV here.
> > This macro is used to enable/disable the compilation of the bbdev test
> > app, so I think it should not be used in the API/PMDs.
> >
> > Also, this looks too specific for the Turbo SW PMD to be exposed as a
> > generic statistic.
> 
> Well, it should be generic. Probably 'turbo' is a bad comment and name.
> It's intention is to feedback execution time/cycles back to test app in order 
> to
> collect the offload cost of the bbdev driver.
> What is meant by the offload cost is the cycles consumed from the moment of
> enqueue till the moment the request is put on the acceleration engine inbound
> sw ring (software) or MMIO operation (hardware).

I understand. Then yes, probably you should find a better name, maybe 
offload_time?

Thanks,
Pablo



Re: [dpdk-dev] [PATCH v6 1/3] ether: support runtime queue setup

2018-04-25 Thread Thomas Monjalon
25/04/2018 07:33, Zhang, Qi Z:
> > 10/04/2018 15:59, Thomas Monjalon:
> > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > +#define DEV_RUNTIME_RX_QUEUE_SETUP 0x0001 /**< Deferred
> > setup
> > > > +rx queue */ #define DEV_RUNTIME_TX_QUEUE_SETUP 0x0002
> > /**<
> > > > +Deferred setup tx queue */
> > >
> > > Please use RTE_ETH_ prefix.
> 
> Actually I saw all the offload flag is started with DEV_, so
> Should I still need rename to RTE_ETH_DEV_CAPA_***? 

Yes
The flags starting with DEV_ will be managed later because it is hard
to rename an existing flag. It is in my roadmap.
The idea is to not add new flags with wrong namespace prefix.





[dpdk-dev] [PATCH] ethdev: fix naming for device capability

2018-04-25 Thread Qi Zhang
Rename all device capabilities to start with RTE_ETH_DEV_CAPA.

Fixes: 172364028db6 ("net/i40e: enable runtime queue setup")

Signed-off-by: Qi Zhang 
---
 doc/guides/nics/features.rst   | 4 ++--
 drivers/net/i40e/i40e_ethdev.c | 4 ++--
 lib/librte_ether/rte_ethdev.c  | 4 ++--
 lib/librte_ether/rte_ethdev.h  | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 67d459f80..14b8672bc 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -899,7 +899,7 @@ Runtime Rx queue setup
 
 Supports Rx queue setup after device started.
 
-* **[provides] rte_eth_dev_info**: 
``dev_capa:DEV_CAPA_RUNTIME_RX_QUEUE_SETUP``.
+* **[provides] rte_eth_dev_info**: 
``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP``.
 * **[related]  API**: ``rte_eth_dev_info_get()``.
 
 .. _nic_features_runtime_tx_queue_setup:
@@ -909,7 +909,7 @@ Runtime Tx queue setup
 
 Supports Tx queue setup after device started.
 
-* **[provides] rte_eth_dev_info**: 
``dev_capa:DEV_CAPA_RUNTIME_TX_QUEUE_SETUP``.
+* **[provides] rte_eth_dev_info**: 
``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP``.
 * **[related]  API**: ``rte_eth_dev_info_get()``.
 
 .. _nic_features_other:
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index e329042df..2fc98a7e7 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3245,8 +3245,8 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_IPIP_TNL_TSO |
DEV_TX_OFFLOAD_GENEVE_TNL_TSO;
dev_info->dev_capa =
-   DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
-   DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
+   RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
+   RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
 
dev_info->hash_key_size = (I40E_PFQF_HKEY_MAX_INDEX + 1) *
sizeof(uint32_t);
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0e503ab7e..5f1a1bf2b 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1478,7 +1478,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
 
if (dev->data->dev_started &&
!(dev_info.dev_capa &
-   DEV_CAPA_RUNTIME_RX_QUEUE_SETUP))
+   RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP))
return -EBUSY;
 
if (dev->data->rx_queue_state[rx_queue_id] !=
@@ -1586,7 +1586,7 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
 
if (dev->data->dev_started &&
!(dev_info.dev_capa &
-   DEV_CAPA_RUNTIME_TX_QUEUE_SETUP))
+   RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP))
return -EBUSY;
 
if (dev->data->tx_queue_state[tx_queue_id] !=
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 4096f688a..8da452cdf 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -993,9 +993,9 @@ struct rte_eth_conf {
  */
 #define DEV_TX_OFFLOAD_IP_TNL_TSO   0x0008
 
-#define DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x0001
+#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x0001
 /**< Device supports Rx queue setup after device started*/
-#define DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x0002
+#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x0002
 /**< Device supports Tx queue setup after device started*/
 
 /*
@@ -1071,7 +1071,7 @@ struct rte_eth_dev_info {
struct rte_eth_dev_portconf default_rxportconf;
/** Tx parameter recommendations */
struct rte_eth_dev_portconf default_txportconf;
-   /** Generic device capabilities */
+   /** Generic device capabilities (RTE_ETH_DEV_CAPA_). */
uint64_t dev_capa;
 };
 
-- 
2.13.6



Re: [dpdk-dev] [PATCH v4 1/5] ethdev: add access to eeprom

2018-04-25 Thread Zijie Pan
Hi Thomas,

> > +/**
> > + * Placeholder for accessing plugin module eeprom
> > + */
> > +struct rte_dev_module_info {
> > + uint32_t type; /**< Type of plugin module eeprom */
> > + uint32_t eeprom_len; /**< Length of plugin module eeprom */
> > +};
>
> I am not sure "plugin module" is descriptive enough.
Here is the description when "man ethtool":
-m --dump-module-eeprom --module-info
Retrieves  and if possible decodes the EEPROM from plugin modules, e.g 
SFP+, QSFP.
Is there any suggestion how to describe it?

> And I think the structure name should be rte_eth_dev_module_info
> to make clear that we are talking about NIC modules.
> Any better idea?
Yes, rte_eth_dev_module_info is better than rte_dev_module_info.
But the name of other structures should be also changed. And the file name
should be changed to rte_eth_dev_info.h.
Do we need to make a patch first to change the name of the file and structures?

> > +int __rte_experimental
> > +rte_eth_dev_get_module_eeprom(uint16_t port_id,
> > +   struct rte_dev_eeprom_info *info)
> > +{
> > + struct rte_eth_dev *dev;
> > +
> > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > +
> > + dev = &rte_eth_devices[port_id];
> > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_eeprom, -ENOTSUP);
> > + return (*dev->dev_ops->get_module_eeprom)(dev, info);
> > +}
> > +
> >  int
> >  rte_eth_dev_get_dcb_info(uint16_t port_id,
>
> Please move this code after other EEPROM related functions.
I put the functions after rte_eth_dev_set_eeprom(). Please apply the patch and 
check it.

> > --- a/lib/librte_ether/rte_ethdev_version.map
> > +++ b/lib/librte_ether/rte_ethdev_version.map
> > @@ -229,5 +229,7 @@ EXPERIMENTAL {
> >  rte_mtr_policer_actions_update;
> >  rte_mtr_stats_read;
> >  rte_mtr_stats_update;
> > + rte_eth_dev_get_module_info;
> > + rte_eth_dev_get_module_eeprom;
>
> This must be inserted in alphabetical order.
I will update it in v5 patchset.

Thanks & Regards,
Zijie

Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf

2018-04-25 Thread Olivier Matz
On Tue, Apr 24, 2018 at 02:53:41PM -0700, Yongseok Koh wrote:
> On Tue, Apr 24, 2018 at 10:22:45PM +0200, Thomas Monjalon wrote:
> > 24/04/2018 21:15, Olivier Matz:
> > > On Tue, Apr 24, 2018 at 09:21:00PM +0300, Andrew Rybchenko wrote:
> > > > On 04/24/2018 07:02 PM, Olivier Matz wrote:
> > > > > On Tue, Apr 24, 2018 at 03:28:33PM +0300, Andrew Rybchenko wrote:
> > > > > > On 04/24/2018 04:38 AM, Yongseok Koh wrote:
> > > > > > > + * Returns TRUE if given mbuf is cloned by mbuf indirection, or 
> > > > > > > FALSE
> > > > > > > + * otherwise.
> > > > > > > + *
> > > > > > > + * If a mbuf has its data in another mbuf and references it by 
> > > > > > > mbuf
> > > > > > > + * indirection, this mbuf can be defined as a cloned mbuf.
> > > > > > > + */
> > > > > > > +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & 
> > > > > > > IND_ATTACHED_MBUF)
> > > > > > > +
> > > > > > > +/**
> > > > > > > * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
> > > > > > > */
> > > > > > > -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & 
> > > > > > > IND_ATTACHED_MBUF)
> > > > > > > +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
> > > > > > It is still confusing that INDIRECT != !DIRECT.
> > > > > > May be we have no good options right now, but I'd suggest to at 
> > > > > > least
> > > > > > deprecate
> > > > > > RTE_MBUF_INDIRECT() and completely remove it in the next release.
> > > > > Agree. I may have missed something, but is my previous suggestion
> > > > > not doable?
> > > > > 
> > > > > - direct = embeds its own data  (and indirect = !direct)
> > > > > - clone (or another name) = data is another mbuf
> > > > > - extbuf = data is in an external buffer
> > > > 
> > > > I guess the problem that it changes INDIRECT semantics since EXTBUF
> > > > is added as well. I think strictly speaking it is an API change.
> > > > Is it OK to make it without announcement?
> > > 
> > > In any case, there will be an ABI change, because an application
> > > compiled for 18.02 will not be able to handle these new kind of
> > > mbuf.
> > > 
> > > So unfortunatly yes, I think this kind of changes should first be
> > > announced.
> > > 
> > > Thomas, what do you think?
> > 
> > What is the impact for the application developer?
> > Is there something to change in the application after this patch?
> 
> Let me address two concerns discussed here.
> 
> 1) API breakage of RTE_MBUF_DIRECT()
> Previously, direct == !indirect but now direct == !indirect && !extbuf. But to
> set the new flag (EXT_ATTACHED_MBUF), the new API, rte_pktmbuf_attach_extbuf()
> should be used and it is experimental. If application isn't compiled without
> allowing experimental API or application doesn't use the new API, it is always
> true that direct == !indirect. It looks logically okay to me. And FYI, it 
> passed
> the mbuf_autotest.
> 
> 2) ABI breakage of mlx5's new Multi-Packet RQ (a.k.a MPRQ) feature
> It's right that it could breadk ABI if the PMD delivers packets with external
> buffer attached. But, the MPRQ feature is disabled by default and it can be
> enabled only by the newly introduced PMD parameter (mprq_en). So, there's no
> possibility that 18.02-based application receives a mbuf having an external
> buffer. And, like Olivier mentioned, there's another ABI breakage by removing
> control mbuf anyway.

Stricly speaking, it is possible that a user pass this parameter through
the application (which just forwards it) to the new dpdk. So there is an ABI
change. In short, if a user wants to enable an optimization of 18.05 on an
application compiled for 18.02, it will fail.

But I agree the impact is very limited.

We are a bit lucky, because:
- the mbuf size is aligned to 128, so it stays the same, and the priv area
  is after the 2nd cache line (note: we are at 112 bytes over 128 on x86_64).
- previously, the area where shinfo is added was filled with garbage. It has
  no impact because it is only accessed when the EXT flag is set.
- the unused flags are 0 by default.

Knowing there is an ABI breakage this release, it could also make sense to
try to limitate them and avoid breaking it again in 18.08.

So in my opinion, reagarding API/ABI, this patchset could go in for 18.05.

Olivier


> 
> So, I don't think there's need for developers to change their application 
> after
> this patch unless they want to use the new feature.
> 
> 
> Thanks,
> Yongseok


Re: [dpdk-dev] [PATCH 5/5] malloc: fix potential negative return

2018-04-25 Thread Tan, Jianfeng



On 4/17/2018 11:48 PM, Anatoly Burakov wrote:

Return value from rte_socket_id_by_idx() may be negative, which would
result in negative index access.

Additionally, return value was of mismatched type (function returns
signed int, socket id was unsigned).

Coverity issue: 272571
Coverity issue: 272597

Fixes: 30bc6bf0d516 ("malloc: add function to dump heap contents")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
  lib/librte_eal/common/rte_malloc.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/rte_malloc.c 
b/lib/librte_eal/common/rte_malloc.c
index b51a6d1..f207ba2 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -169,7 +169,11 @@ rte_malloc_dump_heaps(FILE *f)
unsigned int idx;
  
  	for (idx = 0; idx < rte_socket_count(); idx++) {

-   unsigned int socket = rte_socket_id_by_idx(idx);
+   int socket = rte_socket_id_by_idx(idx);
+   if (socket < 0) {
+   RTE_LOG(ERR, EAL, "Invalid socket index: %u\n", idx);
+   continue;
+   }


For such check (and many others), we are clear that idx is guaranteed by 
rte_socket_count(), so rte_socket_id_by_idx() can never return -1. So 
why not just reporting this as false-positive?


Thanks,
Jianfeng


fprintf(f, "Heap on socket %i:\n", socket);
malloc_heap_dump(&mcfg->malloc_heaps[socket], f);
}




Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf

2018-04-25 Thread Olivier Matz
Hi Yongseok,

On Tue, Apr 24, 2018 at 06:02:44PM +0200, Olivier Matz wrote:
> > > @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
> > >   }
> > >   /**
> > > + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
> > > + * otherwise.
> > > + *
> > > + * If a mbuf has its data in another mbuf and references it by mbuf
> > > + * indirection, this mbuf can be defined as a cloned mbuf.
> > > + */
> > > +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > > +
> > > +/**
> > >* Returns TRUE if given mbuf is indirect, or FALSE otherwise.
> > >*/
> > > -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> > > +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
> > 
> > It is still confusing that INDIRECT != !DIRECT.
> > May be we have no good options right now, but I'd suggest to at least
> > deprecate
> > RTE_MBUF_INDIRECT() and completely remove it in the next release.
> 
> Agree. I may have missed something, but is my previous suggestion
> not doable?
> 
> - direct = embeds its own data  (and indirect = !direct)
> - clone (or another name) = data is another mbuf
> - extbuf = data is in an external buffer

Any comment about this option?


Re: [dpdk-dev] [PATCH v4 1/5] ethdev: add access to eeprom

2018-04-25 Thread Thomas Monjalon
25/04/2018 10:21, Zijie Pan:
> Hi Thomas,
> 
> > > +/**
> > > + * Placeholder for accessing plugin module eeprom
> > > + */
> > > +struct rte_dev_module_info {
> > > + uint32_t type; /**< Type of plugin module eeprom */
> > > + uint32_t eeprom_len; /**< Length of plugin module eeprom */
> > > +};
> >
> > I am not sure "plugin module" is descriptive enough.
> 
> Here is the description when "man ethtool":
> -m --dump-module-eeprom --module-info
> Retrieves  and if possible decodes the EEPROM from plugin modules, e.g 
> SFP+, QSFP.
> Is there any suggestion how to describe it?

No better idea.

> > And I think the structure name should be rte_eth_dev_module_info
> > to make clear that we are talking about NIC modules.
> > Any better idea?
> 
> Yes, rte_eth_dev_module_info is better than rte_dev_module_info.
> But the name of other structures should be also changed. And the file name
> should be changed to rte_eth_dev_info.h.
> Do we need to make a patch first to change the name of the file and 
> structures?

No you can just focus on your change.
If more clean-up is required in your opinion, you can do it later.


> > > +int __rte_experimental
> > > +rte_eth_dev_get_module_eeprom(uint16_t port_id,
> > > +   struct rte_dev_eeprom_info *info)
> > > +{
> > > + struct rte_eth_dev *dev;
> > > +
> > > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
> > > +
> > > + dev = &rte_eth_devices[port_id];
> > > + RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_eeprom, -ENOTSUP);
> > > + return (*dev->dev_ops->get_module_eeprom)(dev, info);
> > > +}
> > > +
> > >  int
> > >  rte_eth_dev_get_dcb_info(uint16_t port_id,
> >
> > Please move this code after other EEPROM related functions.
> I put the functions after rte_eth_dev_set_eeprom(). Please apply the patch 
> and check it.

OK

> > > --- a/lib/librte_ether/rte_ethdev_version.map
> > > +++ b/lib/librte_ether/rte_ethdev_version.map
> > > @@ -229,5 +229,7 @@ EXPERIMENTAL {
> > >  rte_mtr_policer_actions_update;
> > >  rte_mtr_stats_read;
> > >  rte_mtr_stats_update;
> > > + rte_eth_dev_get_module_info;
> > > + rte_eth_dev_get_module_eeprom;
> >
> > This must be inserted in alphabetical order.
> I will update it in v5 patchset.

Please send v5 ASAP, thanks.




Re: [dpdk-dev] [PATCH] ethdev: fix naming for device capability

2018-04-25 Thread Thomas Monjalon
25/04/2018 10:18, Qi Zhang:
> Rename all device capabilities to start with RTE_ETH_DEV_CAPA.
> 
> Fixes: 172364028db6 ("net/i40e: enable runtime queue setup")

Are you sure about the Fixes line?





Re: [dpdk-dev] [dpdk-web] [PATCH v2] update stable releases roadmap

2018-04-25 Thread Ferruh Yigit
On 4/20/2018 4:52 PM, Aaron Conole wrote:
> Kevin Traynor  writes:
> 
>> On 04/18/2018 02:28 PM, Thomas Monjalon wrote:
>>> 18/04/2018 14:28, Ferruh Yigit:
 On 4/18/2018 10:14 AM, Thomas Monjalon wrote:
> 18/04/2018 11:05, Ferruh Yigit:
>> On 4/11/2018 12:28 AM, Thomas Monjalon wrote:
>>> -   Typically a new stable release version follows a mainline 
>>> release
>>> -   by 1-2 weeks, depending on the test results.
>>> +   The first stable release (.1) of a branch should follow
>>> +   its mainline release (.0) by at least two months,
>>> +   after the first release candidate (-rc1) of the next branch.
>>
>> Hi Thomas,
>>
>> What this change suggest? To be able to backport patches from rc1?
>
> Yes, it is the proposal we discussed earlier.
> We can wait one week after RC1 to get some validation confirmation.
> Do you agree?

 This has been discussed in tech-board, what I remember the decision was to 
 wait
 the release to backport patches into stable tree.
>>>
>>
>> Any minutes? I couldn't find them
>>
>>> It was not so clear to me.
>>> I thought post-rc1 was acceptable. The idea is to speed-up stable releases
>>> pace, especially first release of a series.
>>>
>>>
>>
>> I think timing of stable releases and bugfix backports to the stable
>> branch are two separate items.
>>
>> I do think that bugfix backports to stable should happen on a regular
>> basis (e.g. every 2 weeks). Otherwise we are back to the situation where
>> if there's a bugfix after a DPDK release, a user like (surprise,
>> surprise) OVS may not be able to use that DPDK version for ~3 months.
>>
>> Someone who wants to get the latest bugfixes can just take the latest on
>> the stable branch and importantly, can have confidence that the
>> community has officially accepted those patches. If someone requires
>> stable to be validated, then they have to wait until the release.
> 
> +1 - this seems to make the most sense to me.  Keep the patches flowing,
> but don't label/tag it until validation.  That serves an additional
> function: developers know their CC's to stable are being processed.

Are stable trees verified?

> 
>> Kevin.



Re: [dpdk-dev] [PATCH] ethdev: fix naming for device capability

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 9:33 AM, Thomas Monjalon wrote:
> 25/04/2018 10:18, Qi Zhang:
>> Rename all device capabilities to start with RTE_ETH_DEV_CAPA.
>>
>> Fixes: 172364028db6 ("net/i40e: enable runtime queue setup")
> 
> Are you sure about the Fixes line?

Original patch is still on next-net, I am for squashing this patch, so commit
log will be lost.

But for the record, I guess it should be:
Fixes: 8719ad91e577 ("ethdev: support runtime queue setup")


Re: [dpdk-dev] [PATCH] ethdev: fix naming for device capability

2018-04-25 Thread Zhang, Qi Z


> -Original Message-
> From: Yigit, Ferruh
> Sent: Wednesday, April 25, 2018 4:39 PM
> To: Thomas Monjalon ; Zhang, Qi Z
> 
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] ethdev: fix naming for device capability
> 
> On 4/25/2018 9:33 AM, Thomas Monjalon wrote:
> > 25/04/2018 10:18, Qi Zhang:
> >> Rename all device capabilities to start with RTE_ETH_DEV_CAPA.
> >>
> >> Fixes: 172364028db6 ("net/i40e: enable runtime queue setup")
> >
> > Are you sure about the Fixes line?
> 
> Original patch is still on next-net, I am for squashing this patch, so commit 
> log
> will be lost.
> 
> But for the record, I guess it should be:
> Fixes: 8719ad91e577 ("ethdev: support runtime queue setup")

Sorry, just send v2 to correct this.




[dpdk-dev] [PATCH v2] ethdev: fix naming for device capability

2018-04-25 Thread Qi Zhang
Rename all device capabilities to start with RTE_ETH_DEV_CAPA.

Fixes: 8719ad91e5772 ("ethdev: support runtime queue setup")

Signed-off-by: Qi Zhang 
---

v2:
- fix fix line

 doc/guides/nics/features.rst   | 4 ++--
 drivers/net/i40e/i40e_ethdev.c | 4 ++--
 lib/librte_ether/rte_ethdev.c  | 4 ++--
 lib/librte_ether/rte_ethdev.h  | 6 +++---
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 67d459f80..14b8672bc 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -899,7 +899,7 @@ Runtime Rx queue setup
 
 Supports Rx queue setup after device started.
 
-* **[provides] rte_eth_dev_info**: 
``dev_capa:DEV_CAPA_RUNTIME_RX_QUEUE_SETUP``.
+* **[provides] rte_eth_dev_info**: 
``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP``.
 * **[related]  API**: ``rte_eth_dev_info_get()``.
 
 .. _nic_features_runtime_tx_queue_setup:
@@ -909,7 +909,7 @@ Runtime Tx queue setup
 
 Supports Tx queue setup after device started.
 
-* **[provides] rte_eth_dev_info**: 
``dev_capa:DEV_CAPA_RUNTIME_TX_QUEUE_SETUP``.
+* **[provides] rte_eth_dev_info**: 
``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP``.
 * **[related]  API**: ``rte_eth_dev_info_get()``.
 
 .. _nic_features_other:
diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index e329042df..2fc98a7e7 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -3245,8 +3245,8 @@ i40e_dev_info_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
DEV_TX_OFFLOAD_IPIP_TNL_TSO |
DEV_TX_OFFLOAD_GENEVE_TNL_TSO;
dev_info->dev_capa =
-   DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
-   DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
+   RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP |
+   RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP;
 
dev_info->hash_key_size = (I40E_PFQF_HKEY_MAX_INDEX + 1) *
sizeof(uint32_t);
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 0e503ab7e..5f1a1bf2b 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1478,7 +1478,7 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
 
if (dev->data->dev_started &&
!(dev_info.dev_capa &
-   DEV_CAPA_RUNTIME_RX_QUEUE_SETUP))
+   RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP))
return -EBUSY;
 
if (dev->data->rx_queue_state[rx_queue_id] !=
@@ -1586,7 +1586,7 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
 
if (dev->data->dev_started &&
!(dev_info.dev_capa &
-   DEV_CAPA_RUNTIME_TX_QUEUE_SETUP))
+   RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP))
return -EBUSY;
 
if (dev->data->tx_queue_state[tx_queue_id] !=
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 4096f688a..8da452cdf 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -993,9 +993,9 @@ struct rte_eth_conf {
  */
 #define DEV_TX_OFFLOAD_IP_TNL_TSO   0x0008
 
-#define DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x0001
+#define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x0001
 /**< Device supports Rx queue setup after device started*/
-#define DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x0002
+#define RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP 0x0002
 /**< Device supports Tx queue setup after device started*/
 
 /*
@@ -1071,7 +1071,7 @@ struct rte_eth_dev_info {
struct rte_eth_dev_portconf default_rxportconf;
/** Tx parameter recommendations */
struct rte_eth_dev_portconf default_txportconf;
-   /** Generic device capabilities */
+   /** Generic device capabilities (RTE_ETH_DEV_CAPA_). */
uint64_t dev_capa;
 };
 
-- 
2.13.6



Re: [dpdk-dev] [PATCH v2 0/6] mempool: add bucket driver

2018-04-25 Thread Olivier Matz
Hi,

On Wed, Apr 25, 2018 at 01:00:23AM +0200, Thomas Monjalon wrote:
> Can we have this patchset in 18.05-rc1?
> Or is it candidate to rc2?

I realized I made my comments on v1 instead of v2, sorry.

https://dpdk.org/dev/patchwork/patch/36538/
https://dpdk.org/dev/patchwork/patch/36535/
https://dpdk.org/dev/patchwork/patch/36533/

All of them are minor issues, they could be adressed for rc2.
Will send acks for generic mempool part.

Olivier


> 
> 16/04/2018 15:33, Andrew Rybchenko:
> > The patch series adds bucket mempool driver which allows to allocate
> > (both physically and virtually) contiguous blocks of objects and adds
> > mempool API to do it. It is still capable to provide separate objects,
> > but it is definitely more heavy-weight than ring/stack drivers.
> > The driver will be used by the future Solarflare driver enhancements
> > which allow to utilize physical contiguous blocks in the NIC firmware.
> > 
> > The target usecase is dequeue in blocks and enqueue separate objects
> > back (which are collected in buckets to be dequeued). So, the memory
> > pool with bucket driver is created by an application and provided to
> > networking PMD receive queue. The choice of bucket driver is done using
> > rte_eth_dev_pool_ops_supported(). A PMD that relies upon contiguous
> > block allocation should report the bucket driver as the only supported
> > and preferred one.
> [...]
> > Andrew Rybchenko (1):
> >   doc: advertise bucket mempool driver
> > 
> > Artem V. Andreev (5):
> >   mempool/bucket: implement bucket mempool manager
> >   mempool: implement abstract mempool info API
> >   mempool: support block dequeue operation
> >   mempool/bucket: implement block dequeue operation
> >   mempool/bucket: do not allow one lcore to grab all buckets


Re: [dpdk-dev] [PATCH v2 2/6] mempool: implement abstract mempool info API

2018-04-25 Thread Olivier Matz
On Mon, Apr 16, 2018 at 02:33:26PM +0100, Andrew Rybchenko wrote:
> From: "Artem V. Andreev" 
> 
> Primarily, it is intended as a way for the mempool driver to provide
> additional information on how it lays up objects inside the mempool.
> 
> Signed-off-by: Artem V. Andreev 
> Signed-off-by: Andrew Rybchenko 

Acked-by: Olivier Matz 


Re: [dpdk-dev] [PATCH v2 3/6] mempool: support block dequeue operation

2018-04-25 Thread Olivier Matz
On Mon, Apr 16, 2018 at 02:33:27PM +0100, Andrew Rybchenko wrote:
> From: "Artem V. Andreev" 
> 
> If mempool manager supports object blocks (physically and virtual
> contiguous set of objects), it is sufficient to get the first
> object only and the function allows to avoid filling in of
> information about each block member.
> 
> Signed-off-by: Artem V. Andreev 
> Signed-off-by: Andrew Rybchenko 

Acked-by: Olivier Matz 


Re: [dpdk-dev] [PATCH v2] ethdev: fix naming for device capability

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 9:39 AM, Qi Zhang wrote:
> Rename all device capabilities to start with RTE_ETH_DEV_CAPA.
> 
> Fixes: 8719ad91e5772 ("ethdev: support runtime queue setup")
> 
> Signed-off-by: Qi Zhang 

Squashed into relevant commit in next-net, thanks.


Re: [dpdk-dev] [PATCH 5/5] malloc: fix potential negative return

2018-04-25 Thread Burakov, Anatoly

On 25-Apr-18 9:24 AM, Tan, Jianfeng wrote:



On 4/17/2018 11:48 PM, Anatoly Burakov wrote:

Return value from rte_socket_id_by_idx() may be negative, which would
result in negative index access.

Additionally, return value was of mismatched type (function returns
signed int, socket id was unsigned).

Coverity issue: 272571
Coverity issue: 272597

Fixes: 30bc6bf0d516 ("malloc: add function to dump heap contents")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
  lib/librte_eal/common/rte_malloc.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/rte_malloc.c 
b/lib/librte_eal/common/rte_malloc.c

index b51a6d1..f207ba2 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -169,7 +169,11 @@ rte_malloc_dump_heaps(FILE *f)
  unsigned int idx;
  for (idx = 0; idx < rte_socket_count(); idx++) {
-    unsigned int socket = rte_socket_id_by_idx(idx);
+    int socket = rte_socket_id_by_idx(idx);
+    if (socket < 0) {
+    RTE_LOG(ERR, EAL, "Invalid socket index: %u\n", idx);
+    continue;
+    }


For such check (and many others), we are clear that idx is guaranteed by 
rte_socket_count(), so rte_socket_id_by_idx() can never return -1. So 
why not just reporting this as false-positive?


Well, technically, if someone were to corrupt rte_config, it would 
introduce a possibility of a negative return. However, i guess, at that 
point we've got bigger problems, so perhaps you're right and i should 
drop these fixes.




Thanks,
Jianfeng


  fprintf(f, "Heap on socket %i:\n", socket);
  malloc_heap_dump(&mcfg->malloc_heaps[socket], f);
  }






--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH 2/2] eal: fix not checking unlock result

2018-04-25 Thread Burakov, Anatoly

On 25-Apr-18 8:09 AM, Tan, Jianfeng wrote:



On 4/17/2018 11:42 PM, Anatoly Burakov wrote:

Coverity issue: 272607

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 7 +--
  1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c

index 009f963..01fee51 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -304,8 +304,11 @@ clear_hugedir(const char * hugedir)
  /* if lock succeeds, unlock and remove the file */
  if (lck_result != -1) {
  lck.l_type = F_UNLCK;
-    fcntl(fd, F_SETLK, &lck);
-    unlinkat(dir_fd, dirent->d_name, 0);
+    if (fcntl(fd, F_SETLK, &lck) < 0)
+    RTE_LOG(ERR, EAL, "Couldn't unlock %s: %s\n",
+    dirent->d_name, strerror(errno));


It seems that we shall return error if this nearly-impossible error 
happens, no?


I'm not sure if we should. In any case, lock will be dropped by close(), 
so the proper fix would've been just remove the fcntl() call altogether.


I'll respin.




+    else
+    unlinkat(dir_fd, dirent->d_name, 0);
  }
  close (fd);
  dirent = readdir(dir);






--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH v7 07/11] eal: replace rte_panic instances in hugepage_info

2018-04-25 Thread Burakov, Anatoly

On 24-Apr-18 11:16 PM, Arnon Warshavsky wrote:

replace panic calls with log and return value.

Signed-off-by: Arnon Warshavsky 
---


Acked-by: Anatoly Burakov 

--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH] eventdev: convert eth Rx adapter files to SPDX license tag

2018-04-25 Thread Jerin Jacob
-Original Message-
> Date: Wed, 25 Apr 2018 04:02:21 +0530
> From: Nikhil Rao 
> To: jerin.ja...@caviumnetworks.com
> CC: dev@dpdk.org, hemant.agra...@nxp.com, Nikhil Rao 
> Subject: [PATCH] eventdev: convert eth Rx adapter files to SPDX license tag
> X-Mailer: git-send-email 1.8.3.1
> 
> Signed-off-by: Nikhil Rao 

Acked-by: Jerin Jacob 



Re: [dpdk-dev] [PATCH v7 10/11] eal: replace rte_panic instances in init sequence

2018-04-25 Thread Burakov, Anatoly

On 24-Apr-18 11:16 PM, Arnon Warshavsky wrote:

Change some local functions return type from void to int.
This change does not break ABI as the functions are internal.
Panic thrown from threads was not handled in this patch

Signed-off-by: Arnon Warshavsky 
---


<...>

  
+	if (rte_config_init() != 0) {

+   rte_eal_init_alert("Failed to init configuration");
+   rte_errno = EFAULT;
+   return -1;
+   }
+
+   if (rte_mp_channel_init() < 0) {
+   rte_eal_init_alert("failed to init mp channel\n");
+   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+   rte_errno = EFAULT;
+   return -1;
+   }
+   }


^^^ this change looks unintended. Rebase artifact?


+
/* in secondary processes, memory init may allocate additional fbarrays
 * not present in primary processes, so to avoid any potential issues,
 * initialize memzones first.
@@ -671,6 +712,7 @@ static void rte_eal_init_alert(const char *msg)
 */
if (pipe(lcore_config[i].pipe_master2slave) < 0)
rte_panic("Cannot create pipe\n");
+
if (pipe(lcore_config[i].pipe_slave2master) < 0)
rte_panic("Cannot create pipe\n");


^^^ this looks unintended as well.

  
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c

index 5b23bf0..54adaec 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -160,7 +160,7 @@ enum rte_iova_mode
   * We also don't lock the whole file, so that in future we can use read-locks
   * on other parts, e.g. memzones, to detect if there are running secondary
   * processes. */
-static void
+static int
  rte_eal_config_create(void)
  {
void *rte_mem_cfg_addr;
@@ -169,7 +169,7 @@ enum rte_iova_mode
const char *pathname = eal_runtime_config_path();
  
  	if (internal_config.no_shconf)

-   return;


<...>


}
  
  	rte_mem_cfg_addr = mmap(rte_mem_cfg_addr, sizeof(*rte_config.mem_config),

PROT_READ | PROT_WRITE, MAP_SHARED, mem_cfg_fd, 
0);
  
-	if (rte_mem_cfg_addr == MAP_FAILED){

-   rte_panic("Cannot mmap memory for rte_config\n");
+   if (rte_mem_cfg_addr == MAP_FAILED) {
+   RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config\n",
+   __func__);
+   return -1;
}


I think you forgot to close mem_cfg_fd and set it to -1 in case of error 
here.



memcpy(rte_mem_cfg_addr, &early_mem_config, sizeof(early_mem_config));
rte_config.mem_config = rte_mem_cfg_addr;
@@ -211,10 +221,11 @@ enum rte_iova_mode
 * processes could later map the config into this exact location */
rte_config.mem_config->mem_cfg_addr = (uintptr_t) rte_mem_cfg_addr;
  
+	return 0;

  }
  


<...>

  
  	/* map it as read-only first */

mem_config = (struct rte_mem_config *) mmap(NULL, sizeof(*mem_config),
PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
-   if (mem_config == MAP_FAILED)
-   rte_panic("Cannot mmap memory for rte_config! error %i (%s)\n",
- errno, strerror(errno));
+   if (mem_config == MAP_FAILED) {
+   mem_cfg_fd = -1;


Forgot close() here, i think.


+   RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for rte_config! error 
%i (%s)\n",
+   __func__, errno, strerror(errno));
+   return -1;
+   }
  
  	rte_config.mem_config = mem_config;

+
+   return 0;
  }
  
  /* reattach the shared config at exact memory location primary process has it */


<...>


+   if (rte_config_init() != 0)
+   return -1;
+
if (rte_eal_log_init(logid, internal_config.syslog_facility) < 0) {
rte_eal_init_alert("Cannot init logging.");
rte_errno = ENOMEM;
@@ -914,6 +946,7 @@ static void rte_eal_init_alert(const char *msg)
 */
if (pipe(lcore_config[i].pipe_master2slave) < 0)
rte_panic("Cannot create pipe\n");
+
if (pipe(lcore_config[i].pipe_slave2master) < 0)


Again, looks like unintended whitespace change.


rte_panic("Cannot create pipe\n");
  




--
Thanks,
Anatoly


Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf

2018-04-25 Thread Yongseok Koh


> On Apr 25, 2018, at 1:28 AM, Olivier Matz  wrote:
> 
> Hi Yongseok,
> 
> On Tue, Apr 24, 2018 at 06:02:44PM +0200, Olivier Matz wrote:
 @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
  }
  /**
 + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
 + * otherwise.
 + *
 + * If a mbuf has its data in another mbuf and references it by mbuf
 + * indirection, this mbuf can be defined as a cloned mbuf.
 + */
 +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)
 +
 +/**
   * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
   */
 -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
 +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
>>> 
>>> It is still confusing that INDIRECT != !DIRECT.
>>> May be we have no good options right now, but I'd suggest to at least
>>> deprecate
>>> RTE_MBUF_INDIRECT() and completely remove it in the next release.
>> 
>> Agree. I may have missed something, but is my previous suggestion
>> not doable?
>> 
>> - direct = embeds its own data  (and indirect = !direct)
>> - clone (or another name) = data is another mbuf
>> - extbuf = data is in an external buffer
> 
> Any comment about this option?

I liked your idea, so I defined RTE_MBUF_CLONED() and wanted to deprecate
RTE_MBUF_INDIRECT() in the coming release. But RTE_MBUF_DIRECT() can't be
(!RTE_MBUF_INDIRECT()) because it will logically include RTE_MBUF_HAS_EXTBUF().
I'm not sure I understand you correctly.

Can you please give me more guidelines so that I can take you idea?

Thanks,
Yongseok

[dpdk-dev] [PATCH v5 0/5] get the information and data of EEPROM

2018-04-25 Thread Zijie Pan
Add APIs to read information from the DPDK applictions.
It can be used to dump the EEPROM of plugin modules (SFP+, QSFP, etc.).

Two APIs are introduced to access eeprom:
- rte_eth_dev_get_module_info
- rte_eth_dev_get_module_eeprom

Applications based on DPDK can dump eeprom by calling those two APIs.

Then, each PMD has to implement these callbacks for e1000, ixgbe, i40e, etc.

Patch for example/ethtool is used to test this function. It can get the raw 
data of eeprom. See below how both DPDK applications (ethtool) and Linux
kernel are dumping the same eeprom of a same NIC.

- Start example/ethtool:
./examples/ethtool/ethtool-app/x86_64-native-linuxapp-gcc/ethtool -c 0xf -n 
4 --socket-mem 1024,0 -- -i
EthApp> drvinfo
Port 0 driver: net_ixgbe (ver: DPDK 18.05.0-rc0)
firmware-version: 0x18b30001
bus-info: :04:00.0

EthApp> module-eeprom
 [UINT16]: module-eeprom  
 Dump plug-in module EEPROM to file

EthApp> module-eeprom 0 my-module-eeprom.bin
Total plug-in module EEPROM length: 512 bytes

EthApp> quit

- HexDump of this eeprom file:
# xxd my-module-eeprom.bin
000: 0304 0710  0100  0006 6702   g...
010: 0803 001e 4f45 4d20 2020 2020 2020 2020  OEM
020: 2020 2020  1b21 5346 502d 3130 472d  ...!SFP-10G-
030: 5352 2d49 5420 2020 4120 2020 0352 0024  SR-IT   A   .R.$
040: 003a  5751 3136 3034 3132 4131 3135  .:..WQ160412A115
050: 2020 2020 3135 3136 3130 2020 68fa 033b  151610  h..;
060:          
070:          
080:          
090:          
0a0:          
0b0:          
0c0:          
0d0:          
0e0:          
0f0:          
100: 5000 fb00 4b00  8ca0 7530 88b8 7918  P...K.u0..y.
110: 1d4c 01f4 1b58 03e8 3de9 03e8 2710 04eb  .L...X..=...'...
120: 2710 0064 1f07 007e      '..d...~
130:          
140:   3f80    0100   ?...
150: 0100  0100  0100   002d  ...-
160: 2c59 810a 13c7 1752 0001   0200  ,Y.R
170: 0040  0040       .@...@..
180:          
190:          
1a0:  faff        
1b0:          
1c0:          
1d0:          
1e0:          
1f0:     0003 0100    

- Rerun same dump using Linux's kernel ethtool.

# ./install/sbin/dpdk-devbind --bind=ixgbe 04:00.0
# ethtool -m p2p1 raw on > meeprom-kernel.bin

# xxd meeprom-kernel.bin
000: 0304 0710  0100  0006 6702   g...
010: 0803 001e 4f45 4d20 2020 2020 2020 2020  OEM
020: 2020 2020  1b21 5346 502d 3130 472d  ...!SFP-10G-
030: 5352 2d49 5420 2020 4120 2020 0352 0024  SR-IT   A   .R.$
040: 003a  5751 3136 3034 3132 4131 3135  .:..WQ160412A115
050: 2020 2020 3135 3136 3130 2020 68fa 033b  151610  h..;
060:          
070:          
080:          
090:          
0a0:          
0b0:          
0c0:          
0d0:          
0e0:          
0f0:          
100: 5000 fb00 4b00  8ca0 7530 88b8 7918  P...K.u0..y.
110: 1d4c 01f4 1b58 03e8 3de9 03e8 2710 04eb  .L...X..=...'...
120: 2710 0064 1f07 007e      '..d...~
130:          
140:   3f80    0100   ?...
150: 0100  0100  0100   002d  ...-
160: 2899 8146 0058 0001 0001   8200  (..F.X..
170: 0540  0540       .@...@..
00

[dpdk-dev] [PATCH v5 2/5] examples/ethtool: add a new command module-eeprom

2018-04-25 Thread Zijie Pan
Add a new command "module-eeprom" to get the data of plugin
module EEPROM.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: or...@mellanox.com
Cc: bruce.richard...@intel.com
Cc: pablo.de.lara.gua...@intel.com
Cc: radu.nico...@intel.com
Cc: akhil.go...@nxp.com
Cc: tomasz.kante...@intel.com
Cc: john.mcnam...@intel.com
Cc: marko.kovace...@intel.com

 doc/guides/sample_app_ug/ethtool.rst  |3 ++
 examples/ethtool/ethtool-app/ethapp.c |   64 +
 examples/ethtool/lib/Makefile |1 +
 examples/ethtool/lib/rte_ethtool.c|   30 
 examples/ethtool/lib/rte_ethtool.h|   34 ++
 5 files changed, 132 insertions(+)

diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
index 1b79aca..47e09f6 100644
--- a/doc/guides/sample_app_ug/ethtool.rst
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -44,6 +44,7 @@ they do as as follows:
 
 * ``drvinfo``: Print driver info
 * ``eeprom``: Dump EEPROM to file
+* ``module-eeprom``: Dump plugin module EEPROM to file
 * ``link``: Print port link states
 * ``macaddr``: Gets/sets MAC address
 * ``mtu``: Set NIC MTU
@@ -97,6 +98,8 @@ the following functions:
 - ``rte_ethtool_get_eeprom_len()``
 - ``rte_ethtool_get_eeprom()``
 - ``rte_ethtool_set_eeprom()``
+- ``rte_ethtool_get_module_info()``
+- ``rte_ethtool_get_module_eeprom()``
 - ``rte_ethtool_get_pauseparam()``
 - ``rte_ethtool_set_pauseparam()``
 - ``rte_ethtool_net_open()``
diff --git a/examples/ethtool/ethtool-app/ethapp.c 
b/examples/ethtool/ethtool-app/ethapp.c
index 4d62f4c..a4e64b3 100644
--- a/examples/ethtool/ethtool-app/ethapp.c
+++ b/examples/ethtool/ethtool-app/ethapp.c
@@ -75,6 +75,9 @@ struct pcmd_intintint_params {
 /* Commands taking port id and string */
 cmdline_parse_token_string_t pcmd_eeprom_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd, "eeprom");
+cmdline_parse_token_string_t pcmd_module_eeprom_token_cmd =
+   TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd,
+"module-eeprom");
 cmdline_parse_token_string_t pcmd_mtu_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd, "mtu");
 cmdline_parse_token_string_t pcmd_regs_token_cmd =
@@ -298,6 +301,54 @@ struct pcmd_intintint_params {
 
 
 static void
+pcmd_module_eeprom_callback(void *ptr_params,
+   __rte_unused struct cmdline *ctx,
+   __rte_unused void *ptr_data)
+{
+   struct pcmd_intstr_params *params = ptr_params;
+   struct ethtool_eeprom info_eeprom;
+   uint32_t module_info[2];
+   int stat;
+   unsigned char bytes_eeprom[EEPROM_DUMP_CHUNKSIZE];
+   FILE *fp_eeprom;
+
+   if (!rte_eth_dev_is_valid_port(params->port)) {
+   printf("Error: Invalid port number %i\n", params->port);
+   return;
+   }
+
+   stat = rte_ethtool_get_module_info(params->port, module_info);
+   if (stat != 0) {
+   printf("Module EEPROM information read error %i\n", stat);
+   return;
+   }
+
+   info_eeprom.len = module_info[1];
+   info_eeprom.offset = 0;
+
+   stat = rte_ethtool_get_module_eeprom(params->port,
+&info_eeprom, bytes_eeprom);
+   if (stat != 0) {
+   printf("Module EEPROM read error %i\n", stat);
+   return;
+   }
+
+   fp_eeprom = fopen(params->opt, "wb");
+   if (fp_eeprom == NULL) {
+   printf("Error opening '%s' for writing\n", params->opt);
+   return;
+   }
+   printf("Total plugin module EEPROM length: %i bytes\n",
+  info_eeprom.len);
+   if (fwrite(bytes_eeprom, 1, info_eeprom.len,
+  fp_eeprom) != info_eeprom.len) {
+   printf("Error writing '%s'\n", params->opt);
+   }
+   fclose(fp_eeprom);
+}
+
+
+static void
 pcmd_pause_callback(void *ptr_params,
__rte_unused struct cmdline *ctx,
void *ptr_data)
@@ -664,6 +715,18 @@ static void pcmd_vlan_callback(__rte_unused void 
*ptr_params,
NULL
},
 };
+cmdline_parse_inst_t pcmd_module_eeprom = {
+   .f = pcmd_module_eeprom_callback,
+   .data = NULL,
+   .help_str = "module-eeprom  \n"
+   " Dump plugin module EEPROM to file",
+   .tokens = {
+   (void *)&pcmd_module_eeprom_token_cmd,
+   (void *)&pcmd_intstr_token_port,
+   (void *)&pcmd_intstr_token_opt,
+   NULL
+   },
+};
 cmdline_parse_inst_t pcmd_pause_noopt = {
.f = pcmd_pause_callback,
.data = (void *)0x01,
@@ -816,6 +879,7 @@ static void pcmd_vlan_callback(__rte_unused void 
*ptr_params,
 cmdline_parse_ctx_t list_prompt_commands[] = {
(cmdline_parse_inst_t *)&pcmd_drvinfo,
(cmdline_parse_inst_t *)&pcmd_eeprom,
+   (cmdline_parse_inst_t *)&pcmd_module_eeprom,
(cmdline_parse_inst

Re: [dpdk-dev] [PATCH v7 08/11] eal: replace rte_panic instances in interrupts thread

2018-04-25 Thread Burakov, Anatoly

On 24-Apr-18 11:16 PM, Arnon Warshavsky wrote:

replace panic calls with log and return value.
Thread function removes the noreturn attribute.

Signed-off-by: Arnon Warshavsky 
---


Just a general comment - i'm not too familiar with this code, but it 
looks like all of these failures will only happen on thread init. Can we 
make sure it starts?


You can use similar approach to Olivier's (recently merged) thread 
affinity patches, with pthread barriers etc. to ensure the thread has 
initialized properly, pthread_cancel() it if it didn't, and return -1 on 
thread init.


--
Thanks,
Anatoly


[dpdk-dev] [PATCH v5 3/5] net/ixgbe: add module EEPROM callbacks for ixgbe

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of ixgbe to get the information
and data of plugin module eeprom.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: wenzhuo...@intel.com
Cc: konstantin.anan...@intel.com

 drivers/net/ixgbe/ixgbe_ethdev.c |   79 ++
 1 file changed, 79 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index a5e2fc0..55850df 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -327,6 +327,11 @@ static int ixgbe_get_eeprom(struct rte_eth_dev *dev,
 static int ixgbe_set_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
 
+static int ixgbe_get_module_info(struct rte_eth_dev *dev,
+struct rte_dev_module_info *modinfo);
+static int ixgbe_get_module_eeprom(struct rte_eth_dev *dev,
+  struct rte_dev_eeprom_info *info);
+
 static int ixgbevf_get_reg_length(struct rte_eth_dev *dev);
 static int ixgbevf_get_regs(struct rte_eth_dev *dev,
struct rte_dev_reg_info *regs);
@@ -564,6 +569,8 @@ static int ixgbe_dev_udp_tunnel_port_del(struct rte_eth_dev 
*dev,
.get_eeprom_length= ixgbe_get_eeprom_length,
.get_eeprom   = ixgbe_get_eeprom,
.set_eeprom   = ixgbe_set_eeprom,
+   .get_module_info  = ixgbe_get_module_info,
+   .get_module_eeprom= ixgbe_get_module_eeprom,
.get_dcb_info = ixgbe_dev_get_dcb_info,
.timesync_adjust_time = ixgbe_timesync_adjust_time,
.timesync_read_time   = ixgbe_timesync_read_time,
@@ -7126,6 +7133,78 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
return eeprom->ops.write_buffer(hw,  first, length, data);
 }
 
+static int
+ixgbe_get_module_info(struct rte_eth_dev *dev,
+ struct rte_dev_module_info *modinfo)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t status;
+   uint8_t sff8472_rev, addr_mode;
+   bool page_swap = false;
+
+   /* Check whether we support SFF-8472 or not */
+   status = hw->phy.ops.read_i2c_eeprom(hw,
+IXGBE_SFF_SFF_8472_COMP,
+&sff8472_rev);
+   if (status != 0)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   status = hw->phy.ops.read_i2c_eeprom(hw,
+IXGBE_SFF_SFF_8472_SWAP,
+&addr_mode);
+   if (status != 0)
+   return -EIO;
+
+   if (addr_mode & IXGBE_SFF_ADDRESSING_MODE) {
+   PMD_DRV_LOG(ERR,
+   "Address change required to access page 0xA2, "
+   "but not supported. Please report the module "
+   "type to the driver maintainers.");
+   page_swap = true;
+   }
+
+   if (sff8472_rev == IXGBE_SFF_SFF_8472_UNSUP || page_swap) {
+   /* We have a SFP, but it does not support SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   /* We have a SFP which supports a revision of SFF-8472. */
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+
+   return 0;
+}
+
+static int
+ixgbe_get_module_eeprom(struct rte_eth_dev *dev,
+   struct rte_dev_eeprom_info *info)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t status = IXGBE_ERR_PHY_ADDR_INVALID;
+   uint8_t databyte = 0xFF;
+   uint8_t *data = info->data;
+   uint32_t i = 0;
+
+   if (info->length == 0)
+   return -EINVAL;
+
+   for (i = info->offset; i < info->offset + info->length; i++) {
+   if (i < RTE_ETH_MODULE_SFF_8079_LEN)
+   status = hw->phy.ops.read_i2c_eeprom(hw, i, &databyte);
+   else
+   status = hw->phy.ops.read_i2c_sff8472(hw, i, &databyte);
+
+   if (status != 0)
+   return -EIO;
+
+   data[i - info->offset] = databyte;
+   }
+
+   return 0;
+}
+
 uint16_t
 ixgbe_reta_size_get(enum ixgbe_mac_type mac_type) {
switch (mac_type) {
-- 
1.7.10.4



[dpdk-dev] [PATCH v5 1/5] ethdev: add access to eeprom

2018-04-25 Thread Zijie Pan
add new APIs:
- rte_eth_dev_get_module_info
- rte_eth_dev_get_module_eeprom

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: john.mcnam...@intel.com
Cc: marko.kovace...@intel.com
Cc: tho...@monjalon.net

 doc/guides/nics/features.rst|   11 
 lib/librte_ether/rte_dev_info.h |   18 +
 lib/librte_ether/rte_ethdev.c   |   26 +++
 lib/librte_ether/rte_ethdev.h   |   43 +++
 lib/librte_ether/rte_ethdev_core.h  |   12 +
 lib/librte_ether/rte_ethdev_version.map |2 ++
 6 files changed, 112 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 1b4fb97..bb183e2 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -749,6 +749,17 @@ Supports getting/setting device eeprom data.
   ``rte_eth_dev_set_eeprom()``.
 
 
+.. _nic_features_module_eeprom_dump:
+
+Module EEPROM dump
+--
+
+Supports getting information and data of plugin module eeprom.
+
+* **[implements] eth_dev_ops**: ``get_module_info``, ``get_module_eeprom``.
+* **[related]API**: ``rte_eth_dev_get_module_info()``, 
``rte_eth_dev_get_module_eeprom()``.
+
+
 .. _nic_features_register_dump:
 
 Registers dump
diff --git a/lib/librte_ether/rte_dev_info.h b/lib/librte_ether/rte_dev_info.h
index 6b68584..02bdbed 100644
--- a/lib/librte_ether/rte_dev_info.h
+++ b/lib/librte_ether/rte_dev_info.h
@@ -28,4 +28,22 @@ struct rte_dev_eeprom_info {
uint32_t magic; /**< Device-specific key, such as device-id */
 };
 
+/**
+ * Placeholder for accessing plugin module eeprom
+ */
+struct rte_dev_module_info {
+   uint32_t type; /**< Type of plugin module eeprom */
+   uint32_t eeprom_len; /**< Length of plugin module eeprom */
+};
+
+/* EEPROM Standards for plug in modules */
+#define RTE_ETH_MODULE_SFF_8079 0x1
+#define RTE_ETH_MODULE_SFF_8079_LEN 256
+#define RTE_ETH_MODULE_SFF_8472 0x2
+#define RTE_ETH_MODULE_SFF_8472_LEN 512
+#define RTE_ETH_MODULE_SFF_8636 0x3
+#define RTE_ETH_MODULE_SFF_8636_LEN 256
+#define RTE_ETH_MODULE_SFF_8436 0x4
+#define RTE_ETH_MODULE_SFF_8436_LEN 256
+
 #endif /* _RTE_DEV_INFO_H_ */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7821a88..10bc19c 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3947,6 +3947,32 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, 
uint16_t queue_idx,
return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
+int __rte_experimental
+rte_eth_dev_get_module_info(uint16_t port_id,
+   struct rte_dev_module_info *modinfo)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_info, -ENOTSUP);
+   return (*dev->dev_ops->get_module_info)(dev, modinfo);
+}
+
+int __rte_experimental
+rte_eth_dev_get_module_eeprom(uint16_t port_id,
+ struct rte_dev_eeprom_info *info)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_eeprom, -ENOTSUP);
+   return (*dev->dev_ops->get_module_eeprom)(dev, info);
+}
+
 int
 rte_eth_dev_get_dcb_info(uint16_t port_id,
 struct rte_eth_dcb_info *dcb_info)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 7e4e57b..b39ccc4 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3311,6 +3311,49 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t 
queue_id,
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the type and size of plugin module EEPROM
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param modinfo
+ *   The type and size of plugin module EEPROM.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if hardware doesn't support.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
+ *   - others depends on the specific operations implementation.
+ */
+int __rte_experimental
+rte_eth_dev_get_module_info(uint16_t port_id,
+   struct rte_dev_module_info *modinfo);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the data of plugin module EEPROM
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param info
+ *   The template includes the plugin module EEPROM attributes, and the
+ *   buffer for return plugin module EEPROM data.
+ * @return
+ *   - (0) if successfu

[dpdk-dev] [PATCH v5 4/5] net/e1000: add module EEPROM callbacks for e1000

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of e1000 to get the information and
data of plugin module EEPROM.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: wenzhuo...@intel.com

 drivers/net/e1000/base/e1000_phy.h |8 
 drivers/net/e1000/igb_ethdev.c |   86 
 2 files changed, 94 insertions(+)

diff --git a/drivers/net/e1000/base/e1000_phy.h 
b/drivers/net/e1000/base/e1000_phy.h
index 3e45a9e..2cd0e14 100644
--- a/drivers/net/e1000/base/e1000_phy.h
+++ b/drivers/net/e1000/base/e1000_phy.h
@@ -330,4 +330,12 @@ struct sfp_e1000_flags {
 #define E1000_SFF_VENDOR_OUI_AVAGO 0x00176A00
 #define E1000_SFF_VENDOR_OUI_INTEL 0x001B2100
 
+/* EEPROM byte offsets */
+#define IGB_SFF_8472_SWAP  0x5C
+#define IGB_SFF_8472_COMP  0x5E
+
+/* Bitmasks */
+#define IGB_SFF_ADDRESSING_MODE0x4
+#define IGB_SFF_8472_UNSUP 0x00
+
 #endif
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 9b808a9..4b261cc 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -223,6 +223,10 @@ static int eth_igb_get_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
 static int eth_igb_set_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
+static int eth_igb_get_module_info(struct rte_eth_dev *dev,
+  struct rte_dev_module_info *modinfo);
+static int eth_igb_get_module_eeprom(struct rte_eth_dev *dev,
+struct rte_dev_eeprom_info *info);
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
struct ether_addr *mc_addr_set,
uint32_t nb_mc_addr);
@@ -402,6 +406,8 @@ static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t 
msix_vector,
.get_eeprom_length= eth_igb_get_eeprom_length,
.get_eeprom   = eth_igb_get_eeprom,
.set_eeprom   = eth_igb_set_eeprom,
+   .get_module_info  = eth_igb_get_module_info,
+   .get_module_eeprom= eth_igb_get_module_eeprom,
.timesync_adjust_time = igb_timesync_adjust_time,
.timesync_read_time   = igb_timesync_read_time,
.timesync_write_time  = igb_timesync_write_time,
@@ -5329,6 +5335,86 @@ static void igbvf_set_vfta_all(struct rte_eth_dev *dev, 
bool on)
 }
 
 static int
+eth_igb_get_module_info(struct rte_eth_dev *dev,
+   struct rte_dev_module_info *modinfo)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   uint32_t status = 0;
+   uint16_t sff8472_rev, addr_mode;
+   bool page_swap = false;
+
+   if (hw->phy.media_type == e1000_media_type_copper ||
+   hw->phy.media_type == e1000_media_type_unknown)
+   return -EOPNOTSUPP;
+
+   /* Check whether we support SFF-8472 or not */
+   status = e1000_read_phy_reg_i2c(hw, IGB_SFF_8472_COMP, &sff8472_rev);
+   if (status)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   status = e1000_read_phy_reg_i2c(hw, IGB_SFF_8472_SWAP, &addr_mode);
+   if (status)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   if ((addr_mode & 0xFF) & IGB_SFF_ADDRESSING_MODE) {
+   PMD_DRV_LOG(ERR,
+   "Address change required to access page 0xA2, "
+   "but not supported. Please report the module "
+   "type to the driver maintainers.\n");
+   page_swap = true;
+   }
+
+   if ((sff8472_rev & 0xFF) == IGB_SFF_8472_UNSUP || page_swap) {
+   /* We have an SFP, but it does not support SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   /* We have an SFP which supports a revision of SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+
+   return 0;
+}
+
+static int
+eth_igb_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   uint32_t status = 0;
+   uint16_t dataword[RTE_ETH_MODULE_SFF_8472_LEN / 2 + 1];
+   u16 first_word, last_word;
+   int i = 0;
+
+   if (info->length == 0)
+   return -EINVAL;
+
+   first_word = info->offset >> 1;
+   last_word = (info->offset + info->length - 1) >> 1;
+
+   /* Read EEPROM block, SFF-8079/SFF-8472, word at a time */
+   for (i = 0; i < last_word - first_word + 1; i++) {
+   status = e1000_read_phy_reg_i2c(hw, (first_word + i) * 2,
+   &dataword[i]);
+   if (stat

[dpdk-dev] [PATCH v5 5/5] net/i40e: add module EEPROM callbacks for i40e

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of i40e to get the information
and data of plugin module eeprom.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: beilei.x...@intel.com
Cc: qi.z.zh...@intel.com

 drivers/net/i40e/i40e_ethdev.c |  147 
 1 file changed, 147 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 180ac74..1a24cd7 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -369,6 +369,11 @@ static int i40e_get_regs(struct rte_eth_dev *dev,
 static int i40e_get_eeprom(struct rte_eth_dev *dev,
   struct rte_dev_eeprom_info *eeprom);
 
+static int i40e_get_module_info(struct rte_eth_dev *dev,
+   struct rte_dev_module_info *modinfo);
+static int i40e_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info);
+
 static int i40e_set_default_mac_addr(struct rte_eth_dev *dev,
  struct ether_addr *mac_addr);
 
@@ -489,6 +494,8 @@ static int i40e_sw_tunnel_filter_insert(struct i40e_pf *pf,
.get_reg  = i40e_get_regs,
.get_eeprom_length= i40e_get_eeprom_length,
.get_eeprom   = i40e_get_eeprom,
+   .get_module_info  = i40e_get_module_info,
+   .get_module_eeprom= i40e_get_module_eeprom,
.mac_addr_set = i40e_set_default_mac_addr,
.mtu_set  = i40e_dev_mtu_set,
.tm_ops_get   = i40e_tm_ops_get,
@@ -11327,6 +11334,146 @@ static int i40e_get_eeprom(struct rte_eth_dev *dev,
return 0;
 }
 
+static int i40e_get_module_info(struct rte_eth_dev *dev,
+   struct rte_dev_module_info *modinfo)
+{
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t sff8472_comp = 0;
+   uint32_t sff8472_swap = 0;
+   uint32_t sff8636_rev = 0;
+   i40e_status status;
+   uint32_t type = 0;
+
+   /* Check if firmware supports reading module EEPROM. */
+   if (!(hw->flags & I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE)) {
+   PMD_DRV_LOG(ERR,
+   "Module EEPROM memory read not supported. "
+   "Please update the NVM image.\n");
+   return -EINVAL;
+   }
+
+   status = i40e_update_link_info(hw);
+   if (status)
+   return -EIO;
+
+   if (hw->phy.link_info.phy_type == I40E_PHY_TYPE_EMPTY) {
+   PMD_DRV_LOG(ERR,
+   "Cannot read module EEPROM memory. "
+   "No module connected.\n");
+   return -EINVAL;
+   }
+
+   type = hw->phy.link_info.module_type[0];
+
+   switch (type) {
+   case I40E_MODULE_TYPE_SFP:
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   I40E_I2C_EEPROM_DEV_ADDR,
+   I40E_MODULE_SFF_8472_COMP,
+   &sff8472_comp, NULL);
+   if (status)
+   return -EIO;
+
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   I40E_I2C_EEPROM_DEV_ADDR,
+   I40E_MODULE_SFF_8472_SWAP,
+   &sff8472_swap, NULL);
+   if (status)
+   return -EIO;
+
+   /* Check if the module requires address swap to access
+* the other EEPROM memory page.
+*/
+   if (sff8472_swap & I40E_MODULE_SFF_ADDR_MODE) {
+   PMD_DRV_LOG(WARNING,
+   "Module address swap to access "
+   "page 0xA2 is not supported.\n");
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else if (sff8472_comp == 0x00) {
+   /* Module is not SFF-8472 compliant */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+   break;
+   case I40E_MODULE_TYPE_QSFP_PLUS:
+   /* Read from memory page 0. */
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   0,
+   I40E_MODULE_REVISION_ADDR,
+   &sff8636_rev, NULL

Re: [dpdk-dev] [PATCH v3] net/tap: remove queue specific offload support

2018-04-25 Thread Ophir Munk
Hi Ferruh,

I should have mentioned earlier that TAP does support queue specific 
capabilities. 
Please look in tap_queue_setup() and note that each TAP queue is created with a 
distinct file descriptor (fd).
Then supporting an offload capability is just implementing it in SW (e.g. 
calculating IP checksum).

If the main assumption of this patch was that TAP does not support queue 
specific offloads - then please consider this patch again.

On the other hand there is no port specific capability supported by TAP. 
However, in order to support legacy applications, port capabilities are usually 
reported as the OR operation between queue & port capabilities. 
TAP currently clones the queue capabilities to port capabilities. We could 
optimize this cloning by always return queue capabilities when queried about 
queues or ports. In this case - tap_rx_offload_get_port_capa() and 
tap_tx_offload_get_port_capa() could be removed.

Please find more comments inline.

> -Original Message-
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: Tuesday, April 24, 2018 8:54 PM
> To: Pascal Mazon 
> Cc: dev@dpdk.org; Ferruh Yigit ; Mordechay
> Haimovsky ; Ophir Munk 
> Subject: [PATCH v3] net/tap: remove queue specific offload support
> 
> It is not clear if tap PMD supports queue specific offloads, removing the
> related code.
> 
> Fixes: 95ae196ae10b ("net/tap: use new Rx offloads API")
> Fixes: 818fe14a9891 ("net/tap: use new Tx offloads API")
> Cc: mo...@mellanox.com
> 
> Signed-off-by: Ferruh Yigit 
> ---
> Cc: Ophir Munk 
> 
> v2:
> * rebased
> 
> v3:
> * txq->csum restored,
>   - ETH_TXQ_FLAGS_IGNORE check removed since ethdev layer takes care of
> it
>   - tx_conf != NULL check removed, this is internal api who calls this is
>   ethdev and it doesn't pass null tx_conf
> ---
>  drivers/net/tap/rte_eth_tap.c | 102 
> +-
>  1 file changed, 10 insertions(+), 92 deletions(-)
> 
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index ef33aace9..61b4b5df3 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -278,31 +278,6 @@ tap_rx_offload_get_port_capa(void)
>  DEV_RX_OFFLOAD_CRC_STRIP;
>  }
> 
> -static uint64_t
> -tap_rx_offload_get_queue_capa(void)
> -{
> - return DEV_RX_OFFLOAD_SCATTER |
> -DEV_RX_OFFLOAD_IPV4_CKSUM |
> -DEV_RX_OFFLOAD_UDP_CKSUM |
> -DEV_RX_OFFLOAD_TCP_CKSUM |
> -DEV_RX_OFFLOAD_CRC_STRIP;
> -}
> -

TAP PMD supports all of these RX queue specific offloads. I suggest to leave 
this function in place.

> -static bool
> -tap_rxq_are_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads) -{
> - uint64_t port_offloads = dev->data->dev_conf.rxmode.offloads;
> - uint64_t queue_supp_offloads = tap_rx_offload_get_queue_capa();
> - uint64_t port_supp_offloads = tap_rx_offload_get_port_capa();
> -
> - if ((offloads & (queue_supp_offloads | port_supp_offloads)) !=
> - offloads)
> - return false;
> - if ((port_offloads ^ offloads) & port_supp_offloads)
> - return false;
> - return true;
> -}
> -

Putting aside the fact that queue offloads equals port offloads (so could 
ignore "port_supp_offload" variable) - this function is essential to validate 
that the configured Rx offloads are supported by TAP. I suggest to leave this 
function in place.
Without it - testpmd falsely confirms non supported offloads.
For example before this patch: offloading "hw-vlan-filter" will fail as 
expected:

testpmd> port config all
testpmd> port config all hw-vlan-filter on
testpmd> port start all
Configuring Port 0 (socket 0)
PMD: net_tap0: 0x1209fc0: TX configured queues number: 1
PMD: net_tap0: 0x1209fc0: RX configured queues number: 1
PMD: 0x1209fc0: Rx queue offloads 0x120e don't match port offloads 0x120e or 
supported offloads 0x300e
Fail to configure port 0 rx queues

However, with this patch this configuration is falsely accepted.

>  /* Callback to handle the rx burst of packets to the correct interface and
>   * file descriptor(s) in a multi-queue setup.
>   */
> @@ -411,31 +386,6 @@ tap_tx_offload_get_port_capa(void)
>  DEV_TX_OFFLOAD_TCP_CKSUM;
>  }
> 
> -static uint64_t
> -tap_tx_offload_get_queue_capa(void)
> -{
> - return DEV_TX_OFFLOAD_MULTI_SEGS |
> -DEV_TX_OFFLOAD_IPV4_CKSUM |
> -DEV_TX_OFFLOAD_UDP_CKSUM |
> -DEV_TX_OFFLOAD_TCP_CKSUM;
> -}
> -

TAP PMD supports all of these TX queue specific offloads. I suggest to leave 
this function in place.

> -static bool
> -tap_txq_are_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads) -{
> - uint64_t port_offloads = dev->data->dev_conf.txmode.offloads;
> - uint64_t queue_supp_offloads = tap_tx_offload_get_queue_capa();
> - uint64_t port_supp_offloads = tap_tx_offload_get_port_capa();
> -
> - if ((offloads & (queue_supp_offloads | port_supp_offloads)) !=
> - 

Re: [dpdk-dev] [PATCH v4 1/2] mbuf: support attaching external buffer to mbuf

2018-04-25 Thread Yongseok Koh
> On Apr 25, 2018, at 2:08 AM, Yongseok Koh  wrote:
>> On Apr 25, 2018, at 1:28 AM, Olivier Matz  wrote:
>> 
>> Hi Yongseok,
>> 
>> On Tue, Apr 24, 2018 at 06:02:44PM +0200, Olivier Matz wrote:
> @@ -688,14 +704,33 @@ rte_mbuf_to_baddr(struct rte_mbuf *md)
> }
> /**
> + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE
> + * otherwise.
> + *
> + * If a mbuf has its data in another mbuf and references it by mbuf
> + * indirection, this mbuf can be defined as a cloned mbuf.
> + */
> +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)
> +
> +/**
>  * Returns TRUE if given mbuf is indirect, or FALSE otherwise.
>  */
> -#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
> +#define RTE_MBUF_INDIRECT(mb)   RTE_MBUF_CLONED(mb)
 
 It is still confusing that INDIRECT != !DIRECT.
 May be we have no good options right now, but I'd suggest to at least
 deprecate
 RTE_MBUF_INDIRECT() and completely remove it in the next release.
>>> 
>>> Agree. I may have missed something, but is my previous suggestion
>>> not doable?
>>> 
>>> - direct = embeds its own data  (and indirect = !direct)
>>> - clone (or another name) = data is another mbuf
>>> - extbuf = data is in an external buffer
>> 
>> Any comment about this option?
> 
> I liked your idea, so I defined RTE_MBUF_CLONED() and wanted to deprecate
> RTE_MBUF_INDIRECT() in the coming release. But RTE_MBUF_DIRECT() can't be
> (!RTE_MBUF_INDIRECT()) because it will logically include 
> RTE_MBUF_HAS_EXTBUF().
> I'm not sure I understand you correctly.
> 
> Can you please give me more guidelines so that I can take you idea?

Maybe, did you mean the following? Looks like doable but RTE_MBUF_DIRECT()
can't logically mean 'mbuf embeds its own data', right?

#define RTE_MBUF_INDIRECT(mb)   ((mb)->ol_flags & IND_ATTACHED_MBUF)
#define RTE_MBUF_DIRECT(mb) (!RTE_MBUF_INDIRECT(mb))

#define RTE_MBUF_HAS_EXTBUF(mb) ((mb)->ol_flags & EXT_ATTACHED_MBUF)
#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF)

[...]

@@ -1327,7 +1572,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 
if (likely(rte_mbuf_refcnt_read(m) == 1)) {
 
-   if (RTE_MBUF_INDIRECT(m))
+   if (RTE_MBUF_INDIRECT(m) || RTE_MBUF_HAS_EXTBUF(m))
rte_pktmbuf_detach(m);
 
if (m->next != NULL) {
@@ -1339,7 +1584,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
 
} else if (__rte_mbuf_refcnt_update(m, -1) == 0) {
 
-   if (RTE_MBUF_INDIRECT(m))
+   if (RTE_MBUF_INDIRECT(m) || RTE_MBUF_HAS_EXTBUF(m))
rte_pktmbuf_detach(m);
 
if (m->next != NULL) {


Thanks,
Yongseok



Re: [dpdk-dev] [PATCH v5 0/5] get the information and data of EEPROM

2018-04-25 Thread Thomas Monjalon
25/04/2018 11:13, Zijie Pan:
> v5 changes:
>  - insert the new APIs in alphabetical order in rte_ethdev_version.map.

I think there is a misunderstanding.
I was asking to rename rte_dev_module_info as rte_eth_dev_module_info.





Re: [dpdk-dev] [PATCH] eventdev: convert eth Rx adapter files to SPDX license tag

2018-04-25 Thread Hemant Agrawal

Hi Nikhil,


On 4/25/2018 4:02 AM, Nikhil Rao wrote:

Signed-off-by: Nikhil Rao 
---
  lib/librte_eventdev/rte_event_eth_rx_adapter.h | 32 +++---
  lib/librte_eventdev/rte_event_eth_rx_adapter.c |  4 
  2 files changed, 7 insertions(+), 29 deletions(-)

diff --git a/lib/librte_eventdev/rte_event_eth_rx_adapter.h 
b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
index fc9da14..e6a6435 100644
--- a/lib/librte_eventdev/rte_event_eth_rx_adapter.h
+++ b/lib/librte_eventdev/rte_event_eth_rx_adapter.h
@@ -1,32 +1,6 @@
-/*
- *   Copyright(c) 2017 Intel Corporation. All rights reserved.
- *   All rights reserved.
- *
- *   Redistribution and use in source and binary forms, with or without
- *   modification, are permitted provided that the following conditions
- *   are met:
- *
- * * Redistributions of source code must retain the above copyright
- *   notice, this list of conditions and the following disclaimer.
- * * Redistributions in binary form must reproduce the above copyright
- *   notice, this list of conditions and the following disclaimer in
- *   the documentation and/or other materials provided with the
- *   distribution.
- * * Neither the name of Intel Corporation nor the names of its
- *   contributors may be used to endorse or promote products derived
- *   from this software without specific prior written permission.
- *
- *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2017 Intel Corporation.
+ * All rights reserved.
   */
  

FYI, Intel copyrights is not using "All rights reserved." in most patches.

Acked-by: Hemant Agrawal 




Re: [dpdk-dev] [PATCH v5 0/5] get the information and data of EEPROM

2018-04-25 Thread Zijie Pan
Hi Thomas,
> > v5 changes:
> >  - insert the new APIs in alphabetical order in rte_ethdev_version.map.
>
> I think there is a misunderstanding.
> I was asking to rename rte_dev_module_info as rte_eth_dev_module_info.
I will update the patch right now.

Thanks & Regards,
Zijie

Re: [dpdk-dev] [PATCH v7 10/11] eal: replace rte_panic instances in init sequence

2018-04-25 Thread Arnon Warshavsky
<...>

>
>   + if (rte_config_init() != 0) {
>> +   rte_eal_init_alert("Failed to init configuration");
>> +   rte_errno = EFAULT;
>> +   return -1;
>> +   }
>> +
>> +   if (rte_mp_channel_init() < 0) {
>> +   rte_eal_init_alert("failed to init mp channel\n");
>> +   if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>> +   rte_errno = EFAULT;
>> +   return -1;
>> +   }
>> +   }
>>
>
> ^^^ this change looks unintended. Rebase artifact?
>
> +
>> /* in secondary processes, memory init may allocate additional
>> fbarrays
>>  * not present in primary processes, so to avoid any potential
>> issues,
>>  * initialize memzones first.
>> @@ -671,6 +712,7 @@ static void rte_eal_init_alert(const char *msg)
>>  */
>> if (pipe(lcore_config[i].pipe_master2slave) < 0)
>> rte_panic("Cannot create pipe\n");
>> +
>> if (pipe(lcore_config[i].pipe_slave2master) < 0)
>> rte_panic("Cannot create pipe\n");
>>
>
> ^^^ this looks unintended as well.
>
>   diff --git a/lib/librte_eal/linuxapp/eal/eal.c
>> b/lib/librte_eal/linuxapp/eal/eal.c
>> index 5b23bf0..54adaec 100644
>> --- a/lib/librte_eal/linuxapp/eal/eal.c
>> +++ b/lib/librte_eal/linuxapp/eal/eal.c
>> @@ -160,7 +160,7 @@ enum rte_iova_mode
>>* We also don't lock the whole file, so that in future we can use
>> read-locks
>>* on other parts, e.g. memzones, to detect if there are running
>> secondary
>>* processes. */
>> -static void
>> +static int
>>   rte_eal_config_create(void)
>>   {
>> void *rte_mem_cfg_addr;
>> @@ -169,7 +169,7 @@ enum rte_iova_mode
>> const char *pathname = eal_runtime_config_path();
>> if (internal_config.no_shconf)
>> -   return;
>>
>
> <...>
>
> }
>> rte_mem_cfg_addr = mmap(rte_mem_cfg_addr,
>> sizeof(*rte_config.mem_config),
>> PROT_READ | PROT_WRITE, MAP_SHARED,
>> mem_cfg_fd, 0);
>>   - if (rte_mem_cfg_addr == MAP_FAILED){
>> -   rte_panic("Cannot mmap memory for rte_config\n");
>> +   if (rte_mem_cfg_addr == MAP_FAILED) {
>> +   RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for
>> rte_config\n",
>> +   __func__);
>> +   return -1;
>> }
>>
>
> I think you forgot to close mem_cfg_fd and set it to -1 in case of error
> here.
>
> memcpy(rte_mem_cfg_addr, &early_mem_config,
>> sizeof(early_mem_config));
>> rte_config.mem_config = rte_mem_cfg_addr;
>> @@ -211,10 +221,11 @@ enum rte_iova_mode
>>  * processes could later map the config into this exact location
>> */
>> rte_config.mem_config->mem_cfg_addr = (uintptr_t)
>> rte_mem_cfg_addr;
>>   + return 0;
>>   }
>>
>>
>
> <...>
>
> /* map it as read-only first */
>> mem_config = (struct rte_mem_config *) mmap(NULL,
>> sizeof(*mem_config),
>> PROT_READ, MAP_SHARED, mem_cfg_fd, 0);
>> -   if (mem_config == MAP_FAILED)
>> -   rte_panic("Cannot mmap memory for rte_config! error %i
>> (%s)\n",
>> - errno, strerror(errno));
>> +   if (mem_config == MAP_FAILED) {
>> +   mem_cfg_fd = -1;
>>
>
> Forgot close() here, i think.
>
> +   RTE_LOG(CRIT, EAL, "%s(): Cannot mmap memory for
>> rte_config! error %i (%s)\n",
>> +   __func__, errno, strerror(errno));
>> +   return -1;
>> +   }
>> rte_config.mem_config = mem_config;
>> +
>> +   return 0;
>>   }
>> /* reattach the shared config at exact memory location primary
>> process has it */
>>
>
> <...>
>
> +   if (rte_config_init() != 0)
>> +   return -1;
>> +
>> if (rte_eal_log_init(logid, internal_config.syslog_facility) <
>> 0) {
>> rte_eal_init_alert("Cannot init logging.");
>> rte_errno = ENOMEM;
>> @@ -914,6 +946,7 @@ static void rte_eal_init_alert(const char *msg)
>>  */
>> if (pipe(lcore_config[i].pipe_master2slave) < 0)
>> rte_panic("Cannot create pipe\n");
>> +
>> if (pipe(lcore_config[i].pipe_slave2master) < 0)
>>
>
> Again, looks like unintended whitespace change.
>
> rte_panic("Cannot create pipe\n");
>>
>>
>
> Thanks Anatoly


Re: [dpdk-dev] [PATCH v7 08/11] eal: replace rte_panic instances in interrupts thread

2018-04-25 Thread Arnon Warshavsky
> Just a general comment - i'm not too familiar with this code, but it looks
> like all of these failures will only happen on thread init. Can we make
> sure it starts?
>
> You can use similar approach to Olivier's (recently merged) thread
> affinity patches, with pthread barriers etc. to ensure the thread has
> initialized properly, pthread_cancel() it if it didn't, and return -1 on
> thread init.


Thanks for catching this one. I am now reverting it as well from this set,
to be properly handled in a set dedicated to the threading issue


Re: [dpdk-dev] [PATCH v7 02/11] bond: replace rte_panic instances in bonding driver

2018-04-25 Thread Arnon Warshavsky
>
>
> You mixed dpaa2 with bonding in this set.
>

Indeed. Thanks


Re: [dpdk-dev] [PATCH v3 4/5] log: add ability to match dynamic log based on shell pattern

2018-04-25 Thread Thomas Monjalon
25/04/2018 05:17, Stephen Hemminger:
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -209,6 +209,13 @@ DPDK_18.02 {
>  
>  }  DPDK_17.11;
>  
> +DPDK_18.05 {
> + global:
> +
> + rte_log_set_level_pattern;
> +
> +} DPDK_18_02;

Must be DPDK_18.02 with a dot. Yes it is a trap.




Re: [dpdk-dev] [PATCH v7 6/9] ethdev: add common devargs parser

2018-04-25 Thread Remy Horton


On 24/04/2018 20:53, Thomas Monjalon wrote:
[..]

But I would like to review the devargs you are standardizing.
Unfortunately, I cannot find a documentation about it.
How users are supposed to use it?
Can you, at least, describe the syntax in the commit log, please?


The patch follows this pseudo-BNF:

cfg   := pair (',' pair)*
pair  := (key '=' value)
key   := 'port' | 'representor'
value := range | list
range := int ('-' int)?
int   := [0-9]+
list  := '[' range (',' range)* ']'


Re: [dpdk-dev] [PATCH v3] net/tap: remove queue specific offload support

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 10:18 AM, Ophir Munk wrote:
> Hi Ferruh,
> 
> I should have mentioned earlier that TAP does support queue specific 
> capabilities. 
> Please look in tap_queue_setup() and note that each TAP queue is created with 
> a distinct file descriptor (fd).
> Then supporting an offload capability is just implementing it in SW (e.g. 
> calculating IP checksum).
> 
> If the main assumption of this patch was that TAP does not support queue 
> specific offloads - then please consider this patch again.

Yes that was the initial question, is tap supports queue specific offloads or
not. Thanks for the answer.

> 
> On the other hand there is no port specific capability supported by TAP. 

If so verify functions are wrong, that was the error I got.
It seems copy/paste of mlx one but the port_supp_offloads has different meaning
there.

> However, in order to support legacy applications, port capabilities are 
> usually reported as the OR operation between queue & port capabilities. 
> TAP currently clones the queue capabilities to port capabilities. We could 
> optimize this cloning by always return queue capabilities when queried about 
> queues or ports. In this case - tap_rx_offload_get_port_capa() and 
> tap_tx_offload_get_port_capa() could be removed.

Instead of removing the functions I think you can keep them but return correct
values, in this case return empty, this will make the exiting validation
functions correct.

Can you send a fix for that?
If no fix sent, I suggest going with this patch to remove queue level offload
support until it is fixed.

> 
> Please find more comments inline.
> 
>> -Original Message-
>> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
>> Sent: Tuesday, April 24, 2018 8:54 PM
>> To: Pascal Mazon 
>> Cc: dev@dpdk.org; Ferruh Yigit ; Mordechay
>> Haimovsky ; Ophir Munk 
>> Subject: [PATCH v3] net/tap: remove queue specific offload support
>>
>> It is not clear if tap PMD supports queue specific offloads, removing the
>> related code.
>>
>> Fixes: 95ae196ae10b ("net/tap: use new Rx offloads API")
>> Fixes: 818fe14a9891 ("net/tap: use new Tx offloads API")
>> Cc: mo...@mellanox.com
>>
>> Signed-off-by: Ferruh Yigit 
>> ---
>> Cc: Ophir Munk 
>>
>> v2:
>> * rebased
>>
>> v3:
>> * txq->csum restored,
>>   - ETH_TXQ_FLAGS_IGNORE check removed since ethdev layer takes care of
>> it
>>   - tx_conf != NULL check removed, this is internal api who calls this is
>>   ethdev and it doesn't pass null tx_conf
>> ---
>>  drivers/net/tap/rte_eth_tap.c | 102 
>> +-
>>  1 file changed, 10 insertions(+), 92 deletions(-)
>>
>> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
>> index ef33aace9..61b4b5df3 100644
>> --- a/drivers/net/tap/rte_eth_tap.c
>> +++ b/drivers/net/tap/rte_eth_tap.c
>> @@ -278,31 +278,6 @@ tap_rx_offload_get_port_capa(void)
>> DEV_RX_OFFLOAD_CRC_STRIP;
>>  }
>>
>> -static uint64_t
>> -tap_rx_offload_get_queue_capa(void)
>> -{
>> -return DEV_RX_OFFLOAD_SCATTER |
>> -   DEV_RX_OFFLOAD_IPV4_CKSUM |
>> -   DEV_RX_OFFLOAD_UDP_CKSUM |
>> -   DEV_RX_OFFLOAD_TCP_CKSUM |
>> -   DEV_RX_OFFLOAD_CRC_STRIP;
>> -}
>> -
> 
> TAP PMD supports all of these RX queue specific offloads. I suggest to leave 
> this function in place.
> 
>> -static bool
>> -tap_rxq_are_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads) -{
>> -uint64_t port_offloads = dev->data->dev_conf.rxmode.offloads;
>> -uint64_t queue_supp_offloads = tap_rx_offload_get_queue_capa();
>> -uint64_t port_supp_offloads = tap_rx_offload_get_port_capa();
>> -
>> -if ((offloads & (queue_supp_offloads | port_supp_offloads)) !=
>> -offloads)
>> -return false;
>> -if ((port_offloads ^ offloads) & port_supp_offloads)
>> -return false;
>> -return true;
>> -}
>> -
> 
> Putting aside the fact that queue offloads equals port offloads (so could 
> ignore "port_supp_offload" variable) - this function is essential to validate 
> that the configured Rx offloads are supported by TAP. I suggest to leave this 
> function in place.
> Without it - testpmd falsely confirms non supported offloads.
> For example before this patch: offloading "hw-vlan-filter" will fail as 
> expected:
> 
> testpmd> port config all
> testpmd> port config all hw-vlan-filter on
> testpmd> port start all
> Configuring Port 0 (socket 0)
> PMD: net_tap0: 0x1209fc0: TX configured queues number: 1
> PMD: net_tap0: 0x1209fc0: RX configured queues number: 1
> PMD: 0x1209fc0: Rx queue offloads 0x120e don't match port offloads 0x120e or 
> supported offloads 0x300e
> Fail to configure port 0 rx queues
> 
> However, with this patch this configuration is falsely accepted.
> 
>>  /* Callback to handle the rx burst of packets to the correct interface and
>>   * file descriptor(s) in a multi-queue setup.
>>   */
>> @@ -411,31 +386,6 @@ tap_tx_offload_get_port_capa(void)
>> DEV_TX_OFFLOAD_TCP_

Re: [dpdk-dev] [PATCH v1 3/6] mempool: support block dequeue operation

2018-04-25 Thread Andrew Rybchenko

On 04/19/2018 07:41 PM, Olivier Matz wrote:

On Mon, Mar 26, 2018 at 05:12:56PM +0100, Andrew Rybchenko wrote:
[...]

@@ -1531,6 +1615,71 @@ rte_mempool_get(struct rte_mempool *mp, void **obj_p)
  }
  
  /**

+ * @internal Get contiguous blocks of objects from the pool. Used internally.
+ * @param mp
+ *   A pointer to the mempool structure.
+ * @param first_obj_table
+ *   A pointer to a pointer to the first object in each block.
+ * @param n
+ *   A number of blocks to get.
+ * @return
+ *   - >0: Success
+ *   - <0: Error

I guess it is 0 here, not >0.


Yes, thanks.


+ */
+static __rte_always_inline int
+__mempool_generic_get_contig_blocks(struct rte_mempool *mp,
+   void **first_obj_table, unsigned int n)
+{
+   int ret;
+
+   ret = rte_mempool_ops_dequeue_contig_blocks(mp, first_obj_table, n);
+   if (ret < 0)
+   __MEMPOOL_CONTIG_BLOCKS_STAT_ADD(mp, get_fail, n);
+   else
+   __MEMPOOL_CONTIG_BLOCKS_STAT_ADD(mp, get_success, n);
+
+   return ret;
+}
+

Is it worth having this function?


Just to follow the same code structure as usual dequeue.


I think it would be simple to include the code in
rte_mempool_get_contig_blocks() below... or am I missing something?


I agree. Will do in v3.

[...]


Re: [dpdk-dev] [PATCH] app/testpmd: add option to configure udp tunnel port

2018-04-25 Thread Ferruh Yigit
On 4/22/2018 9:59 PM, Rasesh Mody wrote:
> From: Shahed Shaikh 
> 
> This change adds a new option to "port config" command to
> add udp tunnel port for VXLAN and GENEVE tunneling protocols.
> This command can be extended for other tunneling protocols
> listed in "enum rte_eth_tunnel_type" as and when needed.
> 
> usage:
> port config  udp_tunnel_port add|rm vxlan|geneve 
> 
> Signed-off-by: Shahed Shaikh 
> Signed-off-by: Rasesh Mody 

Reviewed-by: Ferruh Yigit 


Re: [dpdk-dev] [PATCH] app/testpmd: add option to configure udp tunnel port

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 10:52 AM, Ferruh Yigit wrote:
> On 4/22/2018 9:59 PM, Rasesh Mody wrote:
>> From: Shahed Shaikh 
>>
>> This change adds a new option to "port config" command to
>> add udp tunnel port for VXLAN and GENEVE tunneling protocols.
>> This command can be extended for other tunneling protocols
>> listed in "enum rte_eth_tunnel_type" as and when needed.
>>
>> usage:
>> port config  udp_tunnel_port add|rm vxlan|geneve 
>>
>> Signed-off-by: Shahed Shaikh 
>> Signed-off-by: Rasesh Mody 
> 
> Reviewed-by: Ferruh Yigit 
Applied to dpdk-next-net/master, thanks.


[dpdk-dev] [PATCH v3 1/9] mem: use strlcpy instead of snprintf

2018-04-25 Thread Anatoly Burakov
Signed-off-by: Anatoly Burakov 
---

Notes:
One other instance of using snprintf is left alone as
it is expected to be addressed by another patch [1].

[1] http://dpdk.org/dev/patchwork/patch/38301/

 lib/librte_eal/common/eal_common_memalloc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memalloc.c 
b/lib/librte_eal/common/eal_common_memalloc.c
index 49fd53c..e983688 100644
--- a/lib/librte_eal/common/eal_common_memalloc.c
+++ b/lib/librte_eal/common/eal_common_memalloc.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "eal_private.h"
@@ -179,7 +180,7 @@ eal_memalloc_mem_event_callback_register(const char *name,
 
/* callback successfully created and is valid, add it to the list */
entry->clb = clb;
-   snprintf(entry->name, RTE_MEM_EVENT_CALLBACK_NAME_LEN, "%s", name);
+   strlcpy(entry->name, name, RTE_MEM_EVENT_CALLBACK_NAME_LEN);
TAILQ_INSERT_TAIL(&mem_event_callback_list, entry, next);
 
ret = 0;
@@ -284,7 +285,7 @@ eal_memalloc_mem_alloc_validator_register(const char *name,
entry->clb = clb;
entry->socket_id = socket_id;
entry->limit = limit;
-   snprintf(entry->name, RTE_MEM_ALLOC_VALIDATOR_NAME_LEN, "%s", name);
+   strlcpy(entry->name, name, RTE_MEM_ALLOC_VALIDATOR_NAME_LEN);
TAILQ_INSERT_TAIL(&mem_alloc_validator_list, entry, next);
 
ret = 0;
-- 
2.7.4


[dpdk-dev] [PATCH v3 0/9] Coverity fixes for EAL memory

2018-04-25 Thread Anatoly Burakov
This patchset fixes a host of coverity issues in memory subsystem
introduced with recent DPDK memory hotplug patchset.

Coverity issues fixed:
- 272601 - leaking fd
- 272560 - double close fd
- 272568 - leaking fd
- 272570 - leaking fd
- 272589 - dereference before null check
- 272602 - freeing wrong pointer
- 272608 - expression does nothing
- 272584 - use after free

Coverity issues not fixed:
- 272577 - negative return not handled
- 272578 - negative return not handled
  - Proper usage of API guarantees no negative returns

Additionally, also replace all instances of snprintf with strlcpy.

v3:
- Drop fixes for 272577 and 272578 and mark them as false positives

v2:
- Rebase on top of latest master

Anatoly Burakov (9):
  mem: use strlcpy instead of snprintf
  mem: fix resource leak
  mem: fix potential double close
  mem: fix potential resource leak
  mem: fix potential resource leak
  mem: fix comparing pointer to value
  mem: fix potential bad unmap
  mem: fix statement having no effect
  mem: fix possible use-after-free

 lib/librte_eal/common/eal_common_memalloc.c |  5 +++--
 lib/librte_eal/common/eal_common_memory.c   |  6 +++---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c  | 22 --
 lib/librte_eal/linuxapp/eal/eal_memory.c|  1 +
 4 files changed, 23 insertions(+), 11 deletions(-)

-- 
2.7.4


[dpdk-dev] [PATCH v3 7/9] mem: fix potential bad unmap

2018-04-25 Thread Anatoly Burakov
Previously, if mmap failed to map page address at requested
address, we were attempting to unmap the wrong address. Fix it
by unmapping our actual mapped address, and jump further to
avoid unmapping memory that is not allocated.

Coverity issue: 272602

Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 8420a26..6a75e5b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -466,7 +466,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
}
if (va != addr) {
RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
-   goto mapped;
+   munmap(va, alloc_sz);
+   goto resized;
}
/* for non-single file segments, we can close fd here */
if (!internal_config.single_file_segments)
-- 
2.7.4


[dpdk-dev] [PATCH v3 2/9] mem: fix resource leak

2018-04-25 Thread Anatoly Burakov
Coverity issue: 272601

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memory.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c 
b/lib/librte_eal/linuxapp/eal/eal_memory.c
index fadc1de..9351e84 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -1750,6 +1750,7 @@ eal_legacy_hugepage_attach(void)
if (map_addr == MAP_FAILED) {
RTE_LOG(ERR, EAL, "Could not map %s: %s\n",
hf->filepath, strerror(errno));
+   close(fd);
goto error;
}
 
-- 
2.7.4


[dpdk-dev] [PATCH v3 5/9] mem: fix potential resource leak

2018-04-25 Thread Anatoly Burakov
Normally, tailq entry should have a valid fd by the time we attempt
to map the segment. However, in case it doesn't, we're leaking fd,
so fix it.

Coverity issue: 272570

Fixes: 2a04139f66b4 ("eal: add single file segments option")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index fab5a98..b02e3a5 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -524,6 +524,8 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
if (te != NULL && te->fd >= 0) {
close(te->fd);
te->fd = -1;
+   } else {
+   close(fd);
}
/* ignore errors, can't make it any worse */
unlink(path);
-- 
2.7.4


[dpdk-dev] [PATCH v3 6/9] mem: fix comparing pointer to value

2018-04-25 Thread Anatoly Burakov
Previous code had an old rebase leftover from the time when
oldpolicy was an actual int, instead of a pointer. Fix it to
do comparison with dereferencing the pointer.

Coverity issue: 272589

Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index b02e3a5..8420a26 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -146,7 +146,7 @@ resotre_numa(int *oldpolicy, struct bitmask *oldmask)
 {
RTE_LOG(DEBUG, EAL,
"Restoring previous memory policy: %d\n", *oldpolicy);
-   if (oldpolicy == MPOL_DEFAULT) {
+   if (*oldpolicy == MPOL_DEFAULT) {
numa_set_localalloc();
} else if (set_mempolicy(*oldpolicy, oldmask->maskp,
 oldmask->size + 1) < 0) {
-- 
2.7.4


[dpdk-dev] [PATCH v3 4/9] mem: fix potential resource leak

2018-04-25 Thread Anatoly Burakov
We close fd if we managed to find it in the list of allocated
segment lists (which should always be the case under normal
conditions), but if we didn't, the fd was leaking. Close it if
we couldn't find it in the segment list. This is not an issue
as if the segment is zero length, we're getting rid of it
anyway, so there's no harm in not storing the fd anywhere.

Coverity issue: 272568

Fixes: 2a04139f66b4 ("eal: add single file segments option")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 9156f8b..fab5a98 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -569,6 +569,8 @@ free_seg(struct rte_memseg *ms, struct hugepage_info *hi,
if (te != NULL && te->fd >= 0) {
close(te->fd);
te->fd = -1;
+   } else {
+   close(fd);
}
unlink(path);
}
-- 
2.7.4


[dpdk-dev] [PATCH v3 3/9] mem: fix potential double close

2018-04-25 Thread Anatoly Burakov
We were closing descriptor before checking if mapping has
failed, but if it did, we did a second close afterwards. Fix
it by moving closing descriptor to after we check if mmap has
succeeded.

Coverity issue: 272560

Fixes: 2a04139f66b4 ("eal: add single file segments option")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 1f553dd..9156f8b 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -458,9 +458,6 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 */
void *va = mmap(addr, alloc_sz, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_POPULATE | MAP_FIXED, fd, map_offset);
-   /* for non-single file segments, we can close fd here */
-   if (!internal_config.single_file_segments)
-   close(fd);
 
if (va == MAP_FAILED) {
RTE_LOG(DEBUG, EAL, "%s(): mmap() failed: %s\n", __func__,
@@ -471,6 +468,9 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
RTE_LOG(DEBUG, EAL, "%s(): wrong mmap() address\n", __func__);
goto mapped;
}
+   /* for non-single file segments, we can close fd here */
+   if (!internal_config.single_file_segments)
+   close(fd);
 
rte_iova_t iova = rte_mem_virt2iova(addr);
if (iova == RTE_BAD_PHYS_ADDR) {
-- 
2.7.4


[dpdk-dev] [PATCH v3 8/9] mem: fix statement having no effect

2018-04-25 Thread Anatoly Burakov
Coverity reports these lines as having no effect. Technically, we do
want for those lines to have no effect, however they would've likely
been optimized out. Add volatile qualifiers to ensure the code has
effects.

Coverity issue: 272608

Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 6a75e5b..655c69e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -503,7 +503,12 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
(unsigned int)(alloc_sz >> 20));
goto mapped;
}
-   *(int *)addr = *(int *)addr;
+   /* we need to trigger a write to the page to enforce page fault and
+* ensure that page is accessible to us, but we can't overwrite value
+* that is already there, so read the old value, and write itback.
+* kernel populates the page with zeroes initially.
+*/
+   *(volatile int *)addr = *(volatile int *)addr;
 
ms->addr = addr;
ms->hugepage_sz = alloc_sz;
-- 
2.7.4


[dpdk-dev] [PATCH v3 9/9] mem: fix possible use-after-free

2018-04-25 Thread Anatoly Burakov
If user has specified a flag to unmap the area right after mapping it,
we were passing an already-unmapped pointer to RTE_LOG. This is not an
issue since RTE_LOG doesn't actually dereference the pointer, but fix
it anyway by moving call to RTE_LOG to before unmap.

Coverity issue: 272584

Fixes: b7cc54187ea4 ("mem: move virtual area function in common directory")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/eal_common_memory.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index 24a9ed5..3e30c58 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -113,12 +113,12 @@ eal_get_virtual_area(void *requested_addr, size_t *size,
RTE_LOG(WARNING, EAL, "   This may cause issues with mapping 
memory into secondary processes\n");
}
 
-   if (unmap)
-   munmap(mapped_addr, map_sz);
-
RTE_LOG(DEBUG, EAL, "Virtual area found at %p (size = 0x%zx)\n",
aligned_addr, *size);
 
+   if (unmap)
+   munmap(mapped_addr, map_sz);
+
baseaddr_offset += *size;
 
return aligned_addr;
-- 
2.7.4


Re: [dpdk-dev] [PATCH v1 2/6] mempool: implement abstract mempool info API

2018-04-25 Thread Andrew Rybchenko

On 04/19/2018 07:42 PM, Olivier Matz wrote:

On Mon, Mar 26, 2018 at 05:12:55PM +0100, Andrew Rybchenko wrote:

From: "Artem V. Andreev" 

Primarily, it is intended as a way for the mempool driver to provide
additional information on how it lays up objects inside the mempool.

Signed-off-by: Artem V. Andreev 
Signed-off-by: Andrew Rybchenko 

I think it's a good idea to have a way to query mempool features
or parameters. The approach chosen in this patch looks similar
to what we have with ethdev devinfo, right?


Yes.


[...]


  /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Additional information about the mempool
+ */
+struct rte_mempool_info;
+

[...]


+/* wrapper to get additional mempool info */
+int
+rte_mempool_ops_get_info(const struct rte_mempool *mp,
+struct rte_mempool_info *info)
+{
+   struct rte_mempool_ops *ops;
+
+   ops = rte_mempool_get_ops(mp->ops_index);
+
+   RTE_FUNC_PTR_OR_ERR_RET(ops->get_info, -ENOTSUP);
+   return ops->get_info(mp, info);
+}

Thinking in terms of ABI compatibility, it looks that each time we will
add or remove a field, it will break the ABI because the info structure
will change.

Well, it's maybe nitpicking, because most of the time adding a field in
info structure goes with adding a field in the mempool struct, which
will anyway break the ABI.

Another approach is to have a function
rte_mempool_info_contig_block_size(mp) whose ABI can be more easily
wrapped with VERSION_SYMBOL().

On my side I'm fine with your current approach, especially given how few
usages of VERSION_SYMBOL() we can find in DPDK. But in case you feel
it's better to have a function...


I'd prefer to keep current solution. Otherwise it could result in too many
different functions to get various information about mempool driver
features/characteristics. Also it could be not very convenient to get
many parameters.

May be we should align info structure size to cache line to avoid size
changes in many cases? Typically it will be used on slow path and
located on caller stack and adding some bytes more should not
be a problem.


Re: [dpdk-dev] [PATCH v2] baseband/turbo_sw: optimization of turbo software driver

2018-04-25 Thread De Lara Guarch, Pablo
Hi,

> -Original Message-
> From: Chalupnik, KamilX
> Sent: Wednesday, April 25, 2018 10:41 AM
> To: De Lara Guarch, Pablo ; dev@dpdk.org
> Cc: Mokhtar, Amr 
> Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: optimization of turbo
> software driver
> 
> 
> 
> > -Original Message-
> > From: De Lara Guarch, Pablo
> > Sent: Tuesday, April 24, 2018 7:54 PM
> > To: Chalupnik, KamilX ; dev@dpdk.org
> > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > 
> > Subject: RE: [dpdk-dev] [PATCH v2] baseband/turbo_sw: optimization of
> > turbo software driver
> >
> >
> >
> > > -Original Message-
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of KamilX
> > > Chalupnik
> > > Sent: Tuesday, April 17, 2018 3:34 PM
> > > To: dev@dpdk.org
> > > Cc: Mokhtar, Amr ; Chalupnik, KamilX
> > > 
> > > Subject: [dpdk-dev] [PATCH v2] baseband/turbo_sw: optimization of
> > > turbo software driver
> >
> > Optimization of the driver is not a good title, in my opinion.
> > Instead, call out what you are actually changing.
> > You should separate this patch into two patches, based on the two
> > bullet points below.
> >
> 
> Ok, I will separate it and change the title.
> 
> > >
> > > Optimization of Turbo Software driver:
> > > - resource-hungry piece of code removed or optimized
> > > - validation of decoder/encoder parameters put under debug flug
> > >
> >
> > ...
> >
> > >
> > > +#ifdef RTE_LIBRTE_BBDEV_DEBUG
> >
> > Where is this macro defined?
> > In DPDK we are avoiding the use of compile time options when possible,
> > so it is a better idea to use a runtime option to avoid the execution
> > of part of the code (passed through device configuration maybe?).
> > Otherwise, compilation checking gets more difficult.
> >
> 
> It seems that I missed one change. I thought it was added in previous release
> RTE_LIBRTE_BBDEV_DEBUG will be defined in config/common_base.
> Are you ok with that?

These kind of compilation flags are not accepted in DPDK anymore,
as they hide compilation issues.

If you want to have a different path for debugging, you should use run-time 
flags,
passed to the device configuration (I'd say through devargs).

Thanks,
Pablo



[dpdk-dev] [PATCH v6 0/5] get the information and data of EEPROM

2018-04-25 Thread Zijie Pan
Add APIs to read information from the DPDK applictions.
It can be used to dump the EEPROM of plugin modules (SFP+, QSFP, etc.).

Two APIs are introduced to access eeprom:
- rte_eth_dev_get_module_info
- rte_eth_dev_get_module_eeprom

Applications based on DPDK can dump eeprom by calling those two APIs.

Then, each PMD has to implement these callbacks for e1000, ixgbe, i40e, etc.

Patch for example/ethtool is used to test this function. It can get the raw 
data of eeprom. See below how both DPDK applications (ethtool) and Linux
kernel are dumping the same eeprom of a same NIC.

- Start example/ethtool:
./examples/ethtool/ethtool-app/x86_64-native-linuxapp-gcc/ethtool -c 0xf -n 
4 --socket-mem 1024,0 -- -i
EthApp> drvinfo
Port 0 driver: net_ixgbe (ver: DPDK 18.05.0-rc0)
firmware-version: 0x18b30001
bus-info: :04:00.0

EthApp> module-eeprom
 [UINT16]: module-eeprom  
 Dump plug-in module EEPROM to file

EthApp> module-eeprom 0 my-module-eeprom.bin
Total plug-in module EEPROM length: 512 bytes

EthApp> quit

- HexDump of this eeprom file:
# xxd my-module-eeprom.bin
000: 0304 0710  0100  0006 6702   g...
010: 0803 001e 4f45 4d20 2020 2020 2020 2020  OEM
020: 2020 2020  1b21 5346 502d 3130 472d  ...!SFP-10G-
030: 5352 2d49 5420 2020 4120 2020 0352 0024  SR-IT   A   .R.$
040: 003a  5751 3136 3034 3132 4131 3135  .:..WQ160412A115
050: 2020 2020 3135 3136 3130 2020 68fa 033b  151610  h..;
060:          
070:          
080:          
090:          
0a0:          
0b0:          
0c0:          
0d0:          
0e0:          
0f0:          
100: 5000 fb00 4b00  8ca0 7530 88b8 7918  P...K.u0..y.
110: 1d4c 01f4 1b58 03e8 3de9 03e8 2710 04eb  .L...X..=...'...
120: 2710 0064 1f07 007e      '..d...~
130:          
140:   3f80    0100   ?...
150: 0100  0100  0100   002d  ...-
160: 2c59 810a 13c7 1752 0001   0200  ,Y.R
170: 0040  0040       .@...@..
180:          
190:          
1a0:  faff        
1b0:          
1c0:          
1d0:          
1e0:          
1f0:     0003 0100    

- Rerun same dump using Linux's kernel ethtool.

# ./install/sbin/dpdk-devbind --bind=ixgbe 04:00.0
# ethtool -m p2p1 raw on > meeprom-kernel.bin

# xxd meeprom-kernel.bin
000: 0304 0710  0100  0006 6702   g...
010: 0803 001e 4f45 4d20 2020 2020 2020 2020  OEM
020: 2020 2020  1b21 5346 502d 3130 472d  ...!SFP-10G-
030: 5352 2d49 5420 2020 4120 2020 0352 0024  SR-IT   A   .R.$
040: 003a  5751 3136 3034 3132 4131 3135  .:..WQ160412A115
050: 2020 2020 3135 3136 3130 2020 68fa 033b  151610  h..;
060:          
070:          
080:          
090:          
0a0:          
0b0:          
0c0:          
0d0:          
0e0:          
0f0:          
100: 5000 fb00 4b00  8ca0 7530 88b8 7918  P...K.u0..y.
110: 1d4c 01f4 1b58 03e8 3de9 03e8 2710 04eb  .L...X..=...'...
120: 2710 0064 1f07 007e      '..d...~
130:          
140:   3f80    0100   ?...
150: 0100  0100  0100   002d  ...-
160: 2899 8146 0058 0001 0001   8200  (..F.X..
170: 0540  0540       .@...@..
00

[dpdk-dev] [PATCH v6 1/5] ethdev: add access to eeprom

2018-04-25 Thread Zijie Pan
add new APIs:
- rte_eth_dev_get_module_info
- rte_eth_dev_get_module_eeprom

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: john.mcnam...@intel.com
Cc: marko.kovace...@intel.com
Cc: tho...@monjalon.net

 doc/guides/nics/features.rst|   11 
 lib/librte_ether/rte_dev_info.h |   18 +
 lib/librte_ether/rte_ethdev.c   |   26 +++
 lib/librte_ether/rte_ethdev.h   |   43 +++
 lib/librte_ether/rte_ethdev_core.h  |   12 +
 lib/librte_ether/rte_ethdev_version.map |2 ++
 6 files changed, 112 insertions(+)

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index 1b4fb97..bb183e2 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -749,6 +749,17 @@ Supports getting/setting device eeprom data.
   ``rte_eth_dev_set_eeprom()``.
 
 
+.. _nic_features_module_eeprom_dump:
+
+Module EEPROM dump
+--
+
+Supports getting information and data of plugin module eeprom.
+
+* **[implements] eth_dev_ops**: ``get_module_info``, ``get_module_eeprom``.
+* **[related]API**: ``rte_eth_dev_get_module_info()``, 
``rte_eth_dev_get_module_eeprom()``.
+
+
 .. _nic_features_register_dump:
 
 Registers dump
diff --git a/lib/librte_ether/rte_dev_info.h b/lib/librte_ether/rte_dev_info.h
index 6b68584..fea5da8 100644
--- a/lib/librte_ether/rte_dev_info.h
+++ b/lib/librte_ether/rte_dev_info.h
@@ -28,4 +28,22 @@ struct rte_dev_eeprom_info {
uint32_t magic; /**< Device-specific key, such as device-id */
 };
 
+/**
+ * Placeholder for accessing plugin module eeprom
+ */
+struct rte_eth_dev_module_info {
+   uint32_t type; /**< Type of plugin module eeprom */
+   uint32_t eeprom_len; /**< Length of plugin module eeprom */
+};
+
+/* EEPROM Standards for plug in modules */
+#define RTE_ETH_MODULE_SFF_8079 0x1
+#define RTE_ETH_MODULE_SFF_8079_LEN 256
+#define RTE_ETH_MODULE_SFF_8472 0x2
+#define RTE_ETH_MODULE_SFF_8472_LEN 512
+#define RTE_ETH_MODULE_SFF_8636 0x3
+#define RTE_ETH_MODULE_SFF_8636_LEN 256
+#define RTE_ETH_MODULE_SFF_8436 0x4
+#define RTE_ETH_MODULE_SFF_8436_LEN 256
+
 #endif /* _RTE_DEV_INFO_H_ */
diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index 7821a88..100cbd8 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -3947,6 +3947,32 @@ int rte_eth_set_queue_rate_limit(uint16_t port_id, 
uint16_t queue_idx,
return eth_err(port_id, (*dev->dev_ops->set_eeprom)(dev, info));
 }
 
+int __rte_experimental
+rte_eth_dev_get_module_info(uint16_t port_id,
+   struct rte_eth_dev_module_info *modinfo)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_info, -ENOTSUP);
+   return (*dev->dev_ops->get_module_info)(dev, modinfo);
+}
+
+int __rte_experimental
+rte_eth_dev_get_module_eeprom(uint16_t port_id,
+ struct rte_dev_eeprom_info *info)
+{
+   struct rte_eth_dev *dev;
+
+   RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV);
+
+   dev = &rte_eth_devices[port_id];
+   RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->get_module_eeprom, -ENOTSUP);
+   return (*dev->dev_ops->get_module_eeprom)(dev, info);
+}
+
 int
 rte_eth_dev_get_dcb_info(uint16_t port_id,
 struct rte_eth_dcb_info *dcb_info)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index 7e4e57b..4852246 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -3311,6 +3311,49 @@ int rte_eth_tx_queue_info_get(uint16_t port_id, uint16_t 
queue_id,
 int rte_eth_dev_set_eeprom(uint16_t port_id, struct rte_dev_eeprom_info *info);
 
 /**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the type and size of plugin module EEPROM
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param modinfo
+ *   The type and size of plugin module EEPROM.
+ * @return
+ *   - (0) if successful.
+ *   - (-ENOTSUP) if hardware doesn't support.
+ *   - (-ENODEV) if *port_id* invalid.
+ *   - (-EIO) if device is removed.
+ *   - others depends on the specific operations implementation.
+ */
+int __rte_experimental
+rte_eth_dev_get_module_info(uint16_t port_id,
+   struct rte_eth_dev_module_info *modinfo);
+
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change without prior notice.
+ *
+ * Retrieve the data of plugin module EEPROM
+ *
+ * @param port_id
+ *   The port identifier of the Ethernet device.
+ * @param info
+ *   The template includes the plugin module EEPROM attributes, and the
+ *   buffer for return plugin module EEPROM data.
+ * @return
+ *   - (0) 

[dpdk-dev] [PATCH v6 2/5] examples/ethtool: add a new command module-eeprom

2018-04-25 Thread Zijie Pan
Add a new command "module-eeprom" to get the data of plugin
module EEPROM.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: or...@mellanox.com
Cc: bruce.richard...@intel.com
Cc: pablo.de.lara.gua...@intel.com
Cc: radu.nico...@intel.com
Cc: akhil.go...@nxp.com
Cc: tomasz.kante...@intel.com
Cc: john.mcnam...@intel.com
Cc: marko.kovace...@intel.com

 doc/guides/sample_app_ug/ethtool.rst  |3 ++
 examples/ethtool/ethtool-app/ethapp.c |   64 +
 examples/ethtool/lib/Makefile |1 +
 examples/ethtool/lib/rte_ethtool.c|   30 
 examples/ethtool/lib/rte_ethtool.h|   34 ++
 5 files changed, 132 insertions(+)

diff --git a/doc/guides/sample_app_ug/ethtool.rst 
b/doc/guides/sample_app_ug/ethtool.rst
index 1b79aca..47e09f6 100644
--- a/doc/guides/sample_app_ug/ethtool.rst
+++ b/doc/guides/sample_app_ug/ethtool.rst
@@ -44,6 +44,7 @@ they do as as follows:
 
 * ``drvinfo``: Print driver info
 * ``eeprom``: Dump EEPROM to file
+* ``module-eeprom``: Dump plugin module EEPROM to file
 * ``link``: Print port link states
 * ``macaddr``: Gets/sets MAC address
 * ``mtu``: Set NIC MTU
@@ -97,6 +98,8 @@ the following functions:
 - ``rte_ethtool_get_eeprom_len()``
 - ``rte_ethtool_get_eeprom()``
 - ``rte_ethtool_set_eeprom()``
+- ``rte_ethtool_get_module_info()``
+- ``rte_ethtool_get_module_eeprom()``
 - ``rte_ethtool_get_pauseparam()``
 - ``rte_ethtool_set_pauseparam()``
 - ``rte_ethtool_net_open()``
diff --git a/examples/ethtool/ethtool-app/ethapp.c 
b/examples/ethtool/ethtool-app/ethapp.c
index 4d62f4c..a4e64b3 100644
--- a/examples/ethtool/ethtool-app/ethapp.c
+++ b/examples/ethtool/ethtool-app/ethapp.c
@@ -75,6 +75,9 @@ struct pcmd_intintint_params {
 /* Commands taking port id and string */
 cmdline_parse_token_string_t pcmd_eeprom_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd, "eeprom");
+cmdline_parse_token_string_t pcmd_module_eeprom_token_cmd =
+   TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd,
+"module-eeprom");
 cmdline_parse_token_string_t pcmd_mtu_token_cmd =
TOKEN_STRING_INITIALIZER(struct pcmd_intstr_params, cmd, "mtu");
 cmdline_parse_token_string_t pcmd_regs_token_cmd =
@@ -298,6 +301,54 @@ struct pcmd_intintint_params {
 
 
 static void
+pcmd_module_eeprom_callback(void *ptr_params,
+   __rte_unused struct cmdline *ctx,
+   __rte_unused void *ptr_data)
+{
+   struct pcmd_intstr_params *params = ptr_params;
+   struct ethtool_eeprom info_eeprom;
+   uint32_t module_info[2];
+   int stat;
+   unsigned char bytes_eeprom[EEPROM_DUMP_CHUNKSIZE];
+   FILE *fp_eeprom;
+
+   if (!rte_eth_dev_is_valid_port(params->port)) {
+   printf("Error: Invalid port number %i\n", params->port);
+   return;
+   }
+
+   stat = rte_ethtool_get_module_info(params->port, module_info);
+   if (stat != 0) {
+   printf("Module EEPROM information read error %i\n", stat);
+   return;
+   }
+
+   info_eeprom.len = module_info[1];
+   info_eeprom.offset = 0;
+
+   stat = rte_ethtool_get_module_eeprom(params->port,
+&info_eeprom, bytes_eeprom);
+   if (stat != 0) {
+   printf("Module EEPROM read error %i\n", stat);
+   return;
+   }
+
+   fp_eeprom = fopen(params->opt, "wb");
+   if (fp_eeprom == NULL) {
+   printf("Error opening '%s' for writing\n", params->opt);
+   return;
+   }
+   printf("Total plugin module EEPROM length: %i bytes\n",
+  info_eeprom.len);
+   if (fwrite(bytes_eeprom, 1, info_eeprom.len,
+  fp_eeprom) != info_eeprom.len) {
+   printf("Error writing '%s'\n", params->opt);
+   }
+   fclose(fp_eeprom);
+}
+
+
+static void
 pcmd_pause_callback(void *ptr_params,
__rte_unused struct cmdline *ctx,
void *ptr_data)
@@ -664,6 +715,18 @@ static void pcmd_vlan_callback(__rte_unused void 
*ptr_params,
NULL
},
 };
+cmdline_parse_inst_t pcmd_module_eeprom = {
+   .f = pcmd_module_eeprom_callback,
+   .data = NULL,
+   .help_str = "module-eeprom  \n"
+   " Dump plugin module EEPROM to file",
+   .tokens = {
+   (void *)&pcmd_module_eeprom_token_cmd,
+   (void *)&pcmd_intstr_token_port,
+   (void *)&pcmd_intstr_token_opt,
+   NULL
+   },
+};
 cmdline_parse_inst_t pcmd_pause_noopt = {
.f = pcmd_pause_callback,
.data = (void *)0x01,
@@ -816,6 +879,7 @@ static void pcmd_vlan_callback(__rte_unused void 
*ptr_params,
 cmdline_parse_ctx_t list_prompt_commands[] = {
(cmdline_parse_inst_t *)&pcmd_drvinfo,
(cmdline_parse_inst_t *)&pcmd_eeprom,
+   (cmdline_parse_inst_t *)&pcmd_module_eeprom,
(cmdline_parse_inst

[dpdk-dev] [PATCH v6 3/5] net/ixgbe: add module EEPROM callbacks for ixgbe

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of ixgbe to get the information
and data of plugin module eeprom.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: remy.hor...@intel.com
Cc: wenzhuo...@intel.com
Cc: konstantin.anan...@intel.com

 drivers/net/ixgbe/ixgbe_ethdev.c |   79 ++
 1 file changed, 79 insertions(+)

diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index a5e2fc0..228daaf 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -327,6 +327,11 @@ static int ixgbe_get_eeprom(struct rte_eth_dev *dev,
 static int ixgbe_set_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
 
+static int ixgbe_get_module_info(struct rte_eth_dev *dev,
+struct rte_eth_dev_module_info *modinfo);
+static int ixgbe_get_module_eeprom(struct rte_eth_dev *dev,
+  struct rte_dev_eeprom_info *info);
+
 static int ixgbevf_get_reg_length(struct rte_eth_dev *dev);
 static int ixgbevf_get_regs(struct rte_eth_dev *dev,
struct rte_dev_reg_info *regs);
@@ -564,6 +569,8 @@ static int ixgbe_dev_udp_tunnel_port_del(struct rte_eth_dev 
*dev,
.get_eeprom_length= ixgbe_get_eeprom_length,
.get_eeprom   = ixgbe_get_eeprom,
.set_eeprom   = ixgbe_set_eeprom,
+   .get_module_info  = ixgbe_get_module_info,
+   .get_module_eeprom= ixgbe_get_module_eeprom,
.get_dcb_info = ixgbe_dev_get_dcb_info,
.timesync_adjust_time = ixgbe_timesync_adjust_time,
.timesync_read_time   = ixgbe_timesync_read_time,
@@ -7126,6 +7133,78 @@ static void ixgbevf_set_vfta_all(struct rte_eth_dev 
*dev, bool on)
return eeprom->ops.write_buffer(hw,  first, length, data);
 }
 
+static int
+ixgbe_get_module_info(struct rte_eth_dev *dev,
+ struct rte_eth_dev_module_info *modinfo)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t status;
+   uint8_t sff8472_rev, addr_mode;
+   bool page_swap = false;
+
+   /* Check whether we support SFF-8472 or not */
+   status = hw->phy.ops.read_i2c_eeprom(hw,
+IXGBE_SFF_SFF_8472_COMP,
+&sff8472_rev);
+   if (status != 0)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   status = hw->phy.ops.read_i2c_eeprom(hw,
+IXGBE_SFF_SFF_8472_SWAP,
+&addr_mode);
+   if (status != 0)
+   return -EIO;
+
+   if (addr_mode & IXGBE_SFF_ADDRESSING_MODE) {
+   PMD_DRV_LOG(ERR,
+   "Address change required to access page 0xA2, "
+   "but not supported. Please report the module "
+   "type to the driver maintainers.");
+   page_swap = true;
+   }
+
+   if (sff8472_rev == IXGBE_SFF_SFF_8472_UNSUP || page_swap) {
+   /* We have a SFP, but it does not support SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   /* We have a SFP which supports a revision of SFF-8472. */
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+
+   return 0;
+}
+
+static int
+ixgbe_get_module_eeprom(struct rte_eth_dev *dev,
+   struct rte_dev_eeprom_info *info)
+{
+   struct ixgbe_hw *hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t status = IXGBE_ERR_PHY_ADDR_INVALID;
+   uint8_t databyte = 0xFF;
+   uint8_t *data = info->data;
+   uint32_t i = 0;
+
+   if (info->length == 0)
+   return -EINVAL;
+
+   for (i = info->offset; i < info->offset + info->length; i++) {
+   if (i < RTE_ETH_MODULE_SFF_8079_LEN)
+   status = hw->phy.ops.read_i2c_eeprom(hw, i, &databyte);
+   else
+   status = hw->phy.ops.read_i2c_sff8472(hw, i, &databyte);
+
+   if (status != 0)
+   return -EIO;
+
+   data[i - info->offset] = databyte;
+   }
+
+   return 0;
+}
+
 uint16_t
 ixgbe_reta_size_get(enum ixgbe_mac_type mac_type) {
switch (mac_type) {
-- 
1.7.10.4



[dpdk-dev] [PATCH v6 5/5] net/i40e: add module EEPROM callbacks for i40e

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of i40e to get the information
and data of plugin module eeprom.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: beilei.x...@intel.com
Cc: qi.z.zh...@intel.com

 drivers/net/i40e/i40e_ethdev.c |  147 
 1 file changed, 147 insertions(+)

diff --git a/drivers/net/i40e/i40e_ethdev.c b/drivers/net/i40e/i40e_ethdev.c
index 180ac74..948ee89 100644
--- a/drivers/net/i40e/i40e_ethdev.c
+++ b/drivers/net/i40e/i40e_ethdev.c
@@ -369,6 +369,11 @@ static int i40e_get_regs(struct rte_eth_dev *dev,
 static int i40e_get_eeprom(struct rte_eth_dev *dev,
   struct rte_dev_eeprom_info *eeprom);
 
+static int i40e_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo);
+static int i40e_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info);
+
 static int i40e_set_default_mac_addr(struct rte_eth_dev *dev,
  struct ether_addr *mac_addr);
 
@@ -489,6 +494,8 @@ static int i40e_sw_tunnel_filter_insert(struct i40e_pf *pf,
.get_reg  = i40e_get_regs,
.get_eeprom_length= i40e_get_eeprom_length,
.get_eeprom   = i40e_get_eeprom,
+   .get_module_info  = i40e_get_module_info,
+   .get_module_eeprom= i40e_get_module_eeprom,
.mac_addr_set = i40e_set_default_mac_addr,
.mtu_set  = i40e_dev_mtu_set,
.tm_ops_get   = i40e_tm_ops_get,
@@ -11327,6 +11334,146 @@ static int i40e_get_eeprom(struct rte_eth_dev *dev,
return 0;
 }
 
+static int i40e_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo)
+{
+   struct i40e_hw *hw = I40E_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+   uint32_t sff8472_comp = 0;
+   uint32_t sff8472_swap = 0;
+   uint32_t sff8636_rev = 0;
+   i40e_status status;
+   uint32_t type = 0;
+
+   /* Check if firmware supports reading module EEPROM. */
+   if (!(hw->flags & I40E_HW_FLAG_AQ_PHY_ACCESS_CAPABLE)) {
+   PMD_DRV_LOG(ERR,
+   "Module EEPROM memory read not supported. "
+   "Please update the NVM image.\n");
+   return -EINVAL;
+   }
+
+   status = i40e_update_link_info(hw);
+   if (status)
+   return -EIO;
+
+   if (hw->phy.link_info.phy_type == I40E_PHY_TYPE_EMPTY) {
+   PMD_DRV_LOG(ERR,
+   "Cannot read module EEPROM memory. "
+   "No module connected.\n");
+   return -EINVAL;
+   }
+
+   type = hw->phy.link_info.module_type[0];
+
+   switch (type) {
+   case I40E_MODULE_TYPE_SFP:
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   I40E_I2C_EEPROM_DEV_ADDR,
+   I40E_MODULE_SFF_8472_COMP,
+   &sff8472_comp, NULL);
+   if (status)
+   return -EIO;
+
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   I40E_I2C_EEPROM_DEV_ADDR,
+   I40E_MODULE_SFF_8472_SWAP,
+   &sff8472_swap, NULL);
+   if (status)
+   return -EIO;
+
+   /* Check if the module requires address swap to access
+* the other EEPROM memory page.
+*/
+   if (sff8472_swap & I40E_MODULE_SFF_ADDR_MODE) {
+   PMD_DRV_LOG(WARNING,
+   "Module address swap to access "
+   "page 0xA2 is not supported.\n");
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else if (sff8472_comp == 0x00) {
+   /* Module is not SFF-8472 compliant */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+   break;
+   case I40E_MODULE_TYPE_QSFP_PLUS:
+   /* Read from memory page 0. */
+   status = i40e_aq_get_phy_register(hw,
+   I40E_AQ_PHY_REG_ACCESS_EXTERNAL_MODULE,
+   0,
+   I40E_MODULE_REVISION_ADDR,
+   &sff8636_r

[dpdk-dev] [PATCH v6 4/5] net/e1000: add module EEPROM callbacks for e1000

2018-04-25 Thread Zijie Pan
Add new callbacks for eth_dev_ops of e1000 to get the information and
data of plugin module EEPROM.

Signed-off-by: Zijie Pan 
Acked-by: Remy Horton 
---
Cc: wenzhuo...@intel.com

 drivers/net/e1000/base/e1000_phy.h |8 
 drivers/net/e1000/igb_ethdev.c |   86 
 2 files changed, 94 insertions(+)

diff --git a/drivers/net/e1000/base/e1000_phy.h 
b/drivers/net/e1000/base/e1000_phy.h
index 3e45a9e..2cd0e14 100644
--- a/drivers/net/e1000/base/e1000_phy.h
+++ b/drivers/net/e1000/base/e1000_phy.h
@@ -330,4 +330,12 @@ struct sfp_e1000_flags {
 #define E1000_SFF_VENDOR_OUI_AVAGO 0x00176A00
 #define E1000_SFF_VENDOR_OUI_INTEL 0x001B2100
 
+/* EEPROM byte offsets */
+#define IGB_SFF_8472_SWAP  0x5C
+#define IGB_SFF_8472_COMP  0x5E
+
+/* Bitmasks */
+#define IGB_SFF_ADDRESSING_MODE0x4
+#define IGB_SFF_8472_UNSUP 0x00
+
 #endif
diff --git a/drivers/net/e1000/igb_ethdev.c b/drivers/net/e1000/igb_ethdev.c
index 9b808a9..c35c935 100644
--- a/drivers/net/e1000/igb_ethdev.c
+++ b/drivers/net/e1000/igb_ethdev.c
@@ -223,6 +223,10 @@ static int eth_igb_get_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
 static int eth_igb_set_eeprom(struct rte_eth_dev *dev,
struct rte_dev_eeprom_info *eeprom);
+static int eth_igb_get_module_info(struct rte_eth_dev *dev,
+  struct rte_eth_dev_module_info *modinfo);
+static int eth_igb_get_module_eeprom(struct rte_eth_dev *dev,
+struct rte_dev_eeprom_info *info);
 static int eth_igb_set_mc_addr_list(struct rte_eth_dev *dev,
struct ether_addr *mc_addr_set,
uint32_t nb_mc_addr);
@@ -402,6 +406,8 @@ static void eth_igb_write_ivar(struct e1000_hw *hw, uint8_t 
msix_vector,
.get_eeprom_length= eth_igb_get_eeprom_length,
.get_eeprom   = eth_igb_get_eeprom,
.set_eeprom   = eth_igb_set_eeprom,
+   .get_module_info  = eth_igb_get_module_info,
+   .get_module_eeprom= eth_igb_get_module_eeprom,
.timesync_adjust_time = igb_timesync_adjust_time,
.timesync_read_time   = igb_timesync_read_time,
.timesync_write_time  = igb_timesync_write_time,
@@ -5329,6 +5335,86 @@ static void igbvf_set_vfta_all(struct rte_eth_dev *dev, 
bool on)
 }
 
 static int
+eth_igb_get_module_info(struct rte_eth_dev *dev,
+   struct rte_eth_dev_module_info *modinfo)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   uint32_t status = 0;
+   uint16_t sff8472_rev, addr_mode;
+   bool page_swap = false;
+
+   if (hw->phy.media_type == e1000_media_type_copper ||
+   hw->phy.media_type == e1000_media_type_unknown)
+   return -EOPNOTSUPP;
+
+   /* Check whether we support SFF-8472 or not */
+   status = e1000_read_phy_reg_i2c(hw, IGB_SFF_8472_COMP, &sff8472_rev);
+   if (status)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   status = e1000_read_phy_reg_i2c(hw, IGB_SFF_8472_SWAP, &addr_mode);
+   if (status)
+   return -EIO;
+
+   /* addressing mode is not supported */
+   if ((addr_mode & 0xFF) & IGB_SFF_ADDRESSING_MODE) {
+   PMD_DRV_LOG(ERR,
+   "Address change required to access page 0xA2, "
+   "but not supported. Please report the module "
+   "type to the driver maintainers.\n");
+   page_swap = true;
+   }
+
+   if ((sff8472_rev & 0xFF) == IGB_SFF_8472_UNSUP || page_swap) {
+   /* We have an SFP, but it does not support SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8079;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8079_LEN;
+   } else {
+   /* We have an SFP which supports a revision of SFF-8472 */
+   modinfo->type = RTE_ETH_MODULE_SFF_8472;
+   modinfo->eeprom_len = RTE_ETH_MODULE_SFF_8472_LEN;
+   }
+
+   return 0;
+}
+
+static int
+eth_igb_get_module_eeprom(struct rte_eth_dev *dev,
+ struct rte_dev_eeprom_info *info)
+{
+   struct e1000_hw *hw = E1000_DEV_PRIVATE_TO_HW(dev->data->dev_private);
+
+   uint32_t status = 0;
+   uint16_t dataword[RTE_ETH_MODULE_SFF_8472_LEN / 2 + 1];
+   u16 first_word, last_word;
+   int i = 0;
+
+   if (info->length == 0)
+   return -EINVAL;
+
+   first_word = info->offset >> 1;
+   last_word = (info->offset + info->length - 1) >> 1;
+
+   /* Read EEPROM block, SFF-8079/SFF-8472, word at a time */
+   for (i = 0; i < last_word - first_word + 1; i++) {
+   status = e1000_read_phy_reg_i2c(hw, (first_word + i) * 2,
+   &dataword[i]);
+   

Re: [dpdk-dev] [dpdk-web] [PATCH v2] update stable releases roadmap

2018-04-25 Thread Luca Boccassi
On Wed, 2018-04-25 at 09:33 +0100, Ferruh Yigit wrote:
> On 4/20/2018 4:52 PM, Aaron Conole wrote:
> > Kevin Traynor  writes:
> > 
> > > On 04/18/2018 02:28 PM, Thomas Monjalon wrote:
> > > > 18/04/2018 14:28, Ferruh Yigit:
> > > > > On 4/18/2018 10:14 AM, Thomas Monjalon wrote:
> > > > > > 18/04/2018 11:05, Ferruh Yigit:
> > > > > > > On 4/11/2018 12:28 AM, Thomas Monjalon wrote:
> > > > > > > > -   Typically a new stable release version
> > > > > > > > follows a mainline release
> > > > > > > > -   by 1-2 weeks, depending on the test results.
> > > > > > > > +   The first stable release (.1) of a branch
> > > > > > > > should follow
> > > > > > > > +   its mainline release (.0) by at least two
> > > > > > > > months,
> > > > > > > > +   after the first release candidate (-rc1) of
> > > > > > > > the next branch.
> > > > > > > 
> > > > > > > Hi Thomas,
> > > > > > > 
> > > > > > > What this change suggest? To be able to backport patches
> > > > > > > from rc1?
> > > > > > 
> > > > > > Yes, it is the proposal we discussed earlier.
> > > > > > We can wait one week after RC1 to get some validation
> > > > > > confirmation.
> > > > > > Do you agree?
> > > > > 
> > > > > This has been discussed in tech-board, what I remember the
> > > > > decision was to wait
> > > > > the release to backport patches into stable tree.
> > > 
> > > Any minutes? I couldn't find them
> > > 
> > > > It was not so clear to me.
> > > > I thought post-rc1 was acceptable. The idea is to speed-up
> > > > stable releases
> > > > pace, especially first release of a series.
> > > > 
> > > > 
> > > 
> > > I think timing of stable releases and bugfix backports to the
> > > stable
> > > branch are two separate items.
> > > 
> > > I do think that bugfix backports to stable should happen on a
> > > regular
> > > basis (e.g. every 2 weeks). Otherwise we are back to the
> > > situation where
> > > if there's a bugfix after a DPDK release, a user like (surprise,
> > > surprise) OVS may not be able to use that DPDK version for ~3
> > > months.
> > > 
> > > Someone who wants to get the latest bugfixes can just take the
> > > latest on
> > > the stable branch and importantly, can have confidence that the
> > > community has officially accepted those patches. If someone
> > > requires
> > > stable to be validated, then they have to wait until the release.
> > 
> > +1 - this seems to make the most sense to me.  Keep the patches
> > flowing,
> > but don't label/tag it until validation.  That serves an additional
> > function: developers know their CC's to stable are being processed.
> 
> Are stable trees verified?

Verification is one issue - so far, Intel and ATT have provided time
and resources to do some regression tests, but only at release time
(before tagging). And it has been a manual process.
It would be great if more companies would step up to help - and even
better if regressions could be automated (nightly job?).

The other issue is deciding when a patch is "good to go" - until now,
the criteria has been "when it's merged into master".
So either that criteria needs to change, and another equally
"authoritative" is decided on, or patches should get reviewed and
merged in master more often and more quickly :-P

We also have not been looking directly at the the various -next trees,
as things are more "in-flux" there and could be reverted, or clash with
changes from other trees - hence why we merge from master.

-- 
Kind regards,
Luca Boccassi


Re: [dpdk-dev] [PATCH] drivers/net: fix link autoneg value for virtual PMDs

2018-04-25 Thread Ferruh Yigit
On 4/20/2018 1:26 AM, Thomas Monjalon wrote:
> 20/04/2018 02:01, Ferruh Yigit:
>> These drivers never attempt link speed negotiation. Change link_autoneg
>> value to ETH_LINK_FIXED to be more accurate and consistent between PMDs.
>>
>> Fixes: 1e3a958f40b3 ("ethdev: fix link autonegotiation value")
>> Cc: sta...@dpdk.org
>>
>> Signed-off-by: Ferruh Yigit 
> 
> Acked-by: Thomas Monjalon 

Applied to dpdk-next-net/master, thanks.


Re: [dpdk-dev] [PATCH v7 6/9] ethdev: add common devargs parser

2018-04-25 Thread Thomas Monjalon
25/04/2018 11:40, Remy Horton:
> 
> On 24/04/2018 20:53, Thomas Monjalon wrote:
> [..]
> > But I would like to review the devargs you are standardizing.
> > Unfortunately, I cannot find a documentation about it.
> > How users are supposed to use it?
> > Can you, at least, describe the syntax in the commit log, please?
> 
> The patch follows this pseudo-BNF:
> 
> cfg   := pair (',' pair)*
> pair  := (key '=' value)
> key   := 'port' | 'representor'
> value := range | list
> range := int ('-' int)?
> int   := [0-9]+
> list  := '[' range (',' range)* ']'

OK
Please can you add it as a comment in the parsing code?

We will need one or two examples in the commit message too.

Can you show a complete command line please?
How do you give ethdev properties without the new devargs syntax
(in progress by Gaetan)?





[dpdk-dev] [PATCH v2] eal: remove call to unlock

2018-04-25 Thread Anatoly Burakov
Coverity was complaining about not checking result of call to
fcntl() for unlocking the file. Disregarding the fact that error
value returned from fcntl() unlock call is highly unlikely in the
first place, we are subsequently calling close() on that same fd,
which will drop the lock, which makes call to fcntl() unnecessary.

Fix this by removing a call to fcntl() altogether.

Coverity issue: 272607

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---

Notes:
v2:
- Removed call to fcntl() instead of handling return value

 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index db5aabd..485a89e 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -304,11 +304,8 @@ clear_hugedir(const char * hugedir)
lck_result = fcntl(fd, F_SETLK, &lck);
 
/* if lock succeeds, unlock and remove the file */
-   if (lck_result != -1) {
-   lck.l_type = F_UNLCK;
-   fcntl(fd, F_SETLK, &lck);
+   if (lck_result != -1)
unlinkat(dir_fd, dirent->d_name, 0);
-   }
close (fd);
dirent = readdir(dir);
}
-- 
2.7.4


[dpdk-dev] [PATCH v2] Coverity fixes for EAL

2018-04-25 Thread Anatoly Burakov
This patchset fixes a few Coverity fixes in EAL introduced
by recent DPDK memory hotplug patchset.

Coverity issues fixed:
- 272607 - error condition not handled

Coverity issues not fixed:
- 272600 - negative return not handled
  - Proper usage of API guarantees no negative return

One of existing coverity issues is not fixed by this patchset:
- 272585 - memory leak

due to being independently discovered and addressed in a
separate patch [1].

[1] http://dpdk.org/dev/patchwork/patch/38301/

v2:
- Dropped fix for 272600 and marked it as false positive

Anatoly Burakov (1):
  eal: remove call to unlock

 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

-- 
2.7.4


[dpdk-dev] [PATCH v2 0/3] Coverity fixes for malloc

2018-04-25 Thread Anatoly Burakov
This patchset fixes a few Coverity issues in malloc
introduced by recent DPDK memory hotplug patchset.

Coverity issues fixed:
- 272566 - possible null dereference
- 272574 - use value before verification

The following coverity issues were not fixed:
- 272573 - calling memset with size 0
  - This is intentional, size will not be 0 in malloc debug
- 272571 - negative return not handled
  - False positive, proper API usage ensures no negative returns
- 272590 - negative return not handled
  - Same as above
- 272597 - negative return not handled
  - Same as above

Also, replace all instaces of snprintf with strlcpy.

v2:
- Dropped fixes for 272571, 272590, 272597 as false positives

Anatoly Burakov (3):
  malloc: replace snprintf with strlcpy
  malloc: fix potential out-of-bounds array access
  malloc: fix potential dereferencing of NULL pointer

 lib/librte_eal/common/malloc_elem.c |  6 ++
 lib/librte_eal/common/malloc_heap.c |  3 ++-
 lib/librte_eal/common/malloc_mp.c   | 23 +++
 3 files changed, 19 insertions(+), 13 deletions(-)

-- 
2.7.4


[dpdk-dev] [PATCH v2 3/3] malloc: fix potential dereferencing of NULL pointer

2018-04-25 Thread Anatoly Burakov
Previous code checked for both first/last elements being NULL,
but if they weren't, the expectation was that they're both
non-NULL, which will be the case under normal conditions, but
may not be the case due to heap structure corruption.

Coverity issue: 272566

Fixes: bb372060dad4 ("malloc: make heap a doubly-linked list")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/malloc_elem.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/lib/librte_eal/common/malloc_elem.c 
b/lib/librte_eal/common/malloc_elem.c
index ee79dcd..af81961 100644
--- a/lib/librte_eal/common/malloc_elem.c
+++ b/lib/librte_eal/common/malloc_elem.c
@@ -49,6 +49,12 @@ malloc_elem_insert(struct malloc_elem *elem)
struct malloc_elem *prev_elem, *next_elem;
struct malloc_heap *heap = elem->heap;
 
+   /* first and last elements must be both NULL or both non-NULL */
+   if ((heap->first == NULL) != (heap->last == NULL)) {
+   RTE_LOG(ERR, EAL, "Heap is probably corrupt\n");
+   return;
+   }
+
if (heap->first == NULL && heap->last == NULL) {
/* if empty heap */
heap->first = elem;
-- 
2.7.4


[dpdk-dev] [PATCH v2 2/3] malloc: fix potential out-of-bounds array access

2018-04-25 Thread Anatoly Burakov
Technically, while the pointer would've been invalid if msl_idx
were invalid, we wouldn't have actually attempted to access the
pointer until verifying the index. Fix it by moving array access
to after we've verified validity of the index.

Coverity issue: 272574

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/malloc_heap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/malloc_heap.c 
b/lib/librte_eal/common/malloc_heap.c
index 590e9e3..5cf7231 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -99,11 +99,12 @@ malloc_add_seg(const struct rte_memseg_list *msl,
 
/* msl is const, so find it */
msl_idx = msl - mcfg->memsegs;
-   found_msl = &mcfg->memsegs[msl_idx];
 
if (msl_idx < 0 || msl_idx >= RTE_MAX_MEMSEG_LISTS)
return -1;
 
+   found_msl = &mcfg->memsegs[msl_idx];
+
malloc_heap_add_memory(heap, found_msl, ms->addr, len);
 
RTE_LOG(DEBUG, EAL, "Added %zuM to heap on socket %i\n", len >> 20,
-- 
2.7.4


[dpdk-dev] [PATCH v2 1/3] malloc: replace snprintf with strlcpy

2018-04-25 Thread Anatoly Burakov
Signed-off-by: Anatoly Burakov 
---
 lib/librte_eal/common/malloc_mp.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/lib/librte_eal/common/malloc_mp.c 
b/lib/librte_eal/common/malloc_mp.c
index 72b1f4c..931c14b 100644
--- a/lib/librte_eal/common/malloc_mp.c
+++ b/lib/librte_eal/common/malloc_mp.c
@@ -7,6 +7,7 @@
 
 #include 
 #include 
+#include 
 
 #include "eal_memalloc.h"
 
@@ -159,7 +160,7 @@ handle_sync(const struct rte_mp_msg *msg, const void *peer)
memset(&reply, 0, sizeof(reply));
 
reply.num_fds = 0;
-   snprintf(reply.name, sizeof(reply.name), "%s", msg->name);
+   strlcpy(reply.name, msg->name, sizeof(reply.name));
reply.len_param = sizeof(*resp);
 
ret = eal_memalloc_sync_with_primary();
@@ -274,8 +275,8 @@ handle_request(const struct rte_mp_msg *msg, const void 
*peer __rte_unused)
/* send failure message straight away */
resp_msg.num_fds = 0;
resp_msg.len_param = sizeof(*resp);
-   snprintf(resp_msg.name, sizeof(resp_msg.name), "%s",
-   MP_ACTION_RESPONSE);
+   strlcpy(resp_msg.name, MP_ACTION_RESPONSE,
+   sizeof(resp_msg.name));
 
resp->t = m->t;
resp->result = REQ_RESULT_FAIL;
@@ -298,8 +299,7 @@ handle_request(const struct rte_mp_msg *msg, const void 
*peer __rte_unused)
/* we can do something, so send sync request asynchronously */
sr_msg.num_fds = 0;
sr_msg.len_param = sizeof(*sr);
-   snprintf(sr_msg.name, sizeof(sr_msg.name), "%s",
-   MP_ACTION_SYNC);
+   strlcpy(sr_msg.name, MP_ACTION_SYNC, sizeof(sr_msg.name));
 
ts.tv_nsec = 0;
ts.tv_sec = MP_TIMEOUT_S;
@@ -393,7 +393,7 @@ handle_sync_response(const struct rte_mp_msg *request,
resp->id = entry->user_req.id;
msg.num_fds = 0;
msg.len_param = sizeof(*resp);
-   snprintf(msg.name, sizeof(msg.name), "%s", MP_ACTION_RESPONSE);
+   strlcpy(msg.name, MP_ACTION_RESPONSE, sizeof(msg.name));
 
if (rte_mp_sendmsg(&msg))
RTE_LOG(ERR, EAL, "Could not send message to secondary 
process\n");
@@ -417,7 +417,7 @@ handle_sync_response(const struct rte_mp_msg *request,
resp->id = entry->user_req.id;
msg.num_fds = 0;
msg.len_param = sizeof(*resp);
-   snprintf(msg.name, sizeof(msg.name), "%s", MP_ACTION_RESPONSE);
+   strlcpy(msg.name, MP_ACTION_RESPONSE, sizeof(msg.name));
 
if (rte_mp_sendmsg(&msg))
RTE_LOG(ERR, EAL, "Could not send message to secondary 
process\n");
@@ -444,8 +444,7 @@ handle_sync_response(const struct rte_mp_msg *request,
/* send rollback request */
rb_msg.num_fds = 0;
rb_msg.len_param = sizeof(*rb);
-   snprintf(rb_msg.name, sizeof(rb_msg.name), "%s",
-   MP_ACTION_ROLLBACK);
+   strlcpy(rb_msg.name, MP_ACTION_ROLLBACK, sizeof(rb_msg.name));
 
ts.tv_nsec = 0;
ts.tv_sec = MP_TIMEOUT_S;
@@ -515,7 +514,7 @@ handle_rollback_response(const struct rte_mp_msg *request,
resp->id = mpreq->id;
msg.num_fds = 0;
msg.len_param = sizeof(*resp);
-   snprintf(msg.name, sizeof(msg.name), "%s", MP_ACTION_RESPONSE);
+   strlcpy(msg.name, MP_ACTION_RESPONSE, sizeof(msg.name));
 
if (rte_mp_sendmsg(&msg))
RTE_LOG(ERR, EAL, "Could not send message to secondary 
process\n");
@@ -577,7 +576,7 @@ request_sync(void)
 
msg.num_fds = 0;
msg.len_param = sizeof(*req);
-   snprintf(msg.name, sizeof(msg.name), "%s", MP_ACTION_SYNC);
+   strlcpy(msg.name, MP_ACTION_SYNC, sizeof(msg.name));
 
/* sync request carries no data */
req->t = REQ_TYPE_SYNC;
@@ -668,7 +667,7 @@ request_to_primary(struct malloc_mp_req *user_req)
 
msg.num_fds = 0;
msg.len_param = sizeof(*msg_req);
-   snprintf(msg.name, sizeof(msg.name), "%s", MP_ACTION_REQUEST);
+   strlcpy(msg.name, MP_ACTION_REQUEST, sizeof(msg.name));
 
/* (attempt to) get a unique id */
user_req->id = get_unique_id();
-- 
2.7.4


Re: [dpdk-dev] [PATCH v3 0/5] log level control enhancements

2018-04-25 Thread Thomas Monjalon
> Stephen Hemminger (5):
>   eal: make syslog facility table const
>   eal: allow symbolic log levels
>   eal: make eal_log_level save private
>   log: add ability to match dynamic log based on shell pattern
>   doc: update guides for current preferrred log level syntax

Applied (with small fix in map file), thanks





Re: [dpdk-dev] [PATCH] ethdev: remove experimental flag of ports enumeration

2018-04-25 Thread Ferruh Yigit
On 4/24/2018 3:15 AM, Thomas Monjalon wrote:
> The basic operations for ports enumeration should not be
> considered as experimental in DPDK 18.05.
> 
> The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
> It uses the function the rte_eth_find_next_owned_by() to get
> only ownerless ports. Its API can be considered stable.
> So the flag experimental is removed from rte_eth_find_next_owned_by().
> 
> The flag experimental is removed from rte_eth_dev_count_avail()
> which is the new name of the old function rte_eth_dev_count().
> 
> The flag experimental is set to rte_eth_dev_count_total()
> in the .c file for consistency with the declaration in the .h file.
> 
> A lot of internal applications are fixed to not allow experimental API.
> 
> Fixes: 8728ccf37615 ("fix ethdev ports enumeration")
> Fixes: d9a42a69febf ("ethdev: deprecate port count function")
> Fixes: e70e26861eaf ("net/mvpp2: fix build")
> 
> Signed-off-by: Thomas Monjalon 

Getting some build errors [1], it seems some samples are using some other
experimental APIs so that we can't remove the flag for them.


[1]
.../dpdk/examples/tep_termination/main.c: In function ‘main’:
.../dpdk/examples/tep_termination/main.c:1209:3: error: ‘rte_ctrl_thread_create’
is deprecated: Symbol is not yet part of stable ABI
[-Werror=deprecated-declarations]
   ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
   ^~~

.../dpdk/examples/vhost/main.c: In function ‘main’:
.../dpdk/examples/vhost/main.c:1497:3: error: ‘rte_ctrl_thread_create’ is
deprecated: Symbol is not yet part of stable ABI 
[-Werror=deprecated-declarations]
   ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
   ^~~


Re: [dpdk-dev] [dpdk-stable] [PATCH v1] net/vdev_netvsc: fix creating short name devices

2018-04-25 Thread Matan Azrad


From: Ferruh Yigit, Tuesday, April 24, 2018 5:20 PM
> On 4/10/2018 4:39 PM, Matan Azrad wrote:
> > It is OK for me.
> 
> Converting this to explicit ack:
> Acked-by: Matan Azrad 
> 

Ok, don't forget to remove the fixes line.

> > Thanks.
> >
> > From: Ophir Munk, Tuesday, April 10, 2018 6:36 PM
> >> Hi.
> >> Discussed with Thomas.
> >> Please consider the following commit message:
> >>
> >> net/vdev_netvsc: shorten devices names
> >>
> >> Prior to this commit the vdev_netvsc PMD was creating tap and
> >> failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
> >> or "net_failsafe_net_vdev_netvsc0".
> >> This commits creates tap and failsafe devices with short names such
> >> as "net_tap_netvsc0" or "net_failsafe_netvsc0".
> >>
> >>> -Original Message-
> >>> From: Matan Azrad
> >>> Sent: Tuesday, April 10, 2018 11:04 AM
> >>> To: Ophir Munk ; dev@dpdk.org
> >>> Cc: Thomas Monjalon ; Olga Shern
> >>> ; sta...@dpdk.org
> >>> Subject: RE: [PATCH v1] net/vdev_netvsc: fix creating short name
> >>> devices
> >>>
> >>> Hi Ophir
> >>>
> >>> From: Ophir Munk, Tuesday, April 10, 2018 10:20 AM
>  Prior to this commit the vdev_netvsc PMD was creating tap and
>  failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
>  or "net_failsafe_net_vdev_netvsc0".
>  Long names containing more than 32 characters may be rejected by
>  some APIs (e.g. membuf pool creation).
> >>>
> >>> Since EAL allows to use long names, I don't think it is a problem of
> >>> the netvsc device.
> >>> If a DPDK entity wants to use this name for some reason it needs to
> >>> adjust it to the usage.
> >>>
> >>> I agree that short names are better and may help for such like cases.
> >>>
> >>> I suggest the next title:
> >>> net/vdev_netvsc: use short device names
> >>>
>  This commits fixes this issue by creating tap and failsafe devices
>  with short names such as "tap_net_vsc0" or "net_failsafe_vsc0".
> 
>  Fixes: e7dc5d7becc5 ("net/vdev_netvsc: implement core
>  functionality")
>  Cc: sta...@dpdk.org
> 
>  Signed-off-by: Ophir Munk 
>  ---
>   drivers/net/vdev_netvsc/vdev_netvsc.c | 8 
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
>  diff --git a/drivers/net/vdev_netvsc/vdev_netvsc.c
>  b/drivers/net/vdev_netvsc/vdev_netvsc.c
>  index db0080a..bb2f78d 100644
>  --- a/drivers/net/vdev_netvsc/vdev_netvsc.c
>  +++ b/drivers/net/vdev_netvsc/vdev_netvsc.c
>  @@ -614,13 +614,13 @@ vdev_netvsc_netvsc_probe(const struct
>  if_nameindex *iface,
>  name, ctx->id);
>   if (ret == -1 || (size_t)ret >= sizeof(ctx->name))
>   ++i;
>  -ret = snprintf(ctx->devname, sizeof(ctx->devname),
>  "net_failsafe_%s",
>  -   ctx->name);
>  +ret = snprintf(ctx->devname, sizeof(ctx->devname),
>  "net_failsafe_vsc%u",
>  +   ctx->id);
>   if (ret == -1 || (size_t)ret >= sizeof(ctx->devname))
>   ++i;
>   ret = snprintf(ctx->devargs, sizeof(ctx->devargs),
>  -   "fd(%d),dev(net_tap_%s,remote=%s)",
>  -   ctx->pipe[0], ctx->name, ctx->if_name);
>  +   "fd(%d),dev(net_tap_vsc%u,remote=%s)",
>  +   ctx->pipe[0], ctx->id, ctx->if_name);
>   if (ret == -1 || (size_t)ret >= sizeof(ctx->devargs))
>   ++i;
>   if (i) {
>  --
>  2.7.4
> >



Re: [dpdk-dev] [PATCH v1 2/6] mempool: implement abstract mempool info API

2018-04-25 Thread Olivier Matz
Hi Andrew,

> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Additional information about the mempool
> > > + */
> > > +struct rte_mempool_info;
> > > +
> > [...]
> > 
> > > +/* wrapper to get additional mempool info */
> > > +int
> > > +rte_mempool_ops_get_info(const struct rte_mempool *mp,
> > > +  struct rte_mempool_info *info)
> > > +{
> > > + struct rte_mempool_ops *ops;
> > > +
> > > + ops = rte_mempool_get_ops(mp->ops_index);
> > > +
> > > + RTE_FUNC_PTR_OR_ERR_RET(ops->get_info, -ENOTSUP);
> > > + return ops->get_info(mp, info);
> > > +}
> > Thinking in terms of ABI compatibility, it looks that each time we will
> > add or remove a field, it will break the ABI because the info structure
> > will change.
> > 
> > Well, it's maybe nitpicking, because most of the time adding a field in
> > info structure goes with adding a field in the mempool struct, which
> > will anyway break the ABI.
> > 
> > Another approach is to have a function
> > rte_mempool_info_contig_block_size(mp) whose ABI can be more easily
> > wrapped with VERSION_SYMBOL().
> > 
> > On my side I'm fine with your current approach, especially given how few
> > usages of VERSION_SYMBOL() we can find in DPDK. But in case you feel
> > it's better to have a function...
> 
> I'd prefer to keep current solution. Otherwise it could result in too many
> different functions to get various information about mempool driver
> features/characteristics. Also it could be not very convenient to get
> many parameters.
> 
> May be we should align info structure size to cache line to avoid size
> changes in many cases? Typically it will be used on slow path and
> located on caller stack and adding some bytes more should not
> be a problem.

Yes, that could be a good thing to do.

Thanks,
Olivier


Re: [dpdk-dev] [PATCH v6 1/5] ethdev: add access to eeprom

2018-04-25 Thread Thomas Monjalon
25/04/2018 12:01, Zijie Pan:
> add new APIs:
> - rte_eth_dev_get_module_info
> - rte_eth_dev_get_module_eeprom
> 
> Signed-off-by: Zijie Pan 
> Acked-by: Remy Horton 

Acked-by: Thomas Monjalon 




Re: [dpdk-dev] [PATCH] ethdev: remove experimental flag of ports enumeration

2018-04-25 Thread Thomas Monjalon
25/04/2018 12:21, Ferruh Yigit:
> On 4/24/2018 3:15 AM, Thomas Monjalon wrote:
> > The basic operations for ports enumeration should not be
> > considered as experimental in DPDK 18.05.
> > 
> > The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
> > It uses the function the rte_eth_find_next_owned_by() to get
> > only ownerless ports. Its API can be considered stable.
> > So the flag experimental is removed from rte_eth_find_next_owned_by().
> > 
> > The flag experimental is removed from rte_eth_dev_count_avail()
> > which is the new name of the old function rte_eth_dev_count().
> > 
> > The flag experimental is set to rte_eth_dev_count_total()
> > in the .c file for consistency with the declaration in the .h file.
> > 
> > A lot of internal applications are fixed to not allow experimental API.
> > 
> > Fixes: 8728ccf37615 ("fix ethdev ports enumeration")
> > Fixes: d9a42a69febf ("ethdev: deprecate port count function")
> > Fixes: e70e26861eaf ("net/mvpp2: fix build")
> > 
> > Signed-off-by: Thomas Monjalon 
> 
> Getting some build errors [1], it seems some samples are using some other
> experimental APIs so that we can't remove the flag for them.
> 
> 
> [1]
> .../dpdk/examples/tep_termination/main.c: In function ‘main’:
> .../dpdk/examples/tep_termination/main.c:1209:3: error: 
> ‘rte_ctrl_thread_create’
> is deprecated: Symbol is not yet part of stable ABI
> [-Werror=deprecated-declarations]
>ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
>^~~
> 
> .../dpdk/examples/vhost/main.c: In function ‘main’:
> .../dpdk/examples/vhost/main.c:1497:3: error: ‘rte_ctrl_thread_create’ is
> deprecated: Symbol is not yet part of stable ABI 
> [-Werror=deprecated-declarations]
>ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
>^~~

Ah sorry, I think it is due to a change in next-net.
I have prepared this patch on master.

Please can you fix it when applying?





Re: [dpdk-dev] [dpdk-stable] [PATCH v1] net/vdev_netvsc: fix creating short name devices

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 11:25 AM, Matan Azrad wrote:
> 
> 
> From: Ferruh Yigit, Tuesday, April 24, 2018 5:20 PM
>> On 4/10/2018 4:39 PM, Matan Azrad wrote:
>>> It is OK for me.
>>
>> Converting this to explicit ack:
>> Acked-by: Matan Azrad 
>>
> 
> Ok, don't forget to remove the fixes line.

Don't forget? Should fixes line be removed, why?

> 
>>> Thanks.
>>>
>>> From: Ophir Munk, Tuesday, April 10, 2018 6:36 PM
 Hi.
 Discussed with Thomas.
 Please consider the following commit message:

 net/vdev_netvsc: shorten devices names

 Prior to this commit the vdev_netvsc PMD was creating tap and
 failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
 or "net_failsafe_net_vdev_netvsc0".
 This commits creates tap and failsafe devices with short names such
 as "net_tap_netvsc0" or "net_failsafe_netvsc0".

> -Original Message-
> From: Matan Azrad
> Sent: Tuesday, April 10, 2018 11:04 AM
> To: Ophir Munk ; dev@dpdk.org
> Cc: Thomas Monjalon ; Olga Shern
> ; sta...@dpdk.org
> Subject: RE: [PATCH v1] net/vdev_netvsc: fix creating short name
> devices
>
> Hi Ophir
>
> From: Ophir Munk, Tuesday, April 10, 2018 10:20 AM
>> Prior to this commit the vdev_netvsc PMD was creating tap and
>> failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
>> or "net_failsafe_net_vdev_netvsc0".
>> Long names containing more than 32 characters may be rejected by
>> some APIs (e.g. membuf pool creation).
>
> Since EAL allows to use long names, I don't think it is a problem of
> the netvsc device.
> If a DPDK entity wants to use this name for some reason it needs to
> adjust it to the usage.
>
> I agree that short names are better and may help for such like cases.
>
> I suggest the next title:
> net/vdev_netvsc: use short device names
>
>> This commits fixes this issue by creating tap and failsafe devices
>> with short names such as "tap_net_vsc0" or "net_failsafe_vsc0".
>>
>> Fixes: e7dc5d7becc5 ("net/vdev_netvsc: implement core
>> functionality")
>> Cc: sta...@dpdk.org
>>
>> Signed-off-by: Ophir Munk 
>> ---
>>  drivers/net/vdev_netvsc/vdev_netvsc.c | 8 
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/vdev_netvsc/vdev_netvsc.c
>> b/drivers/net/vdev_netvsc/vdev_netvsc.c
>> index db0080a..bb2f78d 100644
>> --- a/drivers/net/vdev_netvsc/vdev_netvsc.c
>> +++ b/drivers/net/vdev_netvsc/vdev_netvsc.c
>> @@ -614,13 +614,13 @@ vdev_netvsc_netvsc_probe(const struct
>> if_nameindex *iface,
>> name, ctx->id);
>>  if (ret == -1 || (size_t)ret >= sizeof(ctx->name))
>>  ++i;
>> -ret = snprintf(ctx->devname, sizeof(ctx->devname),
>> "net_failsafe_%s",
>> -   ctx->name);
>> +ret = snprintf(ctx->devname, sizeof(ctx->devname),
>> "net_failsafe_vsc%u",
>> +   ctx->id);
>>  if (ret == -1 || (size_t)ret >= sizeof(ctx->devname))
>>  ++i;
>>  ret = snprintf(ctx->devargs, sizeof(ctx->devargs),
>> -   "fd(%d),dev(net_tap_%s,remote=%s)",
>> -   ctx->pipe[0], ctx->name, ctx->if_name);
>> +   "fd(%d),dev(net_tap_vsc%u,remote=%s)",
>> +   ctx->pipe[0], ctx->id, ctx->if_name);
>>  if (ret == -1 || (size_t)ret >= sizeof(ctx->devargs))
>>  ++i;
>>  if (i) {
>> --
>> 2.7.4
>>>
> 



Re: [dpdk-dev] [v2, 6/6] doc: add event crypto adapter documentation

2018-04-25 Thread Kovacevic, Marko
Few things below.

> Signed-off-by: Abhinandan Gujjar 
> ---
>  MAINTAINERS|   7 +
>  doc/api/doxy-api-index.md  |   1 +
>  doc/guides/prog_guide/event_crypto_adapter.rst | 236
>  doc/guides/prog_guide/index.rst|   1 +
>  doc/guides/rel_notes/release_18_05.rst |   6 +
>  5 files changed, 251 insertions(+)
>  create mode 100644 doc/guides/prog_guide/event_crypto_adapter.rst

<...>

> +Event Crypto Adapter Library
> +
> +
> +The DPDK Event device library
> +'`_

I'd suggest making a change to the link it looks messy the way it is at the 
moment 
can you link the file this way please, 

:doc:`DPDK Event device library `


> +provides event driven programming model with features to schedule
> events.
> +The cryptodev library
> +'`_

Same above:   :doc:`cryptodev library `


> +provides interface to crypto poll mode drivers which supports different
> crypto operations.
> +The Event Crypto Adapter is one of the event adapter which is intended
> +to bridge between event devices and cryptodev.

<...>

> +The ``rte_event_crypto_adapter_create_ext()`` function is passed as a
> +callback function. The callback function is invoked if the adapter
> +needs to use a service function and needs to create an event port for
> +it. The callback is expected to fill the ``struct
> +rte_event_crypto_adapter_conf`` structure passed to it.
> +
> +For ENQ-DEQ mode, the event port created by adapter can be retrived

Spelling retrived / retrieved 

> +using ``rte_event_crypto_adapter_event_port_get()`` API.
> +Application can use this event port to link with event queue on which
> +it enqueue events towards the crypto adapter.


<...>

> +
> +Adding queue pair to the adapter instance
> +-
> +
> +Cryptodev device id and queue pair are created using cryptodev APIs.
> +Refer '`_.

Change above:
And maybe instead of just Refer, maybe try "For more information click here" 
Just a suggestion

:doc:`here  `.

<...>

> +
> +Get adapter statistics
> +--
> +
> +The  rte_event_crypto_adapter_stats_get()`` function reports counters

Missing a `` at the beginning of this function above.


> +defined in struct ``rte_event_crypto_adapter_stats``. The received
> +packet and enqueued event counts are a sum of the counts from the
> +eventdev PMD callbacks if the callback is supported, and the counts
> +maintained by the service function, if one exists.

<...>

Also try and keep the code blocks with 4 to 8 spaces in some cases there is a 
lot of tabs.
Especially as you  have spaces in a few and not others.

Thanks

Marko K.


Re: [dpdk-dev] [PATCH v2] eal: remove call to unlock

2018-04-25 Thread Thomas Monjalon
25/04/2018 12:08, Anatoly Burakov:
> Coverity was complaining about not checking result of call to
> fcntl() for unlocking the file. Disregarding the fact that error
> value returned from fcntl() unlock call is highly unlikely in the
> first place, we are subsequently calling close() on that same fd,
> which will drop the lock, which makes call to fcntl() unnecessary.
> 
> Fix this by removing a call to fcntl() altogether.
> 
> Coverity issue: 272607
> 
> Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
> Cc: anatoly.bura...@intel.com
> 
> Signed-off-by: Anatoly Burakov 

Applied, thanks




Re: [dpdk-dev] [dpdk-stable] [PATCH v1] net/vdev_netvsc: fix creating short name devices

2018-04-25 Thread Matan Azrad


From: Ferruh Yigit, Wednesday, April 25, 2018 1:29 PM
> On 4/25/2018 11:25 AM, Matan Azrad wrote:
> >
> >
> > From: Ferruh Yigit, Tuesday, April 24, 2018 5:20 PM
> >> On 4/10/2018 4:39 PM, Matan Azrad wrote:
> >>> It is OK for me.
> >>
> >> Converting this to explicit ack:
> >> Acked-by: Matan Azrad 
> >>
> >
> > Ok, don't forget to remove the fixes line.
> 
> Don't forget? Should fixes line be removed, why?

This is not a fix.

> >>> Thanks.
> >>>
> >>> From: Ophir Munk, Tuesday, April 10, 2018 6:36 PM
>  Hi.
>  Discussed with Thomas.
>  Please consider the following commit message:
> 
>  net/vdev_netvsc: shorten devices names
> 
>  Prior to this commit the vdev_netvsc PMD was creating tap and
>  failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
>  or "net_failsafe_net_vdev_netvsc0".
>  This commits creates tap and failsafe devices with short names such
>  as "net_tap_netvsc0" or "net_failsafe_netvsc0".
> 
> > -Original Message-
> > From: Matan Azrad
> > Sent: Tuesday, April 10, 2018 11:04 AM
> > To: Ophir Munk ; dev@dpdk.org
> > Cc: Thomas Monjalon ; Olga Shern
> > ; sta...@dpdk.org
> > Subject: RE: [PATCH v1] net/vdev_netvsc: fix creating short name
> > devices
> >
> > Hi Ophir
> >
> > From: Ophir Munk, Tuesday, April 10, 2018 10:20 AM
> >> Prior to this commit the vdev_netvsc PMD was creating tap and
> >> failsafe devices with long names, such as
> "net_tap_net_vdev_netvsc0"
> >> or "net_failsafe_net_vdev_netvsc0".
> >> Long names containing more than 32 characters may be rejected by
> >> some APIs (e.g. membuf pool creation).
> >
> > Since EAL allows to use long names, I don't think it is a problem
> > of the netvsc device.
> > If a DPDK entity wants to use this name for some reason it needs
> > to adjust it to the usage.
> >
> > I agree that short names are better and may help for such like cases.
> >
> > I suggest the next title:
> > net/vdev_netvsc: use short device names
> >
> >> This commits fixes this issue by creating tap and failsafe
> >> devices with short names such as "tap_net_vsc0" or
> "net_failsafe_vsc0".
> >>
> >> Fixes: e7dc5d7becc5 ("net/vdev_netvsc: implement core
> >> functionality")
> >> Cc: sta...@dpdk.org
> >>
> >> Signed-off-by: Ophir Munk 
> >> ---
> >>  drivers/net/vdev_netvsc/vdev_netvsc.c | 8 
> >>  1 file changed, 4 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/drivers/net/vdev_netvsc/vdev_netvsc.c
> >> b/drivers/net/vdev_netvsc/vdev_netvsc.c
> >> index db0080a..bb2f78d 100644
> >> --- a/drivers/net/vdev_netvsc/vdev_netvsc.c
> >> +++ b/drivers/net/vdev_netvsc/vdev_netvsc.c
> >> @@ -614,13 +614,13 @@ vdev_netvsc_netvsc_probe(const struct
> >> if_nameindex *iface,
> >>   name, ctx->id);
> >>if (ret == -1 || (size_t)ret >= sizeof(ctx->name))
> >>++i;
> >> -  ret = snprintf(ctx->devname, sizeof(ctx->devname),
> >> "net_failsafe_%s",
> >> - ctx->name);
> >> +  ret = snprintf(ctx->devname, sizeof(ctx->devname),
> >> "net_failsafe_vsc%u",
> >> + ctx->id);
> >>if (ret == -1 || (size_t)ret >= sizeof(ctx->devname))
> >>++i;
> >>ret = snprintf(ctx->devargs, sizeof(ctx->devargs),
> >> - "fd(%d),dev(net_tap_%s,remote=%s)",
> >> - ctx->pipe[0], ctx->name, ctx->if_name);
> >> + "fd(%d),dev(net_tap_vsc%u,remote=%s)",
> >> + ctx->pipe[0], ctx->id, ctx->if_name);
> >>if (ret == -1 || (size_t)ret >= sizeof(ctx->devargs))
> >>++i;
> >>if (i) {
> >> --
> >> 2.7.4
> >>>
> >



[dpdk-dev] [PATCH v3 2/2] mem: revert to using flock() and add per-segment lockfiles

2018-04-25 Thread Anatoly Burakov
The original implementation used flock() locks, but was later
switched to using fcntl() locks for page locking, because
fcntl() locks allow locking parts of a file, which is useful
for single-file segments mode, where locking the entire file
isn't as useful because we still need to grow and shrink it.

However, according to fcntl()'s Ubuntu manpage [1], semantics of
fcntl() locks have a giant oversight:

  This interface follows the completely stupid semantics of System
  V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all
  locks associated with a file for a given process are removed
  when any file descriptor for that file is closed by that process.
  This semantic means that applications must be aware of any files
  that a subroutine library may access.

Basically, closing *any* fd with an fcntl() lock (which we do because
we don't want to leak fd's) will drop the lock completely.

So, in this commit, we will be reverting back to using flock() locks
everywhere. However, that still leaves the problem of locking parts
of a memseg list file in single file segments mode, and we will be
solving it with creating separate lock files per each page, and
tracking those with flock().

We will also be removing all of this tailq business and replacing it
with a simple array - saving a few bytes is not worth the extra
hassle of dealing with pointers and potential memory allocation
failures. Also, remove the tailq lock since it is not needed - these
fd lists are per-process, and within a given process, it is always
only one thread handling access to hugetlbfs.

So, first one to allocate a segment will create a lockfile, and put
a shared lock on it. When we're shrinking the page file, we will be
trying to take out a write lock on that lockfile, which would fail if
any other process is holding onto the lockfile as well. This way, we
can know if we can shrink the segment file. Also, if no other locks
are found in the lock list for a given memseg list, the memseg list
fd is automatically closed.

One other thing to note is, according to flock() Ubuntu manpage [2],
upgrading the lock from shared to exclusive is implemented by dropping
and reacquiring the lock, which is not atomic and thus would have
created race conditions. So, on attempting to perform operations in
hugetlbfs, we will take out a writelock on hugetlbfs directory, so
that only one process could perform hugetlbfs operations concurrently.

[1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html
[2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime")
Fixes: 2a04139f66b4 ("eal: add single file segments option")
Cc: anatoly.bura...@intel.com

Signed-off-by: Anatoly Burakov 
Acked-by: Bruce Richardson 
---

Notes:
v3:
- Make lock naming consistent with hugetlbfs file naming

v2:
- Make lockfiles hidden
- renamed functions as per Bruce's comments

We could've used OFD locks [1] instead of flock(), but they are
only supported on kernel 3.15+ [2], so we're forced to use this
flock()-based contraption to support older kernels.

[1] 
https://www.gnu.org/software/libc/manual/html_node/Open-File-Description-Locks.html
[2] http://man7.org/linux/man-pages/man2/fcntl.2.html

 lib/librte_eal/common/eal_filesystem.h  |  17 +
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  28 +-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c  | 539 +++-
 lib/librte_eal/linuxapp/eal/eal_memory.c|  22 +-
 4 files changed, 368 insertions(+), 238 deletions(-)

diff --git a/lib/librte_eal/common/eal_filesystem.h 
b/lib/librte_eal/common/eal_filesystem.h
index ad059ef..0a82f89 100644
--- a/lib/librte_eal/common/eal_filesystem.h
+++ b/lib/librte_eal/common/eal_filesystem.h
@@ -115,6 +115,23 @@ eal_get_hugefile_path(char *buffer, size_t buflen, const 
char *hugedir, int f_id
return buffer;
 }
 
+/** String format for hugepage map lock files. */
+#define HUGEFILE_LOCK_FMT "%s/.%smap_%d.lock"
+
+static inline const char *
+eal_get_hugefile_lock_path(char *buffer, size_t buflen, int f_id)
+{
+   const char *directory = default_config_dir;
+   const char *home_dir = getenv("HOME");
+
+   if (getuid() != 0 && home_dir != NULL)
+   directory = home_dir;
+   snprintf(buffer, buflen - 1, HUGEFILE_LOCK_FMT, directory,
+   internal_config.hugefile_prefix, f_id);
+   buffer[buflen - 1] = '\0';
+   return buffer;
+}
+
 /** define the default filename prefix for the %s values above */
 #define HUGEFILE_PREFIX_DEFAULT "rte"
 
diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c 
b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index db5aabd..7eca711 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/libr

[dpdk-dev] [PATCH v3 0/2] Fix file locking in EAL memory

2018-04-25 Thread Anatoly Burakov
This patchset replaces the fcntl()-based locking used in
the original DPDK memory hotplug patchset, to an flock()-
and lockfile-based implementation, due to numerous (well,
one, really) problems with how fcntl() locks work.

Long story short, fcntl() locks will be dropped if any
fd referring to locked file, is closed - even if it's not
the last fd, even if it wasn't even the fd that was used
to lock the file in the first place, even if it wasn't
you who closed that fd, but some other library.

This patchset corrects this saddening design defect in the
original implementation.

One of the ways to work around this was using OFD locks,
but they are only supported on kernels 3.15+, so we cannot
rely on them if we want to support old kernels. Hence, we
use per-segment lockfiles. The number of file descriptors
we open does not end up more than in non-single file
segments case - we still open the same amount of files (a
file per page), plus a file per memseg list.

Additionally, since flock() is not atomic, we also lock the
hugepage dir to prevent multiple processes from concurrently
performing operations on hugetlbfs mounts.

If you know of a more enlightened way of fixing this
limitation, you are certainly welcome to comment :)

v3:
- Made lockfile naming more consistent with hugetlbfs file naming

v2:
- Fixes as per review comments
- Make lockfiles hidden by default

Anatoly Burakov (2):
  mem: add memalloc init stage
  mem: revert to using flock() and add per-segment lockfiles

 lib/librte_eal/bsdapp/eal/eal_memalloc.c|   6 +
 lib/librte_eal/common/eal_common_memory.c   |   3 +
 lib/librte_eal/common/eal_filesystem.h  |  17 +
 lib/librte_eal/common/eal_memalloc.h|   3 +
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  28 +-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c  | 605 +++-
 lib/librte_eal/linuxapp/eal/eal_memory.c|  22 +-
 7 files changed, 420 insertions(+), 264 deletions(-)

-- 
2.7.4


[dpdk-dev] [PATCH v3 1/2] mem: add memalloc init stage

2018-04-25 Thread Anatoly Burakov
Currently, memseg lists for secondary process are allocated on
sync (triggered by init), when they are accessed for the first
time. Move this initialization to a separate init stage for
memalloc.

Signed-off-by: Anatoly Burakov 
Acked-by: Bruce Richardson 
---
 lib/librte_eal/bsdapp/eal/eal_memalloc.c   |  6 +++
 lib/librte_eal/common/eal_common_memory.c  |  3 ++
 lib/librte_eal/common/eal_memalloc.h   |  3 ++
 lib/librte_eal/linuxapp/eal/eal_memalloc.c | 66 ++
 4 files changed, 52 insertions(+), 26 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal_memalloc.c 
b/lib/librte_eal/bsdapp/eal/eal_memalloc.c
index 461732f..f7f07ab 100644
--- a/lib/librte_eal/bsdapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/bsdapp/eal/eal_memalloc.c
@@ -46,3 +46,9 @@ eal_memalloc_sync_with_primary(void)
RTE_LOG(ERR, EAL, "Memory hotplug not supported on FreeBSD\n");
return -1;
 }
+
+int
+eal_memalloc_init(void)
+{
+   return 0;
+}
diff --git a/lib/librte_eal/common/eal_common_memory.c 
b/lib/librte_eal/common/eal_common_memory.c
index 24a9ed5..dd9062d 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -864,6 +864,9 @@ rte_eal_memory_init(void)
if (retval < 0)
goto fail;
 
+   if (eal_memalloc_init() < 0)
+   goto fail;
+
retval = rte_eal_process_type() == RTE_PROC_PRIMARY ?
rte_eal_hugepage_init() :
rte_eal_hugepage_attach();
diff --git a/lib/librte_eal/common/eal_memalloc.h 
b/lib/librte_eal/common/eal_memalloc.h
index 6736fa3..662b3b5 100644
--- a/lib/librte_eal/common/eal_memalloc.h
+++ b/lib/librte_eal/common/eal_memalloc.h
@@ -76,4 +76,7 @@ eal_memalloc_mem_alloc_validator_unregister(const char *name, 
int socket_id);
 int
 eal_memalloc_mem_alloc_validate(int socket_id, size_t new_len);
 
+int
+eal_memalloc_init(void);
+
 #endif /* EAL_MEMALLOC_H */
diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c 
b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
index 1f553dd..162306a 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c
@@ -1060,33 +1060,11 @@ sync_walk(const struct rte_memseg_list *msl, void *arg 
__rte_unused)
struct hugepage_info *hi = NULL;
unsigned int i;
int msl_idx;
-   bool new_msl = false;
 
msl_idx = msl - mcfg->memsegs;
primary_msl = &mcfg->memsegs[msl_idx];
local_msl = &local_memsegs[msl_idx];
 
-   /* check if secondary has this memseg list set up */
-   if (local_msl->base_va == NULL) {
-   char name[PATH_MAX];
-   int ret;
-   new_msl = true;
-
-   /* create distinct fbarrays for each secondary */
-   snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%i",
-   primary_msl->memseg_arr.name, getpid());
-
-   ret = rte_fbarray_init(&local_msl->memseg_arr, name,
-   primary_msl->memseg_arr.len,
-   primary_msl->memseg_arr.elt_sz);
-   if (ret < 0) {
-   RTE_LOG(ERR, EAL, "Cannot initialize local memory 
map\n");
-   return -1;
-   }
-
-   local_msl->base_va = primary_msl->base_va;
-   }
-
for (i = 0; i < RTE_DIM(internal_config.hugepage_info); i++) {
uint64_t cur_sz =
internal_config.hugepage_info[i].hugepage_sz;
@@ -1101,10 +1079,8 @@ sync_walk(const struct rte_memseg_list *msl, void *arg 
__rte_unused)
return -1;
}
 
-   /* if versions don't match or if we have just allocated a new
-* memseg list, synchronize everything
-*/
-   if ((new_msl || local_msl->version != primary_msl->version) &&
+   /* if versions don't match, synchronize everything */
+   if (local_msl->version != primary_msl->version &&
sync_existing(primary_msl, local_msl, hi, msl_idx))
return -1;
return 0;
@@ -1122,3 +1098,41 @@ eal_memalloc_sync_with_primary(void)
return -1;
return 0;
 }
+
+static int
+secondary_msl_create_walk(const struct rte_memseg_list *msl,
+   void *arg __rte_unused)
+{
+   struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+   struct rte_memseg_list *primary_msl, *local_msl;
+   char name[PATH_MAX];
+   int msl_idx, ret;
+
+   msl_idx = msl - mcfg->memsegs;
+   primary_msl = &mcfg->memsegs[msl_idx];
+   local_msl = &local_memsegs[msl_idx];
+
+   /* create distinct fbarrays for each secondary */
+   snprintf(name, RTE_FBARRAY_NAME_LEN, "%s_%i",
+   primary_msl->memseg_arr.name, getpid());
+
+   ret = rte_fbarray_init(&local_msl->memseg_arr, name,
+   primary_msl->memseg_arr.len,
+   primary_msl->memseg_arr.elt_sz);
+   if (re

Re: [dpdk-dev] [PATCH v6 1/5] ethdev: add access to eeprom

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 11:01 AM, Zijie Pan wrote:
> add new APIs:
> - rte_eth_dev_get_module_info
> - rte_eth_dev_get_module_eeprom
> 
> Signed-off-by: Zijie Pan 
> Acked-by: Remy Horton 
> ---
> Cc: remy.hor...@intel.com
> Cc: john.mcnam...@intel.com
> Cc: marko.kovace...@intel.com
> Cc: tho...@monjalon.net
> 
>  doc/guides/nics/features.rst|   11 
>  lib/librte_ether/rte_dev_info.h |   18 +
>  lib/librte_ether/rte_ethdev.c   |   26 +++
>  lib/librte_ether/rte_ethdev.h   |   43 
> +++
>  lib/librte_ether/rte_ethdev_core.h  |   12 +
>  lib/librte_ether/rte_ethdev_version.map |2 ++
>  6 files changed, 112 insertions(+)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index 1b4fb97..bb183e2 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -749,6 +749,17 @@ Supports getting/setting device eeprom data.
>``rte_eth_dev_set_eeprom()``.
>  
>  
> +.. _nic_features_module_eeprom_dump:
> +
> +Module EEPROM dump
> +--
> +
> +Supports getting information and data of plugin module eeprom.
> +
> +* **[implements] eth_dev_ops**: ``get_module_info``, ``get_module_eeprom``.
> +* **[related]API**: ``rte_eth_dev_get_module_info()``, 
> ``rte_eth_dev_get_module_eeprom()``.
> +
> +

New feature added also needs to be added default.ini file, please replace this
feature somewhere close the related features in default.ini

And please keep Thomas' Ack for next version


Re: [dpdk-dev] [PATCH v6 3/5] net/ixgbe: add module EEPROM callbacks for ixgbe

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 11:01 AM, Zijie Pan wrote:
> Add new callbacks for eth_dev_ops of ixgbe to get the information
> and data of plugin module eeprom.
> 
> Signed-off-by: Zijie Pan 
> Acked-by: Remy Horton 
> ---
> Cc: remy.hor...@intel.com
> Cc: wenzhuo...@intel.com
> Cc: konstantin.anan...@intel.com
> 
>  drivers/net/ixgbe/ixgbe_ethdev.c |   79 
> ++
>  1 file changed, 79 insertions(+)

For all three PMDs, please update their .ini file [1] (in their corresponding
patches) to announce this new feature they support. The order should be same
with default.ini file

[1]
doc/guides/nics/features/ixgbe.ini
doc/guides/nics/features/ixgbe_vec.ini
doc/guides/nics/features/ixgbe_vf.ini
doc/guides/nics/features/ixgbe_vf_vec.ini
doc/guides/nics/features/i40e.ini
doc/guides/nics/features/i40e_vec.ini
doc/guides/nics/features/i40e_vf.ini
doc/guides/nics/features/i40e_vf_vec.ini
doc/guides/nics/features/igb.ini
doc/guides/nics/features/igb_vf.ini
doc/guides/nics/features/e1000.ini


Re: [dpdk-dev] [PATCH v7 6/9] ethdev: add common devargs parser

2018-04-25 Thread Remy Horton


On 25/04/2018 11:06, Thomas Monjalon wrote:

25/04/2018 11:40, Remy Horton:


On 24/04/2018 20:53, Thomas Monjalon wrote:

[..]

OK
Please can you add it as a comment in the parsing code?

We will need one or two examples in the commit message too.


Docs are being updated, so it should be in the v8 patchset.


Re: [dpdk-dev] [dpdk-stable] [PATCH v1] net/vdev_netvsc: fix creating short name devices

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 11:33 AM, Matan Azrad wrote:
> 
> 
> From: Ferruh Yigit, Wednesday, April 25, 2018 1:29 PM
>> On 4/25/2018 11:25 AM, Matan Azrad wrote:
>>>
>>>
>>> From: Ferruh Yigit, Tuesday, April 24, 2018 5:20 PM
 On 4/10/2018 4:39 PM, Matan Azrad wrote:
> It is OK for me.

 Converting this to explicit ack:
 Acked-by: Matan Azrad 

>>>
>>> Ok, don't forget to remove the fixes line.
>>
>> Don't forget? Should fixes line be removed, why?
> 
> This is not a fix.

Right, will update it in next-net.

> 
> Thanks.
>
> From: Ophir Munk, Tuesday, April 10, 2018 6:36 PM
>> Hi.
>> Discussed with Thomas.
>> Please consider the following commit message:
>>
>> net/vdev_netvsc: shorten devices names
>>
>> Prior to this commit the vdev_netvsc PMD was creating tap and
>> failsafe devices with long names, such as "net_tap_net_vdev_netvsc0"
>> or "net_failsafe_net_vdev_netvsc0".
>> This commits creates tap and failsafe devices with short names such
>> as "net_tap_netvsc0" or "net_failsafe_netvsc0".
>>
>>> -Original Message-
>>> From: Matan Azrad
>>> Sent: Tuesday, April 10, 2018 11:04 AM
>>> To: Ophir Munk ; dev@dpdk.org
>>> Cc: Thomas Monjalon ; Olga Shern
>>> ; sta...@dpdk.org
>>> Subject: RE: [PATCH v1] net/vdev_netvsc: fix creating short name
>>> devices
>>>
>>> Hi Ophir
>>>
>>> From: Ophir Munk, Tuesday, April 10, 2018 10:20 AM
 Prior to this commit the vdev_netvsc PMD was creating tap and
 failsafe devices with long names, such as
>> "net_tap_net_vdev_netvsc0"
 or "net_failsafe_net_vdev_netvsc0".
 Long names containing more than 32 characters may be rejected by
 some APIs (e.g. membuf pool creation).
>>>
>>> Since EAL allows to use long names, I don't think it is a problem
>>> of the netvsc device.
>>> If a DPDK entity wants to use this name for some reason it needs
>>> to adjust it to the usage.
>>>
>>> I agree that short names are better and may help for such like cases.
>>>
>>> I suggest the next title:
>>> net/vdev_netvsc: use short device names
>>>
 This commits fixes this issue by creating tap and failsafe
 devices with short names such as "tap_net_vsc0" or
>> "net_failsafe_vsc0".

 Fixes: e7dc5d7becc5 ("net/vdev_netvsc: implement core
 functionality")
 Cc: sta...@dpdk.org

 Signed-off-by: Ophir Munk 
 ---
  drivers/net/vdev_netvsc/vdev_netvsc.c | 8 
  1 file changed, 4 insertions(+), 4 deletions(-)

 diff --git a/drivers/net/vdev_netvsc/vdev_netvsc.c
 b/drivers/net/vdev_netvsc/vdev_netvsc.c
 index db0080a..bb2f78d 100644
 --- a/drivers/net/vdev_netvsc/vdev_netvsc.c
 +++ b/drivers/net/vdev_netvsc/vdev_netvsc.c
 @@ -614,13 +614,13 @@ vdev_netvsc_netvsc_probe(const struct
 if_nameindex *iface,
   name, ctx->id);
if (ret == -1 || (size_t)ret >= sizeof(ctx->name))
++i;
 -  ret = snprintf(ctx->devname, sizeof(ctx->devname),
 "net_failsafe_%s",
 - ctx->name);
 +  ret = snprintf(ctx->devname, sizeof(ctx->devname),
 "net_failsafe_vsc%u",
 + ctx->id);
if (ret == -1 || (size_t)ret >= sizeof(ctx->devname))
++i;
ret = snprintf(ctx->devargs, sizeof(ctx->devargs),
 - "fd(%d),dev(net_tap_%s,remote=%s)",
 - ctx->pipe[0], ctx->name, ctx->if_name);
 + "fd(%d),dev(net_tap_vsc%u,remote=%s)",
 + ctx->pipe[0], ctx->id, ctx->if_name);
if (ret == -1 || (size_t)ret >= sizeof(ctx->devargs))
++i;
if (i) {
 --
 2.7.4
>
>>>
> 



Re: [dpdk-dev] [PATCH] ethdev: remove experimental flag of ports enumeration

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 11:29 AM, Thomas Monjalon wrote:
> 25/04/2018 12:21, Ferruh Yigit:
>> On 4/24/2018 3:15 AM, Thomas Monjalon wrote:
>>> The basic operations for ports enumeration should not be
>>> considered as experimental in DPDK 18.05.
>>>
>>> The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
>>> It uses the function the rte_eth_find_next_owned_by() to get
>>> only ownerless ports. Its API can be considered stable.
>>> So the flag experimental is removed from rte_eth_find_next_owned_by().
>>>
>>> The flag experimental is removed from rte_eth_dev_count_avail()
>>> which is the new name of the old function rte_eth_dev_count().
>>>
>>> The flag experimental is set to rte_eth_dev_count_total()
>>> in the .c file for consistency with the declaration in the .h file.
>>>
>>> A lot of internal applications are fixed to not allow experimental API.
>>>
>>> Fixes: 8728ccf37615 ("fix ethdev ports enumeration")
>>> Fixes: d9a42a69febf ("ethdev: deprecate port count function")
>>> Fixes: e70e26861eaf ("net/mvpp2: fix build")
>>>
>>> Signed-off-by: Thomas Monjalon 
>>
>> Getting some build errors [1], it seems some samples are using some other
>> experimental APIs so that we can't remove the flag for them.
>>
>>
>> [1]
>> .../dpdk/examples/tep_termination/main.c: In function ‘main’:
>> .../dpdk/examples/tep_termination/main.c:1209:3: error: 
>> ‘rte_ctrl_thread_create’
>> is deprecated: Symbol is not yet part of stable ABI
>> [-Werror=deprecated-declarations]
>>ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
>>^~~
>>
>> .../dpdk/examples/vhost/main.c: In function ‘main’:
>> .../dpdk/examples/vhost/main.c:1497:3: error: ‘rte_ctrl_thread_create’ is
>> deprecated: Symbol is not yet part of stable ABI 
>> [-Werror=deprecated-declarations]
>>ret = rte_ctrl_thread_create(&tid, "print-stats", NULL,
>>^~~
> 
> Ah sorry, I think it is due to a change in next-net.
> I have prepared this patch on master.
> 
> Please can you fix it when applying?

OK, I will.


Re: [dpdk-dev] [PATCH] ethdev: remove experimental flag of ports enumeration

2018-04-25 Thread Ferruh Yigit
On 4/24/2018 12:59 PM, David Marchand wrote:
> On Tue, Apr 24, 2018 at 4:15 AM, Thomas Monjalon  wrote:
>> The basic operations for ports enumeration should not be
>> considered as experimental in DPDK 18.05.
>>
>> The iterator RTE_ETH_FOREACH_DEV was introduced in DPDK 17.05.
>> It uses the function the rte_eth_find_next_owned_by() to get
>> only ownerless ports. Its API can be considered stable.
>> So the flag experimental is removed from rte_eth_find_next_owned_by().
>>
>> The flag experimental is removed from rte_eth_dev_count_avail()
>> which is the new name of the old function rte_eth_dev_count().
>>
>> The flag experimental is set to rte_eth_dev_count_total()
>> in the .c file for consistency with the declaration in the .h file.
>>
>> A lot of internal applications are fixed to not allow experimental API.
>>
>> Fixes: 8728ccf37615 ("fix ethdev ports enumeration")
>> Fixes: d9a42a69febf ("ethdev: deprecate port count function")
>> Fixes: e70e26861eaf ("net/mvpp2: fix build")
>>
>> Signed-off-by: Thomas Monjalon 
> 
> Tested-by: David Marchand 

Applied to dpdk-next-net/master, thanks.

(Fixed tep_termination & vhost sample app build fix by adding
-DALLOW_EXPERIMENTAL_API back)

> 
>> It was a really bad idea to keep the iterator macro and function
>> as experimental.
>> And it was a real mistake of setting the new name of rte_eth_dev_count
>> function as experimental.
>>
>> I think this fix must be merged in 18.05-rc1, in order to avoid
>> troubles when testing coming RC1.
> 
> +1
> 
> 



[dpdk-dev] [PATCH] ethdev: check Rx/Tx offloads

2018-04-25 Thread Wei Dai
This patch check if a requested offloading is supported
in the device capability.
Any offloading is disabled by default if it is not set
in rte_eth_dev_configure( ) and rte_eth_[rt]x_queue_setup().
A per port offloading can only be enabled in
rte_eth_dev_configure(). If a per port offloading is
sent to rte_eth_[rt]x_queue_setup( ), return error.
Only per queue offloading can be sent to
rte_eth_dev_configure( ). A per queue offloading is
enabled only if it is enabled in rte_eth_dev_configure( ) OR
if it is enabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is enabled in rte_eth_dev_configure(),
it can't be disabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is disabled in rte_eth_dev_configure( ),
it can be enabled or disabled( ) in rte_eth_[rt]x_queue_setup( ).

This patch can make such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

Signed-off-by: Wei Dai 

---
v3: rework according to dicision of offloading API in community

v2: add offloads checking in rte_eth_dev_configure( ).
check if a requested offloading is supported.
---
 lib/librte_ether/rte_ethdev.c | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f0f53d4..70a7904 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1196,6 +1196,28 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
ETHER_MAX_LEN;
}
 
+   /* Any requested offload must be within its device capability */
+   if ((local_conf.rxmode.offloads & dev_info.rx_offload_capa) !=
+local_conf.rxmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Rx offloads "
+   "0x%" PRIx64 " doesn't match Rx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.rxmode.offloads,
+   dev_info.rx_offload_capa);
+   return -EINVAL;
+   }
+   if ((local_conf.txmode.offloads & dev_info.tx_offload_capa) !=
+local_conf.txmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Tx offloads "
+   "0x%" PRIx64 " doesn't match Tx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.txmode.offloads,
+   dev_info.tx_offload_capa);
+   return -EINVAL;
+   }
+
/*
 * Setup new number of RX/TX queues and reconfigure device.
 */
@@ -1547,6 +1569,33 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
&local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from application.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.rx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev port_id=%d rx_queue_id=%d "
+   "Requested offload 0x%" PRIx64 "doesn't "
+   "match per-queue capability 0x%" PRIx64
+   " in rte_eth_rx_queue_setup( )\n",
+   port_id,
+   rx_queue_id,
+   local_conf.offloads,
+   dev_info.rx_queue_offload_capa);
+   return -EINVAL;
+   }
+
+   /*
+* If a per-queue offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled on all queues and can't be disabled here.
+* If it is diabled in rte_eth_dev_configure( ), it can be enabled
+* or disabled here.
+* If a per-port offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled for all queues here.
+*/
+   local_conf.offloads |= dev->data->dev_conf.rxmode.offloads;
+
ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
  socket_id, &local_conf, mp);
if (!ret) {
@@ -1681,6 +1730,33 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
  &local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from applcation.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.tx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev

[dpdk-dev] [PATCH v3] ethdev: check Rx/Tx offloads

2018-04-25 Thread Wei Dai
This patch check if a requested offloading is supported
in the device capability.
Any offloading is disabled by default if it is not set
in rte_eth_dev_configure( ) and rte_eth_[rt]x_queue_setup().
A per port offloading can only be enabled in
rte_eth_dev_configure(). If a per port offloading is
sent to rte_eth_[rt]x_queue_setup( ), return error.
Only per queue offloading can be sent to
rte_eth_dev_configure( ). A per queue offloading is
enabled only if it is enabled in rte_eth_dev_configure( ) OR
if it is enabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is enabled in rte_eth_dev_configure(),
it can't be disabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is disabled in rte_eth_dev_configure( ),
it can be enabled or disabled( ) in rte_eth_[rt]x_queue_setup( ).

This patch can make such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

Signed-off-by: Wei Dai 

---
v3: rework according to dicision of offloading API in community

v2: add offloads checking in rte_eth_dev_configure( ).
check if a requested offloading is supported.
---
 lib/librte_ether/rte_ethdev.c | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f0f53d4..70a7904 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1196,6 +1196,28 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
ETHER_MAX_LEN;
}
 
+   /* Any requested offload must be within its device capability */
+   if ((local_conf.rxmode.offloads & dev_info.rx_offload_capa) !=
+local_conf.rxmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Rx offloads "
+   "0x%" PRIx64 " doesn't match Rx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.rxmode.offloads,
+   dev_info.rx_offload_capa);
+   return -EINVAL;
+   }
+   if ((local_conf.txmode.offloads & dev_info.tx_offload_capa) !=
+local_conf.txmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Tx offloads "
+   "0x%" PRIx64 " doesn't match Tx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.txmode.offloads,
+   dev_info.tx_offload_capa);
+   return -EINVAL;
+   }
+
/*
 * Setup new number of RX/TX queues and reconfigure device.
 */
@@ -1547,6 +1569,33 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
&local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from application.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.rx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev port_id=%d rx_queue_id=%d "
+   "Requested offload 0x%" PRIx64 "doesn't "
+   "match per-queue capability 0x%" PRIx64
+   " in rte_eth_rx_queue_setup( )\n",
+   port_id,
+   rx_queue_id,
+   local_conf.offloads,
+   dev_info.rx_queue_offload_capa);
+   return -EINVAL;
+   }
+
+   /*
+* If a per-queue offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled on all queues and can't be disabled here.
+* If it is diabled in rte_eth_dev_configure( ), it can be enabled
+* or disabled here.
+* If a per-port offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled for all queues here.
+*/
+   local_conf.offloads |= dev->data->dev_conf.rxmode.offloads;
+
ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
  socket_id, &local_conf, mp);
if (!ret) {
@@ -1681,6 +1730,33 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
  &local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from applcation.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.tx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev

[dpdk-dev] [PATCH] net/ifcvf: remove live migration

2018-04-25 Thread Xiao Wang
Live migration feature will be based on future development on ifc driver,
including port representor implementation and potential change in QEMU
and vhost user lib.

This patch removes live migration in ifcvf driver, if possible we could
squash this patch into 7c9f28370437 ("net/ifcvf: add ifcvf vdpa driver"),
and remove the live migration segment in that commit log.

Signed-off-by: Xiao Wang 
---
 doc/guides/nics/ifcvf.rst|  4 +---
 drivers/net/ifc/base/ifcvf.c | 31 
 drivers/net/ifc/base/ifcvf.h |  6 -
 drivers/net/ifc/ifcvf_vdpa.c | 57 +++-
 4 files changed, 4 insertions(+), 94 deletions(-)

diff --git a/doc/guides/nics/ifcvf.rst b/doc/guides/nics/ifcvf.rst
index d7e76353c..48f9adf1d 100644
--- a/doc/guides/nics/ifcvf.rst
+++ b/doc/guides/nics/ifcvf.rst
@@ -8,8 +8,7 @@ The IFCVF vDPA (vhost data path acceleration) driver provides 
support for the
 Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it
 works as a HW vhost backend which can send/receive packets to/from virtio
 directly by DMA. Besides, it supports dirty page logging and device state
-report/restore. This driver enables its vDPA functionality with live migration
-feature.
+report/restore, this driver enables its vDPA functionality.
 
 
 Pre-Installation Configuration
@@ -71,7 +70,6 @@ Features
 Features of the IFCVF driver are:
 
 - Compatibility with virtio 0.95 and 1.0.
-- Live migration.
 
 
 Prerequisites
diff --git a/drivers/net/ifc/base/ifcvf.c b/drivers/net/ifc/base/ifcvf.c
index bbeed30e8..4b22d9ed1 100644
--- a/drivers/net/ifc/base/ifcvf.c
+++ b/drivers/net/ifc/base/ifcvf.c
@@ -278,37 +278,6 @@ ifcvf_stop_hw(struct ifcvf_hw *hw)
ifcvf_reset(hw);
 }
 
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size)
-{
-   u8 *lm_cfg;
-
-   lm_cfg = hw->lm_cfg;
-
-   *(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_LOW) =
-   log_base & IFCVF_32_BIT_MASK;
-
-   *(u32 *)(lm_cfg + IFCVF_LM_BASE_ADDR_HIGH) =
-   (log_base >> 32) & IFCVF_32_BIT_MASK;
-
-   *(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_LOW) =
-   (log_base + log_size) & IFCVF_32_BIT_MASK;
-
-   *(u32 *)(lm_cfg + IFCVF_LM_END_ADDR_HIGH) =
-   ((log_base + log_size) >> 32) & IFCVF_32_BIT_MASK;
-
-   *(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_ENABLE_PF;
-}
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw)
-{
-   u8 *lm_cfg;
-
-   lm_cfg = hw->lm_cfg;
-   *(u32 *)(lm_cfg + IFCVF_LM_LOGGING_CTRL) = IFCVF_LM_DISABLE;
-}
-
 void
 ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid)
 {
diff --git a/drivers/net/ifc/base/ifcvf.h b/drivers/net/ifc/base/ifcvf.h
index 77a2bfa83..badacb615 100644
--- a/drivers/net/ifc/base/ifcvf.h
+++ b/drivers/net/ifc/base/ifcvf.h
@@ -142,12 +142,6 @@ ifcvf_start_hw(struct ifcvf_hw *hw);
 void
 ifcvf_stop_hw(struct ifcvf_hw *hw);
 
-void
-ifcvf_enable_logging(struct ifcvf_hw *hw, u64 log_base, u64 log_size);
-
-void
-ifcvf_disable_logging(struct ifcvf_hw *hw);
-
 void
 ifcvf_notify_queue(struct ifcvf_hw *hw, u16 qid);
 
diff --git a/drivers/net/ifc/ifcvf_vdpa.c b/drivers/net/ifc/ifcvf_vdpa.c
index 26d70be18..c6627c23a 100644
--- a/drivers/net/ifc/ifcvf_vdpa.c
+++ b/drivers/net/ifc/ifcvf_vdpa.c
@@ -280,12 +280,8 @@ static void
 vdpa_ifcvf_stop(struct ifcvf_internal *internal)
 {
struct ifcvf_hw *hw = &internal->hw;
-   uint32_t i, j;
+   uint32_t i;
int vid;
-   uint64_t features, pfn;
-   uint64_t log_base, log_size;
-   uint32_t size;
-   uint8_t *log_buf;
 
vid = internal->vid;
ifcvf_stop_hw(hw);
@@ -293,24 +289,6 @@ vdpa_ifcvf_stop(struct ifcvf_internal *internal)
for (i = 0; i < hw->nr_vring; i++)
rte_vhost_set_vring_base(vid, i, hw->vring[i].last_avail_idx,
hw->vring[i].last_used_idx);
-
-   rte_vhost_get_negotiated_features(vid, &features);
-   if (RTE_VHOST_NEED_LOG(features)) {
-   ifcvf_disable_logging(hw);
-   rte_vhost_get_log_base(internal->vid, &log_base, &log_size);
-   /*
-* IFCVF marks dirty memory pages for only packet buffer,
-* SW helps to mark the used ring as dirty after device stops.
-*/
-   log_buf = (uint8_t *)(uintptr_t)log_base;
-   size = hw->vring[i].size * 8 + 4;
-   for (i = 0; i < hw->nr_vring; i++) {
-   pfn = hw->vring[i].used / PAGE_SIZE;
-   for (j = 0; j <= size / PAGE_SIZE; j++)
-   __sync_fetch_and_or_8(&log_buf[(pfn + j) / 8],
-1 << ((pfn + j) % 8));
-   }
-   }
 }
 
 #define MSIX_IRQ_SET_BUF_LEN (sizeof(struct vfio_irq_set) + \
@@ -570,34 +548,6 @@ ifcvf_dev_close(int vid)
return 0;
 }
 
-static int
-ifcvf_set_features(int vid)
-{
-   uint64_t features

Re: [dpdk-dev] [PATCH v3] net/tap: remove queue specific offload support

2018-04-25 Thread Ophir Munk


> -Original Message-
> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> Sent: Wednesday, April 25, 2018 12:48 PM
> To: Ophir Munk ; Pascal Mazon
> 
> Cc: dev@dpdk.org; Mordechay Haimovsky ; Olga
> Shern ; Thomas Monjalon ;
> Raslan Darawsheh ; Shahaf Shuler
> 
> Subject: Re: [PATCH v3] net/tap: remove queue specific offload support
> 
> On 4/25/2018 10:18 AM, Ophir Munk wrote:
> > Hi Ferruh,
> >
> > I should have mentioned earlier that TAP does support queue specific
> capabilities.
> > Please look in tap_queue_setup() and note that each TAP queue is created
> with a distinct file descriptor (fd).
> > Then supporting an offload capability is just implementing it in SW (e.g.
> calculating IP checksum).
> >
> > If the main assumption of this patch was that TAP does not support queue
> specific offloads - then please consider this patch again.
> 
> Yes that was the initial question, is tap supports queue specific offloads or
> not. Thanks for the answer.
> 
> >
> > On the other hand there is no port specific capability supported by TAP.
> 
> If so verify functions are wrong, that was the error I got.

Can you please specify the test you did what error you got? 
If I fix something I want to verify what I am fixing.

> It seems copy/paste of mlx one but the port_supp_offloads has different
> meaning there.
> 
> > However, in order to support legacy applications, port capabilities are
> usually reported as the OR operation between queue & port capabilities.
> > TAP currently clones the queue capabilities to port capabilities. We could
> optimize this cloning by always return queue capabilities when queried about
> queues or ports. In this case - tap_rx_offload_get_port_capa() and
> tap_tx_offload_get_port_capa() could be removed.
> 
> Instead of removing the functions I think you can keep them but return
> correct values, in this case return empty, this will make the exiting 
> validation
> functions correct.
> 
> Can you send a fix for that?
> If no fix sent, I suggest going with this patch to remove queue level offload
> support until it is fixed.
> 
> >
> > Please find more comments inline.
> >
> >> -Original Message-
> >> From: Ferruh Yigit [mailto:ferruh.yi...@intel.com]
> >> Sent: Tuesday, April 24, 2018 8:54 PM
> >> To: Pascal Mazon 
> >> Cc: dev@dpdk.org; Ferruh Yigit ; Mordechay
> >> Haimovsky ; Ophir Munk
> 
> >> Subject: [PATCH v3] net/tap: remove queue specific offload support
> >>
> >> It is not clear if tap PMD supports queue specific offloads, removing
> >> the related code.
> >>
> >> Fixes: 95ae196ae10b ("net/tap: use new Rx offloads API")
> >> Fixes: 818fe14a9891 ("net/tap: use new Tx offloads API")
> >> Cc: mo...@mellanox.com
> >>
> >> Signed-off-by: Ferruh Yigit 
> >> ---
> >> Cc: Ophir Munk 
> >>
> >> v2:
> >> * rebased
> >>
> >> v3:
> >> * txq->csum restored,
> >>   - ETH_TXQ_FLAGS_IGNORE check removed since ethdev layer takes care
> >> of it
> >>   - tx_conf != NULL check removed, this is internal api who calls this is
> >>   ethdev and it doesn't pass null tx_conf
> >> ---
> >>  drivers/net/tap/rte_eth_tap.c | 102
> >> +-
> >>  1 file changed, 10 insertions(+), 92 deletions(-)
> >>
> >> diff --git a/drivers/net/tap/rte_eth_tap.c
> >> b/drivers/net/tap/rte_eth_tap.c index ef33aace9..61b4b5df3 100644
> >> --- a/drivers/net/tap/rte_eth_tap.c
> >> +++ b/drivers/net/tap/rte_eth_tap.c
> >> @@ -278,31 +278,6 @@ tap_rx_offload_get_port_capa(void)
> >>   DEV_RX_OFFLOAD_CRC_STRIP;
> >>  }
> >>
> >> -static uint64_t
> >> -tap_rx_offload_get_queue_capa(void)
> >> -{
> >> -  return DEV_RX_OFFLOAD_SCATTER |
> >> - DEV_RX_OFFLOAD_IPV4_CKSUM |
> >> - DEV_RX_OFFLOAD_UDP_CKSUM |
> >> - DEV_RX_OFFLOAD_TCP_CKSUM |
> >> - DEV_RX_OFFLOAD_CRC_STRIP;
> >> -}
> >> -
> >
> > TAP PMD supports all of these RX queue specific offloads. I suggest to
> leave this function in place.
> >
> >> -static bool
> >> -tap_rxq_are_offloads_valid(struct rte_eth_dev *dev, uint64_t offloads) -
> {
> >> -  uint64_t port_offloads = dev->data->dev_conf.rxmode.offloads;
> >> -  uint64_t queue_supp_offloads = tap_rx_offload_get_queue_capa();
> >> -  uint64_t port_supp_offloads = tap_rx_offload_get_port_capa();
> >> -
> >> -  if ((offloads & (queue_supp_offloads | port_supp_offloads)) !=
> >> -  offloads)
> >> -  return false;
> >> -  if ((port_offloads ^ offloads) & port_supp_offloads)
> >> -  return false;
> >> -  return true;
> >> -}
> >> -
> >
> > Putting aside the fact that queue offloads equals port offloads (so could
> ignore "port_supp_offload" variable) - this function is essential to validate
> that the configured Rx offloads are supported by TAP. I suggest to leave this
> function in place.
> > Without it - testpmd falsely confirms non supported offloads.
> > For example before this patch: offloading "hw-vlan-filter" will fail as
> expected:
> >
> > testpmd> port config all
> > testpmd> port config all hw-vla

Re: [dpdk-dev] [PATCH v2] app/testpmd: fix testpmd failure due to RSS offload check

2018-04-25 Thread Ferruh Yigit
On 4/25/2018 8:43 AM, Qi Zhang wrote:
> After add RSS hash offload check, default rss_hf will fail on
> devices that not support all bits, the patch introduce RSS
> negotiate flag, when it is set, RSS configuration will negotiate
> with device capability. By default negotiate is turn on, so it
> will not break exist PMD. It can be disabled by "--disable-rss-neg"
> in start parameters or be enable/disable by testpmd command
> "port config all rss neg|noneg".

Hi Qi,

Thanks for the patch, but I feel new neg|noneg is extra complexity, I am
wondering if we can fix this without this complexity.

"rss_hf" is rss hash function configuration for *all* ports. Default value is
ETH_RSS_IP.

Up until recently it was more like default value, Adrien's commit start updating
it in "port config all rss ..." and it become more like config value now.

And if it is config value, the question is what to do if PMD doesn't support
requested hash function:
- Fail
- Convert request what PMD supports.

init_port_config() called from many places and it sets rss_conf.rss_hf via
"rss_hf" and currently this is the problem some pmds hit.

Previously, before Adrien's update, when "port config all rss ..." failed it was
printing log and keep continue, not recording the failed value as config value.


What do you think:
- In init_port_config() use "rss_hf & dev_info.flow_type_rss_offloads" as you
suggested.
- in cmd_config_rss_parsed() set "rss_hf" only if all ethdev successfully called
 rte_eth_dev_rss_hash_update(), this will mean the config has supported by all
ports so it will be ok to use is as "rss_hf & dev_info.flow_type_rss_offloads"
in init_port_config()
- I think there is a defect in "port config all rss default", perhaps because of
me during merged, it shouldn't update "rss_hf"

After this changes answer to above question,
by default config behavior is "Convert request what PMD supports" so testpmd
don't fail, and when configuration updated only values supported by ports 
accepted.



> 
> Fixes: 527624c663f8 ("ethdev: add supported hash function check")
> Signed-off-by: Qi Zhang 
> ---
> 
> v2:
> - fix command help string.
> 
>  app/test-pmd/cmdline.c  | 22 --
>  app/test-pmd/parameters.c   |  5 +
>  app/test-pmd/testpmd.c  | 12 +++-
>  app/test-pmd/testpmd.h  |  1 +
>  doc/guides/testpmd_app_ug/testpmd_funcs.rst |  4 +++-
>  5 files changed, 36 insertions(+), 8 deletions(-)
> 
> diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c
> index 333852194..9390be741 100644
> --- a/app/test-pmd/cmdline.c
> +++ b/app/test-pmd/cmdline.c
> @@ -822,7 +822,7 @@ static void cmd_help_long_parsed(void *parsed_result,
>   " for ports.\n\n"
>  
>   "port config all rss (all|default|ip|tcp|udp|sctp|"
> - "ether|port|vxlan|geneve|nvgre|none|)\n"
> + 
> "ether|port|vxlan|geneve|nvgre|none|neg|noneg|)\n"
>   "Set the RSS mode.\n\n"
>  
>   "port config port-id rss reta 
> (hash,queue)[,(hash,queue)]\n"
> @@ -2029,7 +2029,13 @@ cmd_config_rss_parsed(void *parsed_result,
>   else if (isdigit(res->value[0]) && atoi(res->value) > 0 &&
>   atoi(res->value) < 64)
>   rss_conf.rss_hf = 1ULL << atoi(res->value);
> - else {
> + else if (!strcmp(res->value, "neg")) {
> + rss_hf_noneg = 0;
> + return;
> + } else if (!strcmp(res->value, "noneg")) {
> + rss_hf_noneg = 1;
> + return;
> + } else {
>   printf("Unknown parameter\n");
>   return;
>   }
> @@ -2037,10 +2043,14 @@ cmd_config_rss_parsed(void *parsed_result,
>   /* Update global configuration for RSS types. */
>   rss_hf = rss_conf.rss_hf;
>   RTE_ETH_FOREACH_DEV(i) {
> - if (!strcmp(res->value, "default")) {
> - rte_eth_dev_info_get(i, &dev_info);
> + rte_eth_dev_info_get(i, &dev_info);
> + if (!strcmp(res->value, "default"))
>   rss_conf.rss_hf = dev_info.flow_type_rss_offloads;
> - }
> + else
> + rss_conf.rss_hf &= (rss_hf_noneg ?
> + rss_conf.rss_hf :
> + dev_info.flow_type_rss_offloads);
> +
>   diag = rte_eth_dev_rss_hash_update(i, &rss_conf);
>   if (diag < 0)
>   printf("Configuration of RSS hash at ethernet port %d "
> @@ -2064,7 +2074,7 @@ cmdline_parse_inst_t cmd_config_rss = {
>   .f = cmd_config_rss_parsed,
>   .data = NULL,
>   .help_str = "port config all rss "
> - 
> "all|default|ip|tcp|udp|sctp|ether|port|vxlan|geneve|nvgre|none|",
> + 
> "all|default|ip|tcp|udp|sctp|ether|port|vxlan|geneve|nvgre|none|neg|noneg|",
>   .tokens = {
>

[dpdk-dev] [PATCH v3] ethdev: check Rx/Tx offloads

2018-04-25 Thread Wei Dai
This patch check if a requested offloading is supported
in the device capability.
Any offloading is disabled by default if it is not set
in rte_eth_dev_configure( ) and rte_eth_[rt]x_queue_setup().
A per port offloading can only be enabled in
rte_eth_dev_configure(). If a per port offloading is
sent to rte_eth_[rt]x_queue_setup( ), return error.
Only per queue offloading can be sent to
rte_eth_dev_configure( ). A per queue offloading is
enabled only if it is enabled in rte_eth_dev_configure( ) OR
if it is enabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is enabled in rte_eth_dev_configure(),
it can't be disabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is disabled in rte_eth_dev_configure( ),
it can be enabled or disabled( ) in rte_eth_[rt]x_queue_setup( ).

This patch can make such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

Signed-off-by: Wei Dai 

---
v3: rework according to dicision of offloading API in community

v2: add offloads checking in rte_eth_dev_configure( ).
check if a requested offloading is supported.
---
 lib/librte_ether/rte_ethdev.c | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f0f53d4..70a7904 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1196,6 +1196,28 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
ETHER_MAX_LEN;
}
 
+   /* Any requested offload must be within its device capability */
+   if ((local_conf.rxmode.offloads & dev_info.rx_offload_capa) !=
+local_conf.rxmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Rx offloads "
+   "0x%" PRIx64 " doesn't match Rx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.rxmode.offloads,
+   dev_info.rx_offload_capa);
+   return -EINVAL;
+   }
+   if ((local_conf.txmode.offloads & dev_info.tx_offload_capa) !=
+local_conf.txmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Tx offloads "
+   "0x%" PRIx64 " doesn't match Tx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.txmode.offloads,
+   dev_info.tx_offload_capa);
+   return -EINVAL;
+   }
+
/*
 * Setup new number of RX/TX queues and reconfigure device.
 */
@@ -1547,6 +1569,33 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
&local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from application.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.rx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev port_id=%d rx_queue_id=%d "
+   "Requested offload 0x%" PRIx64 "doesn't "
+   "match per-queue capability 0x%" PRIx64
+   " in rte_eth_rx_queue_setup( )\n",
+   port_id,
+   rx_queue_id,
+   local_conf.offloads,
+   dev_info.rx_queue_offload_capa);
+   return -EINVAL;
+   }
+
+   /*
+* If a per-queue offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled on all queues and can't be disabled here.
+* If it is diabled in rte_eth_dev_configure( ), it can be enabled
+* or disabled here.
+* If a per-port offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled for all queues here.
+*/
+   local_conf.offloads |= dev->data->dev_conf.rxmode.offloads;
+
ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
  socket_id, &local_conf, mp);
if (!ret) {
@@ -1681,6 +1730,33 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
  &local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from applcation.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.tx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev

[dpdk-dev] [PATCH v4] ethdev: check Rx/Tx offloads

2018-04-25 Thread Wei Dai
This patch check if a requested offloading is supported
in the device capability.
Any offloading is disabled by default if it is not set
in rte_eth_dev_configure( ) and rte_eth_[rt]x_queue_setup().
A per port offloading can only be enabled in
rte_eth_dev_configure(). If a per port offloading is
sent to rte_eth_[rt]x_queue_setup( ), return error.
Only per queue offloading can be sent to
rte_eth_[rt]x_queue_setup( ). A per queue offloading is
enabled only if it is enabled in rte_eth_dev_configure( ) OR
if it is enabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is enabled in rte_eth_dev_configure(),
it can't be disabled in rte_eth_[rt]x_queue_setup( ).
If a per queue offloading is disabled in rte_eth_dev_configure( ),
it can be enabled or disabled( ) in rte_eth_[rt]x_queue_setup( ).

This patch can make such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

Signed-off-by: Wei Dai 

---
v4: fix a wrong description in git log message.

v3: rework according to dicision of offloading API in community

v2: add offloads checking in rte_eth_dev_configure( ).
check if a requested offloading is supported.
---
 lib/librte_ether/rte_ethdev.c | 76 +++
 1 file changed, 76 insertions(+)

diff --git a/lib/librte_ether/rte_ethdev.c b/lib/librte_ether/rte_ethdev.c
index f0f53d4..70a7904 100644
--- a/lib/librte_ether/rte_ethdev.c
+++ b/lib/librte_ether/rte_ethdev.c
@@ -1196,6 +1196,28 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t 
nb_rx_q, uint16_t nb_tx_q,
ETHER_MAX_LEN;
}
 
+   /* Any requested offload must be within its device capability */
+   if ((local_conf.rxmode.offloads & dev_info.rx_offload_capa) !=
+local_conf.rxmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Rx offloads "
+   "0x%" PRIx64 " doesn't match Rx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.rxmode.offloads,
+   dev_info.rx_offload_capa);
+   return -EINVAL;
+   }
+   if ((local_conf.txmode.offloads & dev_info.tx_offload_capa) !=
+local_conf.txmode.offloads) {
+   RTE_PMD_DEBUG_TRACE("ethdev port_id=%d requested Tx offloads "
+   "0x%" PRIx64 " doesn't match Tx offloads "
+   "capability 0x%" PRIx64 "\n",
+   port_id,
+   local_conf.txmode.offloads,
+   dev_info.tx_offload_capa);
+   return -EINVAL;
+   }
+
/*
 * Setup new number of RX/TX queues and reconfigure device.
 */
@@ -1547,6 +1569,33 @@ rte_eth_rx_queue_setup(uint16_t port_id, uint16_t 
rx_queue_id,
&local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from application.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.rx_queue_offload_capa) !=
+local_conf.offloads) {
+   RTE_PMD_DEBUG_TRACE("Ethdev port_id=%d rx_queue_id=%d "
+   "Requested offload 0x%" PRIx64 "doesn't "
+   "match per-queue capability 0x%" PRIx64
+   " in rte_eth_rx_queue_setup( )\n",
+   port_id,
+   rx_queue_id,
+   local_conf.offloads,
+   dev_info.rx_queue_offload_capa);
+   return -EINVAL;
+   }
+
+   /*
+* If a per-queue offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled on all queues and can't be disabled here.
+* If it is diabled in rte_eth_dev_configure( ), it can be enabled
+* or disabled here.
+* If a per-port offload is enabled in rte_eth_dev_configure( ),
+* it is also enabled for all queues here.
+*/
+   local_conf.offloads |= dev->data->dev_conf.rxmode.offloads;
+
ret = (*dev->dev_ops->rx_queue_setup)(dev, rx_queue_id, nb_rx_desc,
  socket_id, &local_conf, mp);
if (!ret) {
@@ -1681,6 +1730,33 @@ rte_eth_tx_queue_setup(uint16_t port_id, uint16_t 
tx_queue_id,
  &local_conf.offloads);
}
 
+   /*
+* Only per-queue offload can be enabled from applcation.
+* If any pure per-port offload is sent to this function, return -EINVAL
+*/
+   if ((local_conf.offloads & dev_info.tx_queue_offload_capa) !=
+local_conf.of

  1   2   3   >