dpdk.git
6 years agonet/enic: fix crash due to static max number of queues
Hyong Youb Kim [Tue, 23 Jan 2018 01:05:28 +0000 (17:05 -0800)]
net/enic: fix crash due to static max number of queues

ENIC_CQ_MAX, ENIC_WQ_MAX and others are arbitrary values that
prevent the app from using more queues when they are available on
hardware. Remove them and dynamically allocate vnic_cq and such
arrays to accommodate all available hardware queues.

As a side effect of removing ENIC_CQ_MAX, this commit fixes a segfault
that would happen when the app requests more than 16 CQs, because
enic_set_vnic_res() does not consider ENIC_CQ_MAX. For example, the
following command causes a crash.

testpmd -- --rxq=16 --txq=16

Fixes: ce93d3c36db0 ("net/enic: fix resource check failures when bonding devices")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/mrvl: switch to the new Tx offload API
Tomasz Duszynski [Tue, 23 Jan 2018 08:46:15 +0000 (09:46 +0100)]
net/mrvl: switch to the new Tx offload API

Since the old Tx offload API was depracated
update the driver to use the latest one.

Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
6 years agonet/mrvl: switch to the new Rx offload API
Tomasz Duszynski [Tue, 23 Jan 2018 08:46:14 +0000 (09:46 +0100)]
net/mrvl: switch to the new Rx offload API

Since the old Rx offload API is now depracated
update the driver to use the latest one.

Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
6 years agoethdev: move internal callback list definition
David Marchand [Mon, 22 Jan 2018 12:25:35 +0000 (13:25 +0100)]
ethdev: move internal callback list definition

This structure is not exposed through public apis, we should just move it
to the core header.

Fixes: 331c447ad913 ("ethdev: separate internal structures into own header")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/mlx5: fix allocation when no memory on device NUMA node
Olivier Matz [Mon, 22 Jan 2018 12:33:38 +0000 (13:33 +0100)]
net/mlx5: fix allocation when no memory on device NUMA node

When no memory is available on the same numa node than the device, the
initialization of the device fails. However, the use case where the
cores and memory are on a different socket than the device is valid,
even if not optimal.

To fix this issue, this commit introduces an infrastructure to select
the socket on which to allocate the verbs objects based on the ethdev
configuration and the object type, rather than the PCI numa node.

Fixes: 1e3a39f72d5d ("net/mlx5: allocate verbs object into shared memory")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: fix return value of start operation
Olivier Matz [Mon, 22 Jan 2018 12:33:37 +0000 (13:33 +0100)]
net/mlx5: fix return value of start operation

On error, mlx5_dev_start() does not return a negative value
as it is supposed to do. The consequence is that the application
(ex: testpmd) does not notice that the port is not started
and begins the rxtx on an uninitialized port, which crashes.

Fixes: e1016cb73383 ("net/mlx5: fix Rx interrupts management")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: fix all multi verification code position
Nélio Laranjeiro [Thu, 11 Jan 2018 09:25:22 +0000 (10:25 +0100)]
net/mlx5: fix all multi verification code position

All multi code should not be handled in exit part of the code but in the
mainline of the function.

Fixes: 0a40a1363a4d ("net/mlx5: fix flow type for allmulti rules")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
6 years agodoc: add i40e queue region support to release notes
Wei Zhao [Mon, 22 Jan 2018 05:18:30 +0000 (13:18 +0800)]
doc: add i40e queue region support to release notes

This patch adds information about i40e queue region related to
the release notes.

Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
6 years agodoc: add i40e tunnel support in release notes
Beilei Xing [Thu, 18 Jan 2018 02:24:03 +0000 (10:24 +0800)]
doc: add i40e tunnel support in release notes

Update release notes to declare MPLSoUDP/MPLSoGRE/GTP-U/GTP-C/PPPoE/
PPPoL2TP steering support in i40e driver.

Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
6 years agonet/ixgbe: check security enable bits
Radu Nicolau [Thu, 18 Jan 2018 12:46:40 +0000 (12:46 +0000)]
net/ixgbe: check security enable bits

Check if the security enable bits are not fused before setting
offload capabilities for security.

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
6 years agomaintainers: update for cryptodev
Pablo de Lara [Wed, 24 Jan 2018 17:24:50 +0000 (17:24 +0000)]
maintainers: update for cryptodev

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agocrypto/qat: fix truncated response ring value
Fiona Trahe [Mon, 29 Jan 2018 18:33:40 +0000 (18:33 +0000)]
crypto/qat: fix truncated response ring value

Issue detected by coverity. Could never actually cause a
problem as truncated value (0x7f7f7f7f->0x7f) is what's needed.
But fix in code for correctness.

Coverity issue: 194998
Fixes: 571365dd4c5e ("crypto/qat: enable Rx head writes coalescing")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
6 years agotest/crypto: improve NULL authentication validation
Fiona Trahe [Thu, 25 Jan 2018 17:19:15 +0000 (17:19 +0000)]
test/crypto: improve NULL authentication validation

Add comparison to make sure memory pointed to by
digest pointer is not overwritten in NULL auth case.

Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agocrypto/qat: fix null auth algo overwrite
Fiona Trahe [Thu, 25 Jan 2018 17:19:14 +0000 (17:19 +0000)]
crypto/qat: fix null auth algo overwrite

If auth algorithm is RTE_CRYPTO_AUTH_NULL and digest_length is 0
in the xform and digest pointer is set in the op, then
the PMD may overwrite memory at the digest pointer.
With this patch the memory is not overwritten.

Fixes: db0e952a5c01 ("crypto/qat: add NULL capability")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
6 years agocryptodev: fix session pointer cast
Zhiyong Yang [Tue, 23 Jan 2018 09:48:13 +0000 (17:48 +0800)]
cryptodev: fix session pointer cast

The wrong casts don't cause actual error, but they should conform to C
standard.

Fixes: c261d1431bd8 ("security: introduce security API and framework")
Fixes: b3bbd9e5f265 ("cryptodev: support device independent sessions")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agoapp/crypto-perf: fix out-of-bounds array access
Fan Zhang [Tue, 23 Jan 2018 14:22:55 +0000 (14:22 +0000)]
app/crypto-perf: fix out-of-bounds array access

Fixes: 27c2e7471961 ("app/crypto-perf: support IMIX")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
6 years agoexamples/ipsec_secgw: fix security session
Fan Zhang [Tue, 23 Jan 2018 12:32:11 +0000 (12:32 +0000)]
examples/ipsec_secgw: fix security session

Fixes: 3da37f682173 ("examples/ipsec_secgw: create session mempools for ethdevs")

Some NICs do not have the rte_security context, this patch fixes the segment fault
caused by this.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
6 years agocrypto/qat: fix parameter type
Tomasz Jozwiak [Mon, 22 Jan 2018 16:28:05 +0000 (17:28 +0100)]
crypto/qat: fix parameter type

This commit fixes right cast from qat_cipher_get_block_size
function. This function can return -EFAULT in case of
any error, and that value must be cast to int instead of uint8_t

Fixes: d18ab45f7654 ("crypto/qat: support DOCSIS BPI mode")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
6 years agocrypto/qat: fix typo in error message
Tomasz Jozwiak [Mon, 22 Jan 2018 16:28:04 +0000 (17:28 +0100)]
crypto/qat: fix typo in error message

This commit fixes typo in bpi_cipher_decrypt error message

Fixes: d18ab45f7654 ("crypto/qat: support DOCSIS BPI mode")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
6 years agocrypto/qat: fix out-of-bounds access
Tomasz Jozwiak [Mon, 22 Jan 2018 16:28:03 +0000 (17:28 +0100)]
crypto/qat: fix out-of-bounds access

This commit fixes
  - bpi_cipher_encrypt to prevent before 'array subscript is
    above array bounds' error
  - bpi_cipher_decrypt to prevent before 'array subscript is
    above array bounds' error

Fixes: d18ab45f7654 ("crypto/qat: support DOCSIS BPI mode")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
6 years agocrypto/dpaa_sec: support scatter gather
Akhil Goyal [Mon, 22 Jan 2018 08:46:38 +0000 (14:16 +0530)]
crypto/dpaa_sec: support scatter gather

Signed-off-by: Alok Makhariya <alok.makhariya@nxp.com>
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agocrypto/dpaa2_sec: support scatter gather
Akhil Goyal [Mon, 22 Jan 2018 08:46:37 +0000 (14:16 +0530)]
crypto/dpaa2_sec: support scatter gather

Signed-off-by: Alok Makhariya <alok.makhariya@nxp.com>
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agodoc: update feature list for cryptodevs
Akhil Goyal [Mon, 22 Jan 2018 08:46:36 +0000 (14:16 +0530)]
doc: update feature list for cryptodevs

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agocrypto/dpaa2_sec: fix build with GCC 7
Thomas Monjalon [Mon, 29 Jan 2018 22:20:40 +0000 (23:20 +0100)]
crypto/dpaa2_sec: fix build with GCC 7

Seen with GCC 7.2.0, a switch fall through is detected and
cannot be fixed with a fall-through comment or attribute:

drivers/crypto/dpaa2_sec/hw/rta/operation_cmd.h:89:6: error:
this statement may fall through [-Werror=implicit-fallthrough=]
   if (rta_sec_era < RTA_SEC_ERA_2)
      ^

The check is disabled in dpaa2_sec Makefile but not in dpaa_sec Makefile
which uses source code shared by dpaa2_sec.

The workaround is to disable the check at the beginning of the file.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agocrypto/mrvl: fix export map file name
Thomas Monjalon [Mon, 29 Jan 2018 21:52:56 +0000 (22:52 +0100)]
crypto/mrvl: fix export map file name

Fixes: 8a61c83af2fa ("crypto/mrvl: add mrvl crypto driver")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agodoc: add ABI experimental tag in versioning guide
Neil Horman [Mon, 22 Jan 2018 01:48:07 +0000 (20:48 -0500)]
doc: add ABI experimental tag in versioning guide

Document the need to add the __experimental tag to appropriate functions

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agomk: add experimental tag check
Neil Horman [Mon, 22 Jan 2018 01:48:05 +0000 (20:48 -0500)]
mk: add experimental tag check

Add checks during build to ensure that all symbols in the EXPERIMENTAL
version map section have __experimental tags on their definitions, and
enable the warnings needed to announce their use.  Also add an
ALLOW_EXPERIMENTAL_APIS define to allow individual libraries and files
to declare the acceptability of experimental api usage

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoadd experimental tag to appropriate functions
Neil Horman [Mon, 22 Jan 2018 01:48:06 +0000 (20:48 -0500)]
add experimental tag to appropriate functions

Append the __rte_experimental tag to api calls appearing in the
EXPERIMENTAL section of their libraries version map

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agocompat: add experimental tag macro
Neil Horman [Mon, 22 Jan 2018 01:48:04 +0000 (20:48 -0500)]
compat: add experimental tag macro

The __rte_experimental macro tags a given exported function as being part of
the EXPERIMENTAL api.  Use of this tag will cause any caller of the
function (that isn't removed by dead code elimination) to emit a warning
that the user is making use of an API whos stabilty isn't guaranteed.
It also places the function in the .text.experimental section, which is
used to validate the tag against the corresponding library version map

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
6 years agobuildtools: add script to check experimental API exports
Neil Horman [Mon, 22 Jan 2018 01:48:03 +0000 (20:48 -0500)]
buildtools: add script to check experimental API exports

This tools reads the given version map for a directory, and checks to
ensure that, for each symbol listed in the export list, the corresponding
definition is tagged as __rte_experimental, erroring out if its not.  In this
way, we can ensure that the EXPERIMENTAL api is kept in sync with the tags

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
6 years agopmdinfogen: allow using stdin and stdout
Bruce Richardson [Thu, 25 Jan 2018 11:12:25 +0000 (11:12 +0000)]
pmdinfogen: allow using stdin and stdout

Rather than having to work off files all the time, allow stdin and stdout
to be used as the source and destination for pmdinfogen. This will allow
other possible usages from scripts, e.g. taking files from ar archive and
building a single .pmd.c file from all the .o files in it.

for f in `ar t librte_pmd_xyz.a` ; do
ar p librte_pmd_xyz.a $f | pmdinfogen - - >> xyz_info.c
done

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
6 years agoapp/procinfo: call EAL cleanup before exit
Harry van Haaren [Mon, 29 Jan 2018 16:37:32 +0000 (16:37 +0000)]
app/procinfo: call EAL cleanup before exit

This patch adds a call to the newly introduced cleanup()
function just before quitting the app.

Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoapp/pdump: call EAL cleanup before exit
Harry van Haaren [Mon, 29 Jan 2018 16:37:31 +0000 (16:37 +0000)]
app/pdump: call EAL cleanup before exit

This patch adds a call to the newly introduced cleanup()
function just before quitting the pdump app.

Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoeal: add function to release internal resources
Harry van Haaren [Mon, 29 Jan 2018 16:37:30 +0000 (16:37 +0000)]
eal: add function to release internal resources

This commit adds a new function rte_eal_cleanup().
The function serves as a hook to allow DPDK to release
internal resources (e.g.: hugepage allocations).

This function allows DPDK to become more like an ordinary
library, where the library context itself can be initialized
and cleaned up by the application.

The rte_exit() and rte_panic() functions must be considered,
particularly if they should call rte_eal_cleanup() to release any
resources or not. This patch adds the cleanup to rte_exit(),
but does not clean up on rte_panic(). The reason to not clean
up on panicing is that the developer may wish to inspect the
exact internal state of EAL and hugepages.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoservice: restrict finalize to internal usage
Harry van Haaren [Mon, 29 Jan 2018 16:37:29 +0000 (16:37 +0000)]
service: restrict finalize to internal usage

This commit moves the rte_service_finalize() function
to be in the component header, and marks it as @internal.
The function is only called internally by rte_eal_finalize().

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agobus/fslmc: register platform HW mempool on runtime
Hemant Agrawal [Mon, 29 Jan 2018 08:10:49 +0000 (13:40 +0530)]
bus/fslmc: register platform HW mempool on runtime

Detect if the DPAA2 mempool objects are present and register
it as platform default hw mempool

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agobus/dpaa: register platform HW mempool on runtime
Hemant Agrawal [Mon, 29 Jan 2018 08:10:48 +0000 (13:40 +0530)]
bus/dpaa: register platform HW mempool on runtime

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoapp/testpmd: add log for preferred mempool ops
Pavan Nikhilesh [Mon, 29 Jan 2018 08:10:47 +0000 (13:40 +0530)]
app/testpmd: add log for preferred mempool ops

This patch adds the debug message to print the best selected
pktmbuf mempool ops name.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: add pool create helper for specific mempool ops
Hemant Agrawal [Mon, 29 Jan 2018 08:10:46 +0000 (13:40 +0530)]
mbuf: add pool create helper for specific mempool ops

Introduce a new helper for pktmbuf pool, which will allow
the application to optionally specify the mempool ops name
as well.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: add pool ops selection functions
Hemant Agrawal [Mon, 29 Jan 2018 08:10:45 +0000 (13:40 +0530)]
mbuf: add pool ops selection functions

This patch add support for various mempool ops config helper APIs.

1.User defined mempool ops
2.Platform detected HW mempool ops (active).
3.Best selection of mempool ops by looking into user defined,
  platform registered and compile time configured.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: maintain user and compile time mempool ops name
Hemant Agrawal [Mon, 29 Jan 2018 08:10:44 +0000 (13:40 +0530)]
mbuf: maintain user and compile time mempool ops name

At present the userdefined mempool ops name overwrites
the default mempool ops name variable in internal_config.

This patch change the logic to maintain the value of
user defined only in the internal config.

The pktmbuf_create_pool is updated to reflect the same ie.
use user defined. If not present than use the default.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agoeal: prefix mbuf pool ops name with user defined
Hemant Agrawal [Mon, 29 Jan 2018 08:10:43 +0000 (13:40 +0530)]
eal: prefix mbuf pool ops name with user defined

This patch prefix the mbuf pool ops name with "user" to indicate
that it is user defined.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: fix VLAN flags documentation
Olivier Matz [Mon, 29 Jan 2018 09:36:06 +0000 (10:36 +0100)]
mbuf: fix VLAN flags documentation

Fix inconsistency between mbuf structure documentation and flags
documentation.

Fixes: 380a7aab1ae2 ("mbuf: rename deprecated VLAN flags")
Cc: stable@dpdk.org
Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agombuf: rename Tx VLAN flags
Olivier Matz [Mon, 29 Jan 2018 09:37:07 +0000 (10:37 +0100)]
mbuf: rename Tx VLAN flags

For consistency with the Rx flags, the flags PKT_TX_VLAN_PKT and
PKT_TX_QINQ_PKT are respectively renamed as PKT_TX_VLAN and
PKT_TX_QINQ. The old defines are deprecated but will stay for some time
for compatibility.

Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agombuf: fix NULL freeing when debug enabled
Olivier Matz [Mon, 29 Jan 2018 09:39:23 +0000 (10:39 +0100)]
mbuf: fix NULL freeing when debug enabled

Do not panic when calling rte_pktmbuf_free(NULL) with mbuf debug
enabled, it is a valid operation.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Reported-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoeal/x86: use lock-prefixed instructions for SMP barrier
Konstantin Ananyev [Mon, 15 Jan 2018 15:09:31 +0000 (15:09 +0000)]
eal/x86: use lock-prefixed instructions for SMP barrier

On x86 it is possible to use lock-prefixed instructions to get
the similar effect as mfence.
As pointed by Java guys, on most modern HW that gives a better
performance than using mfence:
https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
That patch adopts that technique for rte_smp_mb() implementation.
On BDW 2.2 mb_autotest on single lcore reports 2X cycle reduction,
i.e. from ~110 to ~55 cycles per operation.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agotest: introduce memory barrier test case
Konstantin Ananyev [Mon, 15 Jan 2018 15:04:39 +0000 (15:04 +0000)]
test: introduce memory barrier test case

Simple functional test for rte_smp_mb() implementations.
Also when executed on a single lcore could be used as rough
estimation how many cycles particular implementation of rte_smp_mb()
might take.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
6 years agoconfig/thunderx: disable C11 memory model ring
Jerin Jacob [Sun, 3 Dec 2017 12:37:30 +0000 (18:07 +0530)]
config/thunderx: disable C11 memory model ring

On thunderx and octeontx, ring_perf_autotest and
ring_pmd_perf_autotest test shows better performance
when disabling CONFIG_RTE_RING_USE_C11_MEM_MODEL.
On the other hand, Enabling CONFIG_RTE_RING_USE_C11_MEM_MODEL
shows better performance on thunderx2.
Since thunderx2 is using the default armv8 config,
no particular change is required.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
6 years agoring: introduce C11 memory model barrier option
Jia He [Mon, 22 Jan 2018 04:41:28 +0000 (20:41 -0800)]
ring: introduce C11 memory model barrier option

This patch is to support C11 memory model barrier in librte_ring.

There are 2 barrier implementation options in librte_ring (suggested
by Jerin).
1. use rte_smp_rmb
2. use load_acquire/store_release(refer to [1]).
The reason why providing 2 options is the performance benchmark
difference in different arm machines, refer to [2].

CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "n"
on any architectures and only "y" on arm64 so far.

[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
[2] http://dpdk.org/ml/archives/dev/2017-October/080861.html

Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoring: move code in a new header file
Jia He [Mon, 22 Jan 2018 04:41:27 +0000 (20:41 -0800)]
ring: move code in a new header file

Move the common part of rte_ring.h into rte_ring_generic.h.
Move the memory barrier part into update_tail().

No functional changes here.

Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoeal/arm64: remove the braces in memory barrier macros
Jia He [Mon, 22 Jan 2018 04:41:26 +0000 (20:41 -0800)]
eal/arm64: remove the braces in memory barrier macros

for the code as follows:
if (condition)
rte_smp_rmb();
else
rte_smp_wmb();
Without this patch, compiler will report this error:
error: 'else' without a previous 'if'

Fixes: 84733fd0d75e ("eal/arm64: fix memory barrier definition")
Cc: stable@dpdk.org
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agonet/mlx5: fix synchronization on polling Rx completions
Yongseok Koh [Thu, 25 Jan 2018 21:02:50 +0000 (13:02 -0800)]
net/mlx5: fix synchronization on polling Rx completions

Polling a new packet is basically sensing the generation bit in a
completion entry. For some processors not having strongly-ordered memory
model, there has to be a memory barrier between reading the generation bit
and other fields of the entry in order to guarantee data is not stale.

Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: replace I/O memory barrier with coherent version
Yongseok Koh [Thu, 25 Jan 2018 21:02:49 +0000 (13:02 -0800)]
net/mlx5: replace I/O memory barrier with coherent version

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: remove unnecessary memory barrier
Yongseok Koh [Thu, 25 Jan 2018 21:02:48 +0000 (13:02 -0800)]
net/mlx5: remove unnecessary memory barrier

As rte_write64() has an IO barrier, there's no need to have a barrier
before the call.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal/arm64: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:47 +0000 (13:02 -0800)]
eal/arm64: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
6 years agoeal/arm32: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:46 +0000 (13:02 -0800)]
eal/arm32: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
6 years agoeal/ppc64: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:45 +0000 (13:02 -0800)]
eal/ppc64: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal/x86: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:44 +0000 (13:02 -0800)]
eal/x86: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal: introduce coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:43 +0000 (13:02 -0800)]
eal: introduce coherent I/O memory barriers

This commit introduces rte_cio_wmb() and rte_cio_rmb(), in order to
guarantee the ordering of coherent shared memory between the CPU and a DMA
capable device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agoeal: group memory barriers by type in doxygen
Yongseok Koh [Thu, 25 Jan 2018 21:02:42 +0000 (13:02 -0800)]
eal: group memory barriers by type in doxygen

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agotest: add reciprocal based division
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:51 +0000 (10:34 +0530)]
test: add reciprocal based division

This commit provides a set of tests for verifying the correctness and
performance of both unsigned 32 and 64bit reciprocal based division.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoeal: add u64-bit variant for reciprocal divide
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:50 +0000 (10:34 +0530)]
eal: add u64-bit variant for reciprocal divide

Currently, rte_reciprocal only supports unsigned 32bit divisors. This
commit adds support for unsigned 64bit divisors.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoeal: introduce integer divide through reciprocal
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:49 +0000 (10:34 +0530)]
eal: introduce integer divide through reciprocal

In some use cases of integer division, denominator remains constant and
numerator varies. It is possible to optimize division for such specific
scenarios.

The librte_sched uses rte_reciprocal to optimize division so, moving it to
eal/common would allow other libraries and applications to use it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoservice: fix memory leak with new function
Vipin Varghese [Fri, 26 Jan 2018 20:55:36 +0000 (02:25 +0530)]
service: fix memory leak with new function

The rte_service_finalize routine checks if service is initialized
or not. If yes; releases internal memory for services and lcore
states are freed. This routine is to be invoked at end of application
termination.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agolog: fix memory leak in regexp level set
Ivan Malov [Sun, 21 Jan 2018 17:05:10 +0000 (17:05 +0000)]
log: fix memory leak in regexp level set

Fixes: a5279180f510 ("eal: change several log levels matching a regexp")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoflow_classify: fix memory leak in rule add
Jasvinder Singh [Mon, 22 Jan 2018 14:14:28 +0000 (14:14 +0000)]
flow_classify: fix memory leak in rule add

Free allocated memory of the rule if not added to the table.

Coverity issue: 257032
Fixes: 50bdac5916d9 ("flow_classify: remove table id parameter from API")

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agokeepalive: fix state alignment
Andriy Berestovskyy [Tue, 23 Jan 2018 15:43:16 +0000 (16:43 +0100)]
keepalive: fix state alignment

The __rte_cache_aligned was applied to the whole array,
not the array elements. This leads to a false sharing between
the monitored cores.

Fixes: e70a61ad50ab ("keepalive: export states")
Cc: stable@dpdk.org
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Remy Horton <remy.horton@intel.com>
6 years agocmdline: avoid garbage in unused fields of parsed result
Xueming Li [Sat, 20 Jan 2018 03:26:31 +0000 (11:26 +0800)]
cmdline: avoid garbage in unused fields of parsed result

The result buffer was not initialized before parsing, inducing garbage
in unused fields or padding of the parsed structure.

Initialize the result buffer each time before parsing.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agocmdline: fix dynamic tokens parsing
Xueming Li [Fri, 19 Jan 2018 18:16:10 +0000 (02:16 +0800)]
cmdline: fix dynamic tokens parsing

When using dynamic tokens, the result buffer contains pointers to some
location inside the result buffer. When the content of the temporary
buffer is copied in the final one, these pointers still point to the
temporary buffer.

This works until the temporary buffer is kept intact, but the next
commit introduces a memset() that breaks this assumption.

This commit keeps the successfully parsed buffers, and ensures that the
pointers point to the valid location, by using temp buffer for following
parsing.

Fixes: 9b3fbb051d2e ("cmdline: fix parsing")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoservice: fix possible mem leak on initialize
Harry van Haaren [Wed, 24 Jan 2018 17:02:47 +0000 (17:02 +0000)]
service: fix possible mem leak on initialize

This commit ensures that if that if we run out of memory
during the initialization of the service library, that the
first allocated memory is correctly freed instead of leaked.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agombuf: remove void pointer cast
Zhiyong Yang [Fri, 19 Jan 2018 10:18:13 +0000 (18:18 +0800)]
mbuf: remove void pointer cast

It is unnecessary to cast from void * to struct rte_mbuf *,
the change can make code clearer.

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agodoc: fix build of bbdev test guide
Marko Kovacevic [Wed, 24 Jan 2018 15:07:45 +0000 (15:07 +0000)]
doc: fix build of bbdev test guide

Fix build issue with pdf guides. Some indentations in the bbdev test
application doc were causing build failures. Latex Log message:
 
    doc.log:! LaTeX Error: Too deeply nested.
   
Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agoevent/opdl: fix icc build
Zhiyong Yang [Thu, 25 Jan 2018 07:03:50 +0000 (15:03 +0800)]
event/opdl: fix icc build

ICC reports the issue at compile time as follows.
error #592: variable "i" is used before its value is set
        RTE_SET_USED(i);

The patch is to fix it. GCC and CLANG has been tested as well.

Fixes: d548ef513cd7 ("event/opdl: add unit tests")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
6 years agonet/vdev_netvsc: fix build without C11 and pedantic
Ophir Munk [Wed, 24 Jan 2018 14:12:13 +0000 (14:12 +0000)]
net/vdev_netvsc: fix build without C11 and pedantic

Remove CFLAGS -std=c11 and -pedantic in order to guarantee
a successful vdev_netvsc compilation on old Linux distributions.
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11

Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform driver")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
6 years agonet/tap: use local eBPF definitions
Ophir Munk [Tue, 23 Jan 2018 21:54:09 +0000 (21:54 +0000)]
net/tap: use local eBPF definitions

eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agodrivers/event: fix resource leak in selftest
Pavan Nikhilesh [Mon, 22 Jan 2018 17:46:01 +0000 (23:16 +0530)]
drivers/event: fix resource leak in selftest

Free resources leak in eventdev selftests.

Coverity issue: 257044
Coverity issue: 257047
Coverity issue: 257009
Fixes: 9ef576176db0 ("test/eventdev: add octeontx multi queue and multi port")
Fixes: 3a17ff401f1e ("test/eventdev: add basic SW tests")
Fixes: 5e6eb5ccd788 ("event/sw: make test standalone")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agoevent/opdl: rework loops to comply with dpdk style
Harry van Haaren [Mon, 22 Jan 2018 10:04:10 +0000 (10:04 +0000)]
event/opdl: rework loops to comply with dpdk style

This commit reworks the loop counter variable declarations
to be in line with the DPDK source code.

Fixes: 3c7f3dcfb099 ("event/opdl: add PMD main body and helper function")
Fixes: 8ca8e3b48eff ("event/opdl: add event queue config get/set")
Fixes: d548ef513cd7 ("event/opdl: add unit tests")

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
6 years agoevent/sw: fix debug logging config option
Jerin Jacob [Fri, 19 Jan 2018 06:38:29 +0000 (12:08 +0530)]
event/sw: fix debug logging config option

align the config option name with config/common_base

Fixes: aaa4a221da26 ("event/sw: add new software-only eventdev driver")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agoversion: 18.02-rc1
Thomas Monjalon [Mon, 22 Jan 2018 00:50:25 +0000 (01:50 +0100)]
version: 18.02-rc1

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoconfig: sort PMD config options
Ferruh Yigit [Sat, 20 Jan 2018 16:50:53 +0000 (16:50 +0000)]
config: sort PMD config options

No config option changed, added or removed.
Only reshuffle PMD config options mostly to help new PMDs where to put
their new config option.

Ordered as physical, paravirtual and virtual groups. Alphabetical order
within a group.

Also tried to group vendor devices together which breaks alphabetical
order in some places.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: rename function parameter for consistency
Ferruh Yigit [Mon, 22 Jan 2018 00:16:25 +0000 (00:16 +0000)]
ethdev: rename function parameter for consistency

Update "port" function argument variable to "port_id" in public
header to be consistent in all APIs.

No functional change.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: reorder inline functions
Ferruh Yigit [Mon, 22 Jan 2018 00:16:24 +0000 (00:16 +0000)]
ethdev: reorder inline functions

Move all inline function to the end of the ethdev.h header file and move
the ethdev_core.h just before inline functions.

Since inline functions need data structures in ethdev_core.h, this
reorder is to group them and make it clear where put further inline
functions.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: separate internal structures into own header
Ferruh Yigit [Mon, 22 Jan 2018 00:16:23 +0000 (00:16 +0000)]
ethdev: separate internal structures into own header

rte_ethdev_core.h created. Internal data structures are moved here.

These structures are mostly intended to be used by drivers, but they
need to be in the public header file because of the inline functions
in the ethdev.h header, and those inline functions are preferred to
kept because of the performance concerns.

The accessibility of the data structures are not changed, only logically
grouped to show that they are not intended to be used by applications.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: separate driver APIs
Ferruh Yigit [Mon, 22 Jan 2018 00:16:22 +0000 (00:16 +0000)]
ethdev: separate driver APIs

Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/failsafe: fix removed device handling
Matan Azrad [Sat, 20 Jan 2018 21:12:24 +0000 (21:12 +0000)]
net/failsafe: fix removed device handling

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agoethdev: adjust removal error report in flow API
Matan Azrad [Sat, 20 Jan 2018 21:12:23 +0000 (21:12 +0000)]
ethdev: adjust removal error report in flow API

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: adjust removal error report
Matan Azrad [Sat, 20 Jan 2018 21:12:22 +0000 (21:12 +0000)]
ethdev: adjust removal error report

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/mlx5: support a device removal check operation
Matan Azrad [Sat, 20 Jan 2018 21:12:21 +0000 (21:12 +0000)]
net/mlx5: support a device removal check operation

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/mlx4: support a device removal check operation
Matan Azrad [Sat, 20 Jan 2018 21:12:20 +0000 (21:12 +0000)]
net/mlx4: support a device removal check operation

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agoethdev: add devop to check removal status
Matan Azrad [Sat, 20 Jan 2018 21:12:19 +0000 (21:12 +0000)]
ethdev: add devop to check removal status

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agodoc: add RSS in tap guide
Ophir Munk [Sat, 20 Jan 2018 21:11:37 +0000 (21:11 +0000)]
doc: add RSS in tap guide

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: implement RSS using eBPF
Ophir Munk [Sat, 20 Jan 2018 21:11:36 +0000 (21:11 +0000)]
net/tap: implement RSS using eBPF

TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.

TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.

TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.

eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.

TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF API
Ophir Munk [Sat, 20 Jan 2018 21:11:35 +0000 (21:11 +0000)]
net/tap: add eBPF API

This commit include BPF API to be used by TAP.

tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF bytes code
Ophir Munk [Sat, 20 Jan 2018 21:11:34 +0000 (21:11 +0000)]
net/tap: add eBPF bytes code

File tap_bpf_insns.h was added. It includes  eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
   clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
   -filetype=obj -o <tap_bpf_program.o>

clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel

2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF program file
Ophir Munk [Sat, 20 Jan 2018 21:11:33 +0000 (21:11 +0000)]
net/tap: add eBPF program file

File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.

Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: support actions for different classifiers
Ophir Munk [Sat, 20 Jan 2018 21:11:32 +0000 (21:11 +0000)]
net/tap: support actions for different classifiers

Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/mlx5: fix memory region lookup
Yongseok Koh [Fri, 19 Jan 2018 07:52:55 +0000 (23:52 -0800)]
net/mlx5: fix memory region lookup

This patch reverts:
commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")

Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.

Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/i40e: ignore case of packet type
Beilei Xing [Fri, 19 Jan 2018 07:50:06 +0000 (15:50 +0800)]
net/i40e: ignore case of packet type

Replace strncmp with strncasecmp in i40e_update_customized_ptype
function for compatibility.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: add parser for IPv4/v6 frag
Beilei Xing [Fri, 19 Jan 2018 07:50:05 +0000 (15:50 +0800)]
net/i40e: add parser for IPv4/v6 frag

There're new metadata IPV4FRAG and IPV6FRAG in PPP
profile, this patch improves ptype parser to support
IPV4FRAG and IPV6FRAG.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: fix fail to update packet type table
Beilei Xing [Fri, 19 Jan 2018 07:50:04 +0000 (15:50 +0800)]
net/i40e: fix fail to update packet type table

Fail to update SW ptype mapping table when loading
PPP profile, though profile can be loaded successfully.
It will cause fail to parse SW ptype during receiving
packets. This patch fixes this issue.

Fixes: 11556c915a08 ("net/i40e: improve packet type parser")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>