dpdk.git
6 years agopmdinfogen: allow using stdin and stdout
Bruce Richardson [Thu, 25 Jan 2018 11:12:25 +0000 (11:12 +0000)]
pmdinfogen: allow using stdin and stdout

Rather than having to work off files all the time, allow stdin and stdout
to be used as the source and destination for pmdinfogen. This will allow
other possible usages from scripts, e.g. taking files from ar archive and
building a single .pmd.c file from all the .o files in it.

for f in `ar t librte_pmd_xyz.a` ; do
ar p librte_pmd_xyz.a $f | pmdinfogen - - >> xyz_info.c
done

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
6 years agoapp/procinfo: call EAL cleanup before exit
Harry van Haaren [Mon, 29 Jan 2018 16:37:32 +0000 (16:37 +0000)]
app/procinfo: call EAL cleanup before exit

This patch adds a call to the newly introduced cleanup()
function just before quitting the app.

Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoapp/pdump: call EAL cleanup before exit
Harry van Haaren [Mon, 29 Jan 2018 16:37:31 +0000 (16:37 +0000)]
app/pdump: call EAL cleanup before exit

This patch adds a call to the newly introduced cleanup()
function just before quitting the pdump app.

Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoeal: add function to release internal resources
Harry van Haaren [Mon, 29 Jan 2018 16:37:30 +0000 (16:37 +0000)]
eal: add function to release internal resources

This commit adds a new function rte_eal_cleanup().
The function serves as a hook to allow DPDK to release
internal resources (e.g.: hugepage allocations).

This function allows DPDK to become more like an ordinary
library, where the library context itself can be initialized
and cleaned up by the application.

The rte_exit() and rte_panic() functions must be considered,
particularly if they should call rte_eal_cleanup() to release any
resources or not. This patch adds the cleanup to rte_exit(),
but does not clean up on rte_panic(). The reason to not clean
up on panicing is that the developer may wish to inspect the
exact internal state of EAL and hugepages.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agoservice: restrict finalize to internal usage
Harry van Haaren [Mon, 29 Jan 2018 16:37:29 +0000 (16:37 +0000)]
service: restrict finalize to internal usage

This commit moves the rte_service_finalize() function
to be in the component header, and marks it as @internal.
The function is only called internally by rte_eal_finalize().

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
6 years agobus/fslmc: register platform HW mempool on runtime
Hemant Agrawal [Mon, 29 Jan 2018 08:10:49 +0000 (13:40 +0530)]
bus/fslmc: register platform HW mempool on runtime

Detect if the DPAA2 mempool objects are present and register
it as platform default hw mempool

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agobus/dpaa: register platform HW mempool on runtime
Hemant Agrawal [Mon, 29 Jan 2018 08:10:48 +0000 (13:40 +0530)]
bus/dpaa: register platform HW mempool on runtime

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoapp/testpmd: add log for preferred mempool ops
Pavan Nikhilesh [Mon, 29 Jan 2018 08:10:47 +0000 (13:40 +0530)]
app/testpmd: add log for preferred mempool ops

This patch adds the debug message to print the best selected
pktmbuf mempool ops name.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: add pool create helper for specific mempool ops
Hemant Agrawal [Mon, 29 Jan 2018 08:10:46 +0000 (13:40 +0530)]
mbuf: add pool create helper for specific mempool ops

Introduce a new helper for pktmbuf pool, which will allow
the application to optionally specify the mempool ops name
as well.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: add pool ops selection functions
Hemant Agrawal [Mon, 29 Jan 2018 08:10:45 +0000 (13:40 +0530)]
mbuf: add pool ops selection functions

This patch add support for various mempool ops config helper APIs.

1.User defined mempool ops
2.Platform detected HW mempool ops (active).
3.Best selection of mempool ops by looking into user defined,
  platform registered and compile time configured.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: maintain user and compile time mempool ops name
Hemant Agrawal [Mon, 29 Jan 2018 08:10:44 +0000 (13:40 +0530)]
mbuf: maintain user and compile time mempool ops name

At present the userdefined mempool ops name overwrites
the default mempool ops name variable in internal_config.

This patch change the logic to maintain the value of
user defined only in the internal config.

The pktmbuf_create_pool is updated to reflect the same ie.
use user defined. If not present than use the default.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agoeal: prefix mbuf pool ops name with user defined
Hemant Agrawal [Mon, 29 Jan 2018 08:10:43 +0000 (13:40 +0530)]
eal: prefix mbuf pool ops name with user defined

This patch prefix the mbuf pool ops name with "user" to indicate
that it is user defined.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
6 years agombuf: fix VLAN flags documentation
Olivier Matz [Mon, 29 Jan 2018 09:36:06 +0000 (10:36 +0100)]
mbuf: fix VLAN flags documentation

Fix inconsistency between mbuf structure documentation and flags
documentation.

Fixes: 380a7aab1ae2 ("mbuf: rename deprecated VLAN flags")
Cc: stable@dpdk.org
Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agombuf: rename Tx VLAN flags
Olivier Matz [Mon, 29 Jan 2018 09:37:07 +0000 (10:37 +0100)]
mbuf: rename Tx VLAN flags

For consistency with the Rx flags, the flags PKT_TX_VLAN_PKT and
PKT_TX_QINQ_PKT are respectively renamed as PKT_TX_VLAN and
PKT_TX_QINQ. The old defines are deprecated but will stay for some time
for compatibility.

Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agombuf: fix NULL freeing when debug enabled
Olivier Matz [Mon, 29 Jan 2018 09:39:23 +0000 (10:39 +0100)]
mbuf: fix NULL freeing when debug enabled

Do not panic when calling rte_pktmbuf_free(NULL) with mbuf debug
enabled, it is a valid operation.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Reported-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoeal/x86: use lock-prefixed instructions for SMP barrier
Konstantin Ananyev [Mon, 15 Jan 2018 15:09:31 +0000 (15:09 +0000)]
eal/x86: use lock-prefixed instructions for SMP barrier

On x86 it is possible to use lock-prefixed instructions to get
the similar effect as mfence.
As pointed by Java guys, on most modern HW that gives a better
performance than using mfence:
https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
That patch adopts that technique for rte_smp_mb() implementation.
On BDW 2.2 mb_autotest on single lcore reports 2X cycle reduction,
i.e. from ~110 to ~55 cycles per operation.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agotest: introduce memory barrier test case
Konstantin Ananyev [Mon, 15 Jan 2018 15:04:39 +0000 (15:04 +0000)]
test: introduce memory barrier test case

Simple functional test for rte_smp_mb() implementations.
Also when executed on a single lcore could be used as rough
estimation how many cycles particular implementation of rte_smp_mb()
might take.

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
6 years agoconfig/thunderx: disable C11 memory model ring
Jerin Jacob [Sun, 3 Dec 2017 12:37:30 +0000 (18:07 +0530)]
config/thunderx: disable C11 memory model ring

On thunderx and octeontx, ring_perf_autotest and
ring_pmd_perf_autotest test shows better performance
when disabling CONFIG_RTE_RING_USE_C11_MEM_MODEL.
On the other hand, Enabling CONFIG_RTE_RING_USE_C11_MEM_MODEL
shows better performance on thunderx2.
Since thunderx2 is using the default armv8 config,
no particular change is required.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
6 years agoring: introduce C11 memory model barrier option
Jia He [Mon, 22 Jan 2018 04:41:28 +0000 (20:41 -0800)]
ring: introduce C11 memory model barrier option

This patch is to support C11 memory model barrier in librte_ring.

There are 2 barrier implementation options in librte_ring (suggested
by Jerin).
1. use rte_smp_rmb
2. use load_acquire/store_release(refer to [1]).
The reason why providing 2 options is the performance benchmark
difference in different arm machines, refer to [2].

CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "n"
on any architectures and only "y" on arm64 so far.

[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
[2] http://dpdk.org/ml/archives/dev/2017-October/080861.html

Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoring: move code in a new header file
Jia He [Mon, 22 Jan 2018 04:41:27 +0000 (20:41 -0800)]
ring: move code in a new header file

Move the common part of rte_ring.h into rte_ring_generic.h.
Move the memory barrier part into update_tail().

No functional changes here.

Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agoeal/arm64: remove the braces in memory barrier macros
Jia He [Mon, 22 Jan 2018 04:41:26 +0000 (20:41 -0800)]
eal/arm64: remove the braces in memory barrier macros

for the code as follows:
if (condition)
rte_smp_rmb();
else
rte_smp_wmb();
Without this patch, compiler will report this error:
error: 'else' without a previous 'if'

Fixes: 84733fd0d75e ("eal/arm64: fix memory barrier definition")
Cc: stable@dpdk.org
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agonet/mlx5: fix synchronization on polling Rx completions
Yongseok Koh [Thu, 25 Jan 2018 21:02:50 +0000 (13:02 -0800)]
net/mlx5: fix synchronization on polling Rx completions

Polling a new packet is basically sensing the generation bit in a
completion entry. For some processors not having strongly-ordered memory
model, there has to be a memory barrier between reading the generation bit
and other fields of the entry in order to guarantee data is not stale.

Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: replace I/O memory barrier with coherent version
Yongseok Koh [Thu, 25 Jan 2018 21:02:49 +0000 (13:02 -0800)]
net/mlx5: replace I/O memory barrier with coherent version

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: remove unnecessary memory barrier
Yongseok Koh [Thu, 25 Jan 2018 21:02:48 +0000 (13:02 -0800)]
net/mlx5: remove unnecessary memory barrier

As rte_write64() has an IO barrier, there's no need to have a barrier
before the call.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal/arm64: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:47 +0000 (13:02 -0800)]
eal/arm64: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
6 years agoeal/arm32: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:46 +0000 (13:02 -0800)]
eal/arm32: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
6 years agoeal/ppc64: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:45 +0000 (13:02 -0800)]
eal/ppc64: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal/x86: define coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:44 +0000 (13:02 -0800)]
eal/x86: define coherent I/O memory barriers

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoeal: introduce coherent I/O memory barriers
Yongseok Koh [Thu, 25 Jan 2018 21:02:43 +0000 (13:02 -0800)]
eal: introduce coherent I/O memory barriers

This commit introduces rte_cio_wmb() and rte_cio_rmb(), in order to
guarantee the ordering of coherent shared memory between the CPU and a DMA
capable device.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agoeal: group memory barriers by type in doxygen
Yongseok Koh [Thu, 25 Jan 2018 21:02:42 +0000 (13:02 -0800)]
eal: group memory barriers by type in doxygen

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agotest: add reciprocal based division
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:51 +0000 (10:34 +0530)]
test: add reciprocal based division

This commit provides a set of tests for verifying the correctness and
performance of both unsigned 32 and 64bit reciprocal based division.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoeal: add u64-bit variant for reciprocal divide
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:50 +0000 (10:34 +0530)]
eal: add u64-bit variant for reciprocal divide

Currently, rte_reciprocal only supports unsigned 32bit divisors. This
commit adds support for unsigned 64bit divisors.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoeal: introduce integer divide through reciprocal
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:49 +0000 (10:34 +0530)]
eal: introduce integer divide through reciprocal

In some use cases of integer division, denominator remains constant and
numerator varies. It is possible to optimize division for such specific
scenarios.

The librte_sched uses rte_reciprocal to optimize division so, moving it to
eal/common would allow other libraries and applications to use it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
6 years agoservice: fix memory leak with new function
Vipin Varghese [Fri, 26 Jan 2018 20:55:36 +0000 (02:25 +0530)]
service: fix memory leak with new function

The rte_service_finalize routine checks if service is initialized
or not. If yes; releases internal memory for services and lcore
states are freed. This routine is to be invoked at end of application
termination.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agolog: fix memory leak in regexp level set
Ivan Malov [Sun, 21 Jan 2018 17:05:10 +0000 (17:05 +0000)]
log: fix memory leak in regexp level set

Fixes: a5279180f510 ("eal: change several log levels matching a regexp")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoflow_classify: fix memory leak in rule add
Jasvinder Singh [Mon, 22 Jan 2018 14:14:28 +0000 (14:14 +0000)]
flow_classify: fix memory leak in rule add

Free allocated memory of the rule if not added to the table.

Coverity issue: 257032
Fixes: 50bdac5916d9 ("flow_classify: remove table id parameter from API")

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agokeepalive: fix state alignment
Andriy Berestovskyy [Tue, 23 Jan 2018 15:43:16 +0000 (16:43 +0100)]
keepalive: fix state alignment

The __rte_cache_aligned was applied to the whole array,
not the array elements. This leads to a false sharing between
the monitored cores.

Fixes: e70a61ad50ab ("keepalive: export states")
Cc: stable@dpdk.org
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Remy Horton <remy.horton@intel.com>
6 years agocmdline: avoid garbage in unused fields of parsed result
Xueming Li [Sat, 20 Jan 2018 03:26:31 +0000 (11:26 +0800)]
cmdline: avoid garbage in unused fields of parsed result

The result buffer was not initialized before parsing, inducing garbage
in unused fields or padding of the parsed structure.

Initialize the result buffer each time before parsing.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agocmdline: fix dynamic tokens parsing
Xueming Li [Fri, 19 Jan 2018 18:16:10 +0000 (02:16 +0800)]
cmdline: fix dynamic tokens parsing

When using dynamic tokens, the result buffer contains pointers to some
location inside the result buffer. When the content of the temporary
buffer is copied in the final one, these pointers still point to the
temporary buffer.

This works until the temporary buffer is kept intact, but the next
commit introduces a memset() that breaks this assumption.

This commit keeps the successfully parsed buffers, and ensures that the
pointers point to the valid location, by using temp buffer for following
parsing.

Fixes: 9b3fbb051d2e ("cmdline: fix parsing")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agoservice: fix possible mem leak on initialize
Harry van Haaren [Wed, 24 Jan 2018 17:02:47 +0000 (17:02 +0000)]
service: fix possible mem leak on initialize

This commit ensures that if that if we run out of memory
during the initialization of the service library, that the
first allocated memory is correctly freed instead of leaked.

Fixes: 21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agombuf: remove void pointer cast
Zhiyong Yang [Fri, 19 Jan 2018 10:18:13 +0000 (18:18 +0800)]
mbuf: remove void pointer cast

It is unnecessary to cast from void * to struct rte_mbuf *,
the change can make code clearer.

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
6 years agodoc: fix build of bbdev test guide
Marko Kovacevic [Wed, 24 Jan 2018 15:07:45 +0000 (15:07 +0000)]
doc: fix build of bbdev test guide

Fix build issue with pdf guides. Some indentations in the bbdev test
application doc were causing build failures. Latex Log message:
 
    doc.log:! LaTeX Error: Too deeply nested.
   
Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")

Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agoevent/opdl: fix icc build
Zhiyong Yang [Thu, 25 Jan 2018 07:03:50 +0000 (15:03 +0800)]
event/opdl: fix icc build

ICC reports the issue at compile time as follows.
error #592: variable "i" is used before its value is set
        RTE_SET_USED(i);

The patch is to fix it. GCC and CLANG has been tested as well.

Fixes: d548ef513cd7 ("event/opdl: add unit tests")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
6 years agonet/vdev_netvsc: fix build without C11 and pedantic
Ophir Munk [Wed, 24 Jan 2018 14:12:13 +0000 (14:12 +0000)]
net/vdev_netvsc: fix build without C11 and pedantic

Remove CFLAGS -std=c11 and -pedantic in order to guarantee
a successful vdev_netvsc compilation on old Linux distributions.
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11

Fixes: 6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform driver")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
6 years agonet/tap: use local eBPF definitions
Ophir Munk [Tue, 23 Jan 2018 21:54:09 +0000 (21:54 +0000)]
net/tap: use local eBPF definitions

eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agodrivers/event: fix resource leak in selftest
Pavan Nikhilesh [Mon, 22 Jan 2018 17:46:01 +0000 (23:16 +0530)]
drivers/event: fix resource leak in selftest

Free resources leak in eventdev selftests.

Coverity issue: 257044
Coverity issue: 257047
Coverity issue: 257009
Fixes: 9ef576176db0 ("test/eventdev: add octeontx multi queue and multi port")
Fixes: 3a17ff401f1e ("test/eventdev: add basic SW tests")
Fixes: 5e6eb5ccd788 ("event/sw: make test standalone")

Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agoevent/opdl: rework loops to comply with dpdk style
Harry van Haaren [Mon, 22 Jan 2018 10:04:10 +0000 (10:04 +0000)]
event/opdl: rework loops to comply with dpdk style

This commit reworks the loop counter variable declarations
to be in line with the DPDK source code.

Fixes: 3c7f3dcfb099 ("event/opdl: add PMD main body and helper function")
Fixes: 8ca8e3b48eff ("event/opdl: add event queue config get/set")
Fixes: d548ef513cd7 ("event/opdl: add unit tests")

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
6 years agoevent/sw: fix debug logging config option
Jerin Jacob [Fri, 19 Jan 2018 06:38:29 +0000 (12:08 +0530)]
event/sw: fix debug logging config option

align the config option name with config/common_base

Fixes: aaa4a221da26 ("event/sw: add new software-only eventdev driver")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agoversion: 18.02-rc1
Thomas Monjalon [Mon, 22 Jan 2018 00:50:25 +0000 (01:50 +0100)]
version: 18.02-rc1

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoconfig: sort PMD config options
Ferruh Yigit [Sat, 20 Jan 2018 16:50:53 +0000 (16:50 +0000)]
config: sort PMD config options

No config option changed, added or removed.
Only reshuffle PMD config options mostly to help new PMDs where to put
their new config option.

Ordered as physical, paravirtual and virtual groups. Alphabetical order
within a group.

Also tried to group vendor devices together which breaks alphabetical
order in some places.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: rename function parameter for consistency
Ferruh Yigit [Mon, 22 Jan 2018 00:16:25 +0000 (00:16 +0000)]
ethdev: rename function parameter for consistency

Update "port" function argument variable to "port_id" in public
header to be consistent in all APIs.

No functional change.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: reorder inline functions
Ferruh Yigit [Mon, 22 Jan 2018 00:16:24 +0000 (00:16 +0000)]
ethdev: reorder inline functions

Move all inline function to the end of the ethdev.h header file and move
the ethdev_core.h just before inline functions.

Since inline functions need data structures in ethdev_core.h, this
reorder is to group them and make it clear where put further inline
functions.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: separate internal structures into own header
Ferruh Yigit [Mon, 22 Jan 2018 00:16:23 +0000 (00:16 +0000)]
ethdev: separate internal structures into own header

rte_ethdev_core.h created. Internal data structures are moved here.

These structures are mostly intended to be used by drivers, but they
need to be in the public header file because of the inline functions
in the ethdev.h header, and those inline functions are preferred to
kept because of the performance concerns.

The accessibility of the data structures are not changed, only logically
grouped to show that they are not intended to be used by applications.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: separate driver APIs
Ferruh Yigit [Mon, 22 Jan 2018 00:16:22 +0000 (00:16 +0000)]
ethdev: separate driver APIs

Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.

There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.

More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/failsafe: fix removed device handling
Matan Azrad [Sat, 20 Jan 2018 21:12:24 +0000 (21:12 +0000)]
net/failsafe: fix removed device handling

There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.

In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.

Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agoethdev: adjust removal error report in flow API
Matan Azrad [Sat, 20 Jan 2018 21:12:23 +0000 (21:12 +0000)]
ethdev: adjust removal error report in flow API

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during flow command execution, many
different errors can be reported to the user.

Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: adjust removal error report
Matan Azrad [Sat, 20 Jan 2018 21:12:22 +0000 (21:12 +0000)]
ethdev: adjust removal error report

rte_eth_dev_is_removed API was added to detect a device removal
synchronously.

When a device removal occurs during control command execution, many
different errors can be reported to the user.

Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/mlx5: support a device removal check operation
Matan Azrad [Sat, 20 Jan 2018 21:12:21 +0000 (21:12 +0000)]
net/mlx5: support a device removal check operation

Add support to get removal status of mlx5 device.
It is not supported in secondary process.

Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/mlx4: support a device removal check operation
Matan Azrad [Sat, 20 Jan 2018 21:12:20 +0000 (21:12 +0000)]
net/mlx4: support a device removal check operation

Add support to get removal status of mlx4 device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agoethdev: add devop to check removal status
Matan Azrad [Sat, 20 Jan 2018 21:12:19 +0000 (21:12 +0000)]
ethdev: add devop to check removal status

There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.

Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.

Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agodoc: add RSS in tap guide
Ophir Munk [Sat, 20 Jan 2018 21:11:37 +0000 (21:11 +0000)]
doc: add RSS in tap guide

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: implement RSS using eBPF
Ophir Munk [Sat, 20 Jan 2018 21:11:36 +0000 (21:11 +0000)]
net/tap: implement RSS using eBPF

TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.

TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.

TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.

eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.

TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF API
Ophir Munk [Sat, 20 Jan 2018 21:11:35 +0000 (21:11 +0000)]
net/tap: add eBPF API

This commit include BPF API to be used by TAP.

tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF bytes code
Ophir Munk [Sat, 20 Jan 2018 21:11:34 +0000 (21:11 +0000)]
net/tap: add eBPF bytes code

File tap_bpf_insns.h was added. It includes  eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
   clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
   -filetype=obj -o <tap_bpf_program.o>

clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel

2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: add eBPF program file
Ophir Munk [Sat, 20 Jan 2018 21:11:33 +0000 (21:11 +0000)]
net/tap: add eBPF program file

File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.

Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/tap: support actions for different classifiers
Ophir Munk [Sat, 20 Jan 2018 21:11:32 +0000 (21:11 +0000)]
net/tap: support actions for different classifiers

Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
6 years agonet/mlx5: fix memory region lookup
Yongseok Koh [Fri, 19 Jan 2018 07:52:55 +0000 (23:52 -0800)]
net/mlx5: fix memory region lookup

This patch reverts:
commit 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")

Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.

Fixes: b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Fixes: 3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/i40e: ignore case of packet type
Beilei Xing [Fri, 19 Jan 2018 07:50:06 +0000 (15:50 +0800)]
net/i40e: ignore case of packet type

Replace strncmp with strncasecmp in i40e_update_customized_ptype
function for compatibility.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: add parser for IPv4/v6 frag
Beilei Xing [Fri, 19 Jan 2018 07:50:05 +0000 (15:50 +0800)]
net/i40e: add parser for IPv4/v6 frag

There're new metadata IPV4FRAG and IPV6FRAG in PPP
profile, this patch improves ptype parser to support
IPV4FRAG and IPV6FRAG.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: fix fail to update packet type table
Beilei Xing [Fri, 19 Jan 2018 07:50:04 +0000 (15:50 +0800)]
net/i40e: fix fail to update packet type table

Fail to update SW ptype mapping table when loading
PPP profile, though profile can be loaded successfully.
It will cause fail to parse SW ptype during receiving
packets. This patch fixes this issue.

Fixes: 11556c915a08 ("net/i40e: improve packet type parser")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: fix flow director Rx resource defect
Beilei Xing [Fri, 19 Jan 2018 05:23:44 +0000 (13:23 +0800)]
net/i40e: fix flow director Rx resource defect

FDIR Rx ring isn't initialized and Rx queue HW tail isn't updated
when there's error detected during programming FDIR flow. There'll
be some potential risk.
This patch updates FDIR Rx resource.

Fixes: a778a1fa2e4e ("i40e: set up and initialize flow director")
Fixes: 05999aab4ca6 ("i40e: add or delete flow director")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
6 years agonet/vdev_netvsc: add automatic probing
Matan Azrad [Thu, 18 Jan 2018 13:51:46 +0000 (13:51 +0000)]
net/vdev_netvsc: add automatic probing

Using DPDK in Hyper-V VM systems requires vdev_netvsc driver to pair
the NetVSC netdev device with the same MAC address PCI device by
fail-safe PMD.

Add vdev_netvsc custom scan in vdev bus to allow automatic probing in
Hyper-V VM systems unless it was already specified by command line.

Add "ignore" parameter to disable this auto-detection.

Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/vdev_netvsc: add force parameter
Matan Azrad [Thu, 18 Jan 2018 13:51:45 +0000 (13:51 +0000)]
net/vdev_netvsc: add force parameter

This parameter allows specifying any non-NetVSC interface or routed
NetVSC interfaces to use with tap sub-devices for development purposes.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/vdev_netvsc: skip routed netvsc probing
Matan Azrad [Thu, 18 Jan 2018 13:51:44 +0000 (13:51 +0000)]
net/vdev_netvsc: skip routed netvsc probing

NetVSC netdevices which are already routed should not be probed because
they are used for management purposes by the HyperV.

prevent routed netvsc devices probing.

Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/vdev_netvsc: implement core functionality
Matan Azrad [Thu, 18 Jan 2018 13:51:43 +0000 (13:51 +0000)]
net/vdev_netvsc: implement core functionality

As described in more details in the attached documentation (see patch
contents), this virtual device driver manages NetVSC interfaces in virtual
machines hosted by Hyper-V/Azure platforms.

This driver does not manage traffic nor Ethernet devices directly; it acts
as a thin configuration layer that automatically instantiates and controls
fail-safe PMD instances combining tap and PCI sub-devices, so that each
NetVSC interface is exposed as a single consolidated port to DPDK
applications.

PCI sub-devices being hot-pluggable (e.g. during VM migration),
applications automatically benefit from increased throughput when present
and automatic fallback on NetVSC otherwise without interruption thanks to
fail-safe's hot-plug handling.

Once initialized, the sole job of the vdev_netvsc driver is to regularly
scan for PCI devices to associate with NetVSC interfaces and feed their
addresses to corresponding fail-safe instances.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/vdev_netvsc: introduce Hyper-V platform driver
Matan Azrad [Thu, 18 Jan 2018 13:51:42 +0000 (13:51 +0000)]
net/vdev_netvsc: introduce Hyper-V platform driver

This patch lays the groundwork for this driver (draft documentation,
copyright notices, code base skeleton and build system hooks). While it can
be successfully compiled and invoked, it's an empty shell at this stage.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
6 years agonet/failsafe: add probed device capture
Matan Azrad [Thu, 18 Jan 2018 13:51:41 +0000 (13:51 +0000)]
net/failsafe: add probed device capture

Previous fail-safe code didn't support probed sub-devices capture and
failed when it tried to probe them.

Skip fail-safe sub-device probing when it already was probed.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agonet/failsafe: add fd parameter
Matan Azrad [Thu, 18 Jan 2018 13:51:40 +0000 (13:51 +0000)]
net/failsafe: add fd parameter

This parameter enables applications to provide device definitions
through an arbitrary file descriptor number.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agonet/failsafe: fix invalid free
Adrien Mazarguil [Thu, 18 Jan 2018 13:51:39 +0000 (13:51 +0000)]
net/failsafe: fix invalid free

rte_free() is not supposed to work with pointers returned by calloc().

Fixes: a0194d828100 ("net/failsafe: add flexible device definition")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agoethdev: add security context API documentation
Radu Nicolau [Fri, 19 Jan 2018 11:42:34 +0000 (11:42 +0000)]
ethdev: add security context API documentation

Added missing doxygen for rte_eth_dev_get_sec_ctx
and moved the declaration to the proper place.

Fixes: 4c270218aa26 ("ethdev: support security APIs")

Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/sfc/base: fix unused argument warning
Andy Moreton [Fri, 19 Jan 2018 06:47:06 +0000 (06:47 +0000)]
net/sfc/base: fix unused argument warning

The type_data argument to ef10_rx_qcreate is only used
in builds with EFSYS_OPT_RX_PACKED_STREAM. note this as
an unused argument to avoid warnings in builds without
packed stream support.

Fixes: b749646dade4 ("net/sfc/base: add function to create packed stream RxQ")

Signed-off-by: Andy Moreton <amoreton@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agodoc: fix typo in link bonding guide
Zhiyong Yang [Fri, 19 Jan 2018 02:21:11 +0000 (10:21 +0800)]
doc: fix typo in link bonding guide

fix one typo and a grammatical mistake.

Fixes: b0152b1b40fe ("doc: update bonding")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
6 years agonet/dpaa: fix potential memory leak
Yong Wang [Thu, 18 Jan 2018 11:48:56 +0000 (06:48 -0500)]
net/dpaa: fix potential memory leak

There are several func calls to rte_zmalloc() which don't do null
pointer check on the return value. And before return, the memory is not
freed. Fix it by adding null pointer check and rte_free().

Fixes: 37f9b54bd3cf ("net/dpaa: support Tx and Rx queue setup")
Fixes: 62f53995caaf ("net/dpaa: add frame count based tail drop with CGR")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Reviewed-by: Shreyansh Jain <shreyansh.jain@nxp.com>
6 years agonet/avf: fix makefile typo
Wenzhuo Lu [Fri, 19 Jan 2018 08:41:48 +0000 (16:41 +0800)]
net/avf: fix makefile typo

A typo in makefile that makes the RX/TX vector code
not to be compiled.

Fixes: 319c421f3890 ("net/avf: enable SSE Rx Tx")

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/mlx5: fix handling link status event
Yongseok Koh [Wed, 17 Jan 2018 17:44:13 +0000 (09:44 -0800)]
net/mlx5: fix handling link status event

Even though link of a port gets down, device still can receive traffic.
That is the reason why mlx5_set_link_up/down() switches rx/tx_pkt_burst().
However, if link gets down by an external command (e.g. ifconfig), it isn't
effective. It is better to change burst functions when link status change
is detected.

Fixes: 62072098b54e ("mlx5: support setting link up or down")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agovhost: protect active rings from async ring changes
Victor Kaplansky [Wed, 17 Jan 2018 13:49:25 +0000 (15:49 +0200)]
vhost: protect active rings from async ring changes

When performing live migration or memory hot-plugging,
the changes to the device and vrings made by message handler
done independently from vring usage by PMD threads.

This causes for example segfaults during live-migration
with MQ enable, but in general virtually any request
sent by qemu changing the state of device can cause
problems.

These patches fixes all above issues by adding a spinlock
to every vring and requiring message handler to start operation
only after ensuring that all PMD threads related to the device
are out of critical section accessing the vring data.

Each vring has its own lock in order to not create contention
between PMD threads of different vrings and to prevent
performance degradation by scaling queue pair number.

See https://bugzilla.redhat.com/show_bug.cgi?id=1450680

Cc: stable@dpdk.org
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
6 years agovhost: fix mbuf free
Junjie Chen [Wed, 17 Jan 2018 15:45:53 +0000 (10:45 -0500)]
vhost: fix mbuf free

dequeue zero copy change buf_addr and buf_iova of mbuf, and return
to mbuf pool without restore them, it breaks vm memory if others allocate
mbuf from same pool since mbuf reset doesn't reset buf_addr and buf_iova.

Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
6 years agonet/virtio: support guest announce
Xiao Wang [Thu, 18 Jan 2018 02:20:38 +0000 (10:20 +0800)]
net/virtio: support guest announce

When live migration is done, for the backup VM, either the virtio
frontend or the vhost backend needs to send out gratuitous RARP packet
to announce its new network location.

This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
migration scenario where the vhost backend doesn't have the ability to
generate RARP packet.

Brief introduction of the work flow:
1. QEMU finishes live migration, pokes the backup VM with an interrupt.
2. Virtio interrupt handler reads out the interrupt status value, and
   realizes it needs to send out RARP packet to announce its location.
3. Pause device to stop worker thread touching the queues.
4. Inject a RARP packet into a Tx Queue.
5. Ack the interrupt via control queue.
6. Resume device to continue packet processing.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
6 years agonet: fix RARP generation
Xiao Wang [Thu, 18 Jan 2018 02:32:24 +0000 (10:32 +0800)]
net: fix RARP generation

Due to a mistake operation from me, older version (v10) was merged to
master branch. It's the v11 should be applied. However, the master branch
is not rebase-able. Thus, this patch is made, from the diff between v10
and v11.

The diffs are:

- Add check for parameter and tailroom in rte_net_make_rarp_packet
- Allocate mbuf in rte_net_make_rarp_packet

Besides that, a link error is fixed when shared lib is enabled.

Fixes: 45ae05df824c ("net: add a helper for making RARP packet")
Fixes: c3ffdba0e88a ("vhost: use API to make RARP packet")

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
6 years agovhost: do deep copy while reallocating queue
Junjie Chen [Mon, 15 Jan 2018 11:32:19 +0000 (06:32 -0500)]
vhost: do deep copy while reallocating queue

When vhost reallocate dev and vq for NUMA enabled case, it doesn't perform
deep copy, which lead to 1) zmbuf list not valid 2) remote memory access.
This patch is to re-initlize the zmbuf list and also do the deep copy.

Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
6 years agonet/thunderx: convert to new offload API
Maciej Czekaj [Thu, 18 Jan 2018 13:06:13 +0000 (14:06 +0100)]
net/thunderx: convert to new offload API

This patch removes all references to old-style offload API
replacing them with new offload flags.

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
6 years agonet/mrvl: allow adding MAC address before port init
Tomasz Duszynski [Thu, 18 Jan 2018 10:57:37 +0000 (11:57 +0100)]
net/mrvl: allow adding MAC address before port init

Since DPDK restores ether address configuration after device
is started it is safe to add ether address to uninitialized port (ppio).

Fixes: c0511a8f741f ("net/mrvl: check if ppio is initialized")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
6 years agonet/mrvl: allow changing MTU before port init
Tomasz Duszynski [Thu, 18 Jan 2018 10:57:36 +0000 (11:57 +0100)]
net/mrvl: allow changing MTU before port init

DPDK updates MTU once mtu_set() callback returns success.
Since PMD changes port's MTU to dev->mtu every time device is
started it is safe to call mtu_set() before MUSDK ppio was initialized.

Fixes: c0511a8f741f ("net/mrvl: check if ppio is initialized")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>
6 years agonet/sfc: convert to new Tx offload API
Ivan Malov [Thu, 18 Jan 2018 09:44:31 +0000 (09:44 +0000)]
net/sfc: convert to new Tx offload API

Ethdev Tx offloads API has changed since:
commit cba7f53b717d ("ethdev: introduce Tx queue offloads API")
This commit support the new Tx offloads API.

The code which fills in txq_flags in default_txconf is preserved
because rte_eth_dev_info_get() lacks conversion between offloads
and txq_flags fields which means that a legacy application which
relies on default_txconf will fail to configure Tx queues in the
case when some bits in txq_flags are mandatory.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/sfc: factor out function to report Tx capabilities
Ivan Malov [Thu, 18 Jan 2018 09:44:30 +0000 (09:44 +0000)]
net/sfc: factor out function to report Tx capabilities

The patch adds a separate function to report supported
Tx capabilities because this function will be required
in more places across the code in the upcoming patches.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/sfc: convert to new Rx offload API
Ivan Malov [Thu, 18 Jan 2018 09:44:29 +0000 (09:44 +0000)]
net/sfc: convert to new Rx offload API

Ethdev Rx offloads API has changed since:
commit ce17eddefc20 ("ethdev: introduce Rx queue offloads API")
This commit support the new Rx offloads API.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/sfc: factor out function to report Rx capabilities
Ivan Malov [Thu, 18 Jan 2018 09:44:28 +0000 (09:44 +0000)]
net/sfc: factor out function to report Rx capabilities

The patch adds a separate function to report supported
Rx capabilities because this function will be required
in more places across the code in the upcoming patches.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add function to look up Tx offload names
Ivan Malov [Thu, 18 Jan 2018 09:44:27 +0000 (09:44 +0000)]
ethdev: add function to look up Tx offload names

Commonly, drivers converted to the new offload API
may need to log unsupported offloads as a response
to wrong settings. From this perspective, it would
be convenient to have generic functions to look up
offload names. The patch adds such a helper for Tx.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: add function to look up Rx offload names
Ivan Malov [Thu, 18 Jan 2018 09:44:26 +0000 (09:44 +0000)]
ethdev: add function to look up Rx offload names

Commonly, drivers converted to the new offload API
may need to log unsupported offloads as a response
to wrong settings. From this perspective, it would
be convenient to have generic functions to look up
offload names. The patch adds such a helper for Rx.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/sfc: fix flow RSS check in error handling
Roman Zhukov [Thu, 18 Jan 2018 07:32:56 +0000 (07:32 +0000)]
net/sfc: fix flow RSS check in error handling

RSS is a local variable with address which is never NULL.

Fixes: d77d07391d4d ("net/sfc: support flow API RSS action")
Cc: stable@dpdk.org
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>