Akhil Goyal [Mon, 22 Jan 2018 08:46:36 +0000 (14:16 +0530)]
doc: update feature list for cryptodevs
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Thomas Monjalon [Mon, 29 Jan 2018 22:20:40 +0000 (23:20 +0100)]
crypto/dpaa2_sec: fix build with GCC 7
Seen with GCC 7.2.0, a switch fall through is detected and
cannot be fixed with a fall-through comment or attribute:
drivers/crypto/dpaa2_sec/hw/rta/operation_cmd.h:89:6: error:
this statement may fall through [-Werror=implicit-fallthrough=]
if (rta_sec_era < RTA_SEC_ERA_2)
^
The check is disabled in dpaa2_sec Makefile but not in dpaa_sec Makefile
which uses source code shared by dpaa2_sec.
The workaround is to disable the check at the beginning of the file.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Thomas Monjalon [Mon, 29 Jan 2018 21:52:56 +0000 (22:52 +0100)]
crypto/mrvl: fix export map file name
Fixes:
8a61c83af2fa ("crypto/mrvl: add mrvl crypto driver")
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Neil Horman [Mon, 22 Jan 2018 01:48:07 +0000 (20:48 -0500)]
doc: add ABI experimental tag in versioning guide
Document the need to add the __experimental tag to appropriate functions
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Neil Horman [Mon, 22 Jan 2018 01:48:05 +0000 (20:48 -0500)]
mk: add experimental tag check
Add checks during build to ensure that all symbols in the EXPERIMENTAL
version map section have __experimental tags on their definitions, and
enable the warnings needed to announce their use. Also add an
ALLOW_EXPERIMENTAL_APIS define to allow individual libraries and files
to declare the acceptability of experimental api usage
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Neil Horman [Mon, 22 Jan 2018 01:48:06 +0000 (20:48 -0500)]
add experimental tag to appropriate functions
Append the __rte_experimental tag to api calls appearing in the
EXPERIMENTAL section of their libraries version map
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Neil Horman [Mon, 22 Jan 2018 01:48:04 +0000 (20:48 -0500)]
compat: add experimental tag macro
The __rte_experimental macro tags a given exported function as being part of
the EXPERIMENTAL api. Use of this tag will cause any caller of the
function (that isn't removed by dead code elimination) to emit a warning
that the user is making use of an API whos stabilty isn't guaranteed.
It also places the function in the .text.experimental section, which is
used to validate the tag against the corresponding library version map
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Neil Horman [Mon, 22 Jan 2018 01:48:03 +0000 (20:48 -0500)]
buildtools: add script to check experimental API exports
This tools reads the given version map for a directory, and checks to
ensure that, for each symbol listed in the export list, the corresponding
definition is tagged as __rte_experimental, erroring out if its not. In this
way, we can ensure that the EXPERIMENTAL api is kept in sync with the tags
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Bruce Richardson [Thu, 25 Jan 2018 11:12:25 +0000 (11:12 +0000)]
pmdinfogen: allow using stdin and stdout
Rather than having to work off files all the time, allow stdin and stdout
to be used as the source and destination for pmdinfogen. This will allow
other possible usages from scripts, e.g. taking files from ar archive and
building a single .pmd.c file from all the .o files in it.
for f in `ar t librte_pmd_xyz.a` ; do
ar p librte_pmd_xyz.a $f | pmdinfogen - - >> xyz_info.c
done
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Harry van Haaren [Mon, 29 Jan 2018 16:37:32 +0000 (16:37 +0000)]
app/procinfo: call EAL cleanup before exit
This patch adds a call to the newly introduced cleanup()
function just before quitting the app.
Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Harry van Haaren [Mon, 29 Jan 2018 16:37:31 +0000 (16:37 +0000)]
app/pdump: call EAL cleanup before exit
This patch adds a call to the newly introduced cleanup()
function just before quitting the pdump app.
Adding this function call before quitting from a secondary processes
is important, as otherwise it will leak hugepage memory. For a secondary
process that is run multiple times, this could cause hugepage memory
to become depleted and stop a secondary process from starting.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Harry van Haaren [Mon, 29 Jan 2018 16:37:30 +0000 (16:37 +0000)]
eal: add function to release internal resources
This commit adds a new function rte_eal_cleanup().
The function serves as a hook to allow DPDK to release
internal resources (e.g.: hugepage allocations).
This function allows DPDK to become more like an ordinary
library, where the library context itself can be initialized
and cleaned up by the application.
The rte_exit() and rte_panic() functions must be considered,
particularly if they should call rte_eal_cleanup() to release any
resources or not. This patch adds the cleanup to rte_exit(),
but does not clean up on rte_panic(). The reason to not clean
up on panicing is that the developer may wish to inspect the
exact internal state of EAL and hugepages.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Harry van Haaren [Mon, 29 Jan 2018 16:37:29 +0000 (16:37 +0000)]
service: restrict finalize to internal usage
This commit moves the rte_service_finalize() function
to be in the component header, and marks it as @internal.
The function is only called internally by rte_eal_finalize().
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Vipin Varghese <vipin.varghese@intel.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:49 +0000 (13:40 +0530)]
bus/fslmc: register platform HW mempool on runtime
Detect if the DPAA2 mempool objects are present and register
it as platform default hw mempool
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:48 +0000 (13:40 +0530)]
bus/dpaa: register platform HW mempool on runtime
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Pavan Nikhilesh [Mon, 29 Jan 2018 08:10:47 +0000 (13:40 +0530)]
app/testpmd: add log for preferred mempool ops
This patch adds the debug message to print the best selected
pktmbuf mempool ops name.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:46 +0000 (13:40 +0530)]
mbuf: add pool create helper for specific mempool ops
Introduce a new helper for pktmbuf pool, which will allow
the application to optionally specify the mempool ops name
as well.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:45 +0000 (13:40 +0530)]
mbuf: add pool ops selection functions
This patch add support for various mempool ops config helper APIs.
1.User defined mempool ops
2.Platform detected HW mempool ops (active).
3.Best selection of mempool ops by looking into user defined,
platform registered and compile time configured.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:44 +0000 (13:40 +0530)]
mbuf: maintain user and compile time mempool ops name
At present the userdefined mempool ops name overwrites
the default mempool ops name variable in internal_config.
This patch change the logic to maintain the value of
user defined only in the internal config.
The pktmbuf_create_pool is updated to reflect the same ie.
use user defined. If not present than use the default.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Hemant Agrawal [Mon, 29 Jan 2018 08:10:43 +0000 (13:40 +0530)]
eal: prefix mbuf pool ops name with user defined
This patch prefix the mbuf pool ops name with "user" to indicate
that it is user defined.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Olivier Matz [Mon, 29 Jan 2018 09:36:06 +0000 (10:36 +0100)]
mbuf: fix VLAN flags documentation
Fix inconsistency between mbuf structure documentation and flags
documentation.
Fixes:
380a7aab1ae2 ("mbuf: rename deprecated VLAN flags")
Cc: stable@dpdk.org
Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 29 Jan 2018 09:37:07 +0000 (10:37 +0100)]
mbuf: rename Tx VLAN flags
For consistency with the Rx flags, the flags PKT_TX_VLAN_PKT and
PKT_TX_QINQ_PKT are respectively renamed as PKT_TX_VLAN and
PKT_TX_QINQ. The old defines are deprecated but will stay for some time
for compatibility.
Reported-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 29 Jan 2018 09:39:23 +0000 (10:39 +0100)]
mbuf: fix NULL freeing when debug enabled
Do not panic when calling rte_pktmbuf_free(NULL) with mbuf debug
enabled, it is a valid operation.
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Reported-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Konstantin Ananyev [Mon, 15 Jan 2018 15:09:31 +0000 (15:09 +0000)]
eal/x86: use lock-prefixed instructions for SMP barrier
On x86 it is possible to use lock-prefixed instructions to get
the similar effect as mfence.
As pointed by Java guys, on most modern HW that gives a better
performance than using mfence:
https://shipilev.net/blog/2014/on-the-fence-with-dependencies/
That patch adopts that technique for rte_smp_mb() implementation.
On BDW 2.2 mb_autotest on single lcore reports 2X cycle reduction,
i.e. from ~110 to ~55 cycles per operation.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Konstantin Ananyev [Mon, 15 Jan 2018 15:04:39 +0000 (15:04 +0000)]
test: introduce memory barrier test case
Simple functional test for rte_smp_mb() implementations.
Also when executed on a single lcore could be used as rough
estimation how many cycles particular implementation of rte_smp_mb()
might take.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Jerin Jacob [Sun, 3 Dec 2017 12:37:30 +0000 (18:07 +0530)]
config/thunderx: disable C11 memory model ring
On thunderx and octeontx, ring_perf_autotest and
ring_pmd_perf_autotest test shows better performance
when disabling CONFIG_RTE_RING_USE_C11_MEM_MODEL.
On the other hand, Enabling CONFIG_RTE_RING_USE_C11_MEM_MODEL
shows better performance on thunderx2.
Since thunderx2 is using the default armv8 config,
no particular change is required.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Jia He [Mon, 22 Jan 2018 04:41:28 +0000 (20:41 -0800)]
ring: introduce C11 memory model barrier option
This patch is to support C11 memory model barrier in librte_ring.
There are 2 barrier implementation options in librte_ring (suggested
by Jerin).
1. use rte_smp_rmb
2. use load_acquire/store_release(refer to [1]).
The reason why providing 2 options is the performance benchmark
difference in different arm machines, refer to [2].
CONFIG_RTE_RING_USE_C11_MEM_MODEL is provided, and by default it is "n"
on any architectures and only "y" on arm64 so far.
[1] https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
[2] http://dpdk.org/ml/archives/dev/2017-October/080861.html
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Jia He [Mon, 22 Jan 2018 04:41:27 +0000 (20:41 -0800)]
ring: move code in a new header file
Move the common part of rte_ring.h into rte_ring_generic.h.
Move the memory barrier part into update_tail().
No functional changes here.
Suggested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Jia He [Mon, 22 Jan 2018 04:41:26 +0000 (20:41 -0800)]
eal/arm64: remove the braces in memory barrier macros
for the code as follows:
if (condition)
rte_smp_rmb();
else
rte_smp_wmb();
Without this patch, compiler will report this error:
error: 'else' without a previous 'if'
Fixes:
84733fd0d75e ("eal/arm64: fix memory barrier definition")
Cc: stable@dpdk.org
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:50 +0000 (13:02 -0800)]
net/mlx5: fix synchronization on polling Rx completions
Polling a new packet is basically sensing the generation bit in a
completion entry. For some processors not having strongly-ordered memory
model, there has to be a memory barrier between reading the generation bit
and other fields of the entry in order to guarantee data is not stale.
Fixes:
570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:49 +0000 (13:02 -0800)]
net/mlx5: replace I/O memory barrier with coherent version
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:48 +0000 (13:02 -0800)]
net/mlx5: remove unnecessary memory barrier
As rte_write64() has an IO barrier, there's no need to have a barrier
before the call.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:47 +0000 (13:02 -0800)]
eal/arm64: define coherent I/O memory barriers
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Thomas Speier <tspeier@qti.qualcomm.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:46 +0000 (13:02 -0800)]
eal/arm32: define coherent I/O memory barriers
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Jianbo Liu <jianbo.liu@arm.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:45 +0000 (13:02 -0800)]
eal/ppc64: define coherent I/O memory barriers
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:44 +0000 (13:02 -0800)]
eal/x86: define coherent I/O memory barriers
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:43 +0000 (13:02 -0800)]
eal: introduce coherent I/O memory barriers
This commit introduces rte_cio_wmb() and rte_cio_rmb(), in order to
guarantee the ordering of coherent shared memory between the CPU and a DMA
capable device.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Yongseok Koh [Thu, 25 Jan 2018 21:02:42 +0000 (13:02 -0800)]
eal: group memory barriers by type in doxygen
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:51 +0000 (10:34 +0530)]
test: add reciprocal based division
This commit provides a set of tests for verifying the correctness and
performance of both unsigned 32 and 64bit reciprocal based division.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:50 +0000 (10:34 +0530)]
eal: add u64-bit variant for reciprocal divide
Currently, rte_reciprocal only supports unsigned 32bit divisors. This
commit adds support for unsigned 64bit divisors.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Pavan Nikhilesh [Fri, 26 Jan 2018 05:04:49 +0000 (10:34 +0530)]
eal: introduce integer divide through reciprocal
In some use cases of integer division, denominator remains constant and
numerator varies. It is possible to optimize division for such specific
scenarios.
The librte_sched uses rte_reciprocal to optimize division so, moving it to
eal/common would allow other libraries and applications to use it.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Vipin Varghese [Fri, 26 Jan 2018 20:55:36 +0000 (02:25 +0530)]
service: fix memory leak with new function
The rte_service_finalize routine checks if service is initialized
or not. If yes; releases internal memory for services and lcore
states are freed. This routine is to be invoked at end of application
termination.
Fixes:
21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Ivan Malov [Sun, 21 Jan 2018 17:05:10 +0000 (17:05 +0000)]
log: fix memory leak in regexp level set
Fixes:
a5279180f510 ("eal: change several log levels matching a regexp")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Jasvinder Singh [Mon, 22 Jan 2018 14:14:28 +0000 (14:14 +0000)]
flow_classify: fix memory leak in rule add
Free allocated memory of the rule if not added to the table.
Coverity issue: 257032
Fixes:
50bdac5916d9 ("flow_classify: remove table id parameter from API")
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andriy Berestovskyy [Tue, 23 Jan 2018 15:43:16 +0000 (16:43 +0100)]
keepalive: fix state alignment
The __rte_cache_aligned was applied to the whole array,
not the array elements. This leads to a false sharing between
the monitored cores.
Fixes:
e70a61ad50ab ("keepalive: export states")
Cc: stable@dpdk.org
Signed-off-by: Andriy Berestovskyy <aber@semihalf.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Xueming Li [Sat, 20 Jan 2018 03:26:31 +0000 (11:26 +0800)]
cmdline: avoid garbage in unused fields of parsed result
The result buffer was not initialized before parsing, inducing garbage
in unused fields or padding of the parsed structure.
Initialize the result buffer each time before parsing.
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Xueming Li [Fri, 19 Jan 2018 18:16:10 +0000 (02:16 +0800)]
cmdline: fix dynamic tokens parsing
When using dynamic tokens, the result buffer contains pointers to some
location inside the result buffer. When the content of the temporary
buffer is copied in the final one, these pointers still point to the
temporary buffer.
This works until the temporary buffer is kept intact, but the next
commit introduces a memset() that breaks this assumption.
This commit keeps the successfully parsed buffers, and ensures that the
pointers point to the valid location, by using temp buffer for following
parsing.
Fixes:
9b3fbb051d2e ("cmdline: fix parsing")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Harry van Haaren [Wed, 24 Jan 2018 17:02:47 +0000 (17:02 +0000)]
service: fix possible mem leak on initialize
This commit ensures that if that if we run out of memory
during the initialization of the service library, that the
first allocated memory is correctly freed instead of leaked.
Fixes:
21698354c832 ("service: introduce service cores concept")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Zhiyong Yang [Fri, 19 Jan 2018 10:18:13 +0000 (18:18 +0800)]
mbuf: remove void pointer cast
It is unnecessary to cast from void * to struct rte_mbuf *,
the change can make code clearer.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Marko Kovacevic [Wed, 24 Jan 2018 15:07:45 +0000 (15:07 +0000)]
doc: fix build of bbdev test guide
Fix build issue with pdf guides. Some indentations in the bbdev test
application doc were causing build failures. Latex Log message:
doc.log:! LaTeX Error: Too deeply nested.
Fixes:
f714a18885a6 ("app/testbbdev: add test application for bbdev")
Signed-off-by: Marko Kovacevic <marko.kovacevic@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Zhiyong Yang [Thu, 25 Jan 2018 07:03:50 +0000 (15:03 +0800)]
event/opdl: fix icc build
ICC reports the issue at compile time as follows.
error #592: variable "i" is used before its value is set
RTE_SET_USED(i);
The patch is to fix it. GCC and CLANG has been tested as well.
Fixes:
d548ef513cd7 ("event/opdl: add unit tests")
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
Ophir Munk [Wed, 24 Jan 2018 14:12:13 +0000 (14:12 +0000)]
net/vdev_netvsc: fix build without C11 and pedantic
Remove CFLAGS -std=c11 and -pedantic in order to guarantee
a successful vdev_netvsc compilation on old Linux distributions.
Otherwise old GCC compilers may complain as follows:
cc1: error: unrecognized command line option -std=c11
Fixes:
6086ab3bb3d2 ("net/vdev_netvsc: introduce Hyper-V platform driver")
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Ophir Munk [Tue, 23 Jan 2018 21:54:09 +0000 (21:54 +0000)]
net/tap: use local eBPF definitions
eBPF has a graceful approach: it must successfully compile on all Linux
distributions. If a specific kernel cannot support eBPF it will gracefully
refuse the eBPF netlink message sent to it.
The kernel header file linux/bpf.h (if present) on different Linux
distributions may not include all definitions required for TAP
compilation.
In order to guarantee a successful eBPF compilation everywhere all the
required definitions for TAP have been locally added instead of including
file <linux/bpf.h>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Pavan Nikhilesh [Mon, 22 Jan 2018 17:46:01 +0000 (23:16 +0530)]
drivers/event: fix resource leak in selftest
Free resources leak in eventdev selftests.
Coverity issue: 257044
Coverity issue: 257047
Coverity issue: 257009
Fixes:
9ef576176db0 ("test/eventdev: add octeontx multi queue and multi port")
Fixes:
3a17ff401f1e ("test/eventdev: add basic SW tests")
Fixes:
5e6eb5ccd788 ("event/sw: make test standalone")
Signed-off-by: Pavan Nikhilesh <pbhagavatula@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Harry van Haaren [Mon, 22 Jan 2018 10:04:10 +0000 (10:04 +0000)]
event/opdl: rework loops to comply with dpdk style
This commit reworks the loop counter variable declarations
to be in line with the DPDK source code.
Fixes:
3c7f3dcfb099 ("event/opdl: add PMD main body and helper function")
Fixes:
8ca8e3b48eff ("event/opdl: add event queue config get/set")
Fixes:
d548ef513cd7 ("event/opdl: add unit tests")
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Liang Ma <liang.j.ma@intel.com>
Jerin Jacob [Fri, 19 Jan 2018 06:38:29 +0000 (12:08 +0530)]
event/sw: fix debug logging config option
align the config option name with config/common_base
Fixes:
aaa4a221da26 ("event/sw: add new software-only eventdev driver")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Thomas Monjalon [Mon, 22 Jan 2018 00:50:25 +0000 (01:50 +0100)]
version: 18.02-rc1
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Ferruh Yigit [Sat, 20 Jan 2018 16:50:53 +0000 (16:50 +0000)]
config: sort PMD config options
No config option changed, added or removed.
Only reshuffle PMD config options mostly to help new PMDs where to put
their new config option.
Ordered as physical, paravirtual and virtual groups. Alphabetical order
within a group.
Also tried to group vendor devices together which breaks alphabetical
order in some places.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 22 Jan 2018 00:16:25 +0000 (00:16 +0000)]
ethdev: rename function parameter for consistency
Update "port" function argument variable to "port_id" in public
header to be consistent in all APIs.
No functional change.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Ferruh Yigit [Mon, 22 Jan 2018 00:16:24 +0000 (00:16 +0000)]
ethdev: reorder inline functions
Move all inline function to the end of the ethdev.h header file and move
the ethdev_core.h just before inline functions.
Since inline functions need data structures in ethdev_core.h, this
reorder is to group them and make it clear where put further inline
functions.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Ferruh Yigit [Mon, 22 Jan 2018 00:16:23 +0000 (00:16 +0000)]
ethdev: separate internal structures into own header
rte_ethdev_core.h created. Internal data structures are moved here.
These structures are mostly intended to be used by drivers, but they
need to be in the public header file because of the inline functions
in the ethdev.h header, and those inline functions are preferred to
kept because of the performance concerns.
The accessibility of the data structures are not changed, only logically
grouped to show that they are not intended to be used by applications.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Ferruh Yigit [Mon, 22 Jan 2018 00:16:22 +0000 (00:16 +0000)]
ethdev: separate driver APIs
Create a rte_ethdev_driver.h file and move PMD specific APIs here.
Drivers updated to include this new header file.
There is no update in header content and since ethdev.h included by
ethdev_driver.h, nothing changed from driver point of view, only
logically grouping of APIs. From applications point of view they can't
access to driver specific APIs anymore and they shouldn't.
More PMD specific data structures still remain in ethdev.h because of
inline functions in header use them. Those will be handled separately.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Matan Azrad [Sat, 20 Jan 2018 21:12:24 +0000 (21:12 +0000)]
net/failsafe: fix removed device handling
There is time between the physical removal of the device until
sub-device PMDs get a RMV interrupt. At this time DPDK PMDs and
applications still don't know about the removal and may call sub-device
control operation which should return an error.
In previous code this error is reported to the application contrary to
fail-safe principle that the app should not be aware of device removal.
Add an removal check in each relevant control command error flow and
prevent an error report to application when the sub-device is removed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Matan Azrad [Sat, 20 Jan 2018 21:12:23 +0000 (21:12 +0000)]
ethdev: adjust removal error report in flow API
rte_eth_dev_is_removed API was added to detect a device removal
synchronously.
When a device removal occurs during flow command execution, many
different errors can be reported to the user.
Adjust all flow APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Matan Azrad [Sat, 20 Jan 2018 21:12:22 +0000 (21:12 +0000)]
ethdev: adjust removal error report
rte_eth_dev_is_removed API was added to detect a device removal
synchronously.
When a device removal occurs during control command execution, many
different errors can be reported to the user.
Adjust all ethdev APIs error reports to return -EIO in case of device
removal using rte_eth_dev_is_removed API.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Matan Azrad [Sat, 20 Jan 2018 21:12:21 +0000 (21:12 +0000)]
net/mlx5: support a device removal check operation
Add support to get removal status of mlx5 device.
It is not supported in secondary process.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Sat, 20 Jan 2018 21:12:20 +0000 (21:12 +0000)]
net/mlx4: support a device removal check operation
Add support to get removal status of mlx4 device.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Sat, 20 Jan 2018 21:12:19 +0000 (21:12 +0000)]
ethdev: add devop to check removal status
There is time between the physical removal of the device until PMDs get
a RMV interrupt. At this time DPDK PMDs and applications still don't
know about the removal.
Current removal detection is achieved only by registration to device RMV
event and the notification comes asynchronously. So, there is no option
to detect a device removal synchronously.
Applications and other DPDK entities may want to check a device removal
synchronously and to take an immediate decision accordingly.
Add new dev op called is_removed to allow DPDK entities to check an
Ethernet device removal status immediately.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Ophir Munk [Sat, 20 Jan 2018 21:11:37 +0000 (21:11 +0000)]
doc: add RSS in tap guide
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Ophir Munk [Sat, 20 Jan 2018 21:11:36 +0000 (21:11 +0000)]
net/tap: implement RSS using eBPF
TAP PMD is required to support RSS queue mapping based on rte_flow API. An
example usage for this requirement is failsafe transparent switching from a
PCI device to TAP device while keep redirecting packets to the same RSS
queues on both devices.
TAP RSS implementation is based on eBPF programs sent to Linux kernel
through BPF system calls and using netlink messages to reference the
programs as part of traffic control commands.
TC uses eBPF programs as classifiers and actions.
eBPF classification: packets marked with an RSS queue will be directed
to this queue using TC with "skbedit" action.
BPF classifiers are downloaded to the kernel once on TAP creation for
each TAP Rx queue.
eBPF action: calculate the Toeplitz RSS hash based on L3 addresses and
L4 ports. Mark the packet with the RSS queue according the resulting
RSS hash, then reclassify the packet.
BPF actions are downloaded to the kernel for each new RSS rule.
TAP eBPF requires Linux version 4.9 configured with BPF. TAP PMD will
successfully compile on systems with old or non-BPF configured kernels but
RSS rules creation on TAP devices will not be successful
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Ophir Munk [Sat, 20 Jan 2018 21:11:35 +0000 (21:11 +0000)]
net/tap: add eBPF API
This commit include BPF API to be used by TAP.
tap_flow_bpf_cls_q() - download to kernel BPF program that classifies
packets to their matching queues
tap_flow_bpf_calc_l3_l4_hash() - download to kernel BPF program that
calculates per packet layer 3 and layer 4 RSS hash
tap_flow_bpf_rss_map_create() - create BPF RSS map for storing RSS
parameters per RSS rule
tap_flow_bpf_update_rss_elem() - update BPF map entry with RSS rule
parameters
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Ophir Munk [Sat, 20 Jan 2018 21:11:34 +0000 (21:11 +0000)]
net/tap: add eBPF bytes code
File tap_bpf_insns.h was added. It includes eBPF bytes code
which corresponds to source file tap_bpf_program.c
(see "net/tap: add eBPF program file").
The bytes code is in the format of C arrays of struct bpf_insn and
was generated from the C file tap_bpf_program.c
1. The C file was compiled via LLVM into an object file in ELF
format as:
clang -O2 -emit-llvm -c tap_bpf_program.c -o - | llc -march=bpf \
-filetype=obj -o <tap_bpf_program.o>
clang version must be 3.7 and above
The C functions are under different ELF sections and are considered
different BPF programs to be downloaded to the kernel
2. Using an external tool the ELF sections are parsed and the C arrays
of struct bpf_insn are generated. Each C array (corresponding to a
different function under an ELF section) is downloaded to the kernel
using an BPF systm call. The external tool that generates the C arrays
will be added in separate commits.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Ophir Munk [Sat, 20 Jan 2018 21:11:33 +0000 (21:11 +0000)]
net/tap: add eBPF program file
File tap_bpf_program.c was added with two ELF sections
corresponding to two BPF programs and one BPF map.
Section cls_q - BPF classifier to classify packets to their
corresponding queue after an RSS hash was calculated on the packet
and saved in skb->cb[1]
Section l3_l4 - BPF action to calculate RSS hash on packet
layers 3 and 4
This file is not part of DPDK tree compilation.
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Ophir Munk [Sat, 20 Jan 2018 21:11:32 +0000 (21:11 +0000)]
net/tap: support actions for different classifiers
Add a generic TC actions handling for TC actions: "mirred",
"gact", "skbedit". This will be useful when introducing
BPF actions, as it uses TCA_BPF_ACT instead of TCA_FLOWER_ACT
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Pascal Mazon <pascal.mazon@6wind.com>
Yongseok Koh [Fri, 19 Jan 2018 07:52:55 +0000 (23:52 -0800)]
net/mlx5: fix memory region lookup
This patch reverts:
commit
3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Although granularity of chunks in a mempool is a cacheline, addresses are
extended to align to page boundary for performance reason in device when
registering a MR (Memory Region). This could make some regions overlap,
then can cause Tx completion error due to incorrect LKEY search. If the
error occurs, the Tx queue will get stuck. It is because buffer address is
compared against aligned addresses for Memory Region. Saving original
addresses of mempool for comparison doesn't create any overlap.
Fixes:
b0b093845793 ("net/mlx5: use buffer address for LKEY search")
Fixes:
3a6f2eb8c5c5 ("net/mlx5: fix Memory Region registration")
Cc: stable@dpdk.org
Reported-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Beilei Xing [Fri, 19 Jan 2018 07:50:06 +0000 (15:50 +0800)]
net/i40e: ignore case of packet type
Replace strncmp with strncasecmp in i40e_update_customized_ptype
function for compatibility.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Beilei Xing [Fri, 19 Jan 2018 07:50:05 +0000 (15:50 +0800)]
net/i40e: add parser for IPv4/v6 frag
There're new metadata IPV4FRAG and IPV6FRAG in PPP
profile, this patch improves ptype parser to support
IPV4FRAG and IPV6FRAG.
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Beilei Xing [Fri, 19 Jan 2018 07:50:04 +0000 (15:50 +0800)]
net/i40e: fix fail to update packet type table
Fail to update SW ptype mapping table when loading
PPP profile, though profile can be loaded successfully.
It will cause fail to parse SW ptype during receiving
packets. This patch fixes this issue.
Fixes:
11556c915a08 ("net/i40e: improve packet type parser")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Beilei Xing [Fri, 19 Jan 2018 05:23:44 +0000 (13:23 +0800)]
net/i40e: fix flow director Rx resource defect
FDIR Rx ring isn't initialized and Rx queue HW tail isn't updated
when there's error detected during programming FDIR flow. There'll
be some potential risk.
This patch updates FDIR Rx resource.
Fixes:
a778a1fa2e4e ("i40e: set up and initialize flow director")
Fixes:
05999aab4ca6 ("i40e: add or delete flow director")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:46 +0000 (13:51 +0000)]
net/vdev_netvsc: add automatic probing
Using DPDK in Hyper-V VM systems requires vdev_netvsc driver to pair
the NetVSC netdev device with the same MAC address PCI device by
fail-safe PMD.
Add vdev_netvsc custom scan in vdev bus to allow automatic probing in
Hyper-V VM systems unless it was already specified by command line.
Add "ignore" parameter to disable this auto-detection.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:45 +0000 (13:51 +0000)]
net/vdev_netvsc: add force parameter
This parameter allows specifying any non-NetVSC interface or routed
NetVSC interfaces to use with tap sub-devices for development purposes.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:44 +0000 (13:51 +0000)]
net/vdev_netvsc: skip routed netvsc probing
NetVSC netdevices which are already routed should not be probed because
they are used for management purposes by the HyperV.
prevent routed netvsc devices probing.
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:43 +0000 (13:51 +0000)]
net/vdev_netvsc: implement core functionality
As described in more details in the attached documentation (see patch
contents), this virtual device driver manages NetVSC interfaces in virtual
machines hosted by Hyper-V/Azure platforms.
This driver does not manage traffic nor Ethernet devices directly; it acts
as a thin configuration layer that automatically instantiates and controls
fail-safe PMD instances combining tap and PCI sub-devices, so that each
NetVSC interface is exposed as a single consolidated port to DPDK
applications.
PCI sub-devices being hot-pluggable (e.g. during VM migration),
applications automatically benefit from increased throughput when present
and automatic fallback on NetVSC otherwise without interruption thanks to
fail-safe's hot-plug handling.
Once initialized, the sole job of the vdev_netvsc driver is to regularly
scan for PCI devices to associate with NetVSC interfaces and feed their
addresses to corresponding fail-safe instances.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:42 +0000 (13:51 +0000)]
net/vdev_netvsc: introduce Hyper-V platform driver
This patch lays the groundwork for this driver (draft documentation,
copyright notices, code base skeleton and build system hooks). While it can
be successfully compiled and invoked, it's an empty shell at this stage.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:41 +0000 (13:51 +0000)]
net/failsafe: add probed device capture
Previous fail-safe code didn't support probed sub-devices capture and
failed when it tried to probe them.
Skip fail-safe sub-device probing when it already was probed.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Matan Azrad [Thu, 18 Jan 2018 13:51:40 +0000 (13:51 +0000)]
net/failsafe: add fd parameter
This parameter enables applications to provide device definitions
through an arbitrary file descriptor number.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Adrien Mazarguil [Thu, 18 Jan 2018 13:51:39 +0000 (13:51 +0000)]
net/failsafe: fix invalid free
rte_free() is not supposed to work with pointers returned by calloc().
Fixes:
a0194d828100 ("net/failsafe: add flexible device definition")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Radu Nicolau [Fri, 19 Jan 2018 11:42:34 +0000 (11:42 +0000)]
ethdev: add security context API documentation
Added missing doxygen for rte_eth_dev_get_sec_ctx
and moved the declaration to the proper place.
Fixes:
4c270218aa26 ("ethdev: support security APIs")
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andy Moreton [Fri, 19 Jan 2018 06:47:06 +0000 (06:47 +0000)]
net/sfc/base: fix unused argument warning
The type_data argument to ef10_rx_qcreate is only used
in builds with EFSYS_OPT_RX_PACKED_STREAM. note this as
an unused argument to avoid warnings in builds without
packed stream support.
Fixes:
b749646dade4 ("net/sfc/base: add function to create packed stream RxQ")
Signed-off-by: Andy Moreton <amoreton@solarflare.com>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Zhiyong Yang [Fri, 19 Jan 2018 02:21:11 +0000 (10:21 +0800)]
doc: fix typo in link bonding guide
fix one typo and a grammatical mistake.
Fixes:
b0152b1b40fe ("doc: update bonding")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Marko Kovacevic <marko.kovacevic@intel.com>
Yong Wang [Thu, 18 Jan 2018 11:48:56 +0000 (06:48 -0500)]
net/dpaa: fix potential memory leak
There are several func calls to rte_zmalloc() which don't do null
pointer check on the return value. And before return, the memory is not
freed. Fix it by adding null pointer check and rte_free().
Fixes:
37f9b54bd3cf ("net/dpaa: support Tx and Rx queue setup")
Fixes:
62f53995caaf ("net/dpaa: add frame count based tail drop with CGR")
Cc: stable@dpdk.org
Signed-off-by: Yong Wang <wang.yong19@zte.com.cn>
Reviewed-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Wenzhuo Lu [Fri, 19 Jan 2018 08:41:48 +0000 (16:41 +0800)]
net/avf: fix makefile typo
A typo in makefile that makes the RX/TX vector code
not to be compiled.
Fixes:
319c421f3890 ("net/avf: enable SSE Rx Tx")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Yongseok Koh [Wed, 17 Jan 2018 17:44:13 +0000 (09:44 -0800)]
net/mlx5: fix handling link status event
Even though link of a port gets down, device still can receive traffic.
That is the reason why mlx5_set_link_up/down() switches rx/tx_pkt_burst().
However, if link gets down by an external command (e.g. ifconfig), it isn't
effective. It is better to change burst functions when link status change
is detected.
Fixes:
62072098b54e ("mlx5: support setting link up or down")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Victor Kaplansky [Wed, 17 Jan 2018 13:49:25 +0000 (15:49 +0200)]
vhost: protect active rings from async ring changes
When performing live migration or memory hot-plugging,
the changes to the device and vrings made by message handler
done independently from vring usage by PMD threads.
This causes for example segfaults during live-migration
with MQ enable, but in general virtually any request
sent by qemu changing the state of device can cause
problems.
These patches fixes all above issues by adding a spinlock
to every vring and requiring message handler to start operation
only after ensuring that all PMD threads related to the device
are out of critical section accessing the vring data.
Each vring has its own lock in order to not create contention
between PMD threads of different vrings and to prevent
performance degradation by scaling queue pair number.
See https://bugzilla.redhat.com/show_bug.cgi?id=
1450680
Cc: stable@dpdk.org
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Junjie Chen [Wed, 17 Jan 2018 15:45:53 +0000 (10:45 -0500)]
vhost: fix mbuf free
dequeue zero copy change buf_addr and buf_iova of mbuf, and return
to mbuf pool without restore them, it breaks vm memory if others allocate
mbuf from same pool since mbuf reset doesn't reset buf_addr and buf_iova.
Fixes:
b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Xiao Wang [Thu, 18 Jan 2018 02:20:38 +0000 (10:20 +0800)]
net/virtio: support guest announce
When live migration is done, for the backup VM, either the virtio
frontend or the vhost backend needs to send out gratuitous RARP packet
to announce its new network location.
This patch enables VIRTIO_NET_F_GUEST_ANNOUNCE feature to support live
migration scenario where the vhost backend doesn't have the ability to
generate RARP packet.
Brief introduction of the work flow:
1. QEMU finishes live migration, pokes the backup VM with an interrupt.
2. Virtio interrupt handler reads out the interrupt status value, and
realizes it needs to send out RARP packet to announce its location.
3. Pause device to stop worker thread touching the queues.
4. Inject a RARP packet into a Tx Queue.
5. Ack the interrupt via control queue.
6. Resume device to continue packet processing.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Xiao Wang [Thu, 18 Jan 2018 02:32:24 +0000 (10:32 +0800)]
net: fix RARP generation
Due to a mistake operation from me, older version (v10) was merged to
master branch. It's the v11 should be applied. However, the master branch
is not rebase-able. Thus, this patch is made, from the diff between v10
and v11.
The diffs are:
- Add check for parameter and tailroom in rte_net_make_rarp_packet
- Allocate mbuf in rte_net_make_rarp_packet
Besides that, a link error is fixed when shared lib is enabled.
Fixes:
45ae05df824c ("net: add a helper for making RARP packet")
Fixes:
c3ffdba0e88a ("vhost: use API to make RARP packet")
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Yuanhan Liu <yliu@fridaylinux.org>
Junjie Chen [Mon, 15 Jan 2018 11:32:19 +0000 (06:32 -0500)]
vhost: do deep copy while reallocating queue
When vhost reallocate dev and vq for NUMA enabled case, it doesn't perform
deep copy, which lead to 1) zmbuf list not valid 2) remote memory access.
This patch is to re-initlize the zmbuf list and also do the deep copy.
Signed-off-by: Junjie Chen <junjie.j.chen@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Maciej Czekaj [Thu, 18 Jan 2018 13:06:13 +0000 (14:06 +0100)]
net/thunderx: convert to new offload API
This patch removes all references to old-style offload API
replacing them with new offload flags.
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Tomasz Duszynski [Thu, 18 Jan 2018 10:57:37 +0000 (11:57 +0100)]
net/mrvl: allow adding MAC address before port init
Since DPDK restores ether address configuration after device
is started it is safe to add ether address to uninitialized port (ppio).
Fixes:
c0511a8f741f ("net/mrvl: check if ppio is initialized")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Duszynski <tdu@semihalf.com>