Jakub Grajciar [Thu, 6 Jun 2019 11:38:50 +0000 (13:38 +0200)]
net/memif: introduce memory interface PMD
Shared memory packet interface (memif) PMD allows for DPDK and any other
client using memif (DPDK, VPP, libmemif) to communicate using shared
memory. The created device transmits packets in a raw format. It can be
used with Ethernet mode, IP mode, or Punt/Inject. At this moment, only
Ethernet mode is supported in DPDK memif implementation. Memif is Linux
only.
Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Maxime Coquelin [Wed, 5 Jun 2019 10:00:38 +0000 (12:00 +0200)]
net/virtio: fix segment length in mergeable packed Rx
Head segment data_len field is wrongly summed with the length
of all the segments of the chain, whereas it should be the
length of the first segment only.
Maxime Coquelin [Wed, 5 Jun 2019 10:00:37 +0000 (12:00 +0200)]
net/virtio: fix mergeable Rx with segmented packet
After having dequeued a burst of descriptors, there may be a
need to dequeue a few more if the last packet was segmented
and not complete. When it happens, the extra segments were
not properly attached to the mbuf chain, and so were lost.
Also, head segment data_len field is wrongly summed with
the length of all the segments of the chain.
This patch fixes both the mbuf chaining and head segment's
data_len field
Fixes: bcac5aa207f8 ("net/virtio: improve batching in mergeable path") Cc: stable@dpdk.org Reported-by: Yaroslav Brustinov <ybrustin@cisco.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Maxime Coquelin [Wed, 5 Jun 2019 10:00:36 +0000 (12:00 +0200)]
net/virtio: fix in-order Rx with segmented packet
After having dequeued a burst of descriptors, there may be a
need to dequeue a few more if the last packet was segmented
and not complete. When it happens, the extra segments were
not properly attached to the mbuf chain, and so were lost.
Also, head segment data_len field is wrongly summed with
the length of all the segments of the chain.
This patch fixes both the mbuf chaining and head segment's
data_len field.
Fixes: e5f456a98d3c ("net/virtio: support in-order Rx and Tx") Cc: stable@dpdk.org Reported-by: Yaroslav Brustinov <ybrustin@cisco.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Maxime Coquelin [Wed, 29 May 2019 13:04:20 +0000 (15:04 +0200)]
eal/x86: force inlining of all memcpy and mov helpers
Some helpers in the header file are forced inlined other are
only inlined, this patch forces inline for all.
It will avoid it to be embedded as functions when called multiple
times in the same object file. For example, when we added packed
ring support in vhost-user library, rte_memcpy_generic got no
more inlined.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Maxime Coquelin [Wed, 29 May 2019 13:04:18 +0000 (15:04 +0200)]
vhost: do not inline unlikely fragmented buffers code
Handling of fragmented virtio-net header and indirect descriptors
tables was implemented to fix CVE-2018-1059. It should never
happen with healthy guests and so is already considered as
unlikely code path.
This patch moves these bits into non-inline dedicated functions
to reduce the I-cache pressure.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
David Harton [Thu, 16 May 2019 18:28:03 +0000 (14:28 -0400)]
net/i40e: eliminate weak symbols in data path
Use of weak symbols can hide makefile errors especially when
custom makefiles are used. Removing the use of weak symbols
to avoid a stub function being linked in production code.
Signed-off-by: David Harton <dharton@cisco.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
William Tu [Fri, 31 May 2019 16:52:42 +0000 (09:52 -0700)]
net/af_xdp: fix remove path
When users call rte_eth_dev_close() and rte_dev_remove(), the af_xdp
pmd return -1 (EPERM) due to eth_dev == NULL.
Since the af_xdp pmd driver advertises RTE_ETH_DEV_CLOSE_REMOVE, all
the resources are freed on rte_eth_dev_close(). rte_dev_remove() tries
to detach device and subsequently calls rte_pmd_af_xdp_remove() that
tries to free already freed resources and fails.
Fix it by return success.
Fixes: f1debd77efaf ("net/af_xdp: introduce AF_XDP PMD") Cc: stable@dpdk.org
Reported-at: https://patchwork.ozlabs.org/patch/1106528/ Suggested-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Ajit Khaparde [Wed, 29 May 2019 21:02:24 +0000 (17:02 -0400)]
net/bnxt: fix RSS RETA indirection table ops
We are trying to update the indirection table for all the VNICs.
We should update the table only for the default vnic0.
Fix the reta update function to only update table entries that are
selected by the update mask. Translate queue number to firmware
group ID when updating an entry.
Fix reta query op to only return table entries as identfied by the
provided mask. Translate firmware group IDs to queue numbers.
Removed extraneous code from bnxt_reta_query_op().
Ajit Khaparde [Wed, 29 May 2019 21:02:20 +0000 (17:02 -0400)]
net/bnxt: update release notes
Update release doc briefly describing updates to bnxt PMD done during
19.05 release, including transmit optimization changes in the commits
identified by the "Fixes:" tags below.
When device is being closed and tries to unregister interrupt callback,
there is a chance the handler is still active (called in context of
eal_intr_thread_main thread). If so the rte_intr_callback_unregister
returns -EAGAIN and keeps the handler registered, causing crash when
underlaying resourse is gone away.
This race condition may happen if event handling in application takes
a long time. We should check the return code of unregistering routine
and try again to unregister the handler. The diagnostic messages are
shown once a second, while trying to unregister.
Dekel Peled [Wed, 15 May 2019 10:07:45 +0000 (13:07 +0300)]
net/mlx5: fix order of items in NEON scatter
Previous patch added handling of metadata for multi-segment packet.
Function txq_scatter_v in file mlx5_rxtx_vec_neon.h was updated
incorrectly, items were inserted into WQE in wrong order.
This patch fixes the issue, inserting items into WQE correctly.
Tom Barbette [Thu, 2 May 2019 12:11:35 +0000 (14:11 +0200)]
examples/rxtx_callbacks: support HW timestamp
Use rxtx callback to demonstrate a way to use rte_eth_read_clock to
convert the hardware timestamps to an amount of cycles.
This allows to get the amount of time the packet spent since its entry
in the device. While the regular latency only shows the latency from
when it entered the software stack.
Signed-off-by: Tom Barbette <barbette@kth.se> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tom Barbette [Thu, 2 May 2019 12:11:34 +0000 (14:11 +0200)]
net/mlx5: support reading clock
Implements support for read_clock for the mlx5 driver. mlx5 supports
hardware timestamp offload, setting packets timestamp field to the
device clock. rte_eth_read_clock allows to read the device's current
clock value and therefore compare values on similar time base.
See rxtx_callbacks for an example.
Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Tom Barbette [Thu, 2 May 2019 12:11:33 +0000 (14:11 +0200)]
ethdev: add API to read device clock
Add rte_eth_read_clock to read the raw clock of a device.
The main use is to get the device clock conversion co-efficients to be
able to translate the raw clock of the timestamp field of the pkt mbuf
to a local synced time value.
This function was missing to allow users to convert the Rx timestamp
field to real time without the complexity of the rte_timesync* facility.
One can derivate the clock frequency by calling twice read_clock and
then keep a common time base.
Signed-off-by: Tom Barbette <barbette@kth.se> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Jerin Jacob [Tue, 11 Jun 2019 14:15:03 +0000 (19:45 +0530)]
acl: fix build with some arm64 compiler
Some compilers reporting the following error, though the existing
code doesn't have any uninitialized variable case.
Just to make compiler happy, initialize the int32x4_t variable
one shot using vdupq_n_s32.
lib/librte_acl/acl_run_neon.h: In function 'search_neon_4'
lib/librte_acl/acl_run_neon.h:230:12: error:
'input' may be used uninitialized in this function
int32x4_t input;
Ruifeng Wang [Mon, 29 Apr 2019 10:02:07 +0000 (18:02 +0800)]
hash: simplify signature compare with NEON
Replaced multiple neon instructions with single equivalent instruction.
This made simpler code and a bit higher performance.
Hash bulk lookup had 0.1% ~ 3% performance gain in tests on ARM A72
platforms.
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Bruce Richardson [Fri, 12 Apr 2019 08:29:00 +0000 (09:29 +0100)]
build: generate Windows exports file
Rather than having a separate version.map file for linux/BSD and an
exports definition file for windows for each library, generate the
latter from the former automatically at build time.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
clang 6.0 and onwards, for the external function call generates
BPF_PSEUDO_CALL instruction:
call pseudo +-off -> call another bpf function.
More details about that change: https://lwn.net/Articles/741773/
DPDK BPF implementation right now doesn't support multiple BPF
functions per module.
To overcome that problem, and preserve existing functionality
(ability to call allowed by user external functions),
bpf_elf_load() clears EBPF_PSEUDO_CALL value.
For details how to reproduce the issue:
https://bugs.dpdk.org/show_bug.cgi?id=259
Fixes: 5dba93ae5f2d ("bpf: add ability to load eBPF program from ELF object file") Cc: stable@dpdk.org Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Bruce Richardson [Wed, 10 Apr 2019 13:45:17 +0000 (14:45 +0100)]
bpf: remove use of weak functions
Weak functions don't work well with static libraries and require the use of
"whole-archive" flag to ensure that the correct function is used when
linking. Since the weak function is only used as a placeholder within this
library alone, we can replace it with a non-weak version protected using
preprocessor ifdefs.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Bruce Richardson [Wed, 10 Apr 2019 13:45:16 +0000 (14:45 +0100)]
acl: remove use of weak functions
Weak functions don't work well with static libraries and require the use of
"whole-archive" flag to ensure that the correct function is used when
linking. Since the weak functions are only used as placeholders within
this library alone, we can replace them with non-weak functions using
preprocessor ifdefs.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Each hash entry has a pointer to one uint32 memory location.
However, all the readers increment the same location causing
race conditions. Allocate memory for each thread so that each
thread will increment its own memory location.
Fixes: b87089b0bb19 ("test/rcu: add API and functional tests") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Tested-by: David Marchand <david.marchand@redhat.com>
Compiling error:
.../aarch64-linux-gnu/bin/ld: lib/librte_eal.so.10.1: version node
not found for symbol numa_run_on_node_mask@@libnuma_1.2
.../aarch64-linux-gnu/bin/ld: failed to set dynamic section sizes:
Bad value
collect2: error: ld returned 1 exit status
[58/1370] Compiling C object 'lib/76b5a35@@rte_cmdline@sta/
librte_cmdline_cmdline_parse_string.c.o'.
ninja: build stopped: subcommand failed.
Fixes: 01add9da25cd ("doc: add cross compiling guide") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Bruce Richardson [Wed, 29 May 2019 16:19:32 +0000 (17:19 +0100)]
doc: update quick start guide for meson
The build-sdk-meson.txt file is a little out of date, so update it with
information on the latest build requirements, and remove any content
no longer needed.
Since the cross-compilation file quoted in the document is now considerably
longer and more complex than previous, replace the contents of the file
with a summary of it instead. This is shorter and more maintainable, and
the original file is available as part of the repo anyway if the user wants
to view it.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Currently, unregister will be attempted even if IPC wasn't
supported in the first place. It is harmless, but for
consistency reasons, update the unregister API call to
exit early when IPC is not supported.
Currently, IPC API will silently ignore unsupported IPC.
Fix the API call and its callers to explicitly handle
unsupported IPC cases.
For primary processes, it is OK to not have IPC because
there may not be any secondary processes in the first place,
and there are valid use cases that disable IPC support, so
all primary process usages are fixed up to ignore IPC
failures.
For secondary processes, IPC will be crucial, so leave all
of the error handling as is.
Xiaolong Ye [Thu, 16 May 2019 07:28:56 +0000 (15:28 +0800)]
ring: remove unnecessary forward declaration
As memzone.h is introduced by
commit 38c9817ee1d8 ("mempool: adjust name size in related data types"),
forward declaration for rte_memzone is no longer needed.
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Nicolas Dichtel [Thu, 23 May 2019 09:52:31 +0000 (11:52 +0200)]
mem: ease init in a docker container
move_pages() is only used to get the numa node id, but this function
is not allowed by default in docker (it needs CAP_SYS_NICE and an update of
the seccomp profile).
get_mempolicy() also requires CAP_SYS_NICE but doesn't need any change in
the default seccomp profile.
Note that the returned value of move_pages() was not checked, thus some
errors could be hidden (if the requested id was 0).
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime") Cc: stable@dpdk.org Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Reviewed-by: Didier Pallard <didier.pallard@6wind.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Ali Alnubani [Thu, 23 May 2019 17:40:38 +0000 (17:40 +0000)]
examples: use child directory as name
This would allow correctly naming an application residing
in a subdirectory. For example, if the example is set to 'path/to/app',
then the name would be 'app'.
This doesn't affect the naming of an example that isn't in a subdirectory.
Signed-off-by: Ali Alnubani <alialnu@mellanox.com> Acked-by: Luca Boccassi <bluca@debian.org>
Liron Himi [Sat, 18 May 2019 21:10:54 +0000 (00:10 +0300)]
config: add Marvell ARMADA based on armv8-a
This patch introduces armada target to address difference
in number of cores, no numa support
Signed-off-by: Liron Himi <lironh@marvell.com> Reviewed-by: Alan Winkowski <walan@marvell.com> Reviewed-by: Jerin Jacob <jerinj@marvell.com> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Gavin Hu <gavin.hu@arm.com>
Yongseok Koh [Tue, 7 May 2019 21:11:40 +0000 (14:11 -0700)]
config: disable armv8 crypto extension
Per armv8 crypto extension support, make build always enable it by default
as long as compiler supports the feature while meson build only enables it
for 'default' machine of generic armv8 architecture.
It is known that not all the armv8 platforms have the crypto extension. For
example, Mellanox BlueField has a variant which doesn't have it. If crypto
enabled binary runs on such a platform, rte_eal_init() fails.
'+crypto' flag currently implies only '+aes' and '+sha2' and enabling it
will generate the crypto instructions only when crypto intrinsics are used.
For the devices supporting 8.2 crypto or newer, compiler could generate
such instructions beyond intrinsics or asm code. For example, compiler can
generate 3-way exclusive OR instructions if sha3 is supported. However, it
has to be enabled by adding '+sha3' as of today.
In DPDK, armv8 cryptodev is the only one which requires the crypto support.
As it even uses external library of Marvell which is compiled out of DPDK
with crypto support and there's run-time check for required cpuflags,
crypto support can be disabled in DPDK.
Yongseok Koh [Thu, 2 May 2019 09:07:53 +0000 (02:07 -0700)]
bus/pci: add Mellanox kernel driver type
When checking RTE_PCI_DRV_IOVA_AS_VA flag to determine IOVA mode,
pci_one_device_has_iova_va() returns true only if kernel driver of the
device is vfio. However, Mellanox mlx4/5 PMD doesn't need to be detached
from kernel driver and attached to VFIO/UIO. Control path still goes
through the existing kernel driver, which is mlx4_core/mlx5_core. In order
to make RTE_PCI_DRV_IOVA_AS_VA effective for mlx4/mlx5 PMD, a new kernel
driver type has to be introduced.
Bruce Richardson [Tue, 14 May 2019 13:37:02 +0000 (14:37 +0100)]
eal/x86: check rdrand and rdseed
The meson build never checked for the presence of rdrand and rdseed
instructions, while make build never checked for rdseed. Ensure builds
always have the appropriate checks - and therefore defines - for these
instructions. For runtime, we also add in rdseed to the list of known
bits returned from cpuid() instruction, so we can confirm its presence at
application init time.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Bruce Richardson [Wed, 15 May 2019 11:38:47 +0000 (12:38 +0100)]
build: warn on unused parameter
To improve code quality we want to turn on as many warnings as we can in
the DPDK code, so turn on the "unused-parameter" warning in meson builds to
match that of the make builds. To ensure correct compilation, disable the
warning selectively for driver base code that otherwise would have issues.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>
Bruce Richardson [Tue, 28 May 2019 11:07:48 +0000 (12:07 +0100)]
build: add libatomic dependency for 32-bit clang
When compiling with clang on 32-bit platforms, we are missing copies
of 64-bit atomic functions. We can solve this by linking against
libatomic for the drivers and libs which need those atomic ops.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>
Bruce Richardson [Tue, 28 May 2019 11:07:47 +0000 (12:07 +0100)]
mem: mark unused function in 32-bit builds
The get_socket_mem_size() function is only used in 64-bit builds,
causing clang to warn about it for 32-bit builds. Add the __rte_unused
attribute to the function to silence the warning.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>
Bruce Richardson [Tue, 28 May 2019 11:07:46 +0000 (12:07 +0100)]
build: remove unnecessary large file support defines
Since we now always use _FILE_OFFSET_BITS=64 flag when building
DPDK, we can remove the Makefile and C-file #defines setting it
individually for parts of the build.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>
Bruce Richardson [Tue, 28 May 2019 11:07:45 +0000 (12:07 +0100)]
build: enable large file support on 32-bit
By default on 32-bit systems, file offsets are given as 32-bit values
which prevents support for large files. While this is unlikely to be
a problem, enabling large file support globally makes "make" and
"meson" builds consistent, since meson always enables large file
support, and without this change, the size of "struct stat" fields
will be different between the two builds.
The only location where this appears to be significant is in the
dpaax common code, where a printf needs to be updated for 32-bit
builds.
The fields of the internal EAL core configuration are currently
laid bare as part of the API. This is not good practice and limits
fixing issues with layout and sizes.
Make new accessor functions for the fields used by current drivers
and examples.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David Marchand <david.marchand@redhat.com>
David Marchand [Wed, 22 May 2019 15:06:56 +0000 (17:06 +0200)]
test/hash: use existing lcore API
Prefer the existing apis rather than direct access the configuration
structure.
test_hash_multi_add_lookup() currently starts n readers and N writers
using rte_eal_remote_launch().
It then waits for the N writers to complete with a custom
multi_writer_done[] array to synchronise over.
Jump on the occasion to use rte_eal_wait_lcore() so that the code is
more straightforward:
- we start n readers with rte_eal_remote_launch(),
- we start N writers with rte_eal_remote_launch(),
- we wait for N writers to join with rte_eal_wait_lcore(),
- we wait for n readers to join with rte_eal_wait_lcore(),