dpdk.git
6 years agomem/linux: fix hugedir write deadlock
Anatoly Burakov [Mon, 30 Apr 2018 10:38:19 +0000 (11:38 +0100)]
mem/linux: fix hugedir write deadlock

At hugepage info initialization, EAL takes out a write lock on
hugetlbfs directories, and drops it after the memory init is
finished. However, in non-legacy mode, if "-m" or "--socket-mem"
switches are passed, this leads to a deadlock because EAL tries
to allocate pages (and thus take out a write lock on hugedir)
while still holding a separate hugedir write lock in EAL.

Fix it by checking if write lock in hugepage info is active, and
not trying to lock the directory if the hugedir fd is valid.

Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment lockfiles")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Shahaf Shuler <shahafs@mellanox.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agoversion: 18.05-rc1
Thomas Monjalon [Fri, 27 Apr 2018 22:26:04 +0000 (00:26 +0200)]
version: 18.05-rc1

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agodoc: update memory option usage for FreeBSD
Reshma Pattan [Fri, 27 Apr 2018 12:59:22 +0000 (13:59 +0100)]
doc: update memory option usage for FreeBSD

EAL option -m is supported in FreeBSD,
so move it under supported heading from non
supported heading.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
6 years agomaintainers: call out subtree for bbdev and security
Pablo de Lara [Fri, 13 Apr 2018 08:14:39 +0000 (09:14 +0100)]
maintainers: call out subtree for bbdev and security

Commits for bbdev and security libraries are merged
into the Next Crypto subtree.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agomaintainers: add backup for next-crypto tree
Akhil Goyal [Thu, 12 Apr 2018 08:32:16 +0000 (14:02 +0530)]
maintainers: add backup for next-crypto tree

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agomaintainers: claim EAL memory init
Anatoly Burakov [Mon, 23 Apr 2018 12:29:57 +0000 (13:29 +0100)]
maintainers: claim EAL memory init

Claim maintainership of all areas of EAL memory init, including
OS-specific parts of it.

Also, claim maintainership of fbarray, since although it's not
related to memory allocation, it is heavily used by it and its
primary purpose is to serve memory allocation functions, and
thus will appear under "memory allocation" banner.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agomem: revert to using flock and add per-segment lockfiles
Anatoly Burakov [Thu, 19 Apr 2018 10:20:21 +0000 (11:20 +0100)]
mem: revert to using flock and add per-segment lockfiles

The original implementation used flock() locks, but was later
switched to using fcntl() locks for page locking, because
fcntl() locks allow locking parts of a file, which is useful
for single-file segments mode, where locking the entire file
isn't as useful because we still need to grow and shrink it.

However, according to fcntl()'s Ubuntu manpage [1], semantics of
fcntl() locks have a giant oversight:

  This interface follows the completely stupid semantics of System
  V and IEEE Std 1003.1-1988 (“POSIX.1”) that require that all
  locks associated with a file for a given process are removed
  when any file descriptor for that file is closed by that process.
  This semantic means that applications must be aware of any files
  that a subroutine library may access.

Basically, closing *any* fd with an fcntl() lock (which we do because
we don't want to leak fd's) will drop the lock completely.

So, in this commit, we will be reverting back to using flock() locks
everywhere. However, that still leaves the problem of locking parts
of a memseg list file in single file segments mode, and we will be
solving it with creating separate lock files per each page, and
tracking those with flock().

We will also be removing all of this tailq business and replacing it
with a simple array - saving a few bytes is not worth the extra
hassle of dealing with pointers and potential memory allocation
failures. Also, remove the tailq lock since it is not needed - these
fd lists are per-process, and within a given process, it is always
only one thread handling access to hugetlbfs.

So, first one to allocate a segment will create a lockfile, and put
a shared lock on it. When we're shrinking the page file, we will be
trying to take out a write lock on that lockfile, which would fail if
any other process is holding onto the lockfile as well. This way, we
can know if we can shrink the segment file. Also, if no other locks
are found in the lock list for a given memseg list, the memseg list
fd is automatically closed.

One other thing to note is, according to flock() Ubuntu manpage [2],
upgrading the lock from shared to exclusive is implemented by dropping
and reacquiring the lock, which is not atomic and thus would have
created race conditions. So, on attempting to perform operations in
hugetlbfs, we will take out a writelock on hugetlbfs directory, so
that only one process could perform hugetlbfs operations concurrently.

[1] http://manpages.ubuntu.com/manpages/artful/en/man2/fcntl.2freebsd.html
[2] http://manpages.ubuntu.com/manpages/bionic/en/man2/flock.2.html

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")
Fixes: a5ff05d60fc5 ("mem: support unmapping pages at runtime")
Fixes: 2a04139f66b4 ("eal: add single file segments option")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: add memalloc init stage
Anatoly Burakov [Thu, 19 Apr 2018 09:40:48 +0000 (10:40 +0100)]
mem: add memalloc init stage

Currently, memseg lists for secondary process are allocated on
sync (triggered by init), when they are accessed for the first
time. Move this initialization to a separate init stage for
memalloc.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: improve autodetection of hugepage counts on 32-bit
Anatoly Burakov [Tue, 24 Apr 2018 10:19:24 +0000 (11:19 +0100)]
mem: improve autodetection of hugepage counts on 32-bit

For non-legacy mode, we are preallocating space for hugepages, so
we know in advance which pages we will be able to allocate, and
which we won't. However, the init procedure was using hugepage
counts gathered from sysfs and paid no attention to hugepage
sizes that were actually available for reservation, and failed
on attempts to reserve unavailable pages.

Fix this by limiting total page counts by number of pages
actually preallocated.

Also, VA preallocate procedure only looks at mountpoints that are
available, and expects pages to exist if a mountpoint exists. That
might not necessarily be the case, so also check if there are
hugepages available for a particular page size on a particular
NUMA node.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
6 years agomem: improve preallocation on 32-bit
Anatoly Burakov [Fri, 20 Apr 2018 15:25:26 +0000 (16:25 +0100)]
mem: improve preallocation on 32-bit

Previously, if we couldn't preallocate VA space on 32-bit for
one page size, we simply bailed out, even though we could've
tried allocating VA space with other page sizes.

For example, if user had both 1G and 2M pages enabled, and
has asked DPDK to allocate memory on both sockets, DPDK
would've tried to allocate VA space for 1x1G page on both
sockets, failed and never tried again, even though it
could've allocated the same 1G of VA space for 512x2M pages.

Fix this by retrying with different page sizes if VA space
reservation failed.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
6 years agomem: fix 32-bit memory upper limit for non-legacy mode
Anatoly Burakov [Fri, 20 Apr 2018 15:25:25 +0000 (16:25 +0100)]
mem: fix 32-bit memory upper limit for non-legacy mode

32-bit mode has an upper limit on amount of VA space it can preallocate,
but the original implementation used the wrong constant, resulting in
failure to initialize due to integer overflow. Fix it by using the
correct constant.

Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Jananee Parthasarathy <jananeex.m.parthasarathy@intel.com>
6 years agomalloc: check for heap corruption
Anatoly Burakov [Mon, 16 Apr 2018 15:04:27 +0000 (16:04 +0100)]
malloc: check for heap corruption

Previous code checked for both first/last elements being NULL,
but if they weren't, the expectation was that they're both
non-NULL, which will be the case under normal conditions, but
may not be the case due to heap structure corruption.

Coverity issue: 272566
Fixes: bb372060dad4 ("malloc: make heap a doubly-linked list")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agomalloc: fix out-of-bounds segment array access
Anatoly Burakov [Mon, 16 Apr 2018 16:45:00 +0000 (17:45 +0100)]
malloc: fix out-of-bounds segment array access

Technically, while the pointer would've been invalid if msl_idx
were invalid, we wouldn't have actually attempted to access the
pointer until verifying the index. Fix it by moving array access
to after we've verified validity of the index.

Coverity issue: 272574
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agomalloc: replace snprintf with strlcpy
Anatoly Burakov [Tue, 17 Apr 2018 14:20:45 +0000 (15:20 +0100)]
malloc: replace snprintf with strlcpy

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
6 years agomem: log page address before unmapping
Anatoly Burakov [Tue, 17 Apr 2018 10:57:52 +0000 (11:57 +0100)]
mem: log page address before unmapping

If user has specified a flag to unmap the area right after mapping it,
we were passing an already-unmapped pointer to RTE_LOG. This is not an
issue since RTE_LOG doesn't actually dereference the pointer, but fix
it anyway by moving call to RTE_LOG to before unmap.

Coverity issue: 272584
Fixes: b7cc54187ea4 ("mem: move virtual area function in common directory")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix page fault trigger
Anatoly Burakov [Fri, 27 Apr 2018 16:38:21 +0000 (17:38 +0100)]
mem: fix page fault trigger

Coverity reports these lines as having no effect. Technically, we do
want for those lines to have no effect, however they would've likely
been optimized out. Add volatile qualifiers to ensure the code has
effects.

Coverity issue: 272608
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix potential bad unmap on map failure
Anatoly Burakov [Mon, 16 Apr 2018 14:37:30 +0000 (15:37 +0100)]
mem: fix potential bad unmap on map failure

Previously, if mmap failed to map page address at requested
address, we were attempting to unmap the wrong address. Fix it
by unmapping our actual mapped address, and jump further to
avoid unmapping memory that is not allocated.

Coverity issue: 272602
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix comparison of old policy
Anatoly Burakov [Mon, 16 Apr 2018 16:18:55 +0000 (17:18 +0100)]
mem: fix comparison of old policy

Previous code had an old rebase leftover from the time when
oldpolicy was an actual int, instead of a pointer. Fix it to
do comparison with dereferencing the pointer.

Coverity issue: 272589
Fixes: 582bed1e1d1d ("mem: support mapping hugepages at runtime")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix potential resource leak on alloc
Anatoly Burakov [Mon, 16 Apr 2018 15:37:03 +0000 (16:37 +0100)]
mem: fix potential resource leak on alloc

Normally, tailq entry should have a valid fd by the time we attempt
to map the segment. However, in case it doesn't, we're leaking fd,
so fix it.

Coverity issue: 272570
Fixes: 2a04139f66b4 ("eal: add single file segments option")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
6 years agomem: fix potential resource leak on freeing
Anatoly Burakov [Mon, 16 Apr 2018 15:31:08 +0000 (16:31 +0100)]
mem: fix potential resource leak on freeing

We close fd if we managed to find it in the list of allocated
segment lists (which should always be the case under normal
conditions), but if we didn't, the fd was leaking. Close it if
we couldn't find it in the segment list. This is not an issue
as if the segment is zero length, we're getting rid of it
anyway, so there's no harm in not storing the fd anywhere.

Coverity issue: 272568
Fixes: 2a04139f66b4 ("eal: add single file segments option")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix potential double close on map failure
Anatoly Burakov [Mon, 16 Apr 2018 15:11:55 +0000 (16:11 +0100)]
mem: fix potential double close on map failure

We were closing descriptor before checking if mapping has
failed, but if it did, we did a second close afterwards. Fix
it by moving closing descriptor to after we've done all error
checks.

Coverity issue: 272560
Fixes: 2a04139f66b4 ("eal: add single file segments option")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
6 years agomem: fix resource leak on map failure
Anatoly Burakov [Mon, 16 Apr 2018 14:40:02 +0000 (15:40 +0100)]
mem: fix resource leak on map failure

Coverity issue: 272601
Fixes: 66cc45e293ed ("mem: replace memseg with memseg lists")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: use strlcpy instead of snprintf
Anatoly Burakov [Tue, 17 Apr 2018 13:42:27 +0000 (14:42 +0100)]
mem: use strlcpy instead of snprintf

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agomem: fix resize return handling for --single-file-segments
Jianfeng Tan [Thu, 26 Apr 2018 08:06:53 +0000 (08:06 +0000)]
mem: fix resize return handling for --single-file-segments

resize_hugefile() returns either 0 (which indicates success) or -1
(which indicates failure). We failed to check the success as we
use --single-file-segments option.

Fixes: 2a04139f66b4 ("eal: add single file segments option")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
6 years agoeal: fix threads block on barrier
Jianfeng Tan [Fri, 27 Apr 2018 16:41:42 +0000 (16:41 +0000)]
eal: fix threads block on barrier

Below commit introduced pthread barrier for synchronization.
But two IPC threads block on the barrier, and never wake up.

  (gdb) bt
  #0  futex_wait (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/unix/sysv/linux/futex-internal.h:61
  #1  futex_wait_simple (private=0, expected=0, futex_word=0x7fffffffcff4)
      at ../sysdeps/nptl/futex-internal.h:135
  #2  __pthread_barrier_wait (barrier=0x7fffffffcff0) at pthread_barrier_wait.c:184
  #3  rte_thread_init (arg=0x7fffffffcfe0)
      at ../dpdk/lib/librte_eal/common/eal_common_thread.c:160
  #4  start_thread (arg=0x7ffff6ecf700) at pthread_create.c:333
  #5  clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Through analysis, we find the barrier defined on the stack could be the
root cause. This patch will change to use heap memory as the barrier.

Fixes: d651ee4919cd ("eal: set affinity for control threads")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
6 years agobus/dpaa: optimize physical to virtual address search
Shreyansh Jain [Fri, 27 Apr 2018 17:20:58 +0000 (22:50 +0530)]
bus/dpaa: optimize physical to virtual address search

With Hotplugging memory support, the order of memseg has been changed
from physically contiguous to virtual contiguous. DPAA bus and drivers
depend on PA to VA address conversion for I/O.

This patch creates a list of blocks requested to be pinned to the
DPAA mempool. For searching physical addresses, it is expected that
it would belong to this list (from hardware pool) and hence it is
less expensive than memseg walks. Though, there is a marginal drop
in performance vis-a-vis the legacy mode with physically contiguous
memsegs.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agobus/fslmc: optimize physical to virtual address search
Shreyansh Jain [Fri, 27 Apr 2018 17:20:57 +0000 (22:50 +0530)]
bus/fslmc: optimize physical to virtual address search

With Hotplugging memory support, the order of memseg has been changed
from physically contiguous to virtual contiguous. FSLMC bus and dpaa2
drivers depend on PA to VA address conversion when in Physical
addressing mode.

This patch creates a list of blocks requested to be pinned to the
DPAA2 mempool. For searching physical addresses, it is expected that
it would belong to this list (from hardware pool) and hence it is
less expensive than memseg walks. Though, this has marginal impact on
performance vis-a-vis legacy mode with physically contiguous memsegs.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agocrypto/dpaa_sec: remove ctx based offset for PA-VA conversion
Shreyansh Jain [Fri, 27 Apr 2018 17:20:56 +0000 (22:50 +0530)]
crypto/dpaa_sec: remove ctx based offset for PA-VA conversion

Crypto requires physical to virtual address conversion for
descriptors. Prior to memory hotplugging this was based on memseg
iteration assuming memsegs are all physical contiguous and using
cached start address fast calculations can be done. This
assumption now stands invalid with memory hotplugging support.

In preparation for supporting hotplugging change to memory,
this patchset removes the optimized pool context stored physical
address offset based PA-VA conversion.

This adversely affects the performance as complete memsegs now need
to be parsed, but a rework containing necessary optimization would be
posted over this.

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
6 years agoapp/testpmd: conserve offload flags of mbuf
Yongseok Koh [Fri, 27 Apr 2018 17:22:52 +0000 (10:22 -0700)]
app/testpmd: conserve offload flags of mbuf

This patch is to accommodate an experimental feature of mbuf - external
buffer attachment. If mbuf is attached to an external buffer, its ol_flags
will have EXT_ATTACHED_MBUF set. Without enabling/using the feature,
everything remains same.

If PMD delivers Rx packets with non-direct mbuf, ol_flags should not be
overwritten. For mlx5 PMD, if Multi-Packet RQ is enabled, Rx packets could
be carried with externally attached mbufs.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agombuf: support attaching external buffer
Yongseok Koh [Fri, 27 Apr 2018 17:22:51 +0000 (10:22 -0700)]
mbuf: support attaching external buffer

This patch introduces a new way of attaching an external buffer to a mbuf.

Attaching an external buffer is quite similar to mbuf indirection in
replacing buffer addresses and length of a mbuf, but a few differences:
  - When an indirect mbuf is attached, refcnt of the direct mbuf would be
    2 as long as the direct mbuf itself isn't freed after the attachment.
    In such cases, the buffer area of a direct mbuf must be read-only. But
    external buffer has its own refcnt and it starts from 1. Unless
    multiple mbufs are attached to a mbuf having an external buffer, the
    external buffer is writable.
  - There's no need to allocate buffer from a mempool. Any buffer can be
    attached with appropriate free callback.
  - Smaller metadata is required to maintain shared data such as refcnt.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agovhost/crypto: fix checks while moving descriptors
Fan Zhang [Fri, 27 Apr 2018 14:06:08 +0000 (15:06 +0100)]
vhost/crypto: fix checks while moving descriptors

This patch fix final condition check while moving virtqueue
descriptors.

Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost/crypto: fix missing head correction
Fan Zhang [Fri, 27 Apr 2018 13:52:33 +0000 (14:52 +0100)]
vhost/crypto: fix missing head correction

This patch fixes the missing head descriptor correction for
indirect descriptors.

Fixes: 0aee2428419f ("vhost/crypto: move to safe GPA translation API")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost: fix vDPA set features
Xiao Wang [Wed, 25 Apr 2018 02:18:27 +0000 (10:18 +0800)]
vhost: fix vDPA set features

We should call set_features callback after setting features in virtio_net
structure, otherwise vDPA driver cannot get the right features.

Fixes: 07718b4f87aa ("vhost: adapt library for selective datapath")

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost: revert avoid concurrency when logging dirty pages
Maxime Coquelin [Fri, 20 Apr 2018 08:39:21 +0000 (10:39 +0200)]
vhost: revert avoid concurrency when logging dirty pages

This reverts commit 394313fff39d0f994325c47f7eab39daf5dc9e11.

While the patch did solve concurrency issue, it induces more
pages copies as some clean pages are marked as dirty for
performance reasons. Moreover, as there is no more contention
doing the logging, the rate of packets than can be processed is
higher, leading to even more pages to be dirtied.

It has been reported that with more than one queue pair, and
with a relatively low packet rate (1Mpps), the live migration
never converges until the flow is stopped.

While a better solution is found, it is better to reset to the
old behaviour, i.e. using atomic operation for dirty pages
logging.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agodoc: update doc and release notes for szedata2 driver
Matej Vido [Fri, 27 Apr 2018 08:57:05 +0000 (10:57 +0200)]
doc: update doc and release notes for szedata2 driver

New version of the packages with dependencies for the szedata2
driver is needed due to the new API of the libsze2 library which
is used in the driver.
The documentation and the release notes are updated to contain
the information about the required versions.

Signed-off-by: Matej Vido <vido@cesnet.cz>
Acked-by: Jan Remes <remes@netcope.com>
6 years agonet/i40e: fix checking offload
Yanglong Wu [Fri, 27 Apr 2018 08:14:07 +0000 (16:14 +0800)]
net/i40e: fix checking offload

Missing "return -ENOTSUP" will always lead to illegal offload
passing through offload checking.

Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
6 years agonet/i40e: fix missing jumbo frame offload capability
Yanglong Wu [Wed, 18 Apr 2018 01:42:04 +0000 (09:42 +0800)]
net/i40e: fix missing jumbo frame offload capability

JUMBO_FRAME offload capability should be exposed since i40e
does support it

Fixes: c3ac7c5b0b8a ("net/i40e: convert to new Rx offloads API")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agoethdev: rename folder to library name
Ferruh Yigit [Thu, 26 Apr 2018 21:25:59 +0000 (22:25 +0100)]
ethdev: rename folder to library name

Library folder name and output library name are same except a few flaws
including librte_ether.

This library is network device abstraction layer, the name "ethdev" fits
better than "ether", and library & header files already named as ethdev.

Also there is a rte_ether.h in the net library which can cause confusion.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
6 years agonet/bonding: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:46 +0000 (08:56 -0700)]
net/bonding: convert to dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/vhost: implement dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:45 +0000 (08:56 -0700)]
net/vhost: implement dynamic logging

Use dynamic log type (instead of PMD) in vhost.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/pcap: support dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:44 +0000 (08:56 -0700)]
net/pcap: support dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/kni: support dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:43 +0000 (08:56 -0700)]
net/kni: support dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/failsafe: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:42 +0000 (08:56 -0700)]
net/failsafe: convert to dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/softnic: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:41 +0000 (08:56 -0700)]
net/softnic: convert to dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/ring: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:40 +0000 (08:56 -0700)]
net/ring: convert to dynamic logging

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/null: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:39 +0000 (08:56 -0700)]
net/null: convert to dynamic logging

Convert null device to use dynamic logging.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/af_packet: convert to dynamic log level
Stephen Hemminger [Wed, 25 Apr 2018 15:56:38 +0000 (08:56 -0700)]
net/af_packet: convert to dynamic log level

Convert this driver to use dynamic log level support.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/tap: convert to dynamic logging
Stephen Hemminger [Wed, 25 Apr 2018 15:56:37 +0000 (08:56 -0700)]
net/tap: convert to dynamic logging

Use new logging macro to convert all calls to RTE_LOG() into
new dynamic log type.

Also fix whitespace.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agodoc: advertise equal stride super-buffer Rx mode in net/sfc
Andrew Rybchenko [Thu, 19 Apr 2018 11:37:06 +0000 (12:37 +0100)]
doc: advertise equal stride super-buffer Rx mode in net/sfc

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc: support MARK and FLAG actions in flow API
Roman Zhukov [Thu, 19 Apr 2018 11:37:05 +0000 (12:37 +0100)]
net/sfc: support MARK and FLAG actions in flow API

Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc: make processing of flow rule actions more uniform
Roman Zhukov [Thu, 19 Apr 2018 11:37:04 +0000 (12:37 +0100)]
net/sfc: make processing of flow rule actions more uniform

Prepare function that parse flow rule actions to support not
fate-deciding actions.

Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: get max supported value for action MARK
Roman Zhukov [Thu, 19 Apr 2018 11:37:03 +0000 (12:37 +0100)]
net/sfc/base: get max supported value for action MARK

The mark value for MATCH_ACTION_MARK has a maximum value.
Requesting a value larger than the maximum will cause the
filter insertion to fail with EINVAL. This patch allows the
driver to check the value at the filter validation.

Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: support MARK and FLAG actions in filters
Roman Zhukov [Thu, 19 Apr 2018 11:37:02 +0000 (12:37 +0100)]
net/sfc/base: support MARK and FLAG actions in filters

This patch adds support for DPDK rte_flow "MARK" and "FLAG" filter
actions to filters on EF10 family NICs.

Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: get actions MARK and FLAG support
Roman Zhukov [Thu, 19 Apr 2018 11:37:01 +0000 (12:37 +0100)]
net/sfc/base: get actions MARK and FLAG support

Filter actions MARK and FLAG are supported on Medford2 by DPDK
firmware variant.

Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc: support flow marks in equal stride super-buffer Rx
Andrew Rybchenko [Thu, 19 Apr 2018 11:37:00 +0000 (12:37 +0100)]
net/sfc: support flow marks in equal stride super-buffer Rx

Equal stride super-buffer Rx mode allows to mark packets in HW
using filters. Process the data on datapath and advertise
corresponding features to allow flow API support to implement it.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc: add Rx descriptor wait timeout
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:59 +0000 (12:36 +0100)]
net/sfc: add Rx descriptor wait timeout

Add device argument to customize Rx descriptor wait timeout which
is supported in DPDK firmware variant only in equal stride super-buffer
Rx mode only.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: support DPDK firmware variant
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:58 +0000 (12:36 +0100)]
net/sfc: support DPDK firmware variant

DPDK firmware variant supports equal stride super-buffer Rx mode which
provides higher packet rate and packet marks but requires dedicated
mempool manager with contiguous object block allocation (e.g. bucket).

Also the firmware supports subvariant without checksumming on Tx which
allows to reach higher packet rates on transmit if checksumming is not
required.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: check mempool when equal stride super-buffer used
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:57 +0000 (12:36 +0100)]
net/sfc: check mempool when equal stride super-buffer used

Equal stride super-buffer requires mempool with contiguous object
block allocation mechanism. Bucket mempool is the only which provides it.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: support callback to check if mempool is supported
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:56 +0000 (12:36 +0100)]
net/sfc: support callback to check if mempool is supported

The callback is a dummy yet since no Rx datapath provides its own
callback, so all pools are supported.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: support equal stride super-buffer Rx mode
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:55 +0000 (12:36 +0100)]
net/sfc: support equal stride super-buffer Rx mode

HW Rx descriptor represents many contiguous packet buffers which
follow each other. Number of buffers, stride and maximum DMA
length are setup-time configurable per Rx queue based on provided
mempool. The mempool must support contiguous block allocation and
get info API to retrieve number of objects in the block.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: allow to take mbuf pool into account when sizing
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:54 +0000 (12:36 +0100)]
net/sfc: allow to take mbuf pool into account when sizing

The new argument will be used by the equal stride super-buffer
Rx datapath.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: allow one Rx queue entry carry many packet buffers
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:53 +0000 (12:36 +0100)]
net/sfc: allow one Rx queue entry carry many packet buffers

One HW Rx descriptor has many packet buffers in the case of equal
stride super-buffer Rx modes. Each packet buffer is still treated
as separate SW Rx descriptor. rxq_entries is the size of HW Rx ring
whereas nb_rx_desc is the number of SW Rx descriptors.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: conditionally compile support for tunnel packets
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:52 +0000 (12:36 +0100)]
net/sfc: conditionally compile support for tunnel packets

Equal stride super-buffer Rx datapath does not support tunnels, code to
parse tunnel packet types and inner checksum offload is not required and
it is important to be able to compile it out on build time to avoid
extra CPU load.

Cutting of tunnels support relies on compiler optimizaitons to
be able to drop extra checks and branches if tun_ptype is always 0.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: move EF10 Rx event parser to shared header
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:51 +0000 (12:36 +0100)]
net/sfc: move EF10 Rx event parser to shared header

Equal stride super-buffer Rx datapath will use it as well.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
6 years agonet/sfc: prepare EF10 Rx event parser to be reused
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:50 +0000 (12:36 +0100)]
net/sfc: prepare EF10 Rx event parser to be reused

Equal stride super-buffer Rx mode will be handled by the dedicated
Rx datapath and the mode has almost the same Rx event structure as
single packet Rx mode.

Restructure the code to allow the common parts to be shared.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
6 years agonet/sfc: factor out function to push Rx doorbell
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:49 +0000 (12:36 +0100)]
net/sfc: factor out function to push Rx doorbell

The function may be shared by different Rx datapath implementations.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
6 years agonet/sfc/base: add equal stride super-buffer prefix layout
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:48 +0000 (12:36 +0100)]
net/sfc/base: add equal stride super-buffer prefix layout

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: support equal stride super-buffer Rx mode
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:47 +0000 (12:36 +0100)]
net/sfc/base: support equal stride super-buffer Rx mode

Equal stride super-buffer Rx mode is supported by DPDK firmware
variant. One Rx descriptor provides many Rx buffers to firmware.
Rx buffers follow each other with specified stride.
Also it supports head of line blocking with timeout to address
drops when no Rx descriptors are available. So it gives extra time
to the driver to provide Rx descriptors before drop.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: detect equal stride super-buffer support
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:46 +0000 (12:36 +0100)]
net/sfc/base: detect equal stride super-buffer support

Equal stride super-buffer Rx mode is supported on Medford2 by
DPDK firmware variant.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: make RxQ type data an union
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:45 +0000 (12:36 +0100)]
net/sfc/base: make RxQ type data an union

The type is an internal interface. Single integer is insufficient
to carry RxQ type-specific information in the case of equal stride
super-buffer Rx mode (packet buffers per bucket, maximum DMA length,
packet stride, head of line block timeout).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc/base: update autogenerated MCDI and TLV headers
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:44 +0000 (12:36 +0100)]
net/sfc/base: update autogenerated MCDI and TLV headers

Equal stride super-buffer is a new name instead of deprecated equal
stride packed stream to avoid confusion with previous packed stream.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/nfp: use dynamic logging everywhere
Stephen Hemminger [Wed, 25 Apr 2018 15:45:51 +0000 (08:45 -0700)]
net/nfp: use dynamic logging everywhere

Drivers should only log with their assigned logtype, not with the
generic PMD log type.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: add newline in PMD_RX/TX_LOG macros
Stephen Hemminger [Wed, 25 Apr 2018 15:45:50 +0000 (08:45 -0700)]
net/nfp: add newline in PMD_RX/TX_LOG macros

Be consistent with usage in other drivers.
No need for snowflake drivers.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: fix double space in init log
Stephen Hemminger [Wed, 25 Apr 2018 15:45:49 +0000 (08:45 -0700)]
net/nfp: fix double space in init log

Shouldn't pass extra newline.

Fixes: 7dcb19d78f27 ("net/nfp: fix Rx interrupt when multiqueue")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: add implied new line to log macro
Stephen Hemminger [Wed, 25 Apr 2018 15:45:48 +0000 (08:45 -0700)]
net/nfp: add implied new line to log macro

The PMD_INIT_LOG macro always adds a newline, and other drivers version
of PMD_DRV_LOG always adds a newline. Therefore change nfp driver
to be consitent with others.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: use correct logtype for init messages
Stephen Hemminger [Wed, 25 Apr 2018 15:45:47 +0000 (08:45 -0700)]
net/nfp: use correct logtype for init messages

The NFP driver init messages would come out under PMD not net.pmd.nfp.init.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agoethdev: add shared counter to flow API
Declan Doherty [Thu, 26 Apr 2018 17:29:19 +0000 (18:29 +0100)]
ethdev: add shared counter to flow API

Add rte_flow_action_count action data structure to enable shared
counters across multiple flows on a single port or across multiple
flows on multiple ports within the same switch domain. Also this enables
multiple count actions to be specified in a single flow action.

This patch also modifies the existing rte_flow_query API to take the
rte_flow_action structure as an input parameter instead of the
rte_flow_action_type enumeration to allow querying a specific action
from a flow rule when multiple actions of the same type are specified.

This patch also contains updates for the bonding, failsafe and mlx5 PMDs
and testpmd application which are affected by this API change.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
6 years agoethdev: add mark flow item
Declan Doherty [Thu, 26 Apr 2018 17:29:18 +0000 (18:29 +0100)]
ethdev: add mark flow item

Introduces a new action type RTE_FLOW_ITEM_TYPE_MARK which enables
flow patterns to specify arbitrary integer values to match aginst
set by the RTE_FLOW_ACTION_TYPE_MARK action in previously matched
flows.

Add support for specification of new MARK flow item in testpmd's cli.
Update testpmd documentation to describe new MARK flow item support.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
6 years agoethdev: add group jump action
Declan Doherty [Thu, 26 Apr 2018 17:29:17 +0000 (18:29 +0100)]
ethdev: add group jump action

Add jump action type which defines an action which allows a matched
flow to be redirect to the specified group. This allows physical and
logical flow table/group hierarchies to be defined through rte_flow.

This breaks ABI compatibility for the following public functions (as it
modifes the ordering of the rte_flow_action_type enumeration):

- rte_flow_copy()
- rte_flow_create()
- rte_flow_query()
- rte_flow_validate()

Add support for specification of new JUMP action to testpmd's flow
cli, and update the testpmd documentation to describe this new
action.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
6 years agoethdev: add tunnel encap/decap actions
Declan Doherty [Thu, 26 Apr 2018 17:29:16 +0000 (18:29 +0100)]
ethdev: add tunnel encap/decap actions

Add new flow action types and associated action data structures to
support the encapsulation and decapsulation of VXLAN and NVGRE tunnel
endpoints.

The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_ENCAP action will cause the
matching flow to be encapsulated in the tunnel endpoint overlay
defined in the [vxlan/nvgre]_encap action data.

The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_DECAP action will cause all
headers associated with the outer most tunnel endpoint of the specified
type for the matching flows.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
6 years agonet/sfc: do not use RSS context if it is not required
Andrew Rybchenko [Thu, 26 Apr 2018 16:48:57 +0000 (17:48 +0100)]
net/sfc: do not use RSS context if it is not required

RSS action with only one destination queue and no specific settings
for hash types and key does not require dedicated RSS context and
may be simplified to QUEUE action.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
6 years agonet/tap: return empty port offload capabilities
Ophir Munk [Thu, 26 Apr 2018 11:13:02 +0000 (11:13 +0000)]
net/tap: return empty port offload capabilities

Fix internal report on port specific offload capabilities to be 0 (no
capabilities). Before this commit port capabilities were a clone of queue
capabilities, however the current TAP offload capabilities (e.g.
checksum calculation) are per queue and are not specific per port.
This commit fixes an internal validation check for new configured
queue offloads.
The port capability API keeps reporting all queue capabilities as port
capabilities.

Fixes: 95ae196ae10b ("net/tap: use new Rx offloads API")
Fixes: 818fe14a9891 ("net/tap: use new Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
6 years agonet/sfc: ignore spec bits not covered by mask
Andrew Rybchenko [Wed, 25 Apr 2018 17:18:34 +0000 (18:18 +0100)]
net/sfc: ignore spec bits not covered by mask

mask is a simple bit-mask applied before interpreting the contents
of spec and last.

Fixes: a9825ccf5bb8 ("net/sfc: support flow API filters")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
6 years agonet/ixgbe: add support for representor ports
Declan Doherty [Thu, 26 Apr 2018 10:41:05 +0000 (11:41 +0100)]
net/ixgbe: add support for representor ports

Add support for virtual function representor ports to the ixgbe PF
driver. When SR-IOV virtual functions devices are enabled a
corresponding representor port for each VF can be enabled in the
process in which the ixgbe PMD is running within, by specifying the
representor devargs with the list of VF ports that representors
are to be created for.

An example of the devargs which would create VF representor for virtual
functions 0,2,4,5,6 and 7 is:

-w DBDF,representor=[0,2,4-7]

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
Signed-off-by: Remy Horton <remy.horton@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/i40e: add support for representor ports
Declan Doherty [Thu, 26 Apr 2018 10:41:04 +0000 (11:41 +0100)]
net/i40e: add support for representor ports

Add support for virtual function representor ports to the i40e PF
driver. When SR-IOV virtual functions devices are enabled a
corresponding representor port for each VF can be enabled, in the
process in which the i40e PMD is running, by specifying the
representor devargs with the list of VF ports that representors
are to be created for.

An example of the devargs which would create VF representor for virtual
functions 0,2,4,5,6 and 7 is:

-w DBDF,representor=[0,2,4-7]

and to just specify a single representor on virtual function 3 (switch
port id):

-w DBDF,representor=3

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
Signed-off-by: Remy Horton <remy.horton@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add switch domain allocator
Declan Doherty [Thu, 26 Apr 2018 10:41:03 +0000 (11:41 +0100)]
ethdev: add switch domain allocator

Add switch domain allocate and free API to enable NET devices to
synchronise switch domain allocation.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add common devargs parser
Remy Horton [Thu, 26 Apr 2018 10:41:02 +0000 (11:41 +0100)]
ethdev: add common devargs parser

Introduces a new structure, rte_eth_devargs, to support generic
ethdev arguments common across NET PMDs, with a new API
rte_eth_devargs_parse API to support PMD parsing these arguments. The
patch add support for a representor argument  passed with passed with
the EAL -w option. The representor parameter allows the user to specify
which representor ports to initialise on a device.

The argument supports passing a single representor port, a list of
port values or a range of port values.

-w BDF,representor=1  # create representor port 1 on pci device BDF
-w BDF,representor=[1,2,5,6,10] # create representor ports in list
-w BDF,representor=[0-31] # create representor ports in range

Signed-off-by: Remy Horton <remy.horton@intel.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoapp/testpmd: add port name to device info
Declan Doherty [Thu, 26 Apr 2018 10:41:01 +0000 (11:41 +0100)]
app/testpmd: add port name to device info

Add the port name to information printed by show port info <port_id>

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add port representor device flag
Declan Doherty [Thu, 26 Apr 2018 10:41:00 +0000 (11:41 +0100)]
ethdev: add port representor device flag

Add new device flag to specify that an ethdev port is a port
representor.  Extend rte_eth_dev_info structure to expose device flags
to the user which enables applications to discover if a port is a
representor port.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add generic create/destroy ethdev APIs
Declan Doherty [Thu, 26 Apr 2018 10:40:59 +0000 (11:40 +0100)]
ethdev: add generic create/destroy ethdev APIs

Add new bus generic ethdev create/destroy APIs which are bus independent
and provide hooks for bus specific initialisation.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoethdev: add switch identifier parameter to port
Declan Doherty [Thu, 26 Apr 2018 10:40:58 +0000 (11:40 +0100)]
ethdev: add switch identifier parameter to port

Introduces a new port attribute to ethdev port's which denotes the
switch domain a port belongs to. By default all port's switch
identifiers are set to RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID. Ports
which supported the concept of switch domains can be configured with
the same switch domain id.

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agodoc: add switch representation documentation
Declan Doherty [Thu, 26 Apr 2018 10:40:57 +0000 (11:40 +0100)]
doc: add switch representation documentation

Add document to describe the  model for representing switching capable
devices in DPDK, using a general ethdev port model and through port
representors. This document also details the port model and the
rte_flow semantics required for flow programming, as well as listing
some example use cases.

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Reviewed-by: Marko Kovacevic <marko.kovacevic@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agodoc: update mlx5 guide on tunnel offloading
Xueming Li [Mon, 23 Apr 2018 12:33:10 +0000 (20:33 +0800)]
doc: update mlx5 guide on tunnel offloading

Remove tunnel limitations, add new hardware tunnel offload features.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: allow flow tunnel ID 0 with outer pattern
Xueming Li [Mon, 23 Apr 2018 12:33:09 +0000 (20:33 +0800)]
net/mlx5: allow flow tunnel ID 0 with outer pattern

Tunnel w/o tunnel id pattern could match any non-tunneled packet,
this patch allowed tunnel w/o tunnel id pattern after proper outer spec.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: introduce VXLAN-GPE tunnel type
Xueming Li [Mon, 23 Apr 2018 12:33:08 +0000 (20:33 +0800)]
net/mlx5: introduce VXLAN-GPE tunnel type

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: add hardware flow debug dump
Xueming Li [Mon, 23 Apr 2018 12:33:07 +0000 (20:33 +0800)]
net/mlx5: add hardware flow debug dump

Dump verb flow detail including flow spec type and size for debugging
purpose.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: support tunnel RSS level
Xueming Li [Mon, 23 Apr 2018 12:33:06 +0000 (20:33 +0800)]
net/mlx5: support tunnel RSS level

Tunnel RSS level of flow RSS action offers user a choice to do RSS hash
calculation on inner or outer RSS fields. Testpmd flow command examples:

GRE flow inner RSS:
  flow create 0 ingress pattern eth / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 1 / end

GRE tunnel flow outer RSS:
  flow create 0 ingress pattern eth  / ipv4 proto is 47 / gre / end
actions rss queues 1 2 end level 0 / end

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: split flow RSS handling logic
Xueming Li [Mon, 23 Apr 2018 12:33:05 +0000 (20:33 +0800)]
net/mlx5: split flow RSS handling logic

This patch split out flow RSS hash field handling logic to dedicate
function.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: cleanup tunnel checksum offloads
Xueming Li [Mon, 23 Apr 2018 12:33:04 +0000 (20:33 +0800)]
net/mlx5: cleanup tunnel checksum offloads

Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: support Rx tunnel type identification
Xueming Li [Mon, 23 Apr 2018 12:33:03 +0000 (20:33 +0800)]
net/mlx5: support Rx tunnel type identification

This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue, no tunnel type
will be returned. User application could use bits in flow mark as tunnel
type identifier.

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>