dpdk.git
6 years agonet/sfc: improve unsupported flow pattern message
Ivan Malov [Wed, 16 May 2018 13:24:03 +0000 (14:24 +0100)]
net/sfc: improve unsupported flow pattern message

Such a message could be generated by two places
in the code, and there is a difference in the
text albeit there is no difference in the meaning.
These two messages must be the same so that
automated error log comparison may be carried
out without confusion.

Fixes: a9825ccf5bb8 ("net/sfc: support flow API filters")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/sfc: fix minimum number of Rx descriptors in ESSB mode
Andrew Rybchenko [Wed, 16 May 2018 12:35:36 +0000 (13:35 +0100)]
net/sfc: fix minimum number of Rx descriptors in ESSB mode

Number of descriptors in equal stride super-buffer Rx mode defines
number of packet buffers to be used. Each HW Rx descriptor has
many packet buffers and the number depends on total size of mbuf
and CONFIG_RTE_DRIVER_MEMPOOL_BUCKET_SIZE_KB value.
Typically it makes a bit less than 32 buffers per descriptor.
Since HW Rx descriptors must be pushed by 8, it makes about 256
as required minimum. Double it in advertised minimum to allow for
at least 2 refill blocks.

Fixes: 390f9b8d82c9 ("net/sfc: support equal stride super-buffer Rx mode")

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agomaintainers: fix responsibility of flow API bits
Adrien Mazarguil [Tue, 15 May 2018 16:23:03 +0000 (18:23 +0200)]
maintainers: fix responsibility of flow API bits

The following commits lack MAINTAINERS entries for this mess.

Fixes: 4d73b6fb9907 ("doc: add generic flow API guide")
Fixes: 19c90af6285c ("app/testpmd: add flow command")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
6 years agonet/tap: fix isolation mode toggling
Ophir Munk [Mon, 14 May 2018 22:26:27 +0000 (22:26 +0000)]
net/tap: fix isolation mode toggling

Running testpmd command "flow isolae <port> 0" (i.e. disabling flow
isolation) followed by command "flow isolate <port> 1" (i.e. enabling
flow isolation) may result in a TAP error:
PMD: Kernel refused TC filter rule creation (17): File exists

Root cause analysis: when disabling flow isolation we keep the local
rule to redirect packets on TX (TAP_REMOTE_TX index) while we add it
again when enabling flow isolation. As a result this rule is added
two times in a row which results in "File exists" error.
The fix is to identify the "File exists" error and silently ignore it.

Another issue occurs when enabling isolation mode several times in a
row in which case the same tc rules are added consecutively and
rte_flow structs are added to a linked list before removing the
previous rte_flow structs.
The fix is to act upon isolation mode command only when there is a
change from "0" to "1" (or vice versa).

Fixes: f503d2694825 ("net/tap: support flow API isolated mode")
Cc: stable@dpdk.org
Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
6 years agomaintainers: hand off ownership of tap PMD
Pascal Mazon [Thu, 17 May 2018 09:44:58 +0000 (11:44 +0200)]
maintainers: hand off ownership of tap PMD

I have unfortunately no longer time enough for maintaining Tap PMD.
Keith has kindly volunteered to take over maintainership. He's been at
the origin of this PMD and knows well how it works.

Signed-off-by: Pascal Mazon <pascal.mazon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
6 years agovhost: improve dirty pages logging performance
Maxime Coquelin [Thu, 17 May 2018 11:44:47 +0000 (13:44 +0200)]
vhost: improve dirty pages logging performance

This patch caches all dirty pages logging until the used ring index
is updated.

The goal of this optimization is to fix a performance regression
introduced when the vhost library started to use atomic operations
to set bits in the shared dirty log map. While the fix was valid
as previous implementation wasn't safe against concurrent accesses,
contention was induced.

With this patch, during migration, we have:
1. Less atomic operations as only a single atomic OR operation
per 32 or 64 (depending on CPU) pages.
2. Less atomic operations as during a burst, the same page will
be marked dirty only once.
3. Less write memory barriers.

Fixes: 897f13a1f726 ("vhost: make page logging atomic")
Cc: stable@dpdk.org
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
6 years agonet/mlx5: fix build without tunnel RSS support
Shahaf Shuler [Sun, 13 May 2018 08:07:46 +0000 (11:07 +0300)]
net/mlx5: fix build without tunnel RSS support

IBV_RX_HASH_INNER should be referenced only when having tunnel support
in the Verbs headers.

Fixes: 80f2d0ed7ff9 ("net/mlx5: add hardware flow debug dump")

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: support MPLS-in-GRE and MPLS-in-UDP
Matan Azrad [Tue, 15 May 2018 11:07:14 +0000 (11:07 +0000)]
net/mlx5: support MPLS-in-GRE and MPLS-in-UDP

Add support for MPLS over GRE and MPLS over UDP tunnel types as
described in the next RFCs:
1. https://tools.ietf.org/html/rfc4023
2. https://tools.ietf.org/html/rfc7510
3. https://tools.ietf.org/html/rfc4385

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: add Bluefield device id
Shahaf Shuler [Tue, 15 May 2018 06:12:50 +0000 (09:12 +0300)]
net/mlx5: add Bluefield device id

Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: fix flow director drop rule deletion crash
Shahaf Shuler [Tue, 15 May 2018 06:26:35 +0000 (09:26 +0300)]
net/mlx5: fix flow director drop rule deletion crash

Drop flow rules are created on the ETH queue even though the parser layer
matches the flow rule layer (L3/L4)

Fixes: 6f2f4948b236 ("net/mlx5: fix flow director rule deletion crash")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
6 years agonet/virtio-user: fix device init in legacy-mem mode
Xiao Wang [Thu, 17 May 2018 07:35:25 +0000 (15:35 +0800)]
net/virtio-user: fix device init in legacy-mem mode

In legacy-mem mode, memory event callback registering is not supported,
we should not return error in dev_init on this case.

Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")

Suggested-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio-user: strip MAC feature when none specified
Tiwei Bie [Fri, 11 May 2018 10:55:42 +0000 (18:55 +0800)]
net/virtio-user: strip MAC feature when none specified

Currently VIRTIO_NET_F_MAC is set unconditionally when server
mode is used. It should be stripped when MAC isn't specified.

Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/vhost: do not clear offload flags in Rx
Tiwei Bie [Thu, 10 May 2018 07:04:14 +0000 (15:04 +0800)]
net/vhost: do not clear offload flags in Rx

The ol_flags of mbufs returned by rte_vhost_dequeue_burst()
contain necessary offload information. It can't be zeroed.

Fixes: f63d356ee993 ("net/vhost: insert/strip VLAN header in software")
Cc: stable@dpdk.org
Reported-by: Lei Yao <lei.a.yao@intel.com>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost/crypto: handle virtually non-contiguous buffers
Fan Zhang [Thu, 10 May 2018 15:41:22 +0000 (16:41 +0100)]
vhost/crypto: handle virtually non-contiguous buffers

This patch enables the handling of buffers non-contiguous in
virtual address space in the vhost_crypto. Instead of using
rte_vhost_va_from_guest_pa(), the host virtual address is
converted by vhost_iova_to_vva() for wider use cases.

For copy mode, the copy length is limited to the chunk size,
next chunks VAs being fetched afterward.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost/crypto: fix descriptor move
Fan Zhang [Wed, 9 May 2018 14:08:39 +0000 (15:08 +0100)]
vhost/crypto: fix descriptor move

This patch fixes the redundant descriptor move in the copy mode
of vhost crypto. Originally the redundant descriptor move will
cause the message parsing error.

Fixes: 3bb595ecd682 ("vhost/crypto: add request handler")

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio-user: fix feature setting with vhost-net backend
Jiayu Hu [Fri, 11 May 2018 07:26:07 +0000 (15:26 +0800)]
net/virtio-user: fix feature setting with vhost-net backend

When the backend is vhost-net, virtio-user must work in client mode and
needs to request features from the backend in virtio_user_dev_init().
But currently, virtio-user is assigned to default features in this case.

This patch is to fix this inappropriate feature setting.

Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")

Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Lei Yao <lei.a.yao@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Zhiyong Yang <zhiyong.yang@intel.com>
6 years agoversion: 18.05-rc4
Thomas Monjalon [Tue, 15 May 2018 20:38:39 +0000 (22:38 +0200)]
version: 18.05-rc4

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/qede: fix default Tx offload config
Rasesh Mody [Tue, 15 May 2018 19:43:59 +0000 (12:43 -0700)]
net/qede: fix default Tx offload config

Correct the default Tx offload config

Fixes: 946dfd18a4ec ("net/qede: convert to new Rx/Tx offloads API")

Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agonet/mlx5: fix uninitialized variable in probing
Andy Green [Mon, 14 May 2018 05:04:38 +0000 (13:04 +0800)]
net/mlx5: fix uninitialized variable in probing

Fixes: ccdcba53a3f4 ("net/mlx5: use Netlink to add/remove MAC addresses")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/bnx2x: fix memzone name overrun
Andy Green [Mon, 14 May 2018 05:04:43 +0000 (13:04 +0800)]
net/bnx2x: fix memzone name overrun

Fixes: 540a211084a7 ("bnx2x: driver core")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agonet/bnx2x: fix KR2 device check
Andy Green [Mon, 14 May 2018 05:04:33 +0000 (13:04 +0800)]
net/bnx2x: fix KR2 device check

In function ‘elink_check_kr2_wa’:
drivers/net/bnx2x/elink.c:12922:28:
error: bitwise comparison always evaluates to false
[-Werror=tautological-compare]
        ((next_page & 0xe0) == 0x2))));

This was fixed elsewhere in 2014

Fixes: b5bf7719221d ("bnx2x: driver support routines")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agonet/bnx2x: do not cast function pointers as a policy
Andy Green [Mon, 14 May 2018 05:04:28 +0000 (13:04 +0800)]
net/bnx2x: do not cast function pointers as a policy

This is stopping the compiler telling you when you have
done something stupid... that is something none of us
can afford...

Now gcc 8.x can tell you did something stupid despite
trying to hide the evidence.

Remove all the "black magic" casts.

Fix the actual problems.

Fixes: b5bf7719221d ("bnx2x: driver support routines")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agomempool: fix virtual address population
Anatoly Burakov [Mon, 14 May 2018 16:06:15 +0000 (17:06 +0100)]
mempool: fix virtual address population

Currently, populate_virt will check if mempool is already populated.
This will cause inability to reserve multi-chunk mempools if
contiguous memory is not a hard requirement, because if allocating
all-contiguous memory fails, mempool will retry with virtual addresses
and will call populate_virt. It seems that the original code never
anticipated more than one non-physically contiguous area.

Fix it by removing the check in populate virt. populate_anon() function
calls populate_virt() also, and it can be reasonably inferred that it is
expecting that virtual area is not already populated. Even though a
similar check is already in place there, also add the check that was
part of populate_virt() just in case.

Fixes: aab4f62d6c1c ("mempool: support no hugepage mode")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agoeal: move runtime directory creation after args parsing
Anatoly Burakov [Tue, 15 May 2018 10:44:52 +0000 (11:44 +0100)]
eal: move runtime directory creation after args parsing

The intention of the original code was to create runtime data
directory as early as possible, however it was moved too early,
before the arguments were parsed, resulting in --file-prefix
option essentially not working.

Fix this by moving eal_create_runtime_dir() to after command
line arguments parsing.

Fixes: 56236363b481 ("eal: add directory for runtime data")

Reported-by: Andrew Rybchenko <arybchenko@solarflare.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agobus/pci: fix size of driver name buffer
Andy Green [Tue, 15 May 2018 07:31:00 +0000 (15:31 +0800)]
bus/pci: fix size of driver name buffer

Variable dri_name is a pointer and it is incorrect to use its
size as the buffer size. Caller knows the buffer size and
it is safer to pass it explicitly.

Fixes: fe5f777b5383 ("bus/pci: replace strncpy by strlcpy")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agoversion: 18.05-rc3
Ferruh Yigit [Mon, 14 May 2018 23:03:42 +0000 (00:03 +0100)]
version: 18.05-rc3

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoeal: move runtime data into dedicated directory
Anatoly Burakov [Mon, 14 May 2018 16:27:42 +0000 (17:27 +0100)]
eal: move runtime data into dedicated directory

Fix all calls to functions in eal_filesystem to produce paths
residing inside dedicated DPDK runtime directory. Leaving DPDK
runtime config in place as 3rd-party applications within the
DPDK ecosystem might rely on this path to determine whether
DPDK is running, so moving that will be postponed to the next
release cycle.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
6 years agoeal: add directory for runtime data
Anatoly Burakov [Mon, 14 May 2018 16:27:41 +0000 (17:27 +0100)]
eal: add directory for runtime data

Currently, during runtime, DPDK will store a bunch of files here
and there (in /var/run, /tmp or in $HOME). Fix it by creating a
DPDK-specific runtime directory, under which all runtime data
will be placed. The template for creating this runtime directory
is the following:

  <base path>/dpdk/<DPDK prefix>/

Where <base path> is set to either "/var/run" if run as root, or
$XDG_RUNTIME_DIR if run as non-root, with a fallback to /tmp if
$XDG_RUNTIME_DIR is not defined. So, for example, if run as root,
by default all runtime data will be stored at /var/run/dpdk/rte/.

There is no equivalent of "mkdir -p", so we will be creating the
path step by step.

Nothing uses this new path yet, changes for that will come in
next commit.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
6 years agomem: rename function returning hugepage data path
Anatoly Burakov [Mon, 14 May 2018 16:27:40 +0000 (17:27 +0100)]
mem: rename function returning hugepage data path

The original name for this path was not too descriptive and
confusing. Rename it to a more appropriate and descriptive name:
it stores data about hugepages, so name it eal_hugepage_data_path().

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
6 years agoeal: remove unused path pattern
Anatoly Burakov [Mon, 14 May 2018 16:27:39 +0000 (17:27 +0100)]
eal: remove unused path pattern

The define was a leftover from IVSHMEM library.

Fixes: c711ccb30987 ("ivshmem: remove library and its EAL integration")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: David Marchand <david.marchand@6wind.com>
6 years agodevtools: provide more generic grep in git check
Andy Green [Mon, 14 May 2018 04:59:56 +0000 (12:59 +0800)]
devtools: provide more generic grep in git check

On Fedora 28, every patch is faulted for
"Wrong headline uppercase", because [A-Z] is not
always case sensitive.

Change to use [[:upper:]]

Signed-off-by: Andy Green <andy@warmcat.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoapp/bbdev: use strcpy for allocated string
Andy Green [Mon, 14 May 2018 05:01:07 +0000 (13:01 +0800)]
app/bbdev: use strcpy for allocated string

app/test-bbdev/test_bbdev_vector.c:895:3:
  error: ‘strncpy’ output truncated before terminating nul copying as
  many bytes from a string as its length [-Werror=stringop-truncation]
   strncpy(entry, line, strlen(line));
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

app/test-bbdev/test_bbdev_vector.c:917:5:
  error: ‘strncat’ output truncated before terminating nul copying as
  many bytes from a string as its length [-Werror=stringop-truncation]
   strncat(entry, line, strlen(line));
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: f714a18885a6 ("app/testbbdev: add test application for bbdev")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoapp/procinfo: fix sprintf overrun
Andy Green [Mon, 14 May 2018 05:01:02 +0000 (13:01 +0800)]
app/procinfo: fix sprintf overrun

app/proc-info/main.c: In function ‘nic_xstats_display’:
app/proc-info/main.c:495:45: error:
‘%s’ directive writing up to 255 bytes into a regioni of size between 165 and 232
[-Werror=format-overflow=]
    sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
                                             ^~
     PRIu64"\n", host_id, port_id, counter_type,
                                   ~~~~~~~~~~~~
app/proc-info/main.c:495:4: note:
‘sprintf’ output between 31 and 435 bytes into a destination of size 256
    sprintf(buf, "PUTVAL %s/dpdkstat-port.%u/%s-%s N:%"
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     PRIu64"\n", host_id, port_id, counter_type,
     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
     xstats_names[i].name, values[i]);

Fixes: 2deb6b5246d7 ("app/procinfo: add collectd format and host id")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/vdev_netvsc: replace strncpy by strlcpy
Andy Green [Mon, 14 May 2018 05:00:57 +0000 (13:00 +0800)]
net/vdev_netvsc: replace strncpy by strlcpy

Continue snprintf to strlcpy conversions started by commit
c022cb400e92 ("convert snprintf to strlcpy").

Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agonet/vdev_netvsc: readlink inputs cannot be aliased
Andy Green [Mon, 14 May 2018 05:00:52 +0000 (13:00 +0800)]
net/vdev_netvsc: readlink inputs cannot be aliased

drivers/net/vdev_netvsc/vdev_netvsc.c:335:2:error:
passing argument 2 to restrict-qualified parameter aliases with argument 1
  ret = readlink(buf, buf, size);
  ^~~

Fixes: e7dc5d7becc5 ("net/vdev_netvsc: implement core functionality")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agonet/sfc: make sure that stats name is nul-terminated
Andy Green [Mon, 14 May 2018 05:00:47 +0000 (13:00 +0800)]
net/sfc: make sure that stats name is nul-terminated

Fixes: 73280c1e4ff2 ("net/sfc: support xstats retrieval by ID")
Fixes: 7b9891769f4b ("net/sfc: support extended statistics")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agonet/qede: fix strncpy
Andy Green [Mon, 14 May 2018 05:00:42 +0000 (13:00 +0800)]
net/qede: fix strncpy

drivers/net/qede/qede_main.c: In function ‘qed_slowpath_start’:
drivers/net/qede/qede_main.c:307:3: error:
‘strncpy’ output may be truncated copying 12 bytes from a string of length 127
[-Werror=stringop-truncation]
   strncpy((char *)drv_version.name, (const char *)params->name,
   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    MCP_DRV_VER_STR_SIZE - 4);
    ~~~~~~~~~~~~~~~~~~~~~~~~~

Fixes: 86a2265e59d7 ("qede: add SRIOV support")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/qede: replace strncpy by strlcpy
Andy Green [Mon, 14 May 2018 05:00:37 +0000 (13:00 +0800)]
net/qede: replace strncpy by strlcpy

Fixes: 8427c6647964 ("net/qede/base: add attention formatting string")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/nfp: fix memcpy out of source range
Andy Green [Mon, 14 May 2018 05:00:32 +0000 (13:00 +0800)]
net/nfp: fix memcpy out of source range

drivers/net/nfp/nfp_net.c:669:2: error:
‘memcpy’ forming offset [5, 6] is out of the bounds [0, 4]
of object ‘tmp’ with type ‘uint32_t’ {aka ‘unsigned int’}
[-Werror=array-bounds]
memcpy(&hw->mac_addr[0], &tmp, sizeof(struct ether_addr));

Fixes: e6decee38209 ("net/nfp: use random MAC address if not configured")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Tested-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: fix off-by-one and no nul on strncpy use
Andy Green [Mon, 14 May 2018 05:00:27 +0000 (13:00 +0800)]
net/nfp: fix off-by-one and no nul on strncpy use

drivers/net/nfp/nfpcore/nfp_resource.c:76:2:error:
‘strncpy’ output may be truncated copying 8 bytes from a string of length 8
[-Werror=stringop-truncation]
  strncpy(name_pad, res->name, sizeof(name_pad));

Fixes: c7e9729da6b5 ("net/nfp: support CPP")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agonet/nfp: fix strncpy misuse
Andy Green [Mon, 14 May 2018 05:00:22 +0000 (13:00 +0800)]
net/nfp: fix strncpy misuse

Fixes: c7e9729da6b5 ("net/nfp: support CPP")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Tested-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: fix buffer overflow of FW strings
Andy Green [Mon, 14 May 2018 05:00:01 +0000 (13:00 +0800)]
net/nfp: fix buffer overflow of FW strings

drivers/net/nfp/nfp_net.c: In function ‘nfp_pf_pci_probe’:
drivers/net/nfp/nfp_net.c:3160: 23: error:
‘%s’ directive writing up to 99 bytes into a region of size 76
[-Werror=format-overflow=]
  sprintf(fw_name, "%s/%s.nffw", DEFAULT_FW_PATH, serial);

Note fw_buf still has to increase somewhat even after
restricting serial[], since otherwise:

drivers/net/nfp/nfp_net.c: In function ‘nfp_pf_pci_probe’:
drivers/net/nfp/nfp_net.c:3176:23:
error: ‘%s’ directive writing up to 99 bytes into a region of size 76
[-Werror=format-overflow=]
  sprintf(fw_name, "%s/%s", DEFAULT_FW_PATH, card);
                       ^~
drivers/net/nfp/nfp_net.c:3262:32:
  err = nfp_fw_upload(dev, nsp, card_desc);
                                ~~~~~~~~~
drivers/net/nfp/nfp_net.c:3176:2:
note: ‘sprintf’ output between 25 and 124 bytes into a destination of size 100
  sprintf(fw_name, "%s/%s", DEFAULT_FW_PATH, card);

Fixes: 896c265ef954 ("net/nfp: use new CPP interface")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agonet/axgbe: fix EEPROM string comparison
Andy Green [Mon, 14 May 2018 05:00:17 +0000 (13:00 +0800)]
net/axgbe: fix EEPROM string comparison

drivers/net/axgbe/axgbe_phy_impl.c:576:6: error:
‘__builtin_memcmp_eq’ reading 16 bytes from a region of size 9
[-Werror=stringop-overflow=]
  if (memcmp(&sfp_eeprom->base[AXGBE_SFP_BASE_VENDOR_NAME],
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      AXGBE_BEL_FUSE_VENDOR, AXGBE_SFP_BASE_VENDOR_NAME_LEN))

Fixes: a5c7273771e8 ("net/axgbe: add phy programming APIs")

Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agobus/dpaa: fix inconsistent struct alignment
Andy Green [Mon, 14 May 2018 05:00:12 +0000 (13:00 +0800)]
bus/dpaa: fix inconsistent struct alignment

The actual descriptor for qm_mr_entry is 64-byte aligned.

But the original code plays a trick, and puts a u8 common
to the three descriptor subtypes in the union afterwards
outside their structure definitions.

Unfortunately since they compose a struct qm_fd with
alignment 8, this trick destroys the ability of the compiler
to understand what has happened, resulting in this kind of
problem:

drivers/bus/dpaa/include/fsl_qman.h:354:3: error:
alignment 1 of ‘struct <anonymous>’ is less than 8 [-Werror=packed-not-aligned]
   } __packed dcern;

on gcc 8 / Fedora 28 out of the box.

This patch moves the u8 verb into the structure definitions
composed into the union, so the alignment of the parent struct
containing the alignment 8 object can also be seen to be
alignment 8 by the compiler.  Uses of .verb are fixed up to use
.ern.verb (the same offset of +0 inside all the structs in
the union).

The final struct layout should be unchanged.

Fixes: c47ff048b99a ("bus/dpaa: add QMAN driver core routines")
Fixes: f6fadc3e6310 ("bus/dpaa: add QMAN interface driver")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Tested-by: Hemant Agrawal <hemant.agrawal@nxp.com>
6 years agobus/pci: replace strncpy by strlcpy
Andy Green [Mon, 14 May 2018 05:00:06 +0000 (13:00 +0800)]
bus/pci: replace strncpy by strlcpy

In function ‘pci_get_kernel_driver_by_path’,
    inlined from ‘pci_scan_one.isra.1’ at
drivers/bus/pci/linux/pci.c:317:8:
drivers/bus/pci/linux/pci.c:57:3: error:
‘strncpy’ specified bound depends on the length of the source argument
[-Werror=stringop-overflow=]
   strncpy(dri_name, name + 1, strlen(name + 1) + 1);

Fixes: d9a8cd9595f2 ("pci: add kernel driver type")
Cc: stable@dpdk.org
Signed-off-by: Andy Green <andy@warmcat.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
6 years agonet/tap: add default name to tun
Vipin Varghese [Sat, 12 May 2018 06:21:33 +0000 (11:51 +0530)]
net/tap: add default name to tun

The change adds default name to reflect TUN PMD instance. if option
name is not passed, the default dtun is taken.

Signed-off-by: Vipin Varghese <vipin.varghese@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/enic: fix missing offload capabilities
Hyong Youb Kim [Mon, 14 May 2018 14:11:26 +0000 (07:11 -0700)]
net/enic: fix missing offload capabilities

Add the following missing flags to the advertised offloads.
- DEV_RX_OFFLOAD_CRC_STRIP
  CRC is always stripped.
- DEV_RX_OFFLOAD_JUMBO_FRAME
  Jumbo support is always enabled on the NIC.
- DEV_RX_OFFLOAD_SCATTER
  Scatter Rx is currently supported.
- DEV_TX_OFFLOAD_MULTI_SEGS
  Multiple-segment transmit has always been supported.

Fixes: 93fb21fdbe23 ("net/enic: enable overlay offload for VXLAN and GENEVE")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
6 years agonet/bonding: fix slave activation simultaneously
Matan Azrad [Tue, 24 Apr 2018 11:29:30 +0000 (11:29 +0000)]
net/bonding: fix slave activation simultaneously

The bonding PMD decides to activate\deactivate its slaves according to
the slaves link statuses.
Thus, it registers to the LSC events of the slaves ports and
activates\deactivates them from its LSC callbacks called asynchronously
by the host thread when the slave link status is changed.

In addition, the bonding PMD uses the callback for slave activation
when it tries to start it, this operation is probably called by the
master thread.

Consequently, a slave may be activated in the same time by two
different threads and may cause a lot of optional errors, for example,
slave mempool recreation with the same name causes an error.

Synchronize the critical section in the LSC callback using a special
new spinlock.

Fixes: 414b202343ce ("bonding: fix initial link status of slave")
Fixes: a45b288ef21a ("bond: support link status polling")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
6 years agonet/avf: fix Rx interrupt mapping
Jingjing Wu [Fri, 11 May 2018 15:09:48 +0000 (23:09 +0800)]
net/avf: fix Rx interrupt mapping

Vector used for rx mapping is different if WB_ON_ITR
is supported. The mapping table need to be updated.

Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt")

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Cc: stable@dpdk.org
6 years agonet/dpaa2: change VLAN strip value to offload flag
Shreyansh Jain [Mon, 14 May 2018 10:58:26 +0000 (16:28 +0530)]
net/dpaa2: change VLAN strip value to offload flag

Fixes: 0ebce6129bc6 ("net/dpaa2: support new ethdev offload APIs")

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
6 years agonet/cxgbe: free resources during uninit
Rahul Lakkireddy [Sat, 12 May 2018 11:50:00 +0000 (17:20 +0530)]
net/cxgbe: free resources during uninit

Move freeing up resources from dev_close() to dev_uninit(). This fixes
NULL pointer de-reference when accessing adapter context needed by
other ports under same PF, but had been freed up by the first port.
This can happen if only the first port is started up and the check
to free up all resources is still satisfied. When dev_close is
called for other ports, adapter context is NULL since it was freed
up by the first port.

Thus, by moving to dev_uninit() all the ports can be teared down
safely without need for extra checks.

Fixes: 2195df6d11bd ("net/cxgbe: rework ethdev device allocation")

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
6 years agonet/mlx4: advertise supported RSS hash functions
Ophir Munk [Mon, 14 May 2018 10:07:32 +0000 (10:07 +0000)]
net/mlx4: advertise supported RSS hash functions

Advertise mlx4 supported RSS functions as part of dev_infos_get
callback.
Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations will be
verified.

[1] commit 8863a1fbfc66 ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
6 years agonet/mlx4: avoid constant recreations in function
Ophir Munk [Mon, 14 May 2018 10:07:31 +0000 (10:07 +0000)]
net/mlx4: avoid constant recreations in function

Function mlx4_conv_rss_types() contains constant arrays variables
which are recreated with every call to the function. By changing the
arrays definitions from "const" to "static const" these recreations
can be saved.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
6 years agonet/mlx5: use correct field in a union structure
Yongseok Koh [Sat, 12 May 2018 01:35:45 +0000 (18:35 -0700)]
net/mlx5: use correct field in a union structure

This is not a bug but it is better to use semantically correct field.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: use coherent I/O memory barrier
Yongseok Koh [Sat, 12 May 2018 01:35:44 +0000 (18:35 -0700)]
net/mlx5: use coherent I/O memory barrier

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: fix inlining segmented TSO packet
Yongseok Koh [Fri, 11 May 2018 17:39:13 +0000 (10:39 -0700)]
net/mlx5: fix inlining segmented TSO packet

When a multi-segmented packet is inlined, data can be further inlined even
after the first segment. In case of TSO packet, extra inline data after TSO
header should be carried by an inline DSEG which has 4B inline header
recording the length of the inline data. If more than one segment is
inlined, the length doesn't count from the second segment. This will cause
a fault in HW and CQE will have an error, which is ignored by PMD.

Fixes: f895536be4fa ("net/mlx5: enable inlining data from multiple segments")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoapp/testpmd: check if CRC strip offload supported
Ferruh Yigit [Wed, 9 May 2018 22:09:04 +0000 (23:09 +0100)]
app/testpmd: check if CRC strip offload supported

Testpmd set CRC_STRIP offload blindly, this is wrong according offload
API definition, and will cause error for the PMDs that doesn't support
CRC_STRIP like virtual PMDs.

Check if underlying device report this capability and don't set it if
not supported.

Fixes: 0074d02fca21 ("app/testpmd: convert to new Rx offloads API")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
6 years agonet/failsafe: add an RSS hash update callback
Ophir Munk [Thu, 10 May 2018 14:38:10 +0000 (14:38 +0000)]
net/failsafe: add an RSS hash update callback

Add an RSS hash update callback to eth_dev_ops.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agoethdev: improve doc for name by port ID API
Ivan Malov [Fri, 11 May 2018 14:05:51 +0000 (15:05 +0100)]
ethdev: improve doc for name by port ID API

Description of rte_eth_dev_get_name_by_port() calls
port ID argument a pointer, which is misleading.
Also, output buffer minimal size is not mentioned.
These points need to be improved.

Fixes: bde516d5a85a ("ethdev: get port by name")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/ixgbe: fix missing port representor data-path
Remy Horton [Fri, 11 May 2018 12:28:45 +0000 (13:28 +0100)]
net/ixgbe: fix missing port representor data-path

This patch adds Rx and Tx burst functions to the ixgbe
Port Representors, so that the implementation within
ixgbe PMD can be tested using applications such as
testpmd which require data-path functionality.

Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports")

Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
6 years agonet/i40e: fix missing port representor data-path
Remy Horton [Fri, 11 May 2018 12:28:44 +0000 (13:28 +0100)]
net/i40e: fix missing port representor data-path

This patch adds Rx and Tx burst functions to the i40e Port
Representors, so that the implementation within this PMD
can be tested using applications such as testpmd which
require data-path functionality.

Fixes: e0cb96204b71 ("net/i40e: add support for representor ports")

Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
6 years agoethdev: fix checking Rx/Tx queue status
Yanglong Wu [Fri, 11 May 2018 08:22:28 +0000 (16:22 +0800)]
ethdev: fix checking Rx/Tx queue status

Relax the check for queue setup, since some device
may not update queue states during dev_stop.

Fixes: cac923cfea47 ("ethdev: support runtime queue setup")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
6 years agonet/i40e: print original value for global register change
Beilei Xing [Fri, 11 May 2018 18:17:07 +0000 (02:17 +0800)]
net/i40e: print original value for global register change

Currently, only new value is printed during global
register change. Add original value to help debugging
facility.

Fixes: bc66b9717c50 ("net/i40e: add debug logs when writing global registers")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/virtio-user: fix multiple queues fail in server mode
Zhiyong Yang [Fri, 11 May 2018 03:31:37 +0000 (11:31 +0800)]
net/virtio-user: fix multiple queues fail in server mode

This patch fixes multiple queues failure when virtio-user works in
server mode.

This patch adds feature negotiation in the processing of virtio-user
connection and enables multiple-queue pairs.

Fixes: bd8f50a45d0f ("net/virtio-user: support server mode")
Cc: stable@dpdk.org
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
6 years agonet/i40e: fix missing VLAN offload capability
Yanglong Wu [Thu, 10 May 2018 06:50:38 +0000 (14:50 +0800)]
net/i40e: fix missing VLAN offload capability

VLAN offload capability should be exposed in VF
since i40e does support it.

Fixes: c3ac7c5b0b8a ("net/i40e: convert to new Rx offloads API")

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: fix missing mbuf fast free offload
Qi Zhang [Fri, 11 May 2018 01:25:42 +0000 (09:25 +0800)]
net/i40e: fix missing mbuf fast free offload

Expose the missing mbuf fast free capability since i40 does
support it.

Fixes: 7497d3e2f777 ("net/i40e: convert to new Tx offloads API")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agoethdev: fix port removal notification timing
Matan Azrad [Thu, 10 May 2018 23:58:36 +0000 (01:58 +0200)]
ethdev: fix port removal notification timing

When an ethdev port is released, a destroy event is triggered to notify
the users about the released port.

A bit before the destroy event is triggered, the port becomes invalid
by changing its state to UNUSED and cleaning its data. Therefore, the
port is invalid for the destroy event callback process and the users
may get a wrong information of the port.

Move the destroy event emitting to be called before the port
invalidation.

Fixes: 133b54779aa1 ("ethdev: fix port data reset timing")
Fixes: 29aa41e36de7 ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agonet/failsafe: fix sub-device ownership race
Matan Azrad [Thu, 10 May 2018 23:58:35 +0000 (01:58 +0200)]
net/failsafe: fix sub-device ownership race

There is time between the sub-device port probing by the sub-device PMD
to the sub-device port ownership taking by a fail-safe port.

In this time, the port is available for the application usage. For
example, the port will be exposed to the applications which use
RTE_ETH_FOREACH_DEV iterator.

Thus, ownership unaware applications may manage the port in this time
what may cause a lot of problematic behaviors in the fail-safe
sub-device initialization.

Register to the ethdev NEW event to take the sub-device port ownership
before it becomes exposed to the application.

Fixes: a46f8d584eb8 ("net/failsafe: add fail-safe PMD")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: fix port probing notification
Thomas Monjalon [Thu, 10 May 2018 23:58:34 +0000 (01:58 +0200)]
ethdev: fix port probing notification

The new device was notified as soon as it was allocated.
It leads to use a device which is not yet initialized.

The notification must be published after the initialization is done
by the PMD, but before the state is changed, in order to let
notified entities taking ownership before general availability.

Fixes: 29aa41e36de7 ("ethdev: add notifications for probing and removal")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: fix port visibility before initialization
Thomas Monjalon [Thu, 10 May 2018 23:58:33 +0000 (01:58 +0200)]
ethdev: fix port visibility before initialization

The port was set to the state ATTACHED during allocation.
The consequence was to iterate over ports which are not initialized.

The state ATTACHED is now set as the last step of probing.

The uniqueness of port name is now checked before the availability
of a port id for allocation (order reversed).

As the state is not set on allocation anymore, it is also not checked
in the function telling whether a port is allocated or not.
The name of the port is set on allocation, so it is enough as a check.

Fixes: 5588909af21b ("ethdev: add device iterator")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: add lock to port allocation check
Matan Azrad [Thu, 10 May 2018 23:58:32 +0000 (01:58 +0200)]
ethdev: add lock to port allocation check

When comparing the port name, there can be a race condition with
a thread allocating a new port and writing the name at the same time.
It can lead to match with a partial name by error.

The check of the port is now considered as a critical section
protected with locks.

This fix will be even more required for multi-process when the
port availability will rely only on the name, in a following patch.

Fixes: 84934303a17c ("ethdev: synchronize port allocation")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: allow ownership operations on unused port
Matan Azrad [Thu, 10 May 2018 23:58:31 +0000 (01:58 +0200)]
ethdev: allow ownership operations on unused port

When the state will be updated later than in allocation,
we may need to update the ownership of a port which is
still in state unused.

It will be used to take ownership of a port before it is
declared as available for other entities.

Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: add probing finish function
Thomas Monjalon [Thu, 10 May 2018 23:58:30 +0000 (01:58 +0200)]
ethdev: add probing finish function

A new hook function is added and called inside the PMDs at the end
of the device probing:
- in primary process, after allocating, init and config
- in secondary process, after attaching and local init

This new function is almost empty for now.
It will be used later to add some post-initialization processing.

For the PMDs calling the helpers rte_eth_dev_create() or
rte_eth_dev_pci_generic_probe(), the hook rte_eth_dev_probing_finish()
is called from here, and not in the PMD itself.

Note that the helper rte_eth_dev_create() could be used more,
especially for vdevs, avoiding some code duplication in PMDs.

Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agodrivers/net: use higher level of probing helper for PCI
Thomas Monjalon [Thu, 10 May 2018 23:58:29 +0000 (01:58 +0200)]
drivers/net: use higher level of probing helper for PCI

The drivers avp, bnx2x and liquidio were using the helper function
rte_eth_dev_pci_allocate() and can be replaced by
rte_eth_dev_pci_generic_probe() which calls the former.

Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: add doxygen comments for each state
Thomas Monjalon [Thu, 10 May 2018 23:58:28 +0000 (01:58 +0200)]
ethdev: add doxygen comments for each state

The enum rte_eth_dev_state was not properly documented.
Its values did not appear in the doxygen output,
and may be misunderstood.

The state RTE_ETH_DEV_DEFERRED has no interest anymore
since the ownership mechanism brings a more flexible categorization.
This state could be removed later.

Fixes: d52268a8b24b ("ethdev: expose device states")
Fixes: cb894d99eceb ("ethdev: add deferred intermediate device state")
Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
Fixes: 7106edc12380 ("ethdev: add devop to check removal status")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agonet/failsafe: fix sub-device visibility
Thomas Monjalon [Thu, 10 May 2018 23:58:27 +0000 (01:58 +0200)]
net/failsafe: fix sub-device visibility

The iterator function rte_eth_find_next_owned_by(), used by the
iterator macro RTE_ETH_FOREACH_DEV_OWNED_BY, are ignoring the devices
which are neither ATTACHED nor REMOVED. Thus sub-devices, having
the state DEFERRED, cannot be seen with the ethdev iterator.
The state RTE_ETH_DEV_DEFERRED can be replaced by
RTE_ETH_DEV_ATTACHED + owner.

Fixes: dcd0c9c32b8d ("net/failsafe: use ownership mechanism for slaves")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoethdev: fix debug log of owner id
Thomas Monjalon [Thu, 10 May 2018 23:58:26 +0000 (01:58 +0200)]
ethdev: fix debug log of owner id

The owner id is 64-bit.
On 32-bit environment, it must be printed with PRIX64.

Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
6 years agoapp/testpmd: add commands to test new offload API
Wei Dai [Wed, 9 May 2018 12:13:50 +0000 (20:13 +0800)]
app/testpmd: add commands to test new offload API

Add following testpmd run-time commands to support test of
new Rx offload API:
show port <port_id> rx_offload capabilities
show port <port_id> rx_offload configuration
port config <port_id> rx_offload <offload> on|off
port <port_id> rxq <queue_id> rx_offload <offload> on|off
Above last 2 commands should be run when the port is stopped.
And <offload> can be one of "vlan_strip", "ipv4_cksum", ...

Add following testpmd run-time commands to support test of
new Tx offload API:
show port <port_id> tx_offload capabilities
show port <port_id> tx_offload configuration
port config <port_id> tx_offload <offload> on|off
port <port_id> txq <queue_id> tx_offload <offload> on|off
Above last 2 commands should be run when the port is stopped.
And <offload> can be one of "vlan_insert", "udp_cksum", ...

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
6 years agonet/i40e: print global register change info
Beilei Xing [Thu, 10 May 2018 22:48:38 +0000 (06:48 +0800)]
net/i40e: print global register change info

Global register change info during enabling
flexible payload is not printed.
This patch changes macro to print the global
register change info.

Fixes: d2f9fe8ae309 ("net/i40e: turn off flexible payload on driver init")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
6 years agonet/sfc: fix inner TCP/UDP checksum offload control
Andrew Rybchenko [Thu, 10 May 2018 11:59:42 +0000 (12:59 +0100)]
net/sfc: fix inner TCP/UDP checksum offload control

If application uses Tx offload API and sets ETH_TXQ_FLAGS_IGNORE flag,
it still should have inner TCP/UDP checksum offload enabled if it is
supported and TCP/UDP checksum offload is requested.

Fixes: c78d280e88ef ("net/sfc: convert to new Tx offload API")
Cc: stable@dpdk.org
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
6 years agodoc: add XXV710 support in i40e guide
Qiming Yang [Thu, 10 May 2018 11:10:25 +0000 (19:10 +0800)]
doc: add XXV710 support in i40e guide

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
6 years agonet/enic: fix flow drop action
Hyong Youb Kim [Thu, 10 May 2018 08:51:13 +0000 (01:51 -0700)]
net/enic: fix flow drop action

Drop is a fate-deciding action, so mark it as FATE. It was missing in
a previous commit.

Fixes: cc17feb90413 ("ethdev: alter behavior of flow API actions")

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
6 years agonet/e1000: report Tx multi segment offload
Wei Dai [Thu, 10 May 2018 03:56:50 +0000 (11:56 +0800)]
net/e1000: report Tx multi segment offload

This feature has been confirmed with testpmd:
testpmd> set fwd txonly
testpmd> port stop all
testpmd> port config all txd 1024
testpmd> set txsplit on
testpmd> set txpkts 70,80,90,100
testpmd> start
It can be observed at peer port that UDP packets
with UDP data length 298 bytes.

Signed-off-by: Wei Dai <wei.dai@intel.com>
6 years agonet/i40e: fix link status update
Beilei Xing [Thu, 10 May 2018 02:26:29 +0000 (10:26 +0800)]
net/i40e: fix link status update

Link status is not updated correctly, link speed is 0
when link is up and link speed is not 0 when link is
down. This patch fixes the issue.

Fixes: eef2daf2e199 ("net/i40e: fix link update no wait")
Cc: stable@dpdk.org
Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
6 years agoapp/testpmd: fix device configure with zero queue
Qi Zhang [Thu, 10 May 2018 02:22:02 +0000 (10:22 +0800)]
app/testpmd: fix device configure with zero queue

Setup number of Rx & Tx queues to 0 at rte_eth_dev_configure means
take driver's default queue number, so if during a re-configuration
previous queue number will be overwrite, this is not expected when
we configure dcb. The patch fix it by re-configure device with the
original queue number.

Fixes: 3be82f5cc5e ("ethdev: support PMD-tuned Tx/Rx parameters")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
6 years agonet/mlx5: add Multi-Packet Rx support
Yongseok Koh [Wed, 9 May 2018 11:13:50 +0000 (04:13 -0700)]
net/mlx5: add Multi-Packet Rx support

Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe
bandwidth by posting a single large buffer for multiple packets. Instead of
posting a buffer per a packet, one large buffer is posted in order to
receive multiple packets on the buffer. A MPRQ buffer consists of multiple
fixed-size strides and each stride receives one packet.

Rx packet is mem-copied to a user-provided mbuf if the size of Rx packet is
comparatively small, or PMD attaches the Rx packet to the mbuf by external
buffer attachment - rte_pktmbuf_attach_extbuf(). A mempool for external
buffers will be allocated and managed by PMD.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: add a function to rdma-core glue
Yongseok Koh [Wed, 9 May 2018 11:13:49 +0000 (04:13 -0700)]
net/mlx5: add a function to rdma-core glue

mlx5dv_create_wq() is added for the Multi-Packet RQ (a.k.a Striding RQ).

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: separate filling Rx flags
Yongseok Koh [Wed, 9 May 2018 11:13:48 +0000 (04:13 -0700)]
net/mlx5: separate filling Rx flags

Filling in fields of mbuf becomes a separate inline function so that this
can be reused.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx4: add new memory region support
Yongseok Koh [Wed, 9 May 2018 11:09:06 +0000 (04:09 -0700)]
net/mlx4: add new memory region support

This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx4_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx4_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx4: remove memory region support
Yongseok Koh [Wed, 9 May 2018 11:09:05 +0000 (04:09 -0700)]
net/mlx4: remove memory region support

This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: add new memory region support
Yongseok Koh [Wed, 9 May 2018 11:09:04 +0000 (04:09 -0700)]
net/mlx5: add new memory region support

This is the new design of Memory Region (MR) for mlx PMD, in order to:
- Accommodate the new memory hotplug model.
- Support non-contiguous Mempool.

There are multiple layers for MR search.

L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Most
Recently Used). If L0 misses, L1 is to look up the address in a fixed-sized
array by linear search. L0/L1 is in an inline function -
mlx5_mr_lookup_cache().

If L1 misses, the bottom-half function is called to look up the address
from the bigger local cache of the queue. This is L2 - mlx5_mr_addr2mr_bh()
and it is not an inline function. Data structure for L2 is the Binary Tree.

If L2 misses, the search falls into the slowest path which takes locks in
order to access global device cache (priv->mr.cache) which is also a B-tree
and caches the original MR list (priv->mr.mr_list) of the device. Unless
the global cache is overflowed, it is all-inclusive of the MR list. This is
L3 - mlx5_mr_lookup_dev(). The size of the L3 cache table is limited and
can't be expanded on the fly due to deadlock. Refer to the comments in the
code for the details - mr_lookup_dev(). If L3 is overflowed, the list will
have to be searched directly bypassing the cache although it is slower.

If L3 misses, a new MR for the address should be created -
mlx5_mr_create(). When it creates a new MR, it tries to register adjacent
memsegs as much as possible which are virtually contiguous around the
address. This must take two locks - memory_hotplug_lock and
priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any
allocation/free of memory inside.

In the free callback of the memory hotplug event, freed space is searched
from the MR list and corresponding bits are cleared from the bitmap of MRs.
This can fragment a MR and the MR will have multiple search entries in the
caches. Once there's a change by the event, the global cache must be
rebuilt and all the per-queue caches will be flushed as well. If memory is
frequently freed in run-time, that may cause jitter on dataplane processing
in the worst case by incurring MR cache flush and rebuild. But, it would be
the least probable scenario.

To guarantee the most optimal performance, it is highly recommended to use
an EAL option - '--socket-mem'. Then, the reserved memory will be pinned
and won't be freed dynamically. And it is also recommended to configure
per-lcore cache of Mempool. Even though there're many MRs for a device or
MRs are highly fragmented, the cache of Mempool will be much helpful to
reduce misses on per-queue caches anyway.

'--legacy-mem' is also supported.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: remove memory region support
Yongseok Koh [Wed, 9 May 2018 11:09:03 +0000 (04:09 -0700)]
net/mlx5: remove memory region support

This patch removes current support of Memory Region (MR) in order to
accommodate the dynamic memory hotplug patch. This patch can be compiled
but traffic can't flow and HW will raise faults. Subsequent patches will
add new MR support.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agoethdev: new Rx/Tx offloads API
Wei Dai [Thu, 10 May 2018 11:56:55 +0000 (19:56 +0800)]
ethdev: new Rx/Tx offloads API

This patch check if a input requested offloading is valid or not.
Any reuqested offloading must be supported in the device capabilities.
Any offloading is disabled by default if it is not set in the parameter
dev_conf->[rt]xmode.offloads to rte_eth_dev_configure() and
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If any offloading is enabled in rte_eth_dev_configure() by application,
it is enabled on all queues no matter whether it is per-queue or
per-port type and no matter whether it is set or cleared in
[rt]x_conf->offloads to rte_eth_[rt]x_queue_setup().
If a per-queue offloading hasn't be enabled in rte_eth_dev_configure(),
it can be enabled or disabled for individual queue in
ret_eth_[rt]x_queue_setup().
A new added offloading is the one which hasn't been enabled in
rte_eth_dev_configure() and is reuqested to be enabled in
rte_eth_[rt]x_queue_setup(), it must be per-queue type,
otherwise trigger an error log.
The underlying PMD must be aware that the requested offloadings
to PMD specific queue_setup() function only carries those
new added offloadings of per-queue type.

This patch can make above such checking in a common way in rte_ethdev
layer to avoid same checking in underlying PMD.

This patch assumes that all PMDs in 18.05-rc2 have already
converted to offload API defined in 17.11 . It also assumes
that all PMDs can return correct offloading capabilities
in rte_eth_dev_infos_get().

In the beginning of [rt]x_queue_setup() of underlying PMD,
add offloads = [rt]xconf->offloads |
dev->data->dev_conf.[rt]xmode.offloads; to keep same as offload API
defined in 17.11 to avoid upper application broken due to offload
API change.
PMD can use the info that input [rt]xconf->offloads only carry
the new added per-queue offloads to do some optimization or some
code change on base of this patch.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/mlx5: change device reference for secondary process
Yongseok Koh [Wed, 9 May 2018 11:04:50 +0000 (04:04 -0700)]
net/mlx5: change device reference for secondary process

rte_eth_devices[] is not shared between primary and secondary process, but
a static array to each process. The reverse pointer of device (priv->dev)
is invalid. Instead, priv has the pointer to shared data of the device,
  struct rte_eth_dev_data *dev_data;

Two macros are added,
  #define PORT_ID(priv) ((priv)->dev_data->port_id)
  #define ETH_DEV(priv) (&rte_eth_devices[PORT_ID(priv)])

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx4: fix CRC stripping capability report
Ophir Munk [Tue, 8 May 2018 12:26:03 +0000 (12:26 +0000)]
net/mlx4: fix CRC stripping capability report

There are two capabilities related to CRC stripping:
1. mlx4 HW capability to perform CRC stripping on a received packet.
This capability is built in mlx4 HW. It should be returned by the API
call mlx4_get_rx_queue_offloads().
2. mlx4 driver capability to enable/disable HW CRC stripping. This
capability is dependent on the driver version.

Before this commit the second capability was falsely returned by
the mentioned API. This commit fixes it by returning the first
capability.
mlx4 HW performs CRC stripping by default and this capability is
always reported as "true".

The ability to enable/disable CRC stripping is supported since this
commit and requires OFED version 4.3-1.5.0.0 or rdma-core version v18.
CRC stripping will be done by default regardless of its configuration
when working with OFED or rdma-core versions earlier than those
previously specified or before this commit.

Fixes: de1df14e6e6ec ("net/mlx4: support CRC strip toggling")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
6 years agoethdev: fix corrupted device info in configure
Ferruh Yigit [Wed, 9 May 2018 22:16:49 +0000 (23:16 +0100)]
ethdev: fix corrupted device info in configure

Calling dev_infos_get() devops directly in rte_eth_dev_configure cause
random values in uninitialized fields because devops doesn't reset the
dev_info structure.

Call rte_eth_dev_info_get() API instead which memset the struct.

Also remove duplicated dev_infos_get existence check.

Fixes: 3be82f5cc5e3 ("ethdev: support PMD-tuned Tx/Rx parameters")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/failsafe: fix probe cleanup
Raslan Darawsheh [Wed, 9 May 2018 15:57:39 +0000 (18:57 +0300)]
net/failsafe: fix probe cleanup

The hot-plug alarm mechanism is responsible to practically execute both
plug in and out operations. It periodically tries to detect missed
sub-devices to be reconfigured and clean the resources of the removed
sub-devices.

The hot-plug alarm is started by the failsafe probe function, and it's
wrongly not stopped if failsafe instance got an error. for example
when starting failsafe with a MAC option, and giving it an invalid MAC
address this will lead to a NULL pointer for the dev private field. Then
when the hotplug alarm is called it will try to access this pointer,
which will lead to a segmentation fault.

Uninstall the hot-plug alarm in case of error in probe function.

Fixes: ebea83f8 ("net/failsafe: add plug-in support")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agonet/failsafe: advertise supported RSS functions
Ophir Munk [Wed, 9 May 2018 13:46:41 +0000 (13:46 +0000)]
net/failsafe: advertise supported RSS functions

Advertise failsafe supported RSS functions as part of dev_infos_get
callback. Set failsafe default RSS hash functions to be:
ETH_RSS_IP, ETH_RSS_UDP, and ETH_RSS_TCP.
The result of failsafe RSS hash functions is the logical AND of the
RSS hash functions among all failsafe sub_devices and failsafe own
defaults.

Previous to this commit RSS support was reported as none. Since the
introduction of [1] it is required that all RSS configurations be
verified.

[1] commit 8863a1fbfc66 ("ethdev: add supported hash function check")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Gaetan Rivet <gaetan.rivet@6wind.com>
6 years agonet/dpaa: update optimal burst size in device info
Shreyansh Jain [Wed, 9 May 2018 09:49:44 +0000 (15:19 +0530)]
net/dpaa: update optimal burst size in device info

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
6 years agonet/dpaa: fix max push mode queue
Shreyansh Jain [Wed, 9 May 2018 09:49:43 +0000 (15:19 +0530)]
net/dpaa: fix max push mode queue

Split default and max push mode queues to 4 and 8, respectively.

Fixes: 0c504f6950b6 ("net/dpaa: support push mode")
Cc: stable@dpdk.org
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>