dpdk.git
5 years agonet/virtio: improve batching in standard Rx path
Tiwei Bie [Tue, 19 Mar 2019 06:43:12 +0000 (14:43 +0800)]
net/virtio: improve batching in standard Rx path

This patch improves descriptors refill by using the same
batching strategy as done in in-order and mergeable path.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: add control queue helper for split ring
Tiwei Bie [Tue, 19 Mar 2019 06:43:11 +0000 (14:43 +0800)]
net/virtio: add control queue helper for split ring

Add a helper for sending commands in split ring to make the
code consistent with the corresponding code in packed ring.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: add interrupt helper for split ring
Tiwei Bie [Tue, 19 Mar 2019 06:43:10 +0000 (14:43 +0800)]
net/virtio: add interrupt helper for split ring

Add a helper for disabling interrupts in split ring to make the
code consistent with the corresponding code in packed ring.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: drop unused field in Tx region structure
Tiwei Bie [Tue, 19 Mar 2019 06:43:09 +0000 (14:43 +0800)]
net/virtio: drop unused field in Tx region structure

Drop the unused field tx_indir_pq from virtio_tx_region
structure.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: drop redundant suffix in packed ring structure
Tiwei Bie [Tue, 19 Mar 2019 06:43:08 +0000 (14:43 +0800)]
net/virtio: drop redundant suffix in packed ring structure

Drop redundant suffix (_packed and _event) from the fields in
packed ring structure.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: refactor virtqueue structure
Tiwei Bie [Tue, 19 Mar 2019 06:43:07 +0000 (14:43 +0800)]
net/virtio: refactor virtqueue structure

Put split ring and packed ring specific fields into separate
sub-structures, and also union them as they won't be available
at the same time.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: optimize flags update for packed ring
Tiwei Bie [Tue, 19 Mar 2019 06:43:06 +0000 (14:43 +0800)]
net/virtio: optimize flags update for packed ring

Cache the AVAIL, USED and WRITE bits to avoid calculating
them as much as possible. Note that, the WRITE bit isn't
cached for control queue.

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: add barrier in interrupt enable
Tiwei Bie [Tue, 19 Mar 2019 06:43:05 +0000 (14:43 +0800)]
net/virtio: add barrier in interrupt enable

Typically, after enabling Rx interrupt, a check should be done
to make sure that there is no new incoming packets before going
to sleep. So a barrier is needed to make sure that any following
check won't happen before the interrupt is actually enabled.

Fixes: c056be239db5 ("net/virtio: add Rx interrupt enable/disable functions")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: fix interrupt helper for packed ring
Tiwei Bie [Tue, 19 Mar 2019 06:43:04 +0000 (14:43 +0800)]
net/virtio: fix interrupt helper for packed ring

When disabling interrupt, the shadow event flags should also be
updated accordingly. The unnecessary wmb is also dropped.

Fixes: e9f4feb7e622 ("net/virtio: add packed virtqueue helpers")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agonet/virtio: fix typo in packed ring init
Tiwei Bie [Tue, 19 Mar 2019 06:43:03 +0000 (14:43 +0800)]
net/virtio: fix typo in packed ring init

The pointer to event structure should be cast to uintptr_t first.

Fixes: f803734b0f2e ("net/virtio: vring init for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
5 years agovhost: fix interrupt suppression for the split ring
Jiayu Hu [Sun, 17 Mar 2019 06:38:32 +0000 (14:38 +0800)]
vhost: fix interrupt suppression for the split ring

The VIRTIO_RING_F_EVENT_IDX feature of split ring might
be broken, as the value of signalled_used is invalid
after live migration, start up and virtio driver reload.
This patch fixes it by using signalled_used_valid.

In addition, this patch makes the VIRTIO_RING_F_EVENT_IDX
implementation of split ring match kernel backend to suppress
more interrupts.

Fixes: e37ff954405a ("vhost: support virtqueue interrupt/notification suppression")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
5 years agonet/virtio-user: fix multiqueue with vhost kernel
Tiwei Bie [Tue, 12 Mar 2019 07:13:07 +0000 (15:13 +0800)]
net/virtio-user: fix multiqueue with vhost kernel

The multiqueue support in virtio-user with vhost kernel backend
is broken when tap name isn't specified by users explicitly,
because the tap name returned by ioctl(TUNSETIFF) isn't saved
properly, and multiple tap interfaces will be created in this
case. Fix this by saving the dynamically allocated tap name
first before reusing the ifr structure. Besides, also make it
possible to support the format string in tap name (e.g. foo%d)
specified by users explicitly.

Fixes: 791b43e08842 ("net/virtio-user: specify MAC of the tap")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agovhost: prevent disabled rings to be processed with zero-copy
Maxime Coquelin [Thu, 28 Feb 2019 17:57:04 +0000 (18:57 +0100)]
vhost: prevent disabled rings to be processed with zero-copy

The vhost-user spec says that once the vring is disabled, the
client has to stop processing it. But it can happen when
dequeue zero-copy is enabled if outstanding descriptors buffers
are still being processed by an external NIC or another guest.

The fix consists in draining the zmbufs list to ensure no more
descriptors buffers are in the wild.

Note that this fix is only working in the case REPLY_ACK
protocol feature is enabled, which is not the case by default
for now (it is only enabled when IOMMU feature is enabled in
the vhost library).

Fixes: b0a985d1f340 ("vhost: add dequeue zero copy")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
5 years agonet/octeontx: fix vdev name
Stephen Hemminger [Wed, 13 Mar 2019 21:58:15 +0000 (14:58 -0700)]
net/octeontx: fix vdev name

The octeontx driver is creating vdev with name  "OCTEONTX_PMD"
which is an artifact from how RTE_PMD_REGISTER_VDEV arguments
work.

Change to use the same convention as all the other network
drivers ie "net_octeontx").

Fixes: f7be70e5130e ("net/octeontx: add net device probe and remove")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
5 years agonet/enic: fix max MTU calculation
Hyong Youb Kim [Thu, 14 Mar 2019 11:05:32 +0000 (04:05 -0700)]
net/enic: fix max MTU calculation

The maximum packet length (max_pkt_len) from the firmware does not
include CRC, so do not subtract 4 when deriving the max MTU. This
change effectively increases the max MTU by 4B. Apps often assume max
MTU = max_rx_pkt_len - 14 (ethernet header), and attempt to set the
MTU to that value (i.e. set MTU to max HW value). This change
incidentally allows such apps to change MTU to max value successfully.

Fixes: bb34ffb848a0 ("net/enic: determine max egress packet size and max MTU")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agoexamples/ethtool: remove query of default config
Thomas Monjalon [Wed, 13 Mar 2019 10:09:09 +0000 (11:09 +0100)]
examples/ethtool: remove query of default config

The default config is used if the setup parameter is NULL.
No need to query the default config with rte_eth_dev_info_get().
The function call will be removed with another useless info.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
5 years agonet/atlantic: fix xstats return
Igor Russkikh [Tue, 12 Mar 2019 15:25:10 +0000 (15:25 +0000)]
net/atlantic: fix xstats return

Max number of xstats items was returned instead of actual number
of filled in records.

Fixes: fbe059e87209 ("net/atlantic: implement device statistics")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: fix missing VLAN filter offload
Igor Russkikh [Tue, 12 Mar 2019 15:25:07 +0000 (15:25 +0000)]
net/atlantic: fix missing VLAN filter offload

Original vlan offload code declared callbacks, but did not
enable the feature offload bit

Fixes: f7c2c2c8c558 ("net/atlantic: implement VLAN filters and offloads")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: eliminate excessive log levels on Rx/Tx
Igor Russkikh [Tue, 12 Mar 2019 15:25:05 +0000 (15:25 +0000)]
net/atlantic: eliminate excessive log levels on Rx/Tx

Default rxtx logging used ERR level, that caused logger to always
trigger. That may cause perf degradation even if logger was not enabled
but compiled in.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: fix link configuration
Igor Russkikh [Tue, 12 Mar 2019 15:25:03 +0000 (15:25 +0000)]
net/atlantic: fix link configuration

In case link speed is re configured after port start, it does not
takes the requested speed value, but instead just sets full autoneg
mask.

Fixes: 7943ba05f67c ("net/atlantic: add link status and interrupt management")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: fix EEPROM get for small and uneven lengths
Pavel Belous [Tue, 12 Mar 2019 15:25:01 +0000 (15:25 +0000)]
net/atlantic: fix EEPROM get for small and uneven lengths

Fixes: ce4e8d418097 ("net/atlantic: implement EEPROM get/set")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
5 years agonet/atlantic: use EEPROM magic as a device address
Pavel Belous [Tue, 12 Mar 2019 15:24:59 +0000 (15:24 +0000)]
net/atlantic: use EEPROM magic as a device address

Default dev addr is replaced with magic field from the request.
Length is allowed to be less than maximum.
SMBUS access bit definitions also better organised now.

Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
5 years agonet/atlantic: fix buffer overflow
Pavel Belous [Tue, 12 Mar 2019 15:24:57 +0000 (15:24 +0000)]
net/atlantic: fix buffer overflow

Found by Coverity scan. This is a real memory corruption.
There is no need in extra RTE_ALIGN macros since the
request/result structures are 4-byte aligned by definition.

Coverity issue: 323518, 323520
Fixes: ce4e8d418097 ("net/atlantic: implement EEPROM get/set")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: Pavel Belous <pavel.belous@aquantia.com>
5 years agonet/atlantic: remove extra checks for error codes
Igor Russkikh [Tue, 12 Mar 2019 15:24:55 +0000 (15:24 +0000)]
net/atlantic: remove extra checks for error codes

Found by Coverity scan. Checks are useless
because at these code places err is always zero.

Fixes: 86d36773bd42 ("net/atlantic: implement firmware operations")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: remove unused variable
Igor Russkikh [Tue, 12 Mar 2019 15:24:53 +0000 (15:24 +0000)]
net/atlantic: remove unused variable

Found by coverity scan.

Coverity issue: 323512
Fixes: 7906661edac6 ("net/atlantic: add b0 hardware layer")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/atlantic: fix negative error codes
Igor Russkikh [Tue, 12 Mar 2019 15:24:52 +0000 (15:24 +0000)]
net/atlantic: fix negative error codes

These are just convention breakage on rte_errno,
no real harm from that.

Fixes: 2b1472d7150c ("net/atlantic: implement Tx path")
Cc: stable@dpdk.org
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
5 years agonet/qede: fix Rx packet drop
Shahed Shaikh [Tue, 12 Mar 2019 16:51:14 +0000 (09:51 -0700)]
net/qede: fix Rx packet drop

There is a corner case in which driver won't post
receive buffers when driver has processed all received packets
in single loop (i.e. hw_consumer == sw_consumer) and then
HW will start dropping packets since it did not see new receive
buffers posted.

This corner case is seen when size of Rx ring is less than or equals
Rx packet burst count for dev->rx_pkt_burst().

Fixes: 8f2312474529 ("net/qede: fix performance bottleneck in Rx path")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shshaikh@marvell.com>
Acked-by: Rasesh Mody <rmody@marvell.com>
5 years agoethdev: fix method name in doxygen comment
Rami Rosen [Tue, 12 Mar 2019 16:07:42 +0000 (18:07 +0200)]
ethdev: fix method name in doxygen comment

This patch fixes rte_ethdev header file to use the correct method name,
namely to use rte_eth_dev_info_get() instead of
rte_eth_dev_infos_get().

Fixes: a4996bd89c42 ("ethdev: new Rx/Tx offloads API")
Fixes: 4f5701f28bd4 ("examples: fix RSS hash function configuration")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
5 years agoapp/testpmd: fix a typo in log message
Rami Rosen [Tue, 12 Mar 2019 05:48:43 +0000 (07:48 +0200)]
app/testpmd: fix a typo in log message

This patch fixes a typo in test-pmd/cmdline.c,
succcessfully->successfully
Two C's are good enough for success...

Fixes: a09f3e4c5046 ("app/testpmd: add hash configuration")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
5 years agoapp/testpmd: remove unused field from port struct
David Marchand [Mon, 11 Mar 2019 15:35:19 +0000 (16:35 +0100)]
app/testpmd: remove unused field from port struct

Remove some leftover from a previous rework.

Fixes: c4bcc342c8ee ("app/testpmd: refactor ieee1588 forwarding")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
5 years agonet/ixgbe: restore VLAN filter for VF
David Harton [Fri, 15 Mar 2019 16:08:32 +0000 (12:08 -0400)]
net/ixgbe: restore VLAN filter for VF

ixgbevf VLAN strip and extend capabilities were removed when
migrating to the bit flags implementation.

Restoring the capability to enable the VLAN strip offload at
configuration time.

Fixes: ec3b1124d14d ("net/ixgbe: convert to new Rx offloads API")
Cc: stable@dpdk.org
Signed-off-by: David Harton <dharton@cisco.com>
Acked-by: Wei Zhao <wei.zhao1@intel.com>
5 years agonet/mlx5: support new representor naming format
Dekel Peled [Sun, 17 Mar 2019 06:23:03 +0000 (08:23 +0200)]
net/mlx5: support new representor naming format

Kernel update [1] introduce new format of representors names.
This patch implements RFC [2], updating MLX5 PMD to support the new
format, while maintaining support of the existing format.

[1] https://github.com/torvalds/linux/commit/c12ecc2
[2] http://mails.dpdk.org/archives/dev/2019-March/125676.html

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
5 years agonet/nfp: fix RSS query
Alejandro Lucero [Tue, 12 Mar 2019 10:19:27 +0000 (10:19 +0000)]
net/nfp: fix RSS query

Current code is not properly giving the RSS information
regarding the redirection table.

Fixes: 934e4c60fbff ("nfp: add RSS")
Cc: stable@dpdk.org
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
5 years agoapp/testpmd: optimize MAC swap for Arm
Ruifeng Wang [Tue, 12 Mar 2019 05:35:27 +0000 (13:35 +0800)]
app/testpmd: optimize MAC swap for Arm

Improved MAC swap performance for ARM platform.
The improvement was achieved by using neon intrinsics
to save CPU cycles and doing swap for four packets
at a time.
The optimization had 15% - 20% throughput boost
in testpmd MAC swap mode.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
5 years agonet/bnxt: suppress spurious error log
Stephen Hemminger [Thu, 14 Mar 2019 21:32:14 +0000 (14:32 -0700)]
net/bnxt: suppress spurious error log

The driver multiple rxq allocation logs a message at error level
but it really is a debug message.

Fixes: 51fafb89a9a0 ("net/bnxt: get rid of ff pools and use VNIC info array")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Somnath Kotur <somnath.kotur@broadcom.com>
5 years agonet/bnxt: silence IOVA warnings
Stephen Hemminger [Thu, 14 Mar 2019 21:32:13 +0000 (14:32 -0700)]
net/bnxt: silence IOVA warnings

When using bnxt on bare-metal with vfio-pci, the driver logs an
unnecessary warning. Hardware works fine, message is not urgent.
Change it to INFO level.

Fixes: 62196f4e0941 ("mem: rename address mapping function to IOVA")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/bnxt: use notice as default log level
Stephen Hemminger [Mon, 11 Mar 2019 18:11:43 +0000 (11:11 -0700)]
net/bnxt: use notice as default log level

Make bnxt driver consistent with all other network drivers
by setting default to NOTICE for log level.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/bnxt: do not double space version message
Stephen Hemminger [Mon, 11 Mar 2019 18:11:42 +0000 (11:11 -0700)]
net/bnxt: do not double space version message

The version message is double spaced in the log.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/bnxt: change PTP message to debug level
Stephen Hemminger [Mon, 11 Mar 2019 18:11:41 +0000 (11:11 -0700)]
net/bnxt: change PTP message to debug level

This message doesn't need to be at INFO level, it is a normal
situation and only useful for debugging.

Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
5 years agonet/mlx5: fix packet inline on Tx queue wraparound
Shahaf Shuler [Sun, 10 Mar 2019 08:14:10 +0000 (10:14 +0200)]
net/mlx5: fix packet inline on Tx queue wraparound

Inlining a packet to WQE that cross the WQ wraparound, i.e. the WQE
starts on the end of the ring and ends on the beginning, is not
supported and blocked by the data path logic.

However, in case of TSO, an extra inline header is required before
inlining. This inline header is not taken into account when checking if
there is enough room left for the required inline size.
On some corner cases were
(ring_tailroom - inline header) < inline size < ring_tailroom ,
this can lead to WQE being written outsize of the ring buffer.

Fixing it by always assuming the worse case that inline of packet will
require the inline header.

Fixes: 3f13f8c23a7c ("net/mlx5: support hardware TSO")
Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
5 years agonet/qede: support IOVA VA mode
Kevin Traynor [Fri, 8 Mar 2019 09:28:55 +0000 (09:28 +0000)]
net/qede: support IOVA VA mode

Set RTE_PCI_DRV_IOVA_AS_VA in drv_flags. This allows initializing qede
PMD as non-root also on Linux v4.x, where /proc/self/pagemap can't be
acccessed without CAP_SYS_ADMIN privileges.

The flag was introduced generically but not in pmds in:
commit 815c7deaed2d ("pci: get IOMMU class on Linux")

Cc: stable@dpdk.org
Acked-by: Shahed Shaikh <shshaikh@marvell.com>
Acked-by: Rasesh Mody <rmody@marvell.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
5 years agoethdev: replace snprintf with strlcpy on init
Stephen Hemminger [Thu, 28 Feb 2019 22:47:54 +0000 (14:47 -0800)]
ethdev: replace snprintf with strlcpy on init

Don't need to use snprintf for simple name copy.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
5 years agoethdev: replace snprintf with strlcpy for owner
Stephen Hemminger [Thu, 28 Feb 2019 22:47:53 +0000 (14:47 -0800)]
ethdev: replace snprintf with strlcpy for owner

The set_port_owner was copying a string between structures of the
same type, therefore the name could never be truncated (unless source
string was not null terminated).  Use strlcpy which does it better.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
5 years agonet/i40e: fix time sync for 25G
Qi Zhang [Mon, 11 Mar 2019 07:42:20 +0000 (15:42 +0800)]
net/i40e: fix time sync for 25G

Time sync increment value is not configured for 25G device.

The patch fix this issue by setting the same value as 40G, this
aligned with kernel driver's behaviour.

Fixes: 75d133dd3296 ("net/i40e: enable 25G device")
Cc: stable@dpdk.org
Reported-by: Michael Luo <michael.luo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Michael Luo <michael.luo@intel.com>
5 years agonet/nfp: fix setting MAC address
Pablo Cascón [Fri, 8 Mar 2019 15:40:47 +0000 (15:40 +0000)]
net/nfp: fix setting MAC address

Some firmwares, mostly for VFs, do not advertise the feature /
capability of changing the MAC address while the interface is up. With
such firmware a request to change the MAC address that at the same
time also tries to enable the not available feature will be denied by
the firmware resulting in an error message like:

nfp_net_reconfig(): Error nfp_net reconfig for ctrl: 80000000 update: 800

Fix set_mac_addr by not trying to enable a feature if it is not
advertised by the firmware.

Fixes: 2fe669f4bcd2 ("net/nfp: support MAC address change")
Cc: stable@dpdk.org
Signed-off-by: Pablo Cascón <pablo.cascon@netronome.com>
Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
5 years agonet/i40e: update queue number check for rounding
Kevin Traynor [Tue, 5 Mar 2019 16:30:39 +0000 (16:30 +0000)]
net/i40e: update queue number check for rounding

Since rounding up the requested queue pairs to allow the VF to
request a non-aligned number was added, it may happen that the
requested number is less than the available num of queues but the
rounded up number is greater. In this case, it is not caught with
the usual checks but later when there is a reset and failed setup.

By rounding earlier the checks can be done before a failed reset
occurs, and a rounded max amount of available queues can be returned
to the VF.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
5 years agomalloc: add NUMA-aware realloc function
Tomasz Jozwiak [Fri, 1 Mar 2019 08:46:16 +0000 (09:46 +0100)]
malloc: add NUMA-aware realloc function

Currently, rte_realloc will not respect original allocation's
NUMA node when memory cannot be resized, and there is no
NUMA-aware equivalent of rte_realloc. This patch adds such a function.

The new API will ensure that reallocated memory stays on
requested NUMA node, as well as allow moving allocated memory
to a different NUMA node.

Signed-off-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
5 years agodoc: add notes about eventdev producer/consumer dependency
Pavan Nikhilesh [Tue, 12 Mar 2019 20:41:13 +0000 (20:41 +0000)]
doc: add notes about eventdev producer/consumer dependency

EventDev i.e consumer needs to be started before starting the
event producers.
Update documentation of EventDev and EventDev adapters.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
Reviewed-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
5 years agoexamples/eventdev: start ethdev after adapter setup
Pavan Nikhilesh [Tue, 12 Mar 2019 20:41:09 +0000 (20:41 +0000)]
examples/eventdev: start ethdev after adapter setup

Start ethdev after the Rx/Tx adapter setup is complete as in some
architectures it might lead to undefined behaviour or events being
dropped.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Nikhil Rao <nikhil.rao@intel.com>
5 years agoapp/eventdev: start event producers after eventdev
Pavan Nikhilesh [Tue, 12 Mar 2019 20:41:05 +0000 (20:41 +0000)]
app/eventdev: start event producers after eventdev

Start event producers after eventdev i.e. consumer is started as in some
architectures it might lead to undefined behaviour or events being
dropped.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
5 years agoevent/opdl: replace sprintf with snprintf
Pallantla Poornima [Mon, 4 Feb 2019 07:18:02 +0000 (07:18 +0000)]
event/opdl: replace sprintf with snprintf

sprintf function is not secure as it doesn't check the length of string.
More secure function snprintf is used.

Fixes: 3c7f3dcfb0 ("event/opdl: add PMD main body and helper function")
Cc: stable@dpdk.org
Signed-off-by: Pallantla Poornima <pallantlax.poornima@intel.com>
5 years agoapp/eventdev: configure optimum timers per adapter
Pavan Nikhilesh [Mon, 11 Mar 2019 06:49:28 +0000 (06:49 +0000)]
app/eventdev: configure optimum timers per adapter

Previously, the total number of event timers per adapter was set to an
arbitrary value, set it to mempool size instead as it defines the max
event timers that can be armed.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
5 years agoapp/eventdev: follow proper teardown sequence
Pavan Nikhilesh [Fri, 1 Mar 2019 07:16:47 +0000 (07:16 +0000)]
app/eventdev: follow proper teardown sequence

Stop eventdev before closing it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
5 years agoexamples/eventdev: follow proper teardown sequence
Pavan Nikhilesh [Fri, 1 Mar 2019 07:16:45 +0000 (07:16 +0000)]
examples/eventdev: follow proper teardown sequence

Stop eventdev before closing it.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
5 years agoexamples/eventdev: probe max events
Pavan Nikhilesh [Fri, 1 Mar 2019 07:16:42 +0000 (07:16 +0000)]
examples/eventdev: probe max events

Some eventdevs support configuring max events to be -1 (open system).
Check eventdev and event port configuration with eventdev info before
configuring them.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
5 years agogit: ignore build directories
Bruce Richardson [Mon, 11 Mar 2019 10:57:32 +0000 (10:57 +0000)]
git: ignore build directories

test-meson-build.sh generates multiple build directories for various
targets. As these follow a known pattern, and since they don't need
to be tracked in git, we can add them to the gitignore file,
along with the default build directory "build".

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Rami Rosen <ramirose@gmail.com>
5 years agogit: ignore hidden files
Bruce Richardson [Mon, 11 Mar 2019 10:57:31 +0000 (10:57 +0000)]
git: ignore hidden files

Generally hidden files are hidden for good reason and we don't want to
track them in git. They can always be manually added to git tracking
individually if needed.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agogit: ignore python bytecode files
Bruce Richardson [Mon, 11 Mar 2019 10:57:30 +0000 (10:57 +0000)]
git: ignore python bytecode files

After you run a python script, a .pyc file is often left behind,
which we don't want to track in git.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agogit: add comments for ignored files
Bruce Richardson [Mon, 11 Mar 2019 10:57:29 +0000 (10:57 +0000)]
git: add comments for ignored files

Split the ignored file list into section based on logical groups of files,
putting a comment at the top of each section clarifying what it is.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agodevtools: fix meson build test to exit on failure
Bruce Richardson [Tue, 12 Mar 2019 10:18:28 +0000 (10:18 +0000)]
devtools: fix meson build test to exit on failure

When piping the ninja command through cat, we lose the error value from
the call to ninja in the case of failure. This prevents the script from
exiting at the first broken build. Fix this by setting the "pipefail"
shell option.

Fixes: 4bcb9b768604 ("devtools: add verbose option to meson build test")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agomk: use linux and freebsd in config names
Bruce Richardson [Wed, 6 Mar 2019 16:22:42 +0000 (16:22 +0000)]
mk: use linux and freebsd in config names

Rather than using linuxapp and bsdapp everywhere, we can change things to
use the, more readable, terms "linux" and "freebsd" in our build configs.
Rather than renaming the configs we can just duplicate the existing ones
with the new names using symlinks, and use the new names exclusively
internally. ["make showconfigs" also only shows the new names to keep the
list short] The result is that backward compatibility is kept fully but any
new builds or development can be done using the newer names, i.e.  both
"make config T=x86_64-native-linuxapp-gcc" and "T=x86_64-native-linux-gcc"
work.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agobuild: rename linuxapp to linux in meson cross files
Bruce Richardson [Wed, 6 Mar 2019 16:22:41 +0000 (16:22 +0000)]
build: rename linuxapp to linux in meson cross files

Rename the cross files for meson compilation from having linuxapp
in the name to just linux in the name.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agobuild/freebsd: rename macro BSDPAPP to FREEBSD
Bruce Richardson [Wed, 6 Mar 2019 16:22:40 +0000 (16:22 +0000)]
build/freebsd: rename macro BSDPAPP to FREEBSD

Rename the macro and all instances in DPDK code, but keep a copy of
the old macro defined for legacy code linking against DPDK

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agobuild/linux: rename macro from LINUXAPP to LINUX
Bruce Richardson [Wed, 6 Mar 2019 16:22:39 +0000 (16:22 +0000)]
build/linux: rename macro from LINUXAPP to LINUX

Rename the macro to make things shorter and more comprehensible. For
both meson and make builds, keep the old macro around for backward
compatibility.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agoeal/linux: rename linuxapp to linux
Bruce Richardson [Wed, 6 Mar 2019 16:22:38 +0000 (16:22 +0000)]
eal/linux: rename linuxapp to linux

The term "linuxapp" is a legacy one, but just calling the subdirectory
"linux" is just clearer for all concerned.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agoeal/bsd: rename bsdapp to freebsd
Bruce Richardson [Wed, 6 Mar 2019 16:22:37 +0000 (16:22 +0000)]
eal/bsd: rename bsdapp to freebsd

The term "bsdapp" is a legacy one, but just calling the subdirectory
"freebsd" is just clearer for all concerned.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
5 years agotest/crypto: fix duplicate id used by CCP device
Hemant Agrawal [Wed, 6 Mar 2019 16:40:34 +0000 (22:10 +0530)]
test/crypto: fix duplicate id used by CCP device

These duplicate device id is causing incorrect mapping
for DPAA_SEC for test case execution on the basis of
capabilities.

Fixes: e155ca055e84 ("test/crypto: add tests for AMD CCP")
Cc: stable@dpdk.org
Reported-by: Anoob Joseph <anoobj@marvell.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
5 years agocryptodev: rework modexp and modinv comments
Arek Kusztal [Wed, 6 Feb 2019 10:34:26 +0000 (11:34 +0100)]
cryptodev: rework modexp and modinv comments

This patch changes modular exponentiation and modular multiplicative
inverse API comments to make it more precise.

Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
5 years agocrypto/openssl: fix modexp
Arek Kusztal [Tue, 5 Feb 2019 09:13:19 +0000 (10:13 +0100)]
crypto/openssl: fix modexp

Fixes bad reference of modinv struct in openssl pmd

Fixes: 3e9d6bd447fb ("crypto/openssl: add RSA and mod asym operations")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
5 years agocrypto/openssl: fix big numbers after computations
Arek Kusztal [Thu, 7 Feb 2019 10:54:39 +0000 (11:54 +0100)]
crypto/openssl: fix big numbers after computations

After performing mod exp and mod inv big numbers (BIGNUM) should
be cleared as data already is copied into op fields and this BNs would
very likely contain private information for unspecified amount of time
(duration of the session).

Fixes: 3e9d6bd447fb ("crypto/openssl: add RSA and mod asym operations")
Cc: stable@dpdk.org
Signed-off-by: Arek Kusztal <arkadiuszx.kusztal@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Shally Verma <shallyv@marvell.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
5 years agodoc: fix table of kernel drivers in qat guide
Fiona Trahe [Thu, 7 Feb 2019 18:46:27 +0000 (18:46 +0000)]
doc: fix table of kernel drivers in qat guide

Added missing line informing which kernel driver can
be used for device DH895xcc for compression service.
Moved service columns to start of table for better visibility
and to prepare for future asymmetric crypto service.

Fixes: e2e35849ea78 ("compress/qat: add compression on DH895x")
Cc: stable@dpdk.org
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>
Acked-by: Tomasz Jozwiak <tomaszx.jozwiak@intel.com>
5 years agocommon/cpt: fix null auth only
Anoob Joseph [Wed, 13 Feb 2019 09:22:50 +0000 (09:22 +0000)]
common/cpt: fix null auth only

Fixes: 351fbee21986 ("common/cpt: support hash")
Cc: stable@dpdk.org
Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
5 years agonet/ixgbe: support VF promiscuous by PF driver
Wei Zhao [Fri, 8 Mar 2019 02:46:17 +0000 (10:46 +0800)]
net/ixgbe: support VF promiscuous by PF driver

The patch adds the PF counterpart changes to support VF promiscuous
mode by DPDK PF driver.

For ixgbe, in order to support VF VLAN promiscuous or unicast
promiscuous, PF need to set register PFVML2FLT of bit UPE and VPE.
The patch aligned to kernel driver's implementation.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
5 years agonet/ixgbevf: enable promiscuous mode
Wei Zhao [Fri, 8 Mar 2019 02:46:16 +0000 (10:46 +0800)]
net/ixgbevf: enable promiscuous mode

Add promiscuous mode support on VF

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
5 years agonet/mlx5: fix sync when handling Tx completions
Dekel Peled [Thu, 28 Feb 2019 15:20:30 +0000 (17:20 +0200)]
net/mlx5: fix sync when handling Tx completions

Function mlx5_tx_complete() reads completion entry information
from Tx queue.
For some processors not having strongly-ordered memory model,
there has to be a memory barrier between reading the entry index
and the entry fields, in order to guarantee data is valid.

Fixes: 54d3fe948dba ("net/mlx5: poll completion queue once per a call")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
5 years agonet/mlx5: fix hex dump of error completion
Dekel Peled [Thu, 28 Feb 2019 15:18:45 +0000 (17:18 +0200)]
net/mlx5: fix hex dump of error completion

struct mlx5_cqe is defined in MLX5 PMD code (mlx5_prm.h).
It includes 64 bytes padding in case of (RTE_CACHE_LINE_SIZE == 128).

struct mlx5_err_cqe is defined in kernel, and doesn't include padding.

When running in debug mode, in case an error CQE is detected
it is printed using rte_hexdump().

The size of data to print should be sizeof(*cqe) instead of
sizeof(*err_cqe), to handle the case of (RTE_CACHE_LINE_SIZE == 128),
and print the full data in any case.

Fixes: c7714992092f ("net/mlx5: extend debug logs verbosity")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
5 years agonet/mlx5: call generic strlcpy
Thomas Monjalon [Sun, 24 Feb 2019 22:42:40 +0000 (23:42 +0100)]
net/mlx5: call generic strlcpy

The call to strlcpy uses either libc, libbsd or internal rte_strlcpy.
No need to call the DPDK flavor explicitly.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Rami Rosen <ramirose@gmail.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
5 years agonet/mlx4: fix default flow rule create
Dekel Peled [Sun, 24 Feb 2019 09:41:09 +0000 (11:41 +0200)]
net/mlx4: fix default flow rule create

Original patch changed logic of function mlx4_flow_merge_eth().
The setting of flow->promisc was wrongly removed.
This patch adds the removed setting of flow->promisc, to restore
the required behavior.

Fixes: c0d239263156 ("net/mlx4: support flow w/o ETH spec and with VLAN")
Cc: stable@dpdk.org
Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
5 years agonet/mlx5: fix flow priorities probing error path
Viacheslav Ovsiienko [Thu, 21 Feb 2019 09:02:16 +0000 (09:02 +0000)]
net/mlx5: fix flow priorities probing error path

The mlx5 PMD probes the Verbs flow priorities supported with
ibv_create_flow() function. If rdma-core or kernel fails for
some reason, the returned error causes the drop queue is not
destroyed, and pd is locked by not freed resource.

Also the mlx5_flow_discover_priorities() returned negative value
as error, and this code was reported "as is", without sign
changing (eventually causing assert(err > 0)).

Fixes: 2815702baea7 ("net/mlx5: replace verbs priorities by flow")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
5 years agonet/i40e: fix negative check on unsigned queue pairs
Kevin Traynor [Tue, 5 Mar 2019 16:30:38 +0000 (16:30 +0000)]
net/i40e: fix negative check on unsigned queue pairs

Fix the check and associated log. Also, fix a typo in other log.

Fixes: 03d478e9609d ("net/i40e: support PF respond VF request more queues")
Cc: stable@dpdk.org
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
5 years agonet/ixgbe: fix crash on remove
Yunjian Wang [Wed, 13 Feb 2019 02:48:52 +0000 (10:48 +0800)]
net/ixgbe: fix crash on remove

The NIC's interrupt source has some active handler when the
port removed. We should cancel the delay handler before removing
dev to prevent executing the delay handler.

Call Trace:
  #0  ixgbe_disable_intr (hw=0x0, hw=0x0)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:852
  #1  ixgbe_dev_interrupt_delayed_handler (param=0xadb9c0
      <rte_eth_devices@@DPDK_2.2+33024>)
      at /usr/src/debug/dpdk-18.11/drivers/net/ixgbe/ixgbe_ethdev.c:4386
  #2  0x00007f05782147af in eal_alarm_callback (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_alarm.c:90
  #3  0x00007f057821320a in eal_intr_process_interrupts (nfds=1,
      events=0x7f056cbf3e88) at /usr/src/debug/dpdk-18.11/lib/
      librte_eal/linuxapp/eal/eal_interrupts.c:838
  #4  eal_intr_handle_interrupts (totalfds=<optimized out>, pfd=18)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:885
  #5  eal_intr_thread_main (arg=<optimized out>)
      at /usr/src/debug/dpdk-18.11/lib/librte_eal/linuxapp/eal/
      eal_interrupts.c:965
  #6  0x00007f05708a0e45 in start_thread () from /usr/lib64/libpthread.so.0
  #7  0x00007f056eb4ab5d in clone () from /usr/lib64/libc.so.6

Fixes: 2866c5f1b87e ("ixgbe: support port hotplug")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
5 years agodoc: fix tag for inner RSS feature
Rami Rosen [Fri, 1 Mar 2019 11:10:10 +0000 (13:10 +0200)]
doc: fix tag for inner RSS feature

This patch fixes a wrong tag in guides/nics/features.rst.
The features tags should be, according to the
"Features Overview" section in this doc, one of the following:
"uses", "implements", "provides", or "related".
Hence in Inner RSS section, it should be "uses"
instead of "users".

Fixes: d0a87d9aa8de ("doc: update mlx5 guide on tunnel offloading")
Cc: stable@dpdk.org
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
5 years agonet/enic: fix inner packet matching
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:51 +0000 (02:42 -0800)]
net/enic: fix inner packet matching

Inner packet matching is currently buggy in many cases.

1. Mishandling null spec ("match any").
The copy_item functions do nothing if spec is null. This is incorrect,
as all patterns should be appended to the L5 pattern buffer even for
null spec (treated as all zeros).

2. Accessing null spec causing segfault.

3. Not setting protocol fields.
The NIC filter API currently has no flags for "match inner IPv4, IPv6,
UDP, TCP, and so on". So, the driver needs to explicitly set EtherType
and IP protocol fields in the L5 pattern buffer to avoid false
positives (e.g. reporting IPv6 as IPv4).

Instead of keep adding "if inner, do something differently" cases to
the existing copy_item functions, introduce separate functions for
inner packet patterns and address the above issues in those
functions. The changes to the previous outer-packet copy_item
functions are mechanical, due to reduced indentation.

Fixes: 6ced137607d0 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: fix endianness in VLAN match
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:50 +0000 (02:42 -0800)]
net/enic: fix endianness in VLAN match

The VLAN fields in the NIC filter use little endian. The VLAN item is
in big endian, so swap bytes.

Fixes: 6ced137607d0 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: fix VXLAN match
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:49 +0000 (02:42 -0800)]
net/enic: fix VXLAN match

The filter API does not have flags for "match VXLAN". Explicitly set
the UDP destination port and mask in the L4 pattern. Otherwise, UDP
packets with non-VXLAN ports may be falsely reported as VXLAN.

1400 series VIC adapters have hardware VXLAN parsing. The L5 buffer on
the NIC starts with the inner Ethernet header, and the VXLAN header is
now in the L4 buffer following the UDP header. So the VXLAN spec/mask
needs to be in the L4 pattern, not L5. Older models still expect the
VXLAN spec/mask in the L5 pattern. Fix up the L4/L5 patterns
accordingly.

Fixes: 6ced137607d0 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: reset VXLAN port regardless of overlay offload
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:48 +0000 (02:42 -0800)]
net/enic: reset VXLAN port regardless of overlay offload

Currently, the driver resets the vxlan port register only if overlay
offload is enabled. But, the register is actually tied to hardware
vxlan parsing, which is an independent feature and is always enabled
even if overlay offload is disabled. If left uninitialized, it can
affect flow rules that match vxlan. So always reset the port number
when HW vxlan parsing is available.

Fixes: 8a4efd17410c ("net/enic: add handlers to add/delete vxlan port number")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: enable limited support for raw flow item
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:47 +0000 (02:42 -0800)]
net/enic: enable limited support for raw flow item

Some apps like VPP use a raw item to match UDP tunnel headers like
VXLAN or GENEVE. The NIC hardware supports such usage via L5 match,
which does pattern match on packet data immediately following the
outer L4 header. Accept raw items for these limited use cases.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: move arguments into struct
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:46 +0000 (02:42 -0800)]
net/enic: move arguments into struct

There are many copy_item functions, all with the same arguments, which
makes it difficult to add/change arguments. Move the arguments into a
struct to help subsequent commits that will add/fix features. Also
remove self-explanatory verbose comments for these local functions.

These changes are purely mechanical and have no impact on
functionalities.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: enable limited passthru flow action
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:45 +0000 (02:42 -0800)]
net/enic: enable limited passthru flow action

Some apps like VPP use PASSTHRU+MARK flow rules to offload packet
matching to the NIC. Just like MARK+RSS used by OVS-DPDK and others,
PASSTHRU+MARK is used to "mark and then receive normally". Recent VIC
adapters support such flow rules, so enable PASSTHRU for this limited
use case.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
5 years agonet/enic: enable limited RSS flow action
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:44 +0000 (02:42 -0800)]
net/enic: enable limited RSS flow action

Some apps like OVS-DPDK use MARK+RSS flow rules in order to offload
packet matching to the NIC. The RSS action in such flow rules simply
indicates "receive packet normally", not trying to override the port
wide RSS. The action is included in the flow rules simply to terminate
them, as MARK is not a fate-deciding action. And, the RSS action has a
most basic config: default hash, level, types, null key, and identity
queue mapping.

Recent VIC adapters can support these "mark and receive" flow
rules. So, enable support for RSS action for this limited use case.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agonet/enic: check for unsupported flow item types
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:43 +0000 (02:42 -0800)]
net/enic: check for unsupported flow item types

Currently a pattern with an unsupported item type causes segfault,
because the flow handler is using the type as an array index without
checking bounds. Add an explicit check for unsupported item types and
avoid out-of-bound accesses.

Fixes: 6ced137607d0 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agonet/enic: allow flow mark ID 0
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:42 +0000 (02:42 -0800)]
net/enic: allow flow mark ID 0

The driver currently accepts mark ID 0 but does not report it in
matching packet's mbuf. For example, the following testpmd command
succeeds. But, the mbuf of a matching IPv4 UDP packet does not have
PKT_RX_FDIR_ID set.

flow create 0 ingress pattern ... actions mark id 0 / queue index 0 / end

The problem has to do with mapping mark IDs (32-bit) to NIC filter
IDs. Filter ID is currently 16-bit, so values greater than 0xffff are
rejected. The firmware reserves filter ID 0 for filters that do not
mark (e.g. steer w/o mark). And, the driver reserves 0xffff for the
flag action. This leaves 1...0xfffe for app use.

It is possible to simply reject mark ID 0 as unsupported. But, 0 is
commonly used (e.g. OVS-DPDK and VPP). So, when adding a filter, set
filter ID = mark ID + 1 to support mark ID 0. The receive handler
subtracts 1 from filter ID to get back the original mark ID.

Fixes: dfbd6a9cb504 ("net/enic: extend flow director support for 1300 series")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agonet/enic: fix SCTP match for flow API
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:41 +0000 (02:42 -0800)]
net/enic: fix SCTP match for flow API

The driver needs to explicitly set the protocol number (132) in the IP
header pattern, as the current firmware filter API lacks "match SCTP
packet" flag. Otherwise, the resulting NIC filter may lead to false
positives (i.e. NIC reporting non-SCTP packets as SCTP packets). The
flow director handler does the same (enic_clsf.c).

Fixes: 6ced137607d0 ("net/enic: flow API for NICs with advanced filters enabled")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agonet/enic: fix flow director SCTP matching
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:40 +0000 (02:42 -0800)]
net/enic: fix flow director SCTP matching

The firmware filter API does not have flags indicating "match SCTP
packet". Instead, the driver needs to explicitly add an IP match and
set the protocol number (132 for SCTP) in the IP header.

The existing code (copy_fltr_v2) has two bugs.

1. It sets the protocol number (132) in the match value, but not the
mask. The mask remains 0, so the match becomes a wildcard match. The
NIC ends up matching all protocol numbers (i.e. thinks non-SCTP
packets are SCTP).

2. It modifies the input argument (rte_eth_fdir_input). The driver
tracks filters using rte_hash_{add,del}_key(input). So, addding
(RTE_ETH_FILTER_ADD) and deleting (RTE_ETH_FILTER_DELETE) must use the
same input argument for the same filter. But, overwriting the protocol
number while adding the filter breaks this assumption, and causes
delete operation to fail.

So, set the mask as well as protocol value. Do not modify the input
argument, and use const in function signatures to make the intention
clear. Also move a couple function declarations to enic_clsf.c from
enic.h as they are strictly local.

Fixes: dfbd6a9cb504 ("net/enic: extend flow director support for 1300 series")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agonet/enic: remove unused functions
Hyong Youb Kim [Sat, 2 Mar 2019 10:42:39 +0000 (02:42 -0800)]
net/enic: remove unused functions

Remove unused functions. Specifically, vnic_set_rss_key() is
obsolete. enic_{add,del}_vlan() has never been supported in the
firmware. And, remove vnic_rss.c altogether as it becomes empty. These
were discovered by cppcheck.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
5 years agoeal: fix core list validation with disabled cores
David Marchand [Wed, 13 Feb 2019 20:06:59 +0000 (21:06 +0100)]
eal: fix core list validation with disabled cores

-l and -c options are two ways to select the cores used by DPDK.
Their format differs, but the checks on the selected cores are the same.
Use an intermediate array to separate the specific parsing checks from
the common consistency checks.
The parsing functions now concentrate on validating the passed string
and do nothing more.

We can report all invalid core indexes rather than only the first error.
In the error log message, reporting [0, cfg->lcore_count - 1] as a valid
range is then wrong when the core list is not continuous.

Example on my 8 cpus laptop with core 2 and 6 disabled.
echo 0 > /sys/devices/system/cpu/cpu2/online
echo 0 > /sys/devices/system/cpu/cpu6/online

Before:
./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048
EAL: Detected 6 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: invalid core list, please check core numbers are in [0, 5] range
...

After:
./master/app/testpmd -l 0-7 --no-huge -m 512 -- --total-num-mbufs 2048
EAL: Detected 6 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: lcore 2 unavailable
EAL: lcore 6 unavailable
EAL: invalid core list, please check specified cores are part of 0-1,3-5,7
...

Fixes: d888cb8b9613 ("eal: add core list input format")
Fixes: b38693b612b4 ("eal: fix core number validation")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
5 years agoeal: remove dead code in core list parsing
David Marchand [Wed, 13 Feb 2019 20:06:58 +0000 (21:06 +0100)]
eal: remove dead code in core list parsing

We don't need to look for trailing spaces.
This is a copy/paste block from eal_parse_coremask().
Remove it and the associated comment.

Fixes: d888cb8b9613 ("eal: add core list input format")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
5 years agoeal: restrict control threads to startup CPU affinity
David Marchand [Tue, 19 Feb 2019 20:41:11 +0000 (21:41 +0100)]
eal: restrict control threads to startup CPU affinity

Spawning the ctrl threads on anything that is not part of the eal
coremask is not that polite to the rest of the system, especially
when you took good care to pin your processes on cpu resources with
tools like taskset (linux) / cpuset (freebsd).

Rather than introduce yet another eal options to control on which cpu
those ctrl threads are created, let's take the startup cpu affinity
as a reference and remove the eal coremask from it.
If no cpu is left, then we default to the master core.

The cpuset is computed once at init before the original cpu affinity
is lost.

Introduced a RTE_CPU_AND macro to abstract the differences between linux
and freebsd respective macros.

Examples in a 4 cores FreeBSD vm:

$ ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
 -- -i --total-num-mbufs=2048

$ procstat -S 1057
  PID    TID COMM                TDNAME              CPU CSID CPU MASK
 1057 100131 testpmd             -                     2    1 2
 1057 100140 testpmd             eal-intr-thread       1    1 0-1
 1057 100141 testpmd             rte_mp_handle         1    1 0-1
 1057 100142 testpmd             lcore-slave-3         3    1 3

$ cpuset -l 1,2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
 -- -i --total-num-mbufs=2048

$ procstat -S 1061
  PID    TID COMM                TDNAME              CPU CSID CPU MASK
 1061 100131 testpmd             -                     2    2 2
 1061 100144 testpmd             eal-intr-thread       1    2 1
 1061 100145 testpmd             rte_mp_handle         1    2 1
 1061 100147 testpmd             lcore-slave-3         3    2 3

$ cpuset -l 2,3 ./build/app/testpmd -l 2,3 --no-huge --no-pci -m 512 \
 -- -i --total-num-mbufs=2048

$ procstat -S 1065
  PID    TID COMM                TDNAME              CPU CSID CPU MASK
 1065 100131 testpmd             -                     2    2 2
 1065 100148 testpmd             eal-intr-thread       2    2 2
 1065 100149 testpmd             rte_mp_handle         2    2 2
 1065 100150 testpmd             lcore-slave-3         3    2 3

Fixes: d651ee4919cd ("eal: set affinity for control threads")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
5 years agoeal: fix control threads pinnning
David Marchand [Tue, 19 Feb 2019 20:41:10 +0000 (21:41 +0100)]
eal: fix control threads pinnning

pthread_setaffinity_np returns a >0 value on error.
We could end up letting the ctrl threads on the current process cpu
affinity.

Fixes: d651ee4919cd ("eal: set affinity for control threads")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
5 years agoeal: fix check when retrieving current CPU affinity
David Marchand [Tue, 19 Feb 2019 20:38:13 +0000 (21:38 +0100)]
eal: fix check when retrieving current CPU affinity

pthread_getaffinity_np returns a >0 value when failing.

This is mainly for the sake of correctness.
The only case where it could fail is when passing an incorrect cpuset
size wrt to the kernel.

Fixes: 2eba8d21f3c9 ("eal: restrict cores auto detection")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Rami Rosen <ramirose@gmail.com>