Jesse Brandeburg [Fri, 23 Oct 2020 20:21:59 +0000 (13:21 -0700)]
net/iavf: fix performance with writeback policy
The iavf driver was trying to use writeback on ITR, but was
never setting an ITR, so it didn't work. This caused performance
to be limited due to too much PCIe traffic and partial writes
during most benchmarking workloads.
Set the ITR during queue setup, which can be checked at runtime
by reading register 0x2800. Setting the value to 2us allows
for generally good streaming packet performance while keeping
latency down.
Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt") Cc: stable@dpdk.org Reported-by: Brian Johnson <brian.johnson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Wei Huang [Fri, 23 Oct 2020 08:59:59 +0000 (04:59 -0400)]
raw/ifpga/base: enhance driver reliability in multi-process
Current hardware protection is based on pthread mutex which
work just for situation of multi-thread in one process. In
multi-process environment, hardware state machine would be
corrupted by concurrent access, that means original pthread
mutex mechanism need be enhanced.
The major modifications in this patch are list below:
1. Create a mutex for adapter in shared memory named
"mutex.IFPGA:domain:bus:dev.func" when device is probed.
2. Create a shared memory named "IFPGA:domain:bus:dev.func" during opae
adapter is initializing. There is a reference count in shared memory.
Shared memory will be destroyed once reference count turned to zero.
3. Two mutexs are created in shared memory and initialized with flag
PTHREAD_PROCESS_SHARED. One for SPI and the other for I2C. They will
be passed to SPI and I2C driver subsequently.
4. DTB data in flash will be cached in shared memory. Then MAX10 driver
can read DTB from shared memory instead of flash. This avoid
confliction of concurrent flash access between hardware and software.
Wei Huang [Fri, 23 Oct 2020 08:59:58 +0000 (04:59 -0400)]
raw/ifpga/base: free resources when destroying device
Add two functions to complete the resource free work, one is
'ifpga_adapter_destroy()', the other is 'ifpga_bus_uinit()'.
Then call 'opae_adapter_destroy()' and 'opae_adapter_data_free()'
in 'ifpga_rawdev_close()' to free resources.
Also 'opae_adapter_free()' is removed from 'ifpga_rawdev_destroy()',
because opae adapter is pointed by dev_private member in raw_dev,
it will be freed in 'rte_rawdev_pmd_release()'.
Interrupt handler copied to the local 'intr_handle' variable by value
before passing it to IRQ functions.
This leads IRQ functions update the local variable instead of
'ifpga_irq_handle'.
Instead, using 'intr_handle' local variable as pointer to
'ifpga_irq_handle' as intended.
Lijun Ou [Wed, 21 Oct 2020 10:07:10 +0000 (18:07 +0800)]
app/testpmd: fix RSS key for flow API RSS rule
When a flow API RSS rule is issued in testpmd, device RSS key is changed
unexpectedly, device RSS key is changed to the testpmd default RSS key.
Consider the following usage with testpmd:
1. first, startup testpmd:
testpmd> show port 0 rss-hash key
RSS functions: all ipv4-frag ipv4-other ipv6-frag ipv6-other ip
RSS key: 6D5A56DA255B0EC24167253D43A38FB0D0CA2BCBAE7B30B477CB2DA38030F 20C6A42B73BBEAC01FA
2. create a rss rule
testpmd> flow create 0 ingress pattern eth / ipv4 / udp / end \
actions rss types ipv4-udp end queues end / end
This is because testpmd always sends a key with the RSS rule,
if user provides a key as part of the rule that key is used, if user
doesn't provide a key, testpmd default key is sent to the PMDs, which is
causing device programmed RSS key to be changed.
There was a previous attempt to fix the same issue [1], but it has been
reverted back [2] because of the crash when 'key_len' is provided
without 'key'.
This patch follows the same approach with the initial fix [1] but also
addresses the crash.
After change, testpmd RSS key is 'NULL' by default, if user provides a
key as part of rule it is used, if not no key is sent to the PMDs at all
[1]
Commit a4391f8bae85 ("app/testpmd: set default RSS key as null")
David Marchand [Fri, 23 Oct 2020 08:43:51 +0000 (10:43 +0200)]
net/ena: remove unused macro
This assert macro is not called anymore.
This also fixes an invalid reference to RTE_LOGTYPE_ERR that does not
exist.
Fixes: 3adcba9a8987 ("net/ena: update HAL to the newer version") Fixes: 6f1c9df9e9cc ("net/ena: use dynamic log type for debug logging") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Michal Krawczyk <mk@semihalf.com>
Cheng Jiang [Thu, 22 Oct 2020 08:59:07 +0000 (08:59 +0000)]
examples/vhost: support vhost async data path
This patch is to implement vhost DMA operation callbacks for CBDMA
PMD and add vhost async data-path in vhost sample. With providing
callback implementation for CBDMA, vswitch can leverage IOAT to
accelerate vhost async data-path.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Cheng Jiang [Thu, 22 Oct 2020 08:59:06 +0000 (08:59 +0000)]
examples/vhost: add async vhost args parsing
This patch is to add async vhost driver arguments parsing function
for CBDMA channel, DMA initiation function and args description.
The meson build file is changed to fix dependency problem. With
these arguments vhost device can be set to use CBDMA or CPU for
enqueue operation and bind vhost device with specific CBDMA channel
to accelerate data copy.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Patrick Fu [Wed, 21 Oct 2020 05:44:25 +0000 (13:44 +0800)]
vhost: remove fallback in async enqueue API
By design, async enqueue API should return directly if async device
is not registered. This patch removes the corrupted implementation of
the enqueue fallback from async mode to sync mode.
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring") Cc: stable@dpdk.org Signed-off-by: Patrick Fu <patrick.fu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Maxime Coquelin [Mon, 19 Oct 2020 17:34:15 +0000 (19:34 +0200)]
vhost: check virtqueue metadata pointer
This patch checks whether the virtqueue metadata pointer
is valid before dereferencing it. It is not considered
a fix as earlier patch ensures there are no holes in the
array of virtqueue metadata pointers.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
The PMD API allows stats and xstats values to be cleared separately.
This is a problem for the vhost PMD since some of the xstats values are
derived from existing stats values. For example:
testpmd> show port xstats all
...
tx_unicast_packets: 17562959
...
testpmd> clear port stats all
...
show port xstats all
...
tx_unicast_packets: 18446744073709551615
...
Modify the driver so that stats and xstats values are stored, updated,
and cleared separately.
Fixes: 4d6cf2ac93dc ("net/vhost: add extended statistics") Cc: stable@dpdk.org Signed-off-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jeff Guo [Fri, 16 Oct 2020 09:44:31 +0000 (17:44 +0800)]
net/iavf: fix vector Rx
The limitation of burst size in vector rx was removed, since it should
retrieve as much received packets as possible. And also the scattered
receive path should use a wrapper function to achieve the goal of
burst maximizing.
Bugzilla ID: 516 Fixes: 319c421f3890 ("net/avf: enable SSE Rx Tx") Fixes: 1162f5a0ef31 ("net/iavf: support flexible Rx descriptor in SSE path") Fixes: 5b6e8859081d ("net/iavf: support flexible Rx descriptor in AVX path") Cc: stable@dpdk.org Signed-off-by: Jeff Guo <jia.guo@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Tested-by: Wei Ling <weix.ling@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Jeff Guo [Fri, 16 Oct 2020 09:44:29 +0000 (17:44 +0800)]
net/ice: fix vector Rx
The limitation of burst size in vector rx was removed, since it should
retrieve as much received packets as possible. And also the scattered
receive path should use a wrapper function to achieve the goal of
burst maximizing.
Bugzilla ID: 516 Fixes: c68a52b8b38c ("net/ice: support vector SSE in Rx") Cc: stable@dpdk.org Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Yingya Han <yingyax.han@intel.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Jeff Guo [Fri, 16 Oct 2020 09:44:28 +0000 (17:44 +0800)]
net/i40e: fix vector Rx
The limitation of burst size in vector rx was removed, since it should
retrieve as much received packets as possible. And also the scattered
receive path should use a wrapper function to achieve the goal of
burst maximizing.
Jeff Guo [Fri, 16 Oct 2020 09:44:27 +0000 (17:44 +0800)]
net/ixgbe: fix vector Rx
The limitation of burst size in vector rx was removed, since it should
retrieve as much received packets as possible. And also the scattered
receive path should use a wrapper function to achieve the goal of
burst maximizing.
Bugzilla ID: 516 Fixes: b20971b6cca0 ("net/ixgbe: implement vector driver for ARM") Fixes: 0e51f9dc4860 ("net/ixgbe: rename x86 vector driver file") Cc: stable@dpdk.org Signed-off-by: Jeff Guo <jia.guo@intel.com> Tested-by: Feifei Wang <feifei.wang2@arm.com> Acked-by: Morten Brørup <mb@smartsharesystems.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Padraig Connolly [Thu, 15 Oct 2020 09:28:58 +0000 (10:28 +0100)]
net/i40e: fix QinQ flow pattern to allow non full mask
Issue reported by customer that only full mask was allowed on inner and
outer VLAN tag, thus not allowing mask to set VLAN ID filter only.
Removed check that enforces inner vlan and outer vlan equal
I40E_TCI_MASK (full mask 0xffff).
Leyi Rong [Fri, 23 Oct 2020 04:14:06 +0000 (12:14 +0800)]
net/ice: add RSS hash parsing in AVX512 path
Support RSS hash parsing in AVX512 data path as the default
RXDID is set to #22, that means the RSS hash field locates
in the 2nd 16B of each Flex Rx descriptor.
Signed-off-by: Leyi Rong <leyi.rong@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Conor Walsh [Tue, 20 Oct 2020 10:02:47 +0000 (10:02 +0000)]
net/ixgbe: check switch domain allocation result
The return value of rte_eth_switch_domain_alloc() was not being checked
within ixgbe_pf_host_init() which caused a coverity issue. If the call
fails a warning is logged using PMD_INIT_LOG() and *vfinfo is free'd.
ixgbe_pf_host_init() now has a return value which is checked in
eth_ixgbe_dev_init()
Coverity issue: 362795 Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports") Cc: stable@dpdk.org Signed-off-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Ajit Khaparde [Tue, 20 Oct 2020 23:24:28 +0000 (16:24 -0700)]
net/bnxt: fix resource leak
Fix a potential resource leak in case of errors during dev args
parsing during device probe.
Fixes: 6dc83230b43b ("net/bnxt: support port representor data path") Cc: stable@dpdk.org Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Yuying Zhang [Mon, 19 Oct 2020 02:20:25 +0000 (02:20 +0000)]
net/i40e: fix virtual channel conflict
i40evf_execute_vf_cmd() uses _atomic_set_cmd() to execute virtual
channel commands safely in multi-process mode and multi-thread mode.
However, it returns error when one process or thread is pending. Add
rte_spinlock_trylock() to handle this issue in concurrent scenarios.
Ting Xu [Thu, 22 Oct 2020 06:49:02 +0000 (14:49 +0800)]
net/iavf: add enable/disable queues for large VF
The current virtchnl structure for enable/disable queues only supports
max 32 queue pairs. Use a new opcode and structure to indicate up to 256
queue pairs, in order to enable/disable queues in large VF case.
Ting Xu [Thu, 22 Oct 2020 06:49:01 +0000 (14:49 +0800)]
net/iavf: enable IRQ mapping configuration for large VF
The current IRQ mapping configuration only supports max 16 queues and
16 MSIX vectors. Change the queue vector mapping structure to indicate
up to 256 queues. A new opcode is used to handle the case with large
number of queues. To avoid adminq buffer size limitation, we support
to send the virtchnl message multiple times if needed.
Ting Xu [Thu, 22 Oct 2020 06:49:00 +0000 (14:49 +0800)]
net/iavf: enable multiple queues configuration for large VF
Since the adminq buffer size has a 4K limitation, the current virtchnl
command VIRTCHNL_OP_CONFIG_VSI_QUEUES cannot send the message only once
to configure up to 256 queues. In this patch, we send the messages
multiple times to make sure that the buffer size is less than 4K each
time.
Ting Xu [Thu, 22 Oct 2020 06:48:59 +0000 (14:48 +0800)]
net/iavf: negotiate large VF and request more queues
Negotiate large VF capability with PF during VF initialization. If large
VF is supported and the number of queues larger than 16 is required, VF
requests additional queues from PF. Mark the state that large VF is
supported.
If the allocated queues number is larger than 16, the max RSS queue
region cannot be 16 anymore. Add the function to query max RSS queue
region from PF, use it in the RSS initialization and future filters
configuration.
Ting Xu [Thu, 22 Oct 2020 06:48:58 +0000 (14:48 +0800)]
net/iavf: support requesting additional queues from PF
Add a new virtchnl function to request additional queues from PF.
Current default queue pairs number when creating a VF is 16. In order to
support up to 256 queue pairs per VF, enable this request queues
function.
When requesting queues succeeds, PF will return an event message. If it
is handled by interrupt first, the request queues command cannot receive
the correct PF response and will wait until timeout. Therefore, disable
interrupt before requesting queues in order to handle the event message
asynchronously.
Ting Xu [Thu, 22 Oct 2020 06:48:57 +0000 (14:48 +0800)]
net/iavf: handle virtchnl event message without interrupt
Currently, VF can only handle virtchnl event message by calling
interrupt.
It is not available in two cases:
1. If the event message comes during VF initialization before interrupt
is enabled, this message will not be handled correctly.
2. Some virtchnl commands need to receive the event message and handle
it with interrupt disabled.
To solve this issue, we add the virtchnl event message handling in the
process of reading vitchnl messages in adminq from PF.
MPRQ (Multi-Packet Rx Queue) processes one packet at a time using
simple scalar instructions. MPRQ works by posting a single large buffer
(consisted of multiple fixed-size strides) in order to receive multiple
packets at once on this buffer. A Rx packet is then copied to a
user-provided mbuf or PMD attaches the Rx packet to the mbuf by the
pointer to an external buffer.
There is an opportunity to speed up the packet receiving by processing
4 packets simultaneously using SIMD (single instruction, multiple data)
extensions. Allocate mbufs in batches for every MPRQ buffer and process
the packets in groups of 4 until all the strides are exhausted. Then
switch to another MPRQ buffer and repeat the process over again.
The vectorized MPRQ burst routine is engaged automatically in case
the mprq_en=1 devarg is specified and the vectorization is not disabled
explicitly by providing rx_vec_en=0 devarg. There is a limitation:
LRO is not supported and scalar MPRQ is selected if it is on.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Move the main processing cycle into a separate function:
rxq_cq_process_v. Put the regular rxq_burst_v function
to a non-arch specific file. Having all SIMD instructions
in a single reusable block is a first preparatory step to
implement vectorized Rx burst for MPRQ feature.
Pass a pointer to the storage of mbufs directly to the
rxq_copy_mbuf_v instead of calculating the pointer inside
this function. This is needed for the future vectorized Rx
routing which is going to pass a different pointer here.
Calculate the number of packets to replenish inside the
mlx5_rx_replenish_bulk_mbuf. Containing this logic in one
place allows us to do the same for MPRQ case.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Oct 2020 11:15:23 +0000 (11:15 +0000)]
net/mlx5: fix port shared data reference count
When probe a representor, tag cache hash table and modification cache
hash table allocated memory upon each port, overwrote previous existing
cache in shared context data.
This patch moves reference check of shared data prior to hash table
allocation to avoid such issue.
Fixes: 6801116688fe ("net/mlx5: fix multiple flow table hash list") Fixes: 1ef4cdef2682 ("net/mlx5: fix flow tag hash list conversion") Cc: stable@dpdk.org Acked-by: Matan Azrad <matan@nvidia.com> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Shiri Kuzin [Mon, 19 Oct 2020 06:36:50 +0000 (09:36 +0300)]
net/mlx5: fix xstats reset reinitialization
The mlx5_xstats_reset clears the device extended statistics.
In this function the driver may reinitialize the structures
that are used to read device counters.
In case of reinitialization, the number of counters may
change, which wouldn't be taken into account by the
reset API callback and can cause a segmentation fault.
This issue is fixed by allocating the counters size after
the reinitialization.
Suanming Mou [Tue, 20 Oct 2020 03:02:28 +0000 (11:02 +0800)]
net/mlx5: optimize counter extend memory
Counter extend memory was allocated for non-batch counter to save the
extra DevX object. Currently, for non-batch counter which does not
support aging, entry in the generic counter struct is used only when
counter is free in free list, and bytes in the struct is used only when
counter is allocated in using.
In this case, the DevX object can be saved to the generic counter struct
union with entry memory when counter is allocated and union with bytes
when counter is free.
And pool type is also not needed as non-fallback mode only has generic
counter and aging counter, just a bit to indicate the pool is aged or
not will be enough.
This eliminates the counter extend info struct saves the memory.