Yongseok Koh [Wed, 10 Jan 2018 17:46:49 +0000 (09:46 -0800)]
net/mlx5: fix deadlock of link status alarm
If mlx5_dev_link_status_handler() is executed while canceling the alarm,
deadlock can happen because rte_eal_alarm_cancel() waits for all callbackes
to finish execution and both calls are protected by priv->lock.
Fixes: 198a3c339a8f ("mlx5: handle link status interrupts") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Qi Zhang [Wed, 27 Dec 2017 20:22:30 +0000 (15:22 -0500)]
net/e1000: fix mailbox interrupt handler
Mailbox interrupt handler only takecare of the PF reset notification,
for other message mbx->ops.read should not be called since it gets
chance to break the foreground VF to PF communication.
Fixes: 316f4f1adc2e ("net/igb: support VF mailbox interrupt for link up/down") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Qi Zhang [Wed, 27 Dec 2017 20:22:29 +0000 (15:22 -0500)]
net/ixgbe: fix mailbox interrupt handler
Mailbox interrupt handler only takes care of PF reset notification, for
other message ixgbe_read_mbx should not be called since it gets chance
to break the foreground VF to PF communication.
This can be simply repeated by 'testpmd>rx_vlan rm all 0'.
Fixes: 77234603fba0 ("net/ixgbe: support VF mailbox interrupt for link up/down") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Chas Williams [Fri, 20 Oct 2017 03:23:39 +0000 (23:23 -0400)]
net/e1000: always enable receive and transmit
The transmit and receive controller state machines are only enabled after
receiving an interrupt and the link status is now valid. If an adapter
is being used in conjunction with NC-SI, network controller sideband
interface, the adapter may never get a link state change interrupt since
the adapter's PHY is always link up and never changes state.
To fix this, always enable and disable the transmit and receive with
.dev_start and .dev_stop. This is a better match for what is typically
done with the other PMD's. Since we may never get an interrupt to check
the link state, we also poll once at the end of .dev_start to get the
current link status.
Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wei Zhao [Fri, 12 Jan 2018 06:59:19 +0000 (14:59 +0800)]
net/i40e: fix port segmentation fault when restart
This patch will go into the process of clear all queue region
related configuration when dev stop even if there is no queue region
command before, so this is a bug, it may cause error. So add code
to check if there is queue configuration exist when flush queue
region config and remove this process when device stop. Queue region
clear only do when device initialization or PMD get flush command.
Fixes: 7cbecc2f7424 ("net/i40e: support queue region set and flush") Cc: stable@dpdk.org Signed-off-by: Wei Zhao <wei.zhao1@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Beilei Xing [Tue, 9 Jan 2018 10:37:44 +0000 (18:37 +0800)]
net/i40e: fix setting MAC address of VF
New MAC address is copied to dev->data->mac_addrs[0] before calling
setting MAC address ops. So it will fail when deleting
dev->data->mac_addrs[0]. Deleting hw->mac.addr will fix the issue.
Fixes: 943c2d899a0c ("net/i40e: set VF MAC from VF") Cc: stable@dpdk.org Signed-off-by: Beilei Xing <beilei.xing@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Yanglong Wu [Tue, 9 Jan 2018 06:32:05 +0000 (14:32 +0800)]
net/ixgbe: fix max queue number for VF
VF can't run in multiple queue mode, if nb_q_per_pool is set to 1.
Nb_q_per_pool is passed through to max_rx_q and max_tx_q in VF.
So if nb_q_per_pool is equal to 1, max_rx_q and max_tx_q shouldn't
be more than 1, otherwise VF multiple queue mode will fail.
RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool is the Max number of queue can be
used in VF, that would be assigned as IXGBE_MAX_RX_QUEUE_NUM /
RTE_ETH_DEV_SRIOV(dev).active, so that assigning nb_q_per_pool as 1
when PF is in ETH_MQ_RX_NONE, which will make VF can just use only 1
queue, is not right.
Fixes: 27b609cbd1c6 ("ethdev: move the multi-queue mode check to specific drivers") Cc: stable@dpdk.org Signed-off-by: Yanglong Wu <yanglong.wu@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Wei Zhao [Wed, 10 Jan 2018 02:10:26 +0000 (10:10 +0800)]
net/i40e: move RSS to flow API
Rte_flow was defined to include RSS, this patch moves i40e
existing RSS to rte_flow. The old RSS configuration is kept
as it was, and can be deprecated in the future.
Beilei Xing [Mon, 8 Jan 2018 03:09:13 +0000 (11:09 +0800)]
net/i40e: support input set configuration
This patch supports getting/setting input set info for
RSS, FDIR, and FDIR flexible payload. It also adds some
helper functions for input set configuration.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:43 +0000 (16:35 +0100)]
net/mrvl: keep shadow Txqs inside PMD Txq
Change shadow queues allocation from port/core to txq/core.
Use array of shadow queues (one per lcore) for each tx queue object to
avoid data corruption when few tx queues are handled by one lcore and
buffers that were not sent yet, can be released and used for receive.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:41 +0000 (16:35 +0100)]
net/mrvl: fix oversize bpool handling
Don't return mbuf to dpdk pool if failed to get buffer from the bpool.
Fix maximum bpool size calculation to prevent unnecessary bpool
oversize cases when working with small rx queues.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:40 +0000 (16:35 +0100)]
net/mrvl: fix HIF objects allocation
1. Add checking for non-EAL threads.
2. Create hif objects on first use since sometimes on probe not all
lcores are initialized and can be added later.
In this case the hif objects for later cores were not created and
this caused system crash.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:39 +0000 (16:35 +0100)]
net/mrvl: fix multiple probe
MUSDK library initialization and cleanup should be done once.
This commit fixes that by doing necessary initialization once the first
port is probed and cleanup once the last port is removed.
Matan Azrad [Tue, 19 Dec 2017 17:14:29 +0000 (17:14 +0000)]
net/failsafe: improve Rx sub-devices iteration
Connecting the sub-devices each other by cyclic linked list can help to
iterate over them by Rx burst functions because there is no need to
check the sub-devices ring wraparound.
Create the aforementioned linked-list and change the Rx burst functions
iteration accordingly.
Matan Azrad [Tue, 19 Dec 2017 17:14:28 +0000 (17:14 +0000)]
net/failsafe: mitigate data plane atomic operations
Fail-safe uses atomic operations to protect sub-device close operation
calling by host thread in removal time while the removed sub-device
burst functions are still in process by application threads.
Using "set" atomic operations is a little bit more efficient than "add"
or "sub" atomic operations because "set" shouldn't read the value and
in fact, it does not need a special atomic mechanism in x86 platforms.
Replace "add 1" and "sub 1" atomic operations by "set 1" and "set 0"
atomic operations.
Matan Azrad [Tue, 19 Dec 2017 17:14:27 +0000 (17:14 +0000)]
net/failsafe: fix Rx safe check compiler hint
failsafe_rx_burst function is used when there are no sub-devices or at
least one of them has been removed, on the other hand, when all the
sub-devices are present, failsafe_rx_burst_fast function is used.
So it's really expected that some of the sub-devices will be unsafe for
Rx burst in failsafe_rx_burst execution.
Remove unlikely compiler hint from fs_rx_unsafe calling.
Sharmila Podury [Thu, 11 Jan 2018 19:12:44 +0000 (11:12 -0800)]
net/bonding: add ethdev ops function for MTU set
Set the MTU for bonding device by calling .mtu_set for all
the slaves. Set the MTU only if all slaves support .mtu_set,
and there is no error returned from any slave.
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:08 +0000 (01:17 -0800)]
net/enic: refill only the address of the RQ descriptor
Once the RQ descriptors are initialized (enic_alloc_rx_queue_mbufs),
their length_type does not change during normal RX
operations. rx_pkt_burst only needs to reset their address field for
newly allocated mbufs.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:07 +0000 (01:17 -0800)]
net/enic: remove a couple unnecessary statements
No need to zero ol_flags as it is overwritten at the end of the
function. No need to check for EOP as the caller (enic_recv_pkts) has
already checked it.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:04 +0000 (01:17 -0800)]
net/enic: fix L4 Rx ptype comparison
For non-UDP/TCP packets, enic may wrongly set PKT_RX_L4_CKSUM_BAD in
ol_flags. The comparison that checks if a packet is UDP or TCP assumes
that RTE_PTYPE_L4 values are bit flags, but they are not. For example,
the following evaluates to true because NONFRAG is 0x600 and UDP is
0x200, and causes the current code to think the packet is UDP.
!!(RTE_PTYPE_L4_NONFRAG & RTE_PTYPE_L4_UDP)
So, fix this by comparing the packet type against UDP and TCP
individually.
Fixes: 453d15059b58 ("net/enic: use new Rx checksum flags") Cc: stable@dpdk.org Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
For enic, the required changes are mechanical. Use the new 'offloads'
field in rxmode instead of the bit fields. And, no changes required
with respect to txq_flags, as enic does not use it at all.
Per-queue RX offload capabilities are not set, as all offloads are
per-port at the moment.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Thomas Monjalon [Thu, 4 Jan 2018 16:01:11 +0000 (17:01 +0100)]
ethdev: add notifications for probing and removal
When a PMD finishes probing, it creates the new port by calling
the function rte_eth_dev_allocate().
A notification of the new port is sent there to the upper layer.
When a PMD finishes removal of a port, it calls the function
rte_eth_dev_release_port().
A notification of the destroyed port is sent there to the upper layer.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Thomas Monjalon [Thu, 4 Jan 2018 16:01:08 +0000 (17:01 +0100)]
ethdev: remove useless parameter in callback process
The pointer to the user parameter of the callback registration is
automatically pass to the callback function.
There is no point to allow changing this user parameter by a caller.
That's why this parameter is always set to NULL by PMDs and set only
in ethdev layer before calling the callback function.
The history is that the user parameter was initially used
by the callback implementation to pass some information
between the application and the driver: c1ceaf3ad056 ("ethdev: add an argument to internal callback function")
Then a new parameter has been added to leave the user parameter
to its standard usage of context given at registration: d6af1a13d7a1 ("ethdev: add return values to callback process API")
The NULL parameter in the internal callback processing function
is now removed. It makes clear that the callback parameter is user
managed and opaque from a DPDK point of view.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Thomas Monjalon [Fri, 5 Jan 2018 17:38:55 +0000 (18:38 +0100)]
ethdev: fix link autonegotiation value
There are 3 kind of link data in ethdev:
- capabilities (rte_eth_dev_info)
- configuration (rte_eth_conf)
- status (rte_eth_link)
A bit-field is used for capabilities (rte_eth_dev_info.speed_capa) and
configuration (rte_eth_conf.link_speeds).
Bits are defined in ETH_LINK_SPEED_*.
Some numerical (ETH_SPEED_NUM_*) and boolean (ETH_LINK_*) values
are used for the link status (rte_eth_link.*).
There was a mistake in the comment of rte_eth_link.link_autoneg,
suggesting ETH_LINK_SPEED_[AUTONEG/FIXED] which are 0/1,
instead of ETH_LINK_[AUTONEG/FIXED] which are 1/0.
The drivers are fixed to use ETH_LINK_[AUTONEG/FIXED].
Fixes: 82113036e4e5 ("ethdev: redesign link speed config") Suggested-by: Andrew Rybchenko <arybchenko@solarflare.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Olivier Matz [Wed, 3 Jan 2018 10:32:25 +0000 (11:32 +0100)]
net/bnxt: fix headroom initialization
When allocating a new mbuf for Rx, the value of m->data_off should not
be reset to its default value (RTE_PKTMBUF_HEADROOM), instead of reusing
the previous undefined value, which could cause the packet to have a too
small or too high headroom.
Chas Williams [Thu, 28 Dec 2017 02:12:31 +0000 (21:12 -0500)]
net/bonding: fix setting slave MAC addresses
Use rte_eth_dev_default_mac_addr_set() to change a slave MAC address.
mac_address_set() only updates the software copy and does nothing to
update the hardware.
Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
Ajit Khaparde [Mon, 8 Jan 2018 20:24:37 +0000 (12:24 -0800)]
net/bnxt: check on-chip resources
Check for availability of on-chip resources - like Queue count,
number stat context, number of ring groups before inheriting and
initializing as per application requirements.
Also check before creating a Tx or Rx queue make sure there are
enough resources to complete the request.
Somnath Kotur [Mon, 8 Jan 2018 20:24:36 +0000 (12:24 -0800)]
net/bnxt: free the aggregation ring
bnxt_free_all_hwrm_rings() was freeing all the Rx Rings including
zero-ing out the memory for the Aggregation rings, but was not issuing
the FW cmd to destroy the AGG ring(s) from HW. This would manifest in
problems when port stop/port start would be issued as there would be a
HW ring leak every time port stop was issued.
Fixes: daef48efe5e5 ("net/bnxt: support set MTU") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Somnath Kotur [Mon, 8 Jan 2018 20:24:35 +0000 (12:24 -0800)]
net/bnxt: fix duplicate pattern for 5tuple filter
When user re-issues same 5 tuple filter pattern cmd with different
destination queue, it would flag it as an existing match.
However, when deletion on this filter was attempted, it would crash
as the 'vnic' from which the filter was being removed from would be
different. Fix by updating the filter in the scenario where there
is a pattern match and only the destination queue varies.
If the attribute/pattern for a flow is the same, with only the 'action'
i.e the destination queue index changing, allow it by cleaning up
the older ntuple filter and updating the existing flow with
the new filter rule having the new destination queue ID.
Also, clear the L2 filter during flow_destroy after destroying
the ntuple filter, otherwise the flow record is not completely purged
from the HW.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:33 +0000 (12:24 -0800)]
net/bnxt: remove addition of a temporary filter
filter1 which is used to just get the l2 filter FW id and not used
later was unnecessarily being inserted into a list and was not being
freed after it's use was done.
Fix it by not doing the list insertion and releasing it back to the free
filter pool.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:32 +0000 (12:24 -0800)]
net/bnxt: fix check for ether type
As per documentation, While supporting ethertype_filters matching
ether_types 0x0800 (IPv4) and 0x86DD (IPv6) is invalid.
But we were wrongly doing that. This patch fixes it.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:31 +0000 (12:24 -0800)]
net/bnxt: check initialization before accessing stats
Maintain state of PMD initialization and check it before checking stats.
In certain cases, we might end up accessing stats before the required
HWRM commands are processed by FW.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:30 +0000 (12:24 -0800)]
net/bnxt: add check for multi host PF per port
Certain SKUs of NIC can support features like NPAR, Multi Host PFs per
port. We need to check for such features in order to restrict certain
HWRM commands from being sent to the FW.
For the single PF per port model, allow commands like hwrm_port_phy_cfg
from the PF driver. In NPAR and MH environments with multiple PFs per
port, we should not allow HWRM commands like hwrm_port_phy_cfg to be
sent to the FW.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:29 +0000 (12:24 -0800)]
net/bnxt: return proper error code
If the FW fails bnxt_hwrm_func_reset() with an error status,
instead of returning -1, return a more standard value of -EIO.
Similarly sometimes the status returned by certain FW commands
may not be generic. Return a more standard value of -EIO in
that case as well.
Ajit Khaparde [Mon, 8 Jan 2018 20:24:25 +0000 (12:24 -0800)]
net/bnxt: check return values in init
We are not checking for return values of functions like
bnxt_hwrm_queue_qportcfg and bnxt_hwrm_func_qcfg in bnxt_dev_init
thereby preventing a cleanup in case of a HWRM command failure.
This patch fixes that.
Hemant Agrawal [Wed, 10 Jan 2018 10:46:39 +0000 (16:16 +0530)]
bus/dpaa: support static queues
DPAA hardware support two kinds of queues:
1. Pull mode queue - where one needs to regularly pull the packets.
2. Push mode queue - where the hw pushes the packet to queue. These are
high performance queues, but limited in number.
This patch add the driver support for push mode queues.
Signed-off-by: Sunil Kumar Kori <sunil.kori@nxp.com> Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Hemant Agrawal [Wed, 10 Jan 2018 10:46:28 +0000 (16:16 +0530)]
net/dpaa: add frame count based tail drop with CGR
Replace the byte based tail queue congestion support
with frame count based congestion groups.
It can easily map to number of RX descriptors for a queue.
net/sfc: make Tx free threshold check datapath specific
EFX_TXQ_LIMIT is libefx-specifics and it should not be used
for other Tx datapaths implementations (e.g. EF10 native).
EF10 native Tx datapath has its own understanding of the maximum
TxQ fill level imposed by EvQ clear strategy and space reserved
for Tx error and flush events.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
net/sfc: make refill threshold check Rx datapath specific
EFX_RXQ_LIMIT is libefx-specifics and it should not be used
for other Rx datapaths implementations (e.g. EF10 native).
EF10 native Rx datapath has its own understanding of the maximum
RxQ fill level imposed by EvQ clear strategy and space reserved
for Rx error and flush events.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>