Xiao Wang [Mon, 1 Jan 2018 22:00:10 +0000 (14:00 -0800)]
igb_uio: allow multi-process access
In some case, one device are accessed by different processes via
different BARs, so one uio device may be opened by more than one
process, for this case we just need to enable interrupt once, and
pci_clear_master only when the last process closed.
Fixes: 5f6ff30dc507 ("igb_uio: fix interrupt enablement after FLR in VM") Cc: stable@dpdk.org Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
We cannot assume external applications will include rte_config.h on their
own, neither directly nor through a -include parameter like DPDK does
internally.
This not only causes obvious compilation failures that can be reproduced
with check-includes.sh such as:
[...]/rte_memory.h:88:43: error: ‘RTE_CACHE_LINE_SIZE’ was not declared in
this scope
#define __rte_cache_aligned __rte_aligned(RTE_CACHE_LINE_SIZE)
^
It also results in less visible issues, for instance rte_hash_crc.h relying
on RTE_ARCH_X86_64's presence to provide dedicated inline functions.
This patch partially reverts the commit below and adds missing include
lines to the remaining files.
Fixes: f1a7a5c5f404 ("remove include of generated config header") Cc: stable@dpdk.org Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Adrien Mazarguil [Thu, 21 Dec 2017 13:00:02 +0000 (14:00 +0100)]
member: fix ISO C in exported header
Reported by check-includes.sh:
[...]/rte_member.h:107:40: error: ISO C does not permit named variadic
macros [-Werror=variadic-macros]
#define RTE_MEMBER_LOG(level, fmt, args...) \
^
Adrien Mazarguil [Thu, 21 Dec 2017 12:59:59 +0000 (13:59 +0100)]
flow_classify: fix ISO C in exported header
Reported by check-includes.sh:
[...]/rte_flow_classify.h:85:47: error: ISO C does not permit named
variadic macros [-Werror=variadic-macros]
#define RTE_FLOW_CLASSIFY_LOG(level, fmt, args...) \
^
Jiayu Hu [Wed, 22 Nov 2017 03:19:42 +0000 (11:19 +0800)]
vhost: support Explicit Congestion Notification
In virtio, Explicit Congestion Notification (ECN) includes two parts:
guest ECN and host ECN. Guest ECN means the frontend can handle TSO
packets which have ECN set, and host ECN means the backend can handle
TSO packets which have ECN set.
The ECN features are rarely used. However, virtio-net enables them by
default, and vhost-net support both. To make live migration from
vhost-net to vhost-user possible, this patch announces to support
guest and host ECN in vhost-user.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Xiao Wang [Wed, 10 Jan 2018 01:23:54 +0000 (09:23 +0800)]
net: add a helper for making RARP packet
Suggested-by: Maxime Coquelin <maxime.coquelin@redhat.com> Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xiao Wang [Wed, 10 Jan 2018 01:23:53 +0000 (09:23 +0800)]
net/virtio: add packet injection method
This patch adds dev_pause, dev_resume and inject_pkts APIs to allow
driver to pause the worker threads and inject special packets into
Tx queue. The next patch will be based on this.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xiao Wang [Wed, 10 Jan 2018 01:23:52 +0000 (09:23 +0800)]
net/virtio: make control queue thread-safe
The virtio_send_command function may be called from app's configuration
routine, but also from an interrupt handler called when live migration
is done on the backup side. So this patch makes control queue
thread-safe first.
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Zhiyong Yang [Wed, 10 Jan 2018 06:01:01 +0000 (14:01 +0800)]
examples/vhost: fix startup check
For vhost sample, the operation if (dev_info.max_rx_queues >
MAX_QUEUES) in the function port_init causes startup failure
when using X710(i40e driver). X710 requires that MAX_QUEUES
should be defined no less than 320, however it is defined as
128 currently.
Such checking is overkill and Removal don't affect any
functionality (have already validated ixgbe and i40e).
The removal can avoid similar issue when introduing new physical NIC.
Fixes: 8bd6c395a568 ("examples/vhost: increase maximum queue number") Cc: stable@dpdk.org Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Junjie Chen [Tue, 9 Jan 2018 11:03:48 +0000 (06:03 -0500)]
vhost: support virtqueue interrupt/notification suppression
The driver can suppress interrupt when VIRTIO_F_EVENT_IDX feature bit is
negotiated. The driver set vring flags to 0, and MAY use used_event in
available ring to advise device interrupt util reach an index specified
by used_event. The device ignore the lower bit of vring flags, and send
an interrupt when index reach used_event.
The device can suppress notification in a manner analogous to the ways
driver suppress interrupt. The device manipulates flags or avail_event in
the used ring in the same way the driver manipulates flags or used_event in
available ring.
Signed-off-by: Junjie Chen <junjie.j.chen@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Tested-by: Lei Yao <lei.a.yao@intel.com> Acked-by: Yuanhan Liu <yliu@fridaylinux.org>
Yongseok Koh [Wed, 10 Jan 2018 17:46:49 +0000 (09:46 -0800)]
net/mlx5: fix deadlock of link status alarm
If mlx5_dev_link_status_handler() is executed while canceling the alarm,
deadlock can happen because rte_eal_alarm_cancel() waits for all callbackes
to finish execution and both calls are protected by priv->lock.
Fixes: 198a3c339a8f ("mlx5: handle link status interrupts") Cc: stable@dpdk.org Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Qi Zhang [Wed, 27 Dec 2017 20:22:30 +0000 (15:22 -0500)]
net/e1000: fix mailbox interrupt handler
Mailbox interrupt handler only takecare of the PF reset notification,
for other message mbx->ops.read should not be called since it gets
chance to break the foreground VF to PF communication.
Fixes: 316f4f1adc2e ("net/igb: support VF mailbox interrupt for link up/down") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Qi Zhang [Wed, 27 Dec 2017 20:22:29 +0000 (15:22 -0500)]
net/ixgbe: fix mailbox interrupt handler
Mailbox interrupt handler only takes care of PF reset notification, for
other message ixgbe_read_mbx should not be called since it gets chance
to break the foreground VF to PF communication.
This can be simply repeated by 'testpmd>rx_vlan rm all 0'.
Fixes: 77234603fba0 ("net/ixgbe: support VF mailbox interrupt for link up/down") Cc: stable@dpdk.org Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Chas Williams [Fri, 20 Oct 2017 03:23:39 +0000 (23:23 -0400)]
net/e1000: always enable receive and transmit
The transmit and receive controller state machines are only enabled after
receiving an interrupt and the link status is now valid. If an adapter
is being used in conjunction with NC-SI, network controller sideband
interface, the adapter may never get a link state change interrupt since
the adapter's PHY is always link up and never changes state.
To fix this, always enable and disable the transmit and receive with
.dev_start and .dev_stop. This is a better match for what is typically
done with the other PMD's. Since we may never get an interrupt to check
the link state, we also poll once at the end of .dev_start to get the
current link status.
Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wei Zhao [Fri, 12 Jan 2018 06:59:19 +0000 (14:59 +0800)]
net/i40e: fix port segmentation fault when restart
This patch will go into the process of clear all queue region
related configuration when dev stop even if there is no queue region
command before, so this is a bug, it may cause error. So add code
to check if there is queue configuration exist when flush queue
region config and remove this process when device stop. Queue region
clear only do when device initialization or PMD get flush command.
Fixes: 7cbecc2f7424 ("net/i40e: support queue region set and flush") Cc: stable@dpdk.org Signed-off-by: Wei Zhao <wei.zhao1@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Beilei Xing [Tue, 9 Jan 2018 10:37:44 +0000 (18:37 +0800)]
net/i40e: fix setting MAC address of VF
New MAC address is copied to dev->data->mac_addrs[0] before calling
setting MAC address ops. So it will fail when deleting
dev->data->mac_addrs[0]. Deleting hw->mac.addr will fix the issue.
Fixes: 943c2d899a0c ("net/i40e: set VF MAC from VF") Cc: stable@dpdk.org Signed-off-by: Beilei Xing <beilei.xing@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Yanglong Wu [Tue, 9 Jan 2018 06:32:05 +0000 (14:32 +0800)]
net/ixgbe: fix max queue number for VF
VF can't run in multiple queue mode, if nb_q_per_pool is set to 1.
Nb_q_per_pool is passed through to max_rx_q and max_tx_q in VF.
So if nb_q_per_pool is equal to 1, max_rx_q and max_tx_q shouldn't
be more than 1, otherwise VF multiple queue mode will fail.
RTE_ETH_DEV_SRIOV(dev).nb_q_per_pool is the Max number of queue can be
used in VF, that would be assigned as IXGBE_MAX_RX_QUEUE_NUM /
RTE_ETH_DEV_SRIOV(dev).active, so that assigning nb_q_per_pool as 1
when PF is in ETH_MQ_RX_NONE, which will make VF can just use only 1
queue, is not right.
Fixes: 27b609cbd1c6 ("ethdev: move the multi-queue mode check to specific drivers") Cc: stable@dpdk.org Signed-off-by: Yanglong Wu <yanglong.wu@intel.com> Acked-by: Wei Dai <wei.dai@intel.com>
Wei Zhao [Wed, 10 Jan 2018 02:10:26 +0000 (10:10 +0800)]
net/i40e: move RSS to flow API
Rte_flow was defined to include RSS, this patch moves i40e
existing RSS to rte_flow. The old RSS configuration is kept
as it was, and can be deprecated in the future.
Beilei Xing [Mon, 8 Jan 2018 03:09:13 +0000 (11:09 +0800)]
net/i40e: support input set configuration
This patch supports getting/setting input set info for
RSS, FDIR, and FDIR flexible payload. It also adds some
helper functions for input set configuration.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:43 +0000 (16:35 +0100)]
net/mrvl: keep shadow Txqs inside PMD Txq
Change shadow queues allocation from port/core to txq/core.
Use array of shadow queues (one per lcore) for each tx queue object to
avoid data corruption when few tx queues are handled by one lcore and
buffers that were not sent yet, can be released and used for receive.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:41 +0000 (16:35 +0100)]
net/mrvl: fix oversize bpool handling
Don't return mbuf to dpdk pool if failed to get buffer from the bpool.
Fix maximum bpool size calculation to prevent unnecessary bpool
oversize cases when working with small rx queues.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:40 +0000 (16:35 +0100)]
net/mrvl: fix HIF objects allocation
1. Add checking for non-EAL threads.
2. Create hif objects on first use since sometimes on probe not all
lcores are initialized and can be added later.
In this case the hif objects for later cores were not created and
this caused system crash.
Natalie Samsonov [Thu, 11 Jan 2018 15:35:39 +0000 (16:35 +0100)]
net/mrvl: fix multiple probe
MUSDK library initialization and cleanup should be done once.
This commit fixes that by doing necessary initialization once the first
port is probed and cleanup once the last port is removed.
Matan Azrad [Tue, 19 Dec 2017 17:14:29 +0000 (17:14 +0000)]
net/failsafe: improve Rx sub-devices iteration
Connecting the sub-devices each other by cyclic linked list can help to
iterate over them by Rx burst functions because there is no need to
check the sub-devices ring wraparound.
Create the aforementioned linked-list and change the Rx burst functions
iteration accordingly.
Matan Azrad [Tue, 19 Dec 2017 17:14:28 +0000 (17:14 +0000)]
net/failsafe: mitigate data plane atomic operations
Fail-safe uses atomic operations to protect sub-device close operation
calling by host thread in removal time while the removed sub-device
burst functions are still in process by application threads.
Using "set" atomic operations is a little bit more efficient than "add"
or "sub" atomic operations because "set" shouldn't read the value and
in fact, it does not need a special atomic mechanism in x86 platforms.
Replace "add 1" and "sub 1" atomic operations by "set 1" and "set 0"
atomic operations.
Matan Azrad [Tue, 19 Dec 2017 17:14:27 +0000 (17:14 +0000)]
net/failsafe: fix Rx safe check compiler hint
failsafe_rx_burst function is used when there are no sub-devices or at
least one of them has been removed, on the other hand, when all the
sub-devices are present, failsafe_rx_burst_fast function is used.
So it's really expected that some of the sub-devices will be unsafe for
Rx burst in failsafe_rx_burst execution.
Remove unlikely compiler hint from fs_rx_unsafe calling.
Sharmila Podury [Thu, 11 Jan 2018 19:12:44 +0000 (11:12 -0800)]
net/bonding: add ethdev ops function for MTU set
Set the MTU for bonding device by calling .mtu_set for all
the slaves. Set the MTU only if all slaves support .mtu_set,
and there is no error returned from any slave.
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:08 +0000 (01:17 -0800)]
net/enic: refill only the address of the RQ descriptor
Once the RQ descriptors are initialized (enic_alloc_rx_queue_mbufs),
their length_type does not change during normal RX
operations. rx_pkt_burst only needs to reset their address field for
newly allocated mbufs.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:07 +0000 (01:17 -0800)]
net/enic: remove a couple unnecessary statements
No need to zero ol_flags as it is overwritten at the end of the
function. No need to check for EOP as the caller (enic_recv_pkts) has
already checked it.
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Hyong Youb Kim [Wed, 10 Jan 2018 09:17:04 +0000 (01:17 -0800)]
net/enic: fix L4 Rx ptype comparison
For non-UDP/TCP packets, enic may wrongly set PKT_RX_L4_CKSUM_BAD in
ol_flags. The comparison that checks if a packet is UDP or TCP assumes
that RTE_PTYPE_L4 values are bit flags, but they are not. For example,
the following evaluates to true because NONFRAG is 0x600 and UDP is
0x200, and causes the current code to think the packet is UDP.
!!(RTE_PTYPE_L4_NONFRAG & RTE_PTYPE_L4_UDP)
So, fix this by comparing the packet type against UDP and TCP
individually.
Fixes: 453d15059b58 ("net/enic: use new Rx checksum flags") Cc: stable@dpdk.org Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>