This patch allows to display flow stats in extended stats.
To do this, DMA-able memory is registered with the FW during device
initialization. Then the driver uses an alarm thread to query the
per flow stats using the HWRM_CFA_COUNTER_QSTATS HWRM command at
regular intervals and stores it locally which will be displayed
when the application queries the xstats.
The DMA-able memory is unregistered during driver cleanup.
This functionality can be enabled using the flow-xstat devarg and
will be disabled by default. The intention behind this is to allow
stats to be displayed for all the flows in one shot instead of
querying one at a time.
Ivan Dyukov [Mon, 30 Mar 2020 07:58:02 +0000 (10:58 +0300)]
net/virtio: support Virtio link speed feature
This patch adds a support of VIRTIO_NET_F_SPEED_DUPLEX feature
for virtio driver.
There are two ways to specify speed of the link:
'speed' devarg
negotiate speed from qemu via VIRTIO_NET_F_SPEED_DUPLEX
The highest priority is devarg. If devarg is not specified,
drivers tries to negotiate it from qemu.
Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Ivan Dyukov [Mon, 30 Mar 2020 07:58:01 +0000 (10:58 +0300)]
net/virtio-user: adding link speed parameter
virtio driver already parses speed devarg. virtio-user should add
it to list of valid devargs and call eth_virtio_dev_init function
which init speed value.
eth_virtio_dev_init already is called from virtio_user_pmd_probe
function. The only change is required to enable speed devargs:
adding speed to list of valid devargs.
Signed-off-by: Ivan Dyukov <i.dyukov@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Marvin Liu [Fri, 17 Apr 2020 01:16:09 +0000 (09:16 +0800)]
vhost: fix shadowed descriptors not flushed
When ring size or enqueue packets not aligned with batch number, it is
possible that descs update still kept in shadowed used structure when
batched enqueue. Fix this issue by flushing remained shadowed used descs
before batch flush.
Fixes: f41516c309d7 ("vhost: flush batched enqueue descs directly") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Marvin Liu [Fri, 17 Apr 2020 02:39:05 +0000 (10:39 +0800)]
vhost: fix shadow update
Defer shadow ring update introduces functional issue which has been
described in Eugenio's fix patch.
The current implementation of vhost_net in packed vring tries to fill
the shadow vector before send any actual changes to the guest. While
this can be beneficial for the throughput, it conflicts with some
bufferfloats methods like the linux kernel napi, that stops
transmitting packets if there are too much bytes/buffers in the
driver.
It also introduces performance issue when frontend run much faster than
backend. Frontend may not be able to collect available descs when shadow
update is deferred. That will harm RFC2544 throughput.
Appropriate choice is to remove deferred shadowed update method.
Now shadowed used descs are flushed at the end of dequeue function.
Fixes: 31d6c6a5b820 ("vhost: optimize packed ring dequeue") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Matan Azrad [Tue, 24 Mar 2020 14:24:34 +0000 (14:24 +0000)]
vdpa/mlx5: move virtual doorbell alloc to probe
The configure and close operations may be called a lot of time by vhost
library according to the virtio connections in the guest.
VAR is the device memory space for the virtio queues doorbells.
Each VAR page can be shared for more than one queue while its owner must
synchronize the writes to it.
The mlx5 driver allocates single VAR page for all its queues.
Therefore, it is better to allocate it in probe device level instead of
creating and destroying it per new connection.
Asaf Penso [Mon, 23 Mar 2020 17:50:13 +0000 (17:50 +0000)]
vdpa/mlx5: set default queue indices
The rte_vhost_get_vring_base function is being called to get the values
of last_avail_idx and last_used_idx.
These fields will not have the correct values in case the function
returns an error.
Adding a check for the function return value, and in the case of an
error, set the fields to be zero and print a warning message.
When a flow is offloaded with MARK action (RTE_FLOW_ACTION_TYPE_MARK),
each packet of that flow will have metadata set in its completion.
This metadata will be used to fetch an index into a mark table where
the actual MARK for that flow is stored. Fetch the MARK from the mark
table and inject it into packet’s mbuf.
This patch does the following
1. Gets the ulp session information from eth_dev
2. Fetches the rte_flow table associated with this session
3. Iterates through all the flows in the flow table
4. Calls ulp_mapper_resources_free which releases the key & action
tables associated with each flow
This patch does the following
1. Gets the ulp session information from eth_dev
2. Fetches the flow associated with the flow id from the flow table
3. Calls ulp_mapper_resources_free which releases the key & action
tables associated with that flow
This patch does the following
1. Validates rte_flow_create arguments
2. Parses rte_flow_item types
3. Parses rte_flow_action types
4. Calls ulp_matcher_pattern_match to see if the flow is supported
5. If there is a match, returns success otherwise failure
This patch does the following
1. Validates rte_flow_create arguments
2. Parses rte_flow_item types
3. Parses rte_flow_action types
4. Calls ulp_matcher_pattern_match to see if the flow is supported
5. If there is a match, calls ulp_mapper_flow_create to program
key & action tables
This patch does the following
1. Registers a callback handler for each rte_flow_action type, if
it is supported
2. Iterates through each rte_flow_action till RTE_FLOW_ACTION_TYPE_END
3. Invokes the action call back handler
4. Each action call back handler will populate the respective fields in
act_details & act_bitmap
1. Registers a callback handler for each rte_flow_item type, if it
is supported
2. Iterates through each rte_flow_item till RTE_FLOW_ITEM_TYPE_END
3. Invokes the header call back handler
4. Each header call back handler will populate the respective fields
in hdr_field & hdr_bitmap
net/bnxt: match flow API actions with flow template actions
This patch does the following
1. Takes act_bitmap generated from the rte_flow_actions
2. Iterates through the static act_bitmap list
3. Returns success if a match is found, otherwise an error
net/bnxt: match flow API items with flow template patterns
This patch does the following
1. Takes hdr_bitmap generated from the rte_flow_items
2. Iterates through the static hdr_bitmap list
3. Returns success if a match is found, otherwise an error
Mike Baucom [Wed, 15 Apr 2020 08:18:59 +0000 (13:48 +0530)]
net/bnxt: support alloc and program key and act tables
This patch does the following
1. Gets the action tables information from the action template id
2. Gets the class tables information from the class template id
3. Initializes the registry file
4. Allocates a flow id from the flow table
5. Process the class & action tables
A ULP session will contain all the resources needed to support
rte flow offloads. A session is initialized as part of rte_eth_device
start. A DPDK application can have multiple interfaces which
means rte_eth_device start will be called for each of these devices.
ULP session manager will make sure that a single ULP session is only
initialized once. Apart from this, it also initializes MARK database,
EEM table & flow database. ULP session manager also manages a list of
all opened ULP sessions.
This patch adds support for cleaning up resources initialized for ULP
sessions.
A ULP session will contain all the resources needed to support
rte flow offloads. A session is initialized as part of rte_eth_device
start. A DPDK application can have multiple interfaces which
means rte_eth_device start will be called for each of these devices.
ULP session manager will make sure that a single ULP session is only
initialized once. Apart from this, it also initializes MARK database,
EEM table & flow database. ULP session manager also manages a list of
all opened ULP sessions.
VNIC is needed for the driver to program the action record for rx
flows. VNIC determines what receive rings to use to place the received
packets. This patch introduces a routine that will convert a given
dpdk port to VNIC.
SVIF (source virtual interface) is used to represent a physical port,
physical function, or a virtual function. SVIF is compared during L2
context and exact match lookups in TX direction. SVIF is masked for
port information during L2 context and exact match lookup in RX direction.
Hence, driver needs this SVIF information to program L2 context and Exact
match tables.
- Add TruFlow flow memory support
- Exact Match (EM) adds the capability to manage and manipulate
data flows using on chip memory.
- Extended Exact Match (EEM) behaves similarly to EM, but at a
vastly increased scale by using host DDR, with performance
trade-off due to the need to access off-chip memory.
Signed-off-by: Pete Spreadborough <peter.spreadborough@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Farah Smith [Wed, 15 Apr 2020 08:18:46 +0000 (13:48 +0530)]
net/bnxt: add TruFlow core identifier
- Add TruFlow Identifier resource support
- Add TruFlow public API for Identifier resources.
- Add support code and stack for Identifier resource allocation control.
Signed-off-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Michael Wildt [Wed, 15 Apr 2020 08:18:45 +0000 (13:48 +0530)]
net/bnxt: add resource manager
- Add TruFlow RM functionality for resource handling
- Update the TruFlow Resource Manager (RM) with resource
support functions for debugging as well as resource cleanup.
- Add support for Internal and external pools.
Signed-off-by: Michael Wildt <michael.wildt@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Michael Wildt [Wed, 15 Apr 2020 08:18:43 +0000 (13:48 +0530)]
net/bnxt: add TruFlow core session SRAM
- Add TruFlow session resource support functionality
- Add TruFlow session hw flush capability as well as
sram support functions.
- Add resource definitions for session pools.
Signed-off-by: Michael Wildt <michael.wildt@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Michael Wildt [Wed, 15 Apr 2020 08:18:42 +0000 (13:48 +0530)]
net/bnxt: add initial TruFlow core session close
- Add TruFlow session and resource support functions
- Add Truflow session close API and related message support functions
for both session and hw resources
Signed-off-by: Michael Wildt <michael.wildt@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Yunjian Wang [Thu, 16 Apr 2020 13:50:52 +0000 (21:50 +0800)]
net/tap: fix unexpected link handler
The nic's interrupt source has some active handler, which maybe call
tap_dev_intr_handler() to set link handler. We should cancel the link
handler before close fd to prevent executing the link handler. It
triggers segfault.
Call Trace:
0x00007f15e08dad99 in __rte_panic (Error adding fd %d epoll_ctl, %s\n")
0x00007f15e08e9b87 in eal_intr_thread_main ()
0x00007f15e249be15 in start_thread ()
0x00007f15d5322f9d in clone ()
Fixes: c0bddd3a057f ("net/tap: add link status notification") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Yunjian Wang [Thu, 16 Apr 2020 03:04:56 +0000 (11:04 +0800)]
net/tap: fix fd leak on creation failure
When eth_dev_tap_create() is failed, nlsk_fd and ka_fd won't be closed
thus leading fds leak. Zero is a valid fd. Ultimately leads to a valid
fd was closed by mistake.
Fixes: bf7b7f437b49 ("net/tap: create netdevice during probing") Fixes: cb7e68da630a ("net/tap: fix cleanup on allocation failure") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yunjian Wang [Thu, 16 Apr 2020 03:04:45 +0000 (11:04 +0800)]
net/tap: fix file close on remove
The internal structure is freed and set to NULL in the
rte_eth_dev_release_port() and zero is a valid fd. Ultimately
leads to a valid fd was closed by mistake.
Fixes: 3101191c63ab ("net/tap: fix device removal when no queue exist") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yunjian Wang [Thu, 16 Apr 2020 03:04:25 +0000 (11:04 +0800)]
net/tap: fix mbuf and mem leak during queue release
For the tap PMD, we should release mbufs and iovecs from the Rx queue
when closing device. In order to remove duplicated code,
rte_pmd_tap_remove() calls tap_dev_close().
Fixes: 0781f5762cfe ("net/tap: support segmented mbufs") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yunjian Wang [Thu, 16 Apr 2020 03:04:07 +0000 (11:04 +0800)]
net/tap: fix mbuf double free when writev fails
When the tap_write_mbufs() function return with break, mbuf was freed
without increasing num_packets, which could cause applications to free
the mbuf again. And the pmd_tx_burst() function should returns the
number of original packets it actually sent excluding tso mbufs.
Fixes: 9396ad334672 ("net/tap: fix reported number of Tx packets") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Xiaoyu Min [Mon, 13 Apr 2020 03:32:56 +0000 (06:32 +0300)]
net/mlx5: fix validation of push VLAN without full mask
Due the limitation of HW, when PMD create push VLAN action it needs to
know what exactly the value of VID/PCP.
PMD try to figure out them via:
- of_set_vlan_vid/pcp actions
- VLAN item in pattern
If none of above is provided, default value - zero is used.
However user will write rule like [1] which match on a range of VID and
without of_set_vlan_vid action and expect the VID will inherit from
original packet. This is not supported by HW currently. PMD will set VID
to default value - zero because it cannot figure out the exact value of
VID from VLAN item.
This is sort of misleading for some users.
In order to avoid this, PMD will spit out error for rule like [1] to
force user to provide explicit VID/PCP for new pushed VLAN headers.
[1]: testpmd> flow create 2 ingress transfer group 0 priority 3 pattern
eth / vlan vid spec 2859 vid prefix 4 / ipv4 / end
actcions of_push_vlan ethertype 0x88A8 /
of_set_vlan_pcp vlan_pcp 6 / port_id id 0 / end
Fixes: 9aee7a8418d4 ("net/mlx5: support push flow action on VLAN header") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Reviewed-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Xiaoyu Min [Mon, 13 Apr 2020 03:29:03 +0000 (06:29 +0300)]
net/mlx5: fix push VLAN action to use item info
Currently when PMD create push VLAN action it need to provide VID to HW
and PMD get VID value from item VLAN in pattern if there is no
of_set_vlan_vid action following.
When user create rule like [1], which has of_set_vlan_vid action
before of_push_vlan, the intention is to modify VID on existing VLAN
header and push a new VLAN header with VID _inherit_ from the previous
of_set_vlan_vid.
Currently the above is not covered by PMD, PMD always fetch the VLAN
information from item for of_push_vlan action.
Fix it by only fetch VLAN information from item when there is no
previous of_set_vlan_vid action.
[1]: testpmd> flow create 2 ingress transfer group 1 priority 3 pattern
eth / vlan vid is 2731 / ipv4 / end actions
of_set_vlan_vid vlan_vid 3209 / of_push_vlan ethertype
0x88A8 / port_id id 1 / end
Fixes: b8c0372bc5ac ("net/mlx5: fix set VLAN ID/PCP in new header") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@mellanox.com> Reviewed-by: Dekel Peled <dekelp@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
John Daley [Wed, 15 Apr 2020 01:06:40 +0000 (18:06 -0700)]
net/enic: support flow API RSS ranges on outer headers
Support rte_flow RSS action on outer headers (level 0). RSS ranges on
the non-default port is OK.
Restrictions:
- The RETA is ignored. The hash function is simply applied across
the RSS queue range.
- The queues used in the RSS group must be sequential.
- There is a performance hit if the number of queues is not a power
of 2.
Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
John Daley [Wed, 15 Apr 2020 01:06:39 +0000 (18:06 -0700)]
net/enic: change Rx queue ordering to enable RSS action
Each RTE RQ is represented on enic as a Start Of Packet (SOP) queue
and overflow queue (DATA). There were arranged SOP0/DATA0, SOP1/DATA1,..
But need to be arranged SOP0, SOP1,..., DATA0, DATA1... so that
rte_flow RSS queue ranges work.
Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
John Daley [Wed, 15 Apr 2020 01:06:38 +0000 (18:06 -0700)]
net/enic: update flow manager API
Update the VIC Flow Manager API. The extensions will allow support for:
- Decap and strip VLAN
- Remove outer VLAN
- Set Egress port
- Set VLAN when replicating encapped packets
- RSS queue ranges on outer header
Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>
Vu Pham [Mon, 13 Apr 2020 21:17:48 +0000 (14:17 -0700)]
common/mlx5: refactor memory management
Refactor common memory btree and cache management to common driver.
Replace some input parameters of MR APIs to more common data structure
like PD, port_id, share_cache,... so that multiple PMD drivers can
use those MR APIs.
Modify mlx5 net pmd driver to use MR management APIs from common driver.
Signed-off-by: Vu Pham <vuhuong@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Vu Pham [Mon, 13 Apr 2020 21:17:47 +0000 (14:17 -0700)]
common/mlx5: refactor IPC handling from net driver
Refactor common multi-process handling codes from net PMD to common
driver. Using tuple mp_id{name, port_id} as standard input parameter
for all multi-process IPC APIs instead of using rte_eth_dev.
Modify net PMD to use multi-process APIs from common driver.
Signed-off-by: Vu Pham <vuhuong@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Currently, when translate jump action, the table reference will be
increased all the time. But when release the jump action, the table
resource reference will only be decreased when jump action is released.
It means for jump action which was referenced more than one time, the
increased table reference only decrease one time when jump action is
released.
Add table release when the jump action was not new created.
Currently, the meter suffix table is created and saved in the mlx5
shared struct. It causes the suffix table will never be released
even without any meter rules.
Move the suffix table to meter domain struct to help the suffix table
be released when all the meter rules are destroyed.
The multi-stride operations now allow to reduce a stride size
while supporting Jumbo frames. That means that it is possible
to have mbufs configured with a size smaller than the whole
packet received. It is not an issue during normal MPRQ operations
since we attach external buffers instead of copying the data
into the mbuf itself. But it is not the case in "emergency mode"
when we have to copy every packet because of no more external
mbufs are available. Assemble a multi-segment packet to overcome
this issue in case scatter mode is enabled, drop a packet if not.
MPRQ feature should be updated to allow a packet to be received
into multiple strides in order to support the MTU exceeding 8KB.
Special care is needed to prevent the headroom corruption in the
multi-stride mode since the headroom space is borrowed by the PMD
from the tail of the preceding stride. Copy the whole packet into
a separate mbuf in this case or just the overlapping data if the
Rx scattering is supported by an application.
net/mlx5: add device parameter for MPRQ stride size
Define a device parameter to configure log 2 of a stride size for MPRQ
- mprq_log_stride_size. User is able to specify a stride size in a range
allowed by an underlying hardware. The default stride size is defined as
2048 bytes to encompass most commonly used packet sizes in the Internet
(MTU 1518 and less) and will be used in case a maximum configured packet
size cannot fit into the largest possible stride size. Otherwise a
stride size is set to a large enough value to encompass a whole packet.
net/ice/base: force switch to use different recipe
When we use profile rule as switch rule to download, if
we download 2 different rules one by one, there will be
rejection from function ice_aq_sw_rules(), for example:
"flow create 0 priority 0 ingress pattern eth / ipv6 / ah
/ end actions queue index 3 / end"
"flow create 0 priority 0 ingress pattern eth / ipv6 / esp
/ end actions queue index 2 / end"
That is because the 2 rules has the same s_rule input set
except action queue index, so it will be rejected by
hardware. So we have to use different recipes for them.
Also, we need to add recipe_id to keep record of recipe
index, which will be used in rule remove, if not, there
will be error when search recipe in function
ice_rem_adv_rule() if we create 2 or more profile rule.
For example:
"flow create 0 priority 0 ingress pattern eth / ipv4 / udp
/ pfcp s_field is 1 / end actions queue index 4 / end"
"flow create 0 priority 0 ingress pattern eth / ipv4 / udp
/ pfcp s_field is 0 / end actions queue index 5 / end"
then,
"flow flush 0"
you will find only the first rule will be delete,
because ice_find_recp() will always return recipe
id of the first rule.
net/ice/base: add mask check to find switch recipe
In order to find accurate recipe for switch filter, we
need to add mask as an element when searching for recipe.
If we create different rules with the same input set, but
using different masks, then proper recipes should use
those different mask.
When we add some long switch rule, we need check the
number of final recipe number, if it is large than
ICE_MAX_CHAIN_RECIPE, we should refuse this rule.
For example:
"flow create 0 ingress pattern eth / ipv6
src is CDCD:910A:2222:5498:8475:1111:3900:1536
dst is CDCD:910A:2222:5498:8475:1111:3900:2022
tc is 3 / udp dst is 45 / end actions queue index 2 / end"
This rule will consume 6 recipe, if it is not refused, it
will cause the following code over write of lkup_indx and mask.
Gavin Hu [Mon, 13 Apr 2020 16:40:25 +0000 (00:40 +0800)]
net/i40e: restrict pointer aliasing for NEON
Restrict pointer aliasing to optimize the code generated.
The patch showed ~3% performance uplift on Arm N1SDP platform, and no
degradation on ThunderX2. The tet case is RFC2544 zero-loss L2
forwarding running testpmd.
Disable CQ_DISABLED error interrupt in NIX_LF_ERR_INT
to fix spurious interrupts in event dev mode. Also skip
configuring RSS when RQ count is '0' because
RSS table initialization is done incorrectly due to
divide-by-zero error and it is leading to RQ_OOR error
in NIX_LF_ERR_INT.
Fixes: 83ce2880e22e ("net/octeontx2: support RSS") Cc: stable@dpdk.org Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
In case of bnx2xvf pmd, tx packets can support vland id in 2 ways:
1. Setting the mbuf ol_flags=PKT_TX_VLAN_PKT and passing the
vlanid in mbuf->vlan_tci.
2. The tx packet itself has the vlan id included in the packet.
The first case is working as expected but the second case where
the vlan id is included in thetx packets itself was found not
working as expected. To handle that we need to properly set the
start_bd bitfield and the vlan_or_ethertype instead of setting it
to just the ethertype in case of VF.
Add support the set_mc_addr_list device operation in the bnx2xvf PMD.
The configured addresses are stored in the device private area, so
they can be flushed before adding new ones.
Without this v6 multicast packets were properly forwarded to the
Guest VF.
fgets(3)/fread(3)/fscanf(3) etc. use mmap(2)/munmap(2) which leads
to TLB shutdown interrupts to all DPDK app cores including RX cores.
This can cause packet drops. Use read(2)/write(2) instead.
When creating a flow, usually the creating routine is called in
serial. No parallel execution is supported right now. The same
function will be called only once for a single flow creation.
But there is a special case that the creating routine will be called
nested. If the xmeta feature is enabled and there is FLAG / MARK in
the actions list, some metadata reg copy flow needs to be created
before the original flow is applied to the hardware.
In the flow non-cached mode, resources only for flow creation will
not be saved anymore. The memory space is pre-allocated and reused
for each flow. A global index for each device is used to indicate
the memory address of the resources. If the function is called in a
nested mode, then the index will be reset and make everything get
corrupted.
To solve this, a nested index is introduced to save the position for
the original flow creation. Currently, only one level nested call
of the flow creating routine is supported.
Fixes: e7bfa3596a0a ("net/mlx5: separate the flow handle resource") Signed-off-by: Bing Zhao <bingz@mellanox.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Thomas Monjalon [Wed, 8 Apr 2020 00:09:00 +0000 (02:09 +0200)]
net/mlx4: fix build with -fno-common
The variable storages of the same name are merged together
if compiled with -fcommon. This is the default.
This default behaviour allows to declare a variable in a header file and
share the variable in every .o binaries thanks to merge at link-time.
In the case of dlopen linking of the glue library, the pointer mlx4_glue
is referencing the glue functions struct and is set after calling
dlopen.
If compiling with -fno-common (default in GCC 10), the variables must be
declared as extern to avoid multiple re-definitions.
In case the glue layer is split in glue library, the variable mlx4_glue
needs to have its own storage for the rest of the PMD.
Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@mellanox.com>
Thomas Monjalon [Wed, 8 Apr 2020 00:08:59 +0000 (02:08 +0200)]
common/mlx5: fix build with -fno-common
The variable storages of the same name are merged together
if compiled with -fcommon. This is the default.
This default behaviour allows to declare a variable in a header file and
share the variable in every .o binaries thanks to merge at link-time.
In the case of dlopen linking of the glue library, the pointer mlx5_glue
is referencing the glue functions struct and is set after calling
dlopen.
If compiling with -fno-common (default in GCC 10), the variable must be
declared as extern to avoid multiple re-definitions.
In case the glue layer is split in glue library, the variable mlx5_glue
needs to have its own storage for the rest of the PMD.
Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@mellanox.com>
Thomas Monjalon [Wed, 8 Apr 2020 00:08:58 +0000 (02:08 +0200)]
common/mlx5: split glue initialization
The function mlx5_glue_init was doing three things:
- initialize logs
- load glue library if in dlopen mode
- initialize glue layer
They are split in three functions for clarity.
The config option RTE_IBVERBS_LINK_DLOPEN is not used anymore
outside of make and meson files. It is replaced with MLX5_GLUE,
which is defined in the same condition and is already used with dlopen.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Matan Azrad <matan@mellanox.com>