bus/fslmc: optimize physical to virtual address search
With Hotplugging memory support, the order of memseg has been changed
from physically contiguous to virtual contiguous. FSLMC bus and dpaa2
drivers depend on PA to VA address conversion when in Physical
addressing mode.
This patch creates a list of blocks requested to be pinned to the
DPAA2 mempool. For searching physical addresses, it is expected that
it would belong to this list (from hardware pool) and hence it is
less expensive than memseg walks. Though, this has marginal impact on
performance vis-a-vis legacy mode with physically contiguous memsegs.
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
crypto/dpaa_sec: remove ctx based offset for PA-VA conversion
Crypto requires physical to virtual address conversion for
descriptors. Prior to memory hotplugging this was based on memseg
iteration assuming memsegs are all physical contiguous and using
cached start address fast calculations can be done. This
assumption now stands invalid with memory hotplugging support.
In preparation for supporting hotplugging change to memory,
this patchset removes the optimized pool context stored physical
address offset based PA-VA conversion.
This adversely affects the performance as complete memsegs now need
to be parsed, but a rework containing necessary optimization would be
posted over this.
This patch is to accommodate an experimental feature of mbuf - external
buffer attachment. If mbuf is attached to an external buffer, its ol_flags
will have EXT_ATTACHED_MBUF set. Without enabling/using the feature,
everything remains same.
If PMD delivers Rx packets with non-direct mbuf, ol_flags should not be
overwritten. For mlx5 PMD, if Multi-Packet RQ is enabled, Rx packets could
be carried with externally attached mbufs.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
This patch introduces a new way of attaching an external buffer to a mbuf.
Attaching an external buffer is quite similar to mbuf indirection in
replacing buffer addresses and length of a mbuf, but a few differences:
- When an indirect mbuf is attached, refcnt of the direct mbuf would be
2 as long as the direct mbuf itself isn't freed after the attachment.
In such cases, the buffer area of a direct mbuf must be read-only. But
external buffer has its own refcnt and it starts from 1. Unless
multiple mbufs are attached to a mbuf having an external buffer, the
external buffer is writable.
- There's no need to allocate buffer from a mempool. Any buffer can be
attached with appropriate free callback.
- Smaller metadata is required to maintain shared data such as refcnt.
Signed-off-by: Yongseok Koh <yskoh@mellanox.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Xiao Wang [Wed, 25 Apr 2018 02:18:27 +0000 (10:18 +0800)]
vhost: fix vDPA set features
We should call set_features callback after setting features in virtio_net
structure, otherwise vDPA driver cannot get the right features.
Fixes: 07718b4f87aa ("vhost: adapt library for selective datapath") Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Acked-by: Zhihong Wang <zhihong.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
While the patch did solve concurrency issue, it induces more
pages copies as some clean pages are marked as dirty for
performance reasons. Moreover, as there is no more contention
doing the logging, the rate of packets than can be processed is
higher, leading to even more pages to be dirtied.
It has been reported that with more than one queue pair, and
with a relatively low packet rate (1Mpps), the live migration
never converges until the flow is stopped.
While a better solution is found, it is better to reset to the
old behaviour, i.e. using atomic operation for dirty pages
logging.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Matej Vido [Fri, 27 Apr 2018 08:57:05 +0000 (10:57 +0200)]
doc: update doc and release notes for szedata2 driver
New version of the packages with dependencies for the szedata2
driver is needed due to the new API of the libsze2 library which
is used in the driver.
The documentation and the release notes are updated to contain
the information about the required versions.
Signed-off-by: Matej Vido <vido@cesnet.cz> Acked-by: Jan Remes <remes@netcope.com>
Roman Zhukov [Thu, 19 Apr 2018 11:37:03 +0000 (12:37 +0100)]
net/sfc/base: get max supported value for action MARK
The mark value for MATCH_ACTION_MARK has a maximum value.
Requesting a value larger than the maximum will cause the
filter insertion to fail with EINVAL. This patch allows the
driver to check the value at the filter validation.
Signed-off-by: Roman Zhukov <roman.zhukov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andrew Rybchenko [Thu, 19 Apr 2018 11:37:00 +0000 (12:37 +0100)]
net/sfc: support flow marks in equal stride super-buffer Rx
Equal stride super-buffer Rx mode allows to mark packets in HW
using filters. Process the data on datapath and advertise
corresponding features to allow flow API support to implement it.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:59 +0000 (12:36 +0100)]
net/sfc: add Rx descriptor wait timeout
Add device argument to customize Rx descriptor wait timeout which
is supported in DPDK firmware variant only in equal stride super-buffer
Rx mode only.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:58 +0000 (12:36 +0100)]
net/sfc: support DPDK firmware variant
DPDK firmware variant supports equal stride super-buffer Rx mode which
provides higher packet rate and packet marks but requires dedicated
mempool manager with contiguous object block allocation (e.g. bucket).
Also the firmware supports subvariant without checksumming on Tx which
allows to reach higher packet rates on transmit if checksumming is not
required.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:55 +0000 (12:36 +0100)]
net/sfc: support equal stride super-buffer Rx mode
HW Rx descriptor represents many contiguous packet buffers which
follow each other. Number of buffers, stride and maximum DMA
length are setup-time configurable per Rx queue based on provided
mempool. The mempool must support contiguous block allocation and
get info API to retrieve number of objects in the block.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:53 +0000 (12:36 +0100)]
net/sfc: allow one Rx queue entry carry many packet buffers
One HW Rx descriptor has many packet buffers in the case of equal
stride super-buffer Rx modes. Each packet buffer is still treated
as separate SW Rx descriptor. rxq_entries is the size of HW Rx ring
whereas nb_rx_desc is the number of SW Rx descriptors.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:52 +0000 (12:36 +0100)]
net/sfc: conditionally compile support for tunnel packets
Equal stride super-buffer Rx datapath does not support tunnels, code to
parse tunnel packet types and inner checksum offload is not required and
it is important to be able to compile it out on build time to avoid
extra CPU load.
Cutting of tunnels support relies on compiler optimizaitons to
be able to drop extra checks and branches if tun_ptype is always 0.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:50 +0000 (12:36 +0100)]
net/sfc: prepare EF10 Rx event parser to be reused
Equal stride super-buffer Rx mode will be handled by the dedicated
Rx datapath and the mode has almost the same Rx event structure as
single packet Rx mode.
Restructure the code to allow the common parts to be shared.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:49 +0000 (12:36 +0100)]
net/sfc: factor out function to push Rx doorbell
The function may be shared by different Rx datapath implementations.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:47 +0000 (12:36 +0100)]
net/sfc/base: support equal stride super-buffer Rx mode
Equal stride super-buffer Rx mode is supported by DPDK firmware
variant. One Rx descriptor provides many Rx buffers to firmware.
Rx buffers follow each other with specified stride.
Also it supports head of line blocking with timeout to address
drops when no Rx descriptors are available. So it gives extra time
to the driver to provide Rx descriptors before drop.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andrew Rybchenko [Thu, 19 Apr 2018 11:36:45 +0000 (12:36 +0100)]
net/sfc/base: make RxQ type data an union
The type is an internal interface. Single integer is insufficient
to carry RxQ type-specific information in the case of equal stride
super-buffer Rx mode (packet buffers per bucket, maximum DMA length,
packet stride, head of line block timeout).
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
The PMD_INIT_LOG macro always adds a newline, and other drivers version
of PMD_DRV_LOG always adds a newline. Therefore change nfp driver
to be consitent with others.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Add rte_flow_action_count action data structure to enable shared
counters across multiple flows on a single port or across multiple
flows on multiple ports within the same switch domain. Also this enables
multiple count actions to be specified in a single flow action.
This patch also modifies the existing rte_flow_query API to take the
rte_flow_action structure as an input parameter instead of the
rte_flow_action_type enumeration to allow querying a specific action
from a flow rule when multiple actions of the same type are specified.
This patch also contains updates for the bonding, failsafe and mlx5 PMDs
and testpmd application which are affected by this API change.
Introduces a new action type RTE_FLOW_ITEM_TYPE_MARK which enables
flow patterns to specify arbitrary integer values to match aginst
set by the RTE_FLOW_ACTION_TYPE_MARK action in previously matched
flows.
Add support for specification of new MARK flow item in testpmd's cli.
Update testpmd documentation to describe new MARK flow item support.
Add jump action type which defines an action which allows a matched
flow to be redirect to the specified group. This allows physical and
logical flow table/group hierarchies to be defined through rte_flow.
This breaks ABI compatibility for the following public functions (as it
modifes the ordering of the rte_flow_action_type enumeration):
Add new flow action types and associated action data structures to
support the encapsulation and decapsulation of VXLAN and NVGRE tunnel
endpoints.
The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_ENCAP action will cause the
matching flow to be encapsulated in the tunnel endpoint overlay
defined in the [vxlan/nvgre]_encap action data.
The RTE_FLOW_ACTION_TYPE_[VXLAN/NVGRE]_DECAP action will cause all
headers associated with the outer most tunnel endpoint of the specified
type for the matching flows.
Andrew Rybchenko [Thu, 26 Apr 2018 16:48:57 +0000 (17:48 +0100)]
net/sfc: do not use RSS context if it is not required
RSS action with only one destination queue and no specific settings
for hash types and key does not require dedicated RSS context and
may be simplified to QUEUE action.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Fix internal report on port specific offload capabilities to be 0 (no
capabilities). Before this commit port capabilities were a clone of queue
capabilities, however the current TAP offload capabilities (e.g.
checksum calculation) are per queue and are not specific per port.
This commit fixes an internal validation check for new configured
queue offloads.
The port capability API keeps reporting all queue capabilities as port
capabilities.
Fixes: 95ae196ae10b ("net/tap: use new Rx offloads API") Fixes: 818fe14a9891 ("net/tap: use new Tx offloads API") Cc: stable@dpdk.org Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Andrew Rybchenko [Wed, 25 Apr 2018 17:18:34 +0000 (18:18 +0100)]
net/sfc: ignore spec bits not covered by mask
mask is a simple bit-mask applied before interpreting the contents
of spec and last.
Fixes: a9825ccf5bb8 ("net/sfc: support flow API filters") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com> Reviewed-by: Roman Zhukov <roman.zhukov@oktetlabs.ru>
Add support for virtual function representor ports to the ixgbe PF
driver. When SR-IOV virtual functions devices are enabled a
corresponding representor port for each VF can be enabled in the
process in which the ixgbe PMD is running within, by specifying the
representor devargs with the list of VF ports that representors
are to be created for.
An example of the devargs which would create VF representor for virtual
functions 0,2,4,5,6 and 7 is:
-w DBDF,representor=[0,2,4-7]
Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Signed-off-by: Remy Horton <remy.horton@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Add support for virtual function representor ports to the i40e PF
driver. When SR-IOV virtual functions devices are enabled a
corresponding representor port for each VF can be enabled, in the
process in which the i40e PMD is running, by specifying the
representor devargs with the list of VF ports that representors
are to be created for.
An example of the devargs which would create VF representor for virtual
functions 0,2,4,5,6 and 7 is:
-w DBDF,representor=[0,2,4-7]
and to just specify a single representor on virtual function 3 (switch
port id):
-w DBDF,representor=3
Signed-off-by: Declan Doherty <declan.doherty@intel.com> Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Signed-off-by: Remy Horton <remy.horton@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Introduces a new structure, rte_eth_devargs, to support generic
ethdev arguments common across NET PMDs, with a new API
rte_eth_devargs_parse API to support PMD parsing these arguments. The
patch add support for a representor argument passed with passed with
the EAL -w option. The representor parameter allows the user to specify
which representor ports to initialise on a device.
The argument supports passing a single representor port, a list of
port values or a range of port values.
-w BDF,representor=1 # create representor port 1 on pci device BDF
-w BDF,representor=[1,2,5,6,10] # create representor ports in list
-w BDF,representor=[0-31] # create representor ports in range
Add new device flag to specify that an ethdev port is a port
representor. Extend rte_eth_dev_info structure to expose device flags
to the user which enables applications to discover if a port is a
representor port.
Introduces a new port attribute to ethdev port's which denotes the
switch domain a port belongs to. By default all port's switch
identifiers are set to RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID. Ports
which supported the concept of switch domains can be configured with
the same switch domain id.
Add document to describe the model for representing switching capable
devices in DPDK, using a general ethdev port model and through port
representors. This document also details the port model and the
rte_flow semantics required for flow programming, as well as listing
some example use cases.
Xueming Li [Mon, 23 Apr 2018 12:33:04 +0000 (20:33 +0800)]
net/mlx5: cleanup tunnel checksum offloads
Once tunnel packet type(RTE_PTYPE_TUNNEL_xxx) identified,
PKT_RX_IP_CKSUM_XXX and PKT_RX_L4_CKSUM_XXX represent checksum result of
inner headers, outer L3 and L4 header checksum are always valid as soon
as tunnel identified. If no tunnel identified, PKT_RX_IP_CKSUM_XXX and
PKT_RX_L4_CKSUM_XXX represent checksum result of outer L3 and L4
headers.
Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Xueming Li [Mon, 23 Apr 2018 12:33:03 +0000 (20:33 +0800)]
net/mlx5: support Rx tunnel type identification
This patch introduced tunnel type identification based on flow rules.
If flows of multiple tunnel types built on same queue, no tunnel type
will be returned. User application could use bits in flow mark as tunnel
type identifier.
Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Xueming Li [Mon, 23 Apr 2018 12:33:02 +0000 (20:33 +0800)]
net/mlx5: support L3 VXLAN flow
This patch support L3 VXLAN, no inner L2 header comparing to standard
VXLAN protocol. L3 VXLAN using specific overlay UDP destination port to
discriminate against standard VXLAN, device parameter and FW has to be
configured to support it:
sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_EN=1
sudo mlxconfig -d <device> -y s IP_OVER_VXLAN_PORT=<port>
Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Xueming Li [Mon, 23 Apr 2018 12:33:00 +0000 (20:33 +0800)]
net/mlx5: support 16 hardware priorities
This patch supports new 16 Verbs flow priorities by trying to create a
simple flow of priority 15. If 16 priorities not available, fallback to
traditional 8 priorities.
Verb priority mapping:
8 priorities >=16 priorities
Control flow: 4-7 8-15
User normal flow: 1-3 4-7
User tunnel flow: 0-2 0-3
Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
After add RSS hash offload check, default rss_hf will fail on
devices that not support all bits, the patch take rss_hf as
a suggest value and only set bits that device supported base on
rte_eth_dev_get_info, also rss_hf will only be updated when new
rss offload is successfully updated on all ports by
"port config all rss [!default]" command.
Fixes: 8863a1fbfc66 ("ethdev: add supported hash function check") Fixes: d9aa619c60b6 ("app/testpmd: new parameter for port config all RSS command") Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com> Tested-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:44 +0000 (18:51 +0100)]
net/sfc: convert to the advanced EFX RSS interface
The current code has the following drawbacks:
- It is assumed that TCP 4-tuple hash is
always supported, which is untrue in
the case of packed stream FW variant.
- The driver is unaware of UDP hash support
available with latest firmware.
In order to cope with the mentioned issues, this
patch implements the new approach to handle hash
settings using the advanced EFX RSS interface.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:43 +0000 (18:51 +0100)]
net/sfc: factor out RSS fields from adapter info
RSS handling will need more sophisticated fields
in the adapter context storage in future patches.
This patch groups existing fields in a dedicated
structure and updates the rest of the code.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:42 +0000 (18:51 +0100)]
net/sfc: remove conditional compilation for RSS
RSS is one of the most valuable features in the
driver, and one would hardly need to disable it
at build time. This patch withdraws unnecessary
conditionals for RSS snippets.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:41 +0000 (18:51 +0100)]
net/sfc: process RSS settings on Rx configure step
One may submit advanced RSS settings as part of
rte_eth_conf to customise RSS configuration from
the very beginning. Currently the driver does not
check that piece of settings and proceeds with
default choices for RSS hash functions and RSS key.
This patch implements the required processing.
Fixes: 4ec1fc3ba881 ("net/sfc: add basic stubs for RSS support on driver attach") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Packed stream firmware variant on EF10 adapters has a
number of properties which must be taken into account:
- Only one exclusive RSS context is available per port.
- Only IP addresses can contribute to the hash value.
Huntington and Medford have one more limitation which
is important for the drivers capable of packed stream:
- Hash algorithm is non-standard (i.e. non-Toeplitz).
This implies XORing together source + destination
IP addresses (or last four bytes in the case of IPv6)
and using the result as the input to a Toeplitz hash.
This patch provides a number of improvements in order
to treat the mentioned limitations in the common code.
If the firmware variant is packed stream, the list of
supported hash tuples will include less variants, and
the maximum number of RSS contexts will be set to one.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:39 +0000 (18:51 +0100)]
net/sfc/base: support more RSS hash configurations
Modern firmwares on EF10 adapters have support for
more traffic classes eligible for hash computation.
Also, it has become possible to adjust hashing per
individual class and select distinct packet fields
which will be able to contribute to the hash value.
This patch adds support for the mentioned features.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Wed, 25 Apr 2018 17:51:38 +0000 (18:51 +0100)]
net/sfc/base: add a new means to control RSS hash
Currently, libefx has no support for additional RSS modes
available with later controllers. In order to support this,
libefx should be able to list available hash configurations.
This patch provides basic infrastructure for the new interface.
The client drivers will be able to query the list of supported
hash configurations for a particular hash algorithm. Also, it
will be possible to configure hashing by means of new definitions.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andrew Rybchenko [Wed, 25 Apr 2018 17:51:37 +0000 (18:51 +0100)]
net/sfc/base: cope with clang warning on negative shift
clang 4.0.1-6 on Ubuntu generates false positive warning that shift
is negative. It is done regardless of the fact that the branch is
not taken because of previous check.
The warning is generate in EFX_INSERT_NATIVE32 used by
EFX_INSERT_FIELD_NATIVE32. All similar cases are fixed as well.
It is undesirable to suppress the warning completely.
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Xueming Li [Mon, 23 Apr 2018 12:16:32 +0000 (20:16 +0800)]
ethdev: introduce new tunnel VXLAN-GPE
VXLAN-GPE enables VXLAN for all protocols. Protocol link:
https://www.ietf.org/id/draft-ietf-nvo3-vxlan-gpe-05.txt
Signed-off-by: Xueming Li <xuemingl@mellanox.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
RTE_FLOW_ACTION_TYPE_PORT_ID brings the ability to inject matching traffic
into a different device, as identified by its DPDK port ID.
This is normally only supported when the target port ID has some kind of
relationship with the port ID the flow rule is created against, such as
being exposed by a common physical device (e.g. a different port of an
Ethernet switch).
The converse pattern item, RTE_FLOW_ITEM_TYPE_PORT_ID, makes the resulting
flow rule match traffic whose origin is the specified port ID. Note that
specifying a port ID that differs from the one the flow rule is created
against is normally meaningless (if even accepted), but can make sense if
combined with the transfer attribute.
These must not be confused with their PHY_PORT counterparts, which refer to
physical ports using device-specific indices, but unlike PORT_ID are not
necessarily tied to DPDK port IDs.
This breaks ABI compatibility for the following public functions:
This patch adds the missing action counterpart to the PHY_PORT pattern
item, that is, the ability to directly inject matching traffic into a
physical port of the underlying device.
It breaks ABI compatibility for the following public functions:
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
While RTE_FLOW_ITEM_TYPE_PORT refers to physical ports of the underlying
device using specific identifiers, these are often confused with DPDK port
IDs exposed to applications in the global name space.
Since this pattern item is seldom used, rename it RTE_FLOW_ITEM_PHY_PORT
for better clarity.
No ABI impact.
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Contrary to all other pattern items, these are inconsistently documented as
affecting traffic instead of simply matching its origin, without provision
for the latter.
This commit clarifies documentation and updates PMDs since the original
behavior now has to be explicitly requested using the new transfer
attribute.
It breaks ABI compatibility for the following public functions:
- rte_flow_create()
- rte_flow_validate()
Impacted PMDs are bnxt and i40e, for which the VF pattern item is now only
supported when a transfer attribute is also present.
This new attribute enables applications to create flow rules that do not
simply match traffic whose origin is specified in the pattern (e.g. some
non-default physical port or VF), but actively affect it by applying the
flow rule at the lowest possible level in the underlying device.
It breaks ABI compatibility for the following public functions:
VLAN TCI is a 16-bit field broken down as PCP (3b), DEI (1b) and VID (12b).
The default mask used by PMDs for the VLAN pattern when one isn't provided
by the application comprises the entire TCI, which is problematic because
most devices only support VID matching.
This forces applications to always provide a mask limited to the VID part
in order to successfully apply a flow rule with a VLAN pattern item.
Moreover, applications rarely want to match PCP and DEI intentionally.
Given the above and since VID is what is commonly referred to when talking
about VLAN, this commit excludes PCP and DEI from the default mask.
Fixes: 6de5c0f1302c ("ethdev: define default item masks in flow API") Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
TPID handling in rte_flow VLAN and E_TAG pattern item definitions is not
consistent with the normal stacking order of pattern items, which is
confusing to applications.
Problem is that when followed by one of these layers, the EtherType field
of the preceding layer keeps its "inner" definition, and the "outer" TPID
is provided by the subsequent layer, the reverse of how a packet looks like
on the wire:
Wire: [ ETH TPID = A | VLAN EtherType = B | B DATA ]
rte_flow: [ ETH EtherType = B | VLAN TPID = A | B DATA ]
Worse, when QinQ is involved, the stacking order of VLAN layers is
unspecified. It is unclear whether it should be reversed (innermost to
outermost) as well given TPID applies to the previous layer:
Wire: [ ETH TPID = A | VLAN TPID = B | VLAN EtherType = C | C DATA ]
rte_flow 1: [ ETH EtherType = C | VLAN TPID = B | VLAN TPID = A | C DATA ]
rte_flow 2: [ ETH EtherType = C | VLAN TPID = A | VLAN TPID = B | C DATA ]
While specifying EtherType/TPID is hopefully rarely necessary, the stacking
order in case of QinQ and the lack of documentation remain an issue.
This patch replaces TPID in the VLAN pattern item with an inner
EtherType/TPID as is usually done everywhere else (e.g. struct vlan_hdr),
clarifies documentation and updates all relevant code.
It breaks ABI compatibility for the following public functions:
Summary of changes for PMDs that implement ETH, VLAN or E_TAG pattern
items:
- bnxt: EtherType matching is supported with and without VLAN, but TPID
matching is not and triggers an error.
- e1000: EtherType matching is only supported with the ETHERTYPE filter,
which does not support VLAN matching, therefore no impact.
- enic: same as bnxt.
- i40e: same as bnxt with existing FDIR limitations on allowed EtherType
values. The remaining filter types (VXLAN, NVGRE, QINQ) do not support
EtherType matching.
- ixgbe: same as e1000, with additional minor change to rely on the new
E-Tag macro definition.
- mlx4: EtherType/TPID matching is not supported, no impact.