Xuan Ding [Mon, 11 Oct 2021 07:59:41 +0000 (07:59 +0000)]
vfio: allow partially unmapping adjacent memory
Currently, if we map a memory area A, then map a separate memory area B
that by coincidence happens to be adjacent to A, current implementation
will merge these two segments into one, and if partial unmapping is not
supported, these segments will then be only allowed to be unmapped in
one go. In other words, given segments A and B that are adjacent, it
is currently not possible to map A, then map B, then unmap A.
Fix this by adding a notion of "chunk size", which will allow
subdividing segments into equally sized segments whenever we are dealing
with an IOMMU that does not support partial unmapping. With this change,
we will still be able to merge adjacent segments, but only if they are
of the same size. If we keep with our above example, adjacent segments A
and B will be stored as separate segments if they are of different
sizes.
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Signed-off-by: Xuan Ding <xuan.ding@intel.com> Tested-by: Yvonne Yang <yvonnex.yang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Ivan Malov [Thu, 16 Sep 2021 18:49:55 +0000 (21:49 +0300)]
net/virtio: fix Tx checksum for tunnel packets
Tx prepare method calls rte_net_intel_cksum_prepare(), which
handles tunnel packets correctly, but Tx burst path does not
take tunnel presence into account when computing the offsets.
Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Wenwu Ma [Fri, 24 Sep 2021 17:23:00 +0000 (17:23 +0000)]
examples/vhost: fix use after free on drain
When a vdev is removed in destroy_device function,
the corresponding vhost TX buffer will also be freed,
but the vhost TX buffer may still be used in the
drain_vhost function, which will cause an error of
heap-use-after-free. Therefore, before accessing
vhost TX buffer, we need to check whether the vdev
has been removed, if so, let's skip this vdev.
Fixes: a68ba8e0a6b6 ("examples/vhost: refactor vhost data path") Cc: stable@dpdk.org Signed-off-by: Wenwu Ma <wenwux.ma@intel.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Marvin Liu [Sun, 26 Sep 2021 09:28:42 +0000 (17:28 +0800)]
net/virtio: fix oversized packets in vectorized Rx
If packed ring size is not power of two, it is possible that remained
number less than one batch and meanwhile batch operation can pass.
This will cause incorrect remained number calculation and then lead to
receiving oversized packets. The patch fixed the issue by added
remained number check before batch operation.
Fixes: 77d66da83834 ("net/virtio: add vectorized packed ring Rx") Cc: stable@dpdk.org Signed-off-by: Marvin Liu <yong.liu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xueming Li [Fri, 15 Oct 2021 15:05:45 +0000 (23:05 +0800)]
vdpa/mlx5: retry VAR allocation during vDPA restart
VAR is the device memory space for the virtio queues doorbells,
Qemu could mmap it to directly to speed up doorbell push.
On a busy system, Qemu takes time to release VAR resources during driver
shutdown. If vdpa restarted quickly, the VAR allocation failed with
error 28 since the VAR is singleton resource per device.
This patch adds retry mechanism for VAR allocation.
Fixes: 4cae722c1b06 ("vdpa/mlx5: move virtual doorbell alloc to probe") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xueming Li [Fri, 15 Oct 2021 15:05:44 +0000 (23:05 +0800)]
vdpa/mlx5: workaround FW first completion in start
After a vDPA application restart, Qemu restores VQ with used and
available index, new incoming packet triggers virtio driver to
handle buffers. Under heavy traffic, no available buffer for
firmware to receive new packets, no Rx interrupts generated,
driver is stuck on endless interrupt waiting.
As a firmware workaround, this patch sends a notification after
VQ setup to ask driver handling buffers and filling new buffers.
Fixes: bff735011078 ("vdpa/mlx5: prepare virtio queues") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Ting Xu [Thu, 21 Oct 2021 05:54:07 +0000 (05:54 +0000)]
net/ice: fix TM hierarchy commit flag reset
After DCF commits TM hierarchy configuration, the commit flag is set to
avoid duplicated commit. But the flag is not reset after device stop,
which prevents the update of hierarchy configuration unless close the
device. It is not reasonable. This patch fix to reset the commit flag
after device stop. Then users can delete and add nodes to commit a new
TM hierarchy configuration.
Fixes: 3a6bfc37eaf4 ("net/ice: support QoS config VF bandwidth in DCF") Cc: stable@dpdk.org Signed-off-by: Ting Xu <ting.xu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
William Tu [Wed, 20 Oct 2021 03:47:49 +0000 (20:47 -0700)]
net/e1000: build on Windows
This patch enables building the e1000 driver for Windows.
I tested using two Windows VM on top of VMware Fusion,
creating two e1000 devices with device ID 0x10D3 (8274L),
verifying rx/tx works correctly using dpdk-testpmd.exe
rxonly and txonly mode.
Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Pallavi Kadam <pallavi.kadam@intel.com> Tested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
Tudor Cornea [Wed, 20 Oct 2021 18:13:46 +0000 (21:13 +0300)]
net/ixgbe: fix port initialization if MTU config fails
On a VMware ESXi 6.0 setup with an Intel 82599 NIC the ports don't
seem to initialize anymore, while running testpmd.
Configuring Port 0 (socket 0)
ixgbevf_dev_rx_init(): Set max packet length to 1518 failed.
ixgbevf_dev_start(): Unable to initialize RX hardware (-22)
Fail to start port 0: Invalid argument
Configuring Port 1 (socket 0)
ixgbevf_dev_rx_init(): Set max packet length to 1518 failed.
ixgbevf_dev_start(): Unable to initialize RX hardware (-22)
Fail to start port 1: Invalid argument
Please stop the ports first
If the call to ixgbevf_rlpml_set_vf fails and we return prematurely,
we will not be able to initialize the ports correctly.
The behavior seems to have changed since the following commit:
Fixes: c77866a16904 ("net/ixgbe: detect failed VF MTU set") Cc: stable@dpdk.org
We can make this particular use case work correctly if we don't
return an error, which seems to be consistent with the overall
kernel ixgbevf implementation.
Rongwei Liu [Thu, 21 Oct 2021 08:56:36 +0000 (11:56 +0300)]
net/mlx5: set Tx queue affinity in round-robin
Previously, we set txq affinity to 0 and let firmware
to perform round-robin when bonding. Firmware uses a
global counter to assign txq affinity to different
physical ports accord to remainder after division.
There are three dis-advantages:
1. The global counter is shared between kernel and dpdk.
2. After restarting pmd or port, the previous counter value
is reused, so the new affinity is unpredictable.
3. There is no way to get what affinity is set by firmware.
In this update, we will create several TISs up to the
number of bonding ports and bind each TIS to one PF port.
For each port, it will start to pick up TIS using its port
index. Upper layer application can quickly calculate each txq's
affinity without querying.
At DPDK layer, when creating txq with 2 bonding ports, the
affinity is set like:
port 0: 1-->2-->1-->2
port 1: 2-->1-->2-->1
port 2: 1-->2-->1-->2
Note: Only applicable to DevX api.
This affinity subjects to HW hash.
Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Dmitry Kozlyuk [Thu, 14 Oct 2021 08:55:28 +0000 (11:55 +0300)]
net/mlx5: close tools socket with last device
MLX5 PMD exposes a socket for external tools to dump port state.
Socket events are listened using an interrupt source of EXT type.
The socket was closed and the interrupt callback was unregistered
at program exit, which is incorrect because DPDK could be already
shut down at this point. Move actions performed at program exit
to the moment the last MLX5 port is closed. The socket will be opened
again if later a new MLX5 device is plugged in and probed.
Also fix comments that were decisively talking
about secondary processes instead of external tools.
Fixes: e6cdc54cc0ef ("net/mlx5: add socket server for external tools") Cc: stable@dpdk.org Reported-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Dmitry Kozlyuk [Mon, 18 Oct 2021 17:24:56 +0000 (20:24 +0300)]
net/mlx5: fix Rx queue resource cleanup
mlx5_rxq_start() allocates rxq_ctrl->obj and frees it on failure,
but did not set it to NULL. Later mlx5_rxq_release() could not recognize
this object is already freed and attempted to release its resources,
resulting in a crash:
Configuring Port 0 (socket 0)
mlx5_common: Failed to create RQ using DevX
mlx5_common: Can't create DevX RQ object.
mlx5_net: Port 0 Rx queue 0 RQ creation failure.
Segmentation fault
Set rxq_ctrl->obj to NULL after it is freed to skip resource release.
Bing Zhao [Mon, 18 Oct 2021 13:53:05 +0000 (16:53 +0300)]
net/mlx5: fix meter yellow policy with RSS action
The RSS configuration in a policy action container was a pointer
inside a union, and the pointer area could be used as other fate
action. In the current implementation, the RSS of the green color
was prior to that of the yellow color. There was a high possibility
the pointer was considered as the RSS and result in a error flow
expansion when only the yellow color had the RSS action.
The check of the fate action type should also be done to get rid of
the misjudgment.
Fixes: b38a12272b3a ("net/mlx5: split meter color policy handling") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Xueming Li [Tue, 19 Oct 2021 10:35:00 +0000 (18:35 +0800)]
net/mlx5: enable DevX Tx queue creation
Verbs API does not support Infiniband device port number larger 255 by
design. To support more representors on a single Infiniband device DevX
API should be engaged.
While creating Send Queue (SQ) object with Verbs API, the PMD assigned
IB device port attribute and kernel created the default miss flows in
FDB domain, to redirect egress traffic from the queue being created to
representor appropriate peer (wire, HPF, VF or SF).
With DevX API there is no IB-device port attribute (it is merely kernel
one, DevX operates in PRM terms) and PMD must create default miss flows
in FDB explicitly. PMD did not provide this and using DevX API for
E-Switch configurations was disabled.
The default miss FDB flow matches E-Switch manager vport (to make sure
the source is some representor) and SQn (Send Queue number - device
internal queue index). The root flow table managed by kernel/firmware
and it does not support vport redirect action, we have to split the
default miss flow into two ones:
- flow with lowest priority in the root table that matches E-Switch
manager vport ID and jump to group 1.
- flow in group 1 that matches E-Switch manager vport ID and SQn and
forwards packet to peer vport
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Tue, 19 Oct 2021 10:34:59 +0000 (18:34 +0800)]
net/mlx5: fix internal root table flow priority
When creating internal transfer flow on root table with lowest
priority, the flow was created with max UINT32_MAX priority. It is wrong
since the flow is created in kernel and max priority supported is 16.
This patch fixes this by adding internal flow check.
Xueming Li [Tue, 19 Oct 2021 10:34:57 +0000 (18:34 +0800)]
net/mlx5: support E-Switch manager egress traffic match
For egress packet on representor, the vport ID in transport domain
is E-Switch manager vport ID since representor shares resources of
E-Switch manager. E-Switch manager vport ID and Tx queue internal device
index are used to match representor egress packet.
This patch adds flow item port ID match on E-Switch manager.
E-Switch manager vport ID is 0xfffe on BlueField, 0 otherwise.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Tue, 19 Oct 2021 10:34:56 +0000 (18:34 +0800)]
net/mlx5: improve Verbs flow priority discovery
To detect number flow Verbs flow priorities, PMD try to create Verbs
flows in different priority. While Verbs is not designed to support
ports larger than 255.
When DevX supported by kernel driver, 16 Verbs priorities must be
supported, no need to create Verbs flows.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Jie Wang [Thu, 21 Oct 2021 10:49:22 +0000 (18:49 +0800)]
ethdev: support L2TPv2 and PPP procotol
Added flow pattern items and header formats of L2TPv2 and PPP.
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com> Signed-off-by: Jie Wang <jie1x.wang@intel.com> Acked-by: Ori Kam <orika@nvidia.com> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Wed, 20 Oct 2021 12:47:24 +0000 (15:47 +0300)]
ethdev: fix ID spelling in comments and log messages
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Wed, 20 Oct 2021 12:47:23 +0000 (15:47 +0300)]
ethdev: fix VLAN spelling including VLAN ID case
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Wed, 20 Oct 2021 12:47:19 +0000 (15:47 +0300)]
ethdev: avoid documentation in next lines
Documentation in the next separate line is confusing. If documentation
requires own line it should be before, not after.
Move documentation to the previous line if documentation on the same
line makes it too long.
Fix a number of incorrect markups on the way.
When a lines is touched by the patch anyway, do other cosmetics
changes to avoid changes in next patches.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Acked-by: Ori Kam <orika@nvidia.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
David Marchand [Sat, 23 Oct 2021 06:29:34 +0000 (08:29 +0200)]
dma/idxd: fix build on Windows
Windows compilation gives us a splat:
In file included from ../drivers/dma/idxd/idxd_pci.c:10:
In file included from ..\drivers\dma\idxd/idxd_internal.h:11:
..\drivers\dma\idxd/idxd_hw_defs.h:46:21: error: expected member name or
';' after declaration specifiers
uint16_t __reserved[13];
~~~~~~~~ ^
1 error generated.
Ironically, __reserved is probably a reserved token.
Some drivers that build fine on Windows have structs with a "reserved"
field, let's go with this.
Fixes: 82147042d062 ("dma/idxd: add datapath structures") Signed-off-by: David Marchand <david.marchand@redhat.com>
Dmitry Kozlyuk [Thu, 7 Oct 2021 22:10:28 +0000 (01:10 +0300)]
cmdline: make struct rdline opaque
Hide struct rdline definition and some RDLINE_* constants in order
to be able to change internal buffer sizes transparently to the user.
Add new functions:
* rdline_new(): allocate and initialize struct rdline.
This function replaces rdline_init() and takes an extra parameter:
opaque user data for the callbacks.
* rdline_free(): deallocate struct rdline.
* rdline_get_history_buffer_size(): for use in tests.
* rdline_get_opaque(): to obtain user data in callback functions.
Remove rdline_init() function from library headers and export list,
because using it requires the knowledge of sizeof(struct rdline).
Bruce Richardson [Wed, 20 Oct 2021 11:25:54 +0000 (12:25 +0100)]
build/windows: remove separate list of libs
Rather than maintaining a separate list of libraries which are to be
built on windows, use the standard library list and explicitly add to
each library that is not to be built a check for windows and disable
the library at that per-lib level. As well as shortening the main
lib/meson.build file, this also leads to the build summary at the end of
the meson config run correctly listing the libraries which are not to be
built.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Wed, 20 Oct 2021 11:25:53 +0000 (12:25 +0100)]
dmadev: enable build on Windows
The dmadev library was not added to the list of libraries built on
Windows, meaning it was skipped in those builds and also that none of
the drivers were being considered for build. Adding dmadev to the list
fixes this, and also enables the skeleton dmadev driver to be built -
all-be-it with a small fix necessary.
Conor Walsh [Mon, 18 Oct 2021 12:38:27 +0000 (12:38 +0000)]
dma/ioat: add configuration
Add functions for device configuration. The info_get and close functions
are included here also. info_get can be useful for checking successful
configuration and close is used by the dmadev api when releasing a
configured device.
Signed-off-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Conor Walsh [Mon, 18 Oct 2021 12:38:25 +0000 (12:38 +0000)]
dma/ioat: create dmadev instances on PCI probe
When a suitable device is found during the PCI probe, create a dmadev
instance for each channel. Internal structures and HW definitions required
for device creation are also included.
Signed-off-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
Thomas Monjalon [Mon, 18 Oct 2021 09:55:58 +0000 (11:55 +0200)]
devtools: fix letter case check in commit title
The prefix (before the colon) of the title is lowercase.
The check of uppercase/lowercase in the commit title
was supposed to apply after the colon,
but some greps were not limited to the exact word.
So in the case of "test/dma: add basic dmadev instance tests",
the lowercase word "dmadev" was wrongly suggested to be uppercase.
The words of the dictionary must be filtered as whole word
with the grep option -w.
Fixes: d448efa259e9 ("devtools: export dictionary for commit title check") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Kevin Laatz [Wed, 20 Oct 2021 16:30:13 +0000 (16:30 +0000)]
usertools/devbind: move idxd device ID to DMA class
The dmadev library is the preferred abstraction for using IDXD devices and
will replace the rawdev implementation in future. This patch moves the IDXD
device ID to the dmadev class.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Conor Walsh [Wed, 20 Oct 2021 16:30:11 +0000 (16:30 +0000)]
dma/idxd: move config script from raw driver
Move the example script for configuring IDXD devices bound to the IDXD
kernel driver from raw to dma, and create a symlink to still allow use from
raw.
Signed-off-by: Conor Walsh <conor.walsh@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Kevin Laatz [Wed, 20 Oct 2021 16:30:10 +0000 (16:30 +0000)]
dma/idxd: add burst capacity
Add support for the burst capacity API. This API will provide the calling
application with the remaining capacity of the current burst (limited by
max HW batch size).
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Kevin Laatz [Wed, 20 Oct 2021 16:30:09 +0000 (16:30 +0000)]
dma/idxd: add vchan status
When testing dmadev drivers, it is useful to have the HW device in a known
state. This patch adds the implementation of the function which will wait
for the device to be idle (all jobs completed) before proceeding.
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Kevin Laatz [Wed, 20 Oct 2021 16:30:05 +0000 (16:30 +0000)]
dma/idxd: add start and stop for PCI devices
Add device start/stop functions for DSA devices bound to vfio. For devices
bound to the IDXD kernel driver, these are not required since the IDXD
kernel driver takes care of this.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Kevin Laatz [Wed, 20 Oct 2021 16:30:01 +0000 (16:30 +0000)]
dma/idxd: create dmadev instances on bus probe
When a suitable device is found during the bus scan/probe, create a dmadev
instance for each HW queue. Internal structures required for device
creation are also added.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Kevin Laatz [Wed, 20 Oct 2021 16:30:00 +0000 (16:30 +0000)]
dma/idxd: add bus device probing
Add the basic device probing for DSA devices bound to the IDXD kernel
driver. These devices can be configured via sysfs and made available to
DPDK if they are found during bus scan. Relevant documentation is included.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Kevin Laatz [Wed, 20 Oct 2021 16:29:59 +0000 (16:29 +0000)]
dma/idxd: add skeleton for VFIO based DSA device
Add the basic device probe/remove skeleton code for DSA device bound to
the vfio pci driver. Relevant documentation and MAINTAINERS update also
included.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Bruce Richardson [Wed, 20 Oct 2021 16:29:58 +0000 (16:29 +0000)]
raw/ioat: build only if dmadev not present
Only build the rawdev IDXD/IOAT drivers if the dmadev drivers are not
present.
This change requires the dependencies to be reordered in
drivers/meson.build so that rawdev can use the "RTE_DMA_* build macros to
check for the presence of the equivalent dmadev driver.
A note is also added to the documentation to inform users of this change.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com> Reviewed-by: Conor Walsh <conor.walsh@intel.com>
This is a new packet capture application to replace existing pdump.
The new application works like Wireshark dumpcap program and supports
the pdump API features.
It is not complete yet some features such as filtering are not implemented.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This enhances the DPDK pdump library to support new
pcapng format and filtering via BPF.
The internal client/server protocol is changed to support
two versions: the original pdump basic version and a
new pcapng version.
The internal version number (not part of exposed API or ABI)
is intentionally increased to cause any attempt to try
mismatched primary/secondary process to fail.
Add new API to do allow filtering of captured packets with
DPDK BPF (eBPF) filter program. It keeps statistics
on packets captured, filtered, and missed (because ring was full).
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>
When debugging converted (and other) programs it is useful
to see disassembled eBPF output.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>
bpf: add function to convert classic BPF to DPDK BPF
The pcap library emits classic BPF (32 bit) and is useful for
creating filter programs. The DPDK BPF library only implements
extended BPF (eBPF). Add an function to convert from old to
new.
The rte_bpf_convert function uses rte_malloc to put the resulting
program in hugepage shared memory so it can be passed from a
secondary process to a primary process.
The code to convert was originally done as part of the Linux
kernel implementation then converted to a userspace program.
See https://github.com/tklauser/filter2xdp
Both authors have agreed that it is allowable to create a modified
version of this code and license it with BSD license used by DPDK.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>
Some BPF programs may use XOR of a register with itself
as a way to zero register in one instruction.
The BPF filter converter generates this in the prolog
to the generated code.
The BPF validator would not allow this because the value of
register was undefined. But after this operation it always zero.
Fixes: 8021917293d0 ("bpf: add extra validation for input BPF program") Cc: stable@dpdk.org Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
This is utility library for writing pcapng format files
used by Wireshark family of utilities. Older tcpdump
also knows how to read (but not write) this format.
See
https://github.com/pcapng/pcapng/
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Reshma Pattan <reshma.pattan@intel.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>
The current version of the pdump library was building on
Windows, but it was useless since the pdump utility was not being
built and Windows does not have multi-process support.
The new version of pdump with filtering now has dependency
on bpf. But bpf library is not available on Windows.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Pravin Pathak [Thu, 14 Oct 2021 14:51:41 +0000 (14:51 +0000)]
event/dlb2: optimize credit allocations using port
This commit implements the changes required for using suggested
port type hint feature. Each port uses different credit quanta
based on port type specified using port configuration flags.
Each port has separate quanta defined in dlb2_priv.h
Producer and consumer ports will need larger quanta value to reduce number
of credit calls they make. Workers can use small quanta as they mostly
work out of locally cached credits and don't request/return credits often.
Harry van Haaren [Thu, 14 Oct 2021 14:51:39 +0000 (14:51 +0000)]
examples/eventdev_pipeline: use port config hints
This commit adds the per-port hints added to the eventdev API, indicating
which eventdev ports will be used for producing, forwarding, or consuming
events from the system.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Harry van Haaren [Thu, 14 Oct 2021 14:51:38 +0000 (14:51 +0000)]
eventdev: add usage hints to port configure API
This commit introduces 3 flags to the port configuration flags.
These flags allow the application to indicate what type of work
is expected to be performed by an eventdev port.
The three new flags are
- RTE_EVENT_PORT_CFG_HINT_PRODUCER (mostly RTE_EVENT_OP_NEW events)
- RTE_EVENT_PORT_CFG_HINT_CONSUMER (mostly RTE_EVENT_OP_RELEASE events)
- RTE_EVENT_PORT_CFG_HINT_WORKER (mostly RTE_EVENT_OP_FORWARD events)
These flags are only hints, and the PMDs must operate under the
assumption that any port can enqueue an event with any type of op.
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Added changes to receive packets as event vector. By default this is
disabled and can be enabled using the option --event-vector. Vector
size and timeout to form the vector can be configured using options
--event-vector-size and --event-vector-tmo.
Added changes to receive packets as event vector. By default this is
disabled and can be enabled using the option --event-vector. Vector
size and timeout to form the vector can be configured using options
--event-vector-size and --event-vector-tmo.
When a poll queue is removed from a rx_adapter instance, the WRR poll
array is recomputed. The wrr array length is reduced in this case. The
next wrr position to poll is stored in wrr_pos variable of rx_adapter
instance. This wrr_pos can become invalid in some cases after wrr is
recomputed. Using this variable to get the next queue and device pair
may leed to wrr buffer overruns.
Resetting the wrr_pos to zero after recomputation of wrr array fixes
the buffer overrun issue.
Fixes: 9c38b704d280 ("eventdev: add eth Rx adapter implementation") Cc: stable@dpdk.org Signed-off-by: Naga Harish K S V <s.v.naga.harish.k@intel.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Rashmi Shetty [Fri, 15 Oct 2021 15:18:53 +0000 (10:18 -0500)]
app/eventdev: support burst enqueue
Introduce a new command line option prod_enq_burst_sz
to set burst size for eventdev enqueue at producer in perf_queue
test. The newly added function perf_producer_burst is called when
prod_enq_burst_sz is greater than 1.
Harry van Haaren [Thu, 14 Oct 2021 09:54:44 +0000 (09:54 +0000)]
app/eventdev: fix terminal colour after control-c exit
Before this commit, a Control^C exit of the test-eventdev application
would print the worker packet percentages, and leave the terminal with
a green colour despite the colour reset being issued after the newline.
By moving the colour reset command before the \n the issue is fixed.
Fixes: 6b1a14a83a06 ("app/eventdev: add packet distribution logs") Cc: stable@dpdk.org Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Pavan Nikhilesh [Mon, 18 Oct 2021 23:36:09 +0000 (05:06 +0530)]
eventdev: mark trace variables as internal
Mark rte_trace global variables as internal i.e. remove them
from experimental section of version map.
Some of them are used in inline APIs, mark those as global.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>