Igor Romanov [Tue, 7 Jul 2020 10:45:25 +0000 (11:45 +0100)]
test/service: check active state on two lcores
The test checks that the service may be active API works
when there are two cores: a non-service lcore and a service one.
The API notes to take care when checking the status of a running
service, but the test setup allows for a safe usage in that case.
Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
To call the rte_rawdev_info_get() function, the user currently has to know
the underlying type of the device in order to pass an appropriate structure
or buffer as the dev_private pointer in the info structure. By allowing a
NULL value for this field, we can skip getting the device-specific info and
just return the generic info - including the device name and driver, which
can be used to determine the device type - to the user.
This ensures that basic info can be get for all rawdevs, without knowing
the type, and even if the info driver API call has not been implemented for
the device.
Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
This commit fixes the setting of relative rpath on dpdk-test for
drivers ($libdir/dpdk/pmd-$abiver) to the correct absolute rpath
($prefix$libdir/dpdk/pmd-$abiver).
Fixes: b5dc795a8a55 ("test: build app with meson as dpdk-test") Cc: stable@dpdk.org Signed-off-by: Timothy Redaelli <tredaelli@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Haiyue Wang [Fri, 3 Jul 2020 14:57:18 +0000 (22:57 +0800)]
vfio: support VF token
The Linux kernel module vfio-pci introduces the VF token to enable
SR-IOV support since 5.7.
The VF token can be set by a vfio-pci based PF driver and must be known
by the vfio-pci based VF driver in order to gain access to the device.
Since the vfio-pci module uses the VF token as internal data to provide
the collaboration between SR-IOV PF and VFs, so DPDK can use the same
VF token for all PF devices by specifying the related EAL option.
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Andrew Rybchenko <arybchenko@solarflare.com> Tested-by: Harman Kalra <hkalra@marvell.com>
Haiyue Wang [Fri, 3 Jul 2020 14:57:17 +0000 (22:57 +0800)]
eal: fix uuid header dependencies
Add the dependent header files explicitly, so that the user just needs
to include the 'rte_uuid.h' header file directly to avoid compile error:
(1). rte_uuid.h:97:55: error: unknown type name ‘size_t’
(2). rte_uuid.h:58:2: error: implicit declaration of function ‘memcpy’
Fixes: 6bc67c497a51 ("eal: add uuid API") Cc: stable@dpdk.org Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> Acked-by: David Marchand <david.marchand@redhat.com>
Yunjian Wang [Sat, 16 May 2020 07:58:39 +0000 (15:58 +0800)]
vfio: remove unused variable
The 'group_status' has never been used and can be removed.
Fixes: 94c0776b1bad ("vfio: support hotplug") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
If rte_lcore_index() is asked to give the index of the
current lcore (argument -1) and is called from a non-EAL thread
then it would invalid result. The result would come
lcore_config[-1].core_index which is some other data in the
per-thread area.
The resolution is to return -1 which is what rte_lcore_index()
returns if handed an invalid lcore.
Same issue existed with rte_lcore_to_cpu_id().
Bugzilla ID: 446 Fixes: 26cc3bbe4dc0 ("eal: add lcore accessors") Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: David Marchand <david.marchand@redhat.com>
eal/armv8: fix timer frequency calibration with PMU
get_tsc_freq uses 'nanosleep' system call to calculate the CPU
frequency. However, 'nanosleep' results in the process getting
un-scheduled. The kernel saves and restores the PMU state. This
ensures that the PMU cycles are not counted towards a sleeping
process. When RTE_ARM_EAL_RDTSC_USE_PMU is defined, this results
in incorrect CPU frequency calculation. This logic is replaced
with generic counter based loop.
Bugzilla ID: 450 Fixes: f91bcbb2d9a6 ("eal/armv8: use high-resolution cycle counter") Cc: stable@dpdk.org Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Dharmik Thakkar <dharmik.thakkar@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
David Marchand [Fri, 26 Jun 2020 08:16:36 +0000 (10:16 +0200)]
build: remove special versioning for non stable libraries
Having a special versioning for experimental/internal libraries put a
additional maintenance cost while this status is already announced in
MAINTAINERS and the library headers/documentation.
Following discussions and vote at 05/20 TB meeting [1], use a single
versioning for all libraries in DPDK.
Note: for the ABI check, an exception [2] had been added when tweaking
this special versioning [3].
Prefer explicit libabigail rules (which will be dropped in 20.11).
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Ray Kinsella <mdr@ashroe.eu> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Some EAL functions are used by mempool lib but not exported on Windows.
The functions are exported.
Added mempool to supported libraries for Windows compilation.
Alan Dewar [Thu, 25 Jun 2020 09:59:30 +0000 (10:59 +0100)]
sched: fix port time rounding
The QoS scheduler works off port time that is computed from the number
of CPU cycles that have elapsed since the last time the port was
polled. It divides the number of elapsed cycles to calculate how
many bytes can be sent, however this division can generate rounding
errors, where some fraction of a byte sent may be lost.
Lose enough of these fractional bytes and the QoS scheduler
underperforms. The problem is worse with low bandwidths.
To compensate for this rounding error this fix doesn't advance the
port's time_cpu_cycles by the number of cycles that have elapsed,
but by multiplying the computed number of bytes that can be sent
(which has been rounded down) by number of cycles per byte.
This will mean that port's time_cpu_cycles will lag behind the CPU
cycles momentarily. At the next poll, the lag will be taken into
account.
Ori Kam [Mon, 6 Jul 2020 17:36:48 +0000 (17:36 +0000)]
regexdev: add core functions
This commit introduce the API that is needed by the RegEx devices in
order to work with the RegEX lib.
During the probe of a RegEx device, the device should configure itself,
and allocate the resources it requires.
On completion of the device init, it should call the
rte_regex_dev_register in order to register itself as a RegEx device.
Signed-off-by: Ori Kam <orika@mellanox.com> Signed-off-by: Parav Pandit <parav@mellanox.com> Acked-by: Guy Kaneti <guyk@marvell.com>
Jerin Jacob [Mon, 6 Jul 2020 17:36:46 +0000 (17:36 +0000)]
regexdev: introduce API
As RegEx usage become more used by DPDK applications, for example:
* Next Generation Firewalls (NGFW)
* Deep Packet and Flow Inspection (DPI)
* Intrusion Prevention Systems (IPS)
* DDoS Mitigation
* Network Monitoring
* Data Loss Prevention (DLP)
* Smart NICs
* Grammar based content processing
* URL, spam and adware filtering
* Advanced auditing and policing of user/application security policies
* Financial data mining - parsing of streamed financial feeds
* Application recognition.
* Dmemory introspection.
* Natural Language Processing (NLP)
* Sentiment Analysis.
* Big data database acceleration.
* Computational storage.
Number of PMD providers started to work on HW implementation,
along side with SW implementations.
This lib adds the support for those kind of devices.
The RegEx Device API is composed of two parts:
- The application-oriented RegEx API that includes functions to setup
a RegEx device (configure it, setup its queue pairs and start it),
update the rule database and so on.
- The driver-oriented RegEx API that exports a function allowing
a RegEx poll Mode Driver (PMD) to simultaneously register itself as
a RegEx device driver.
RegEx: A regular expression is a concise and flexible means for matching
strings of text, such as particular characters, words, or patterns of
characters. A common abbreviation for this is â~@~\RegExâ~@~].
RegEx device: A hardware or software-based implementation of RegEx
device API for PCRE based pattern matching syntax and semantics.
PCRE RegEx syntax and semantics specification:
http://regexkit.sourceforge.net/Documentation/pcre/pcrepattern.html
RegEx queue pair: Each RegEx device should have one or more queue pair to
transmit a burst of pattern matching request and receive a burst of
receive the pattern matching response. The pattern matching
request/response embedded in *rte_regex_ops* structure.
Rule: A pattern matching rule expressed in PCRE RegEx syntax along with
Match ID and Group ID to identify the rule upon the match.
Rule database: The RegEx device accepts regular expressions and converts
them into a compiled rule database that can then be used to scan data.
Compilation allows the device to analyze the given pattern(s) and
pre-determine how to scan for these patterns in an optimized fashion that
would be far too expensive to compute at run-time. A rule database
contains a set of rules that compiled in device specific binary form.
Match ID or Rule ID: A unique identifier provided at the time of rule
creation for the application to identify the rule upon match.
Group ID: Group of rules can be grouped under one group ID to enable
rule isolation and effective pattern matching. A unique group identifier
provided at the time of rule creation for the application to identify
the rule upon match.
Scan: A pattern matching request through *enqueue* API.
It may possible that a given RegEx device may not support all the
features
of PCRE. The application may probe unsupported features through
struct rte_regexdev_info::pcre_unsup_flags
By default, all the functions of the RegEx Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on
different logical cores to work on the same target object. For instance,
the dequeue function of a PMD cannot be invoked in parallel on two logical
cores to operates on same RegEx queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue
pair. It is the responsibility of the upper level application to
enforce this rule.
In all functions of the RegEx API, the RegEx device is
designated by an integer >= 0 named the device identifier *dev_id*
At the RegEx driver level, RegEx devices are represented by a generic
data structure of type *rte_regexdev*.
RegEx devices are dynamically registered during the PCI/SoC device
probing phase performed at EAL initialization time.
When a RegEx device is being probed, a *rte_regexdev* structure and
a new device identifier are allocated for that device. Then, the
regexdev_init() function supplied by the RegEx driver matching the
probed device is invoked to properly initialize the device.
The role of the device init function consists of resetting the hardware
or software RegEx driver implementations.
If the device init operation is successful, the correspondence between
the device identifier assigned to the new device and its associated
*rte_regexdev* structure is effectively registered.
Otherwise, both the *rte_regexdev* structure and the device identifier
are freed.
The functions exported by the application RegEx API to setup a device
designated by its device identifier must be invoked in the following
order:
- rte_regexdev_configure()
- rte_regexdev_queue_pair_setup()
- rte_regexdev_start()
Then, the application can invoke, in any order, the functions
exported by the RegEx API to enqueue pattern matching job, dequeue
pattern matching response, get the stats, update the rule database,
get/set device attributes and so on
If the application wants to change the configuration (i.e. call
rte_regexdev_configure() or rte_regexdev_queue_pair_setup()), it must
call rte_regexdev_stop() first to stop the device and then do the
reconfiguration before calling rte_regexdev_start() again. The enqueue and
dequeue functions should not be invoked when the device is stopped.
Finally, an application can close a RegEx device by invoking the
rte_regexdev_close() function.
Each function of the application RegEx API invokes a specific function
of the PMD that controls the target device designated by its device
identifier.
For this purpose, all device-specific functions of a RegEx driver are
supplied through a set of pointers contained in a generic structure of
type *regexdev_ops*.
The address of the *regexdev_ops* structure is stored in the
*rte_regexdev* structure by the device init function of the RegEx driver,
which is invoked during the PCI/SoC device probing phase, as explained
earlier.
In other words, each function of the RegEx API simply retrieves the
*rte_regexdev* structure associated with the device identifier and
performs an indirect invocation of the corresponding driver function
supplied in the *regexdev_ops* structure of the *rte_regexdev*
structure.
For performance reasons, the address of the fast-path functions of the
RegEx driver is not contained in the *regexdev_ops* structure.
Instead, they are directly stored at the beginning of the *rte_regexdev*
structure to avoid an extra indirect memory access during their
invocation.
RTE RegEx device drivers do not use interrupts for enqueue or dequeue
operation. Instead, RegEx drivers export Poll-Mode enqueue and dequeue
functions to applications.
The *enqueue* operation submits a burst of RegEx pattern matching
request to the RegEx device and the *dequeue* operation gets a burst of
pattern matching response for the ones submitted through *enqueue*
operation.
Typical application utilisation of the RegEx device API will follow the
following programming flow.
- rte_regexdev_configure()
- rte_regexdev_queue_pair_setup()
- rte_regexdev_rule_db_update() Needs to invoke if precompiled rule
database not
provided in rte_regexdev_config::rule_db for rte_regexdev_configure()
and/or application needs to update rule database.
- rte_regexdev_rule_db_compile_activate() Needs to invoke if
rte_regexdev_rule_db_update function was used.
- Create or reuse exiting mempool for *rte_regex_ops* objects.
- rte_regexdev_start()
- rte_regexdev_enqueue_burst()
- rte_regexdev_dequeue_burst()
Signed-off-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Signed-off-by: Ori Kam <orika@mellanox.com>
David Marchand [Mon, 6 Jul 2020 08:00:22 +0000 (10:00 +0200)]
devtools: fix check of variable declaration inside for
An expression with a space is split by the awk script resulting in
false positive for any patch matching any of the two part of the
expression.
Fix this by using [[:space:]].
Pavan Nikhilesh [Mon, 29 Jun 2020 01:33:28 +0000 (07:03 +0530)]
event/octeontx2: improve datapath memory locality
When event device is transmitting packet on OCTEONTX2 it needs to access
the destined ethernet device TXq data.
Currently, we get the TXq data through rte_eth_devices global array.
Instead save the TXq address inside event port memory.
Pavan Nikhilesh [Mon, 29 Jun 2020 01:33:27 +0000 (07:03 +0530)]
event/octeontx2: fix sub event type
In OCTEONTX2 event device we use sub_event_type to store the ethernet
port identifier when we receive work from OCTEONTX2 ethernet device.
This violates the event device spec as sub_event_type should be 0 in
the initial receive stage.
Set sub_event_type to 0 after copying the port id.
Harry van Haaren [Tue, 16 Jun 2020 16:56:03 +0000 (17:56 +0100)]
examples/eventdev: fix 32-bit coremask
This commit fixes a bug in 32-bit environments when a core mask greater
than 32-bits is requested. The fix is to convert the bitmask logic to
64 bits, aligning 64 and 32 bit implementations.
Fixes: adb5d548 ("examples/eventdev_pipeline_sw_pmd: add sample app") Cc: stable@dpdk.org Reported-by: Jun W Zhou <junx.w.zhou@intel.com> Suggested-by: Mao Jiang <maox.jiang@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Harman Kalra [Fri, 15 May 2020 11:21:24 +0000 (16:51 +0530)]
event/octeontx: fix memory corruption
Since PMD enqueues a single event at a time, fixing the issue by
passing 1 rather than nb_events to avoid any out of bound access as
reported by coverity.
Bruce Richardson [Fri, 26 Jun 2020 14:59:57 +0000 (15:59 +0100)]
eal: restrict default plugin path to shared lib mode
When using statically linked DPDK binaries, the EAL checks the default PMD
path and tries to load any drivers there, despite the fact that all drivers
are normally linked into the binary. This behaviour can cause issues if
the PMD path and lib dir is configured to a non-standard location which is
not in the ld.so.conf paths, e.g. a build with prefix set to a home
directory location. In a case such as this, EAL will try and
(unnecessarily) load the .so driver files but that load will fail as their
dependent libraries, such as ethdev, for example, will not be found.
Because of this, it is better if statically linked DPDK apps do not load
drivers from the standard paths automatically. The user can always have
this behaviour by explicitly specifying the path using -d flag, if so
desired.
Not loading the libraries automatically can also prevent potential issues
with a user building and running a statically-linked DPDK binary based off
a private copy of DPDK, while there exists on the same machine a
system-wide installation of DPDK in the default locations. Without this
change, the system-installed drivers will be loaded to the binary alongside
the statically-linked drivers, which is not what the user would have
intended.
To detect whether we are in a statically or dynamically linked binary, we
can have EAL try to get a dlopen handle to its own shared library, by
calling dlopen with the RTLD_NOLOAD flag. This will return NULL if there is
no such shared lib loaded i.e. the code is executing from a static library,
or a handle to the lib if it is loaded.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
When loading a directory of drivers, we check the same hierarchy multiple
times. If we just cache the last directory checked, this avoids repeated
checks of the same path, since all drivers in that path have been added to
the list consecutively.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Any paths on the system which are world-writable are insecure and should
not be used for loading drivers. Therefore, whenever an absolute or
relative driver path is passed to EAL, check for world-writability and
don't load any drivers from that path if it is insecure. Drivers loaded
from system locations i.e. those passed without any path info and found
automatically by the loader, are excluded from these checks as system paths
are assumed to be secure.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
eal: load only shared libs from driver plugin directories
When we pass a "-d" flag to EAL pointing to a directory, we attempt to load
all files in that directory as driver plugins, irrespective of file type.
This procludes using e.g. the build/drivers directory, as a driver source
since it contains static libs and other files as well as the shared
objects.
By filtering out any files whose filename does not end in ".so", we can
improve usability by allowing other non-driver files to be present in the
driver directory.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
eal: remove unnecessary null-termination in plugin path
Since strlcpy always null-terminates, and the buffer is zeroed before copy
anyway, there is no need to explicitly zero the end of the character
array, or to limit the bytes that strlcpy can write.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Rather than checking the binutils version number, which can lead to
unnecessary disabling of AVX512 if fixes have been backported to distro
versions, we can instead check the output of "as" from binutils to see if
it is correct.
The check in the script uses the minimal assembly reproduction code posted
to the public bug tracker for gcc/binutils for those issues [1]. If the
binutils bug is present, the instruction parameters - specifically the
displacement parameter - will be different in the disassembled output
compared to the input. Therefore the check involves assembling a single
instruction and disassembling it again, checking that the two match.
When building with meson, the default size of virtual address space
reserved for mapping pages was globally set at 512GB, which is too big for
use in 32-bit processes. To match the behaviour with "make", we configure
this to be 512GB for 64-bit and 2GB for 32-bit builds.
examples/l2fwd: add forwarding port mapping option
Current l2fwd application statically configures adjacent ports as
destination ports for forwarding the traffic.
Add a portmap option to pass the forwarding port pair mapping which allows
the user to configure forwarding port mapping.
If no portmap argument is specified, destination port map is not
changed and traffic gets forwarded with existing mapping.
To align port/queue configuration of each lcore with destination port
map, port/queue configuration of each lcore gets modified when portmap
option is specified.
With above portmap option, traffic received from portid = 0 gets forwarded
to port = 3 and vice versa, similarly traffic gets forwarded on other port
pairs (1,4) and (2,5)
Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Andrzej Ostruszka <aostruszka@marvell.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Thomas Monjalon [Wed, 1 Jul 2020 07:31:34 +0000 (09:31 +0200)]
build: remove special handling for node library
The node library had a need of being linked as a whole
to make some constructors effective.
Now that all libraries are linked with --whole-archive,
there is no need to have this library separate.
Fixes: e2db26f76673 ("build: always link whole DPDK static libraries") Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Tested-by: Jerin Jacob <jerinj@marvell.com>
Rework test code to reduce code complexity for the compiler and
bring down compilation time and memory consumption.
Current test_ring_enqueue/test_ring_dequeue functions contain
too many branches and it takes compiler a lot of effort to resolve all
of them at compile time.
So the patch replaces these branchy function invocations
with an array of function pointers (test_enqdeq_impl[]).
That way compiler knows straightway which function to use
for each particular case.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Thomas Monjalon [Sun, 24 May 2020 17:43:41 +0000 (19:43 +0200)]
devtools: remove useless files from ABI reference
When building an ABI reference with meson, some static libraries
are built and linked in apps. They are useless and take a lot of space.
Those binaries, and other useless files (examples and doc files)
in the share/ directory, are removed after being installed.
In order to save time when building the ABI reference,
the examples (which are not installed anyway) are not compiled.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com>
Thomas Monjalon [Sun, 24 May 2020 17:30:07 +0000 (19:30 +0200)]
devtools: forbid variable declaration inside for
Some compilers raise an error when declaring a variable
in the middle of a function. This is a C99 allowance.
Even if DPDK switches globally to C99 or C11 standard,
the coding rules are for declarations at the beginning
of a block:
http://doc.dpdk.org/guides/contributing/coding_style.html#local-variables
This coding style is enforced by adding a check of
the common patterns like "for (int i;"
The occurrences of the checked pattern are fixed:
'for *(\(char\|u\?int\|unsigned\|s\?size_t\)'
In the file dpaa2_sparser.c, the fix is to remove the unused macros.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: David Marchand <david.marchand@redhat.com>
Bruce Richardson [Tue, 30 Jun 2020 14:14:33 +0000 (15:14 +0100)]
build/pkg-config: prevent overlinking
Add the --as-needed linker flag to the DPDK library list in the pkg-config
file so as to prevent overlinking. Without this flag, when linking
statically using flags from $(pkg-config --static --libs libdpdk), all DPDK
drivers and libs were statically linked in, but the binary was also
requiring all the shared versions be present to run.
The real root-cause of this issue is that the DPDK libraries need to be
duplicated in the linker command when doing static linking, due to the
behaviour of pkg-config, but since that behaviour cannot be easily changed,
this is a simple workaround to avoid problems.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Bruce Richardson [Tue, 30 Jun 2020 14:14:32 +0000 (15:14 +0100)]
build/pkg-config: improve static linking flags
Rather than setting -Bstatic in the linker flags when doing a static link,
and then having to explicitly set -Bdynamic again afterwards, we can update
the pkg-config file to use -l:libfoo.a syntax to explicitly refer to the
static library in question. Since this syntax is not supported by meson's
pkg-config module directly, we can post-process the .pc files instead to
adjust them.
Once done, we can simplify the examples' makefiles and the docs by removing
the explicit static flag.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Bruce Richardson [Tue, 30 Jun 2020 14:14:31 +0000 (15:14 +0100)]
build/pkg-config: output drivers first for static build
When calling pkg-config --static --libs, pkg-config will always output the
regular libs first, and then the extra libs from Libs.private field,
since the assumption is that those are additional dependencies for building
statically that the .a files depend upon.
However, for DPDK, we only link the driver files for static builds, and
those need to come *before* the regular libraries. To get this result, we
need two pkgconfig files for DPDK, one for the shared libs, and a second
for the static libs and drivers, which depends upon the first. Using a
dependency means that the shared libs are printed only after the
Libs.private field rather than before.
Without this patch, the linking works in DPDK because in all cases we
specify the libraries after the drivers in the Libs.private line, ensuring
that the references to the libs from the drivers can be resolved. The
current output is therefore of the form, "(shared)libs, drivers,
(static)libs", while after this patch the output is, "drivers,
(static)libs, (shared)libs". The former case will not work if we use the
--whole-archive flag on the static libs as it will lead to duplicate
definitions due to some references having been previously resolved from the
shared libraries. By ensuring the shared libraries come last in the link
link, this issue does not occur, as duplicate references when linking the
shared libs will be ignored.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Bruce Richardson [Tue, 30 Jun 2020 14:14:30 +0000 (15:14 +0100)]
build/pkg-config: move pkg-config file creation
Ahead of changes to rework the file, move the pkg-config file generation to
a new directory under buildtools. This allows the meson code to be
separated out from the main meson.build for simplicity, and also allows any
additional scripts for working with the pkg-config files to be placed there
too.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Bruce Richardson [Tue, 30 Jun 2020 14:14:29 +0000 (15:14 +0100)]
devtools: test static linkage with pkg-config
The pkg-config file was tested by building some of the examples using make,
pulling the cflags and ldflags from the pkg-config file for DPDK. However,
this only tested the shared library linkage, and not the static, so this
patch updates it to test both.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Bruce Richardson [Tue, 30 Jun 2020 14:14:28 +0000 (15:14 +0100)]
build: remove unnecessary variable
Since all libraries are explicitly linked as part of a build, we no longer
need to track ones that should be always included for linking against apps.
Previously telemetry was special-cased for linking as it was not directly
needed by the linker when linking the apps, since they never called into it
directly. This meant that it could be forgotten when specifying the app
dependencies, and so the telemetry support would not work. This
special-casing was never needed for make as it always linked in all
libraries, as meson does now.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com>
Bruce Richardson [Tue, 30 Jun 2020 14:14:27 +0000 (15:14 +0100)]
build: always link whole DPDK static libraries
To ensure all constructors are included in static build, we need to pass
the --whole-archive flag when linking, which is used with the
"link_whole" meson option. Since we use link_whole for all libs, we no
longer need to track the lib as part of the static dependency, just the
path to the headers for compiling.
After this patch is applied, all DPDK .a files are inside
--whole-archive/--no-whole-archive flags, but external dependencies and
shared libs being linked against remain outside.
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Andrzej Ostruszka <aostruszka@marvell.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Bruce Richardson [Wed, 27 May 2020 14:57:45 +0000 (15:57 +0100)]
test: fix build with ring PMD but no bond PMD
If the bonding pmd is disabled, all autotest associated with it should be
disabled. However, some of those tests also depended upon the ring PMD so
were placed in a block depending on that driver - and unfortunately that
driver alone. This caused build failures if the ring PMD was enabled but
the bonding PMD disabled, due to missing header files and driver libs.
This error can be reproduced by configuring DPDK using e.g.
meson configure -Ddisable_drivers=net/[!r]* build
(which will disable all drivers not starting with "r"), and then building
using ninja.
Fix this by moving all link bonding autotests to the one block and putting
a second conditional check within that block for those also requiring the
ring PMD.
Fixes: 7f6ef1664027 ("test/bonding: allow disabling driver") Fixes: 207b1c813f39 ("test: fix build without ring PMD") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Feifei Wang [Mon, 8 Jun 2020 05:58:46 +0000 (13:58 +0800)]
test/ring: fix statistics in bulk enq/dequeue
In size 32 bulk ring enq/dequeue performance test, the "Total count"
statistics is incorrect. For example, running the test on lcore 25 and
lcore 26, the output is as follows:
The test command:
$sudo ./arm64-armv8a-linuxapp-gcc/app/test -l 25-26
RTE>>ring_perf_autotest
Fixes: 759cf9b5632c ("test/ring: enhance mp/mc coverage") Cc: stable@dpdk.org Signed-off-by: Feifei Wang <feifei.wang2@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Phil Yang <phil.yang@arm.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
The ring used in copy mode should be multi-producer multi-consumer
because enqueues and dequeues to the ring are performed on both the rx
and tx paths, which can be running on different threads.
Fixes: 489e0b5b3320 ("net/af_xdp: use single producer/consumer ring") Cc: stable@dpdk.org Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Ciara Loftus [Tue, 23 Jun 2020 14:29:25 +0000 (14:29 +0000)]
net/af_xdp: improve packet loss
This commit makes some changes to the AF_XDP PMD in an effort to improve
its packet loss characteristics.
1. In the case of failed transmission due to inability to reserve a tx
descriptor, the PMD now pulls from the completion ring, issues a
syscall in which the kernel attempts to complete outstanding tx
operations, then tries to reserve the tx descriptor again. Prior to
this we dropped the packet after the syscall and didn't try to
re-reserve.
2. During completion ring cleanup, always pull as many entries as
possible from the ring as opposed to the batch size or just how many
packets we're going to attempt to send. Keeping the completion ring
emptier should reduce failed transmissions in the kernel, as the
kernel requires space in the completion ring to successfully tx.
3. Size the fill ring as twice the receive ring size which may help
reduce allocation failures in the driver.
4. Emulate a tx_free_thresh - when the number of available entries in
the completion ring rises above this, we pull from it. The threshold
is set to 1k entries.
With these changes, a benchmark which measured the packet rate at which
0.01% packet loss could be reached improved from ~0.1G to ~3Gbps.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Matan Azrad [Mon, 29 Jun 2020 14:08:20 +0000 (14:08 +0000)]
vhost: notify virtq file descriptor update
When virtq call or kick file descriptors are changed in the device
configuration when the queue is ready, the application and the vDPA
driver should be notified to be aligned to the new file descriptors.
Notify the state to be disabled before the file descriptor update and
return it back to be enabled after the update.
Matan Azrad [Mon, 29 Jun 2020 14:01:56 +0000 (14:01 +0000)]
vdpa/mlx5: control completion queue event mode
The CQ polling is necessary in order to manage guest notifications when
the guest doesn't work with poll mode (callfd != -1).
The CQ polling scheduling method can affect the host CPU utilization and
the traffic bandwidth.
Define 3 modes to control the CQ polling scheduling:
1. A timer thread which automatically adjusts its delays to the coming
traffic rate.
2. A timer thread with fixed delay time.
3. Interrupts: Each CQE burst arms the CQ in order to get an interrupt
event in the next traffic burst.
When traffic becomes off, mode 3 is taken automatically.
The interrupt management takes a lot of CPU cycles but forward traffic
event to the guest very fast.
Timer thread save the interrupt overhead but may add delay for the guest
notification.
Add device arguments to control on the mode.
Signed-off-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Matan Azrad [Mon, 29 Jun 2020 14:01:55 +0000 (14:01 +0000)]
vdpa/mlx5: optimize completion queue poll
The vDPA driver uses a CQ in order to know when traffic works were
completed by the HW.
Each traffic burst completion adds a CQE to the CQ.
When the vDPA driver detects CQEs in the CQ, it triggers the guest
notification for the corresponding queue and consumes all of them.
There is collapse feature in the HW that configures the HW to write all
the CQEs in the first entry of the CQ.
Using this feature, the vDPA driver can read only the first CQE,
validate that the completion counter inside the CQE was changed and if
so, to notify the guest.
Use CQ collapse feature in order to improve the poll utilization.
Signed-off-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Matan Azrad [Mon, 29 Jun 2020 14:01:54 +0000 (14:01 +0000)]
vdpa/mlx5: optimize notification events
When the virtio guest driver doesn't work with poll mode, the driver
creates event mechanism in order to schedule completion notifications
for each virtq burst traffic.
When traffic comes to a virtq, a CQE will be added to the virtq CQ by
the FW.
The driver requests interrupt for the next CQE index, and when interrupt
is triggered, the driver polls the CQ and notifies the guest by virtq
callfd writing.
According to the described method, the interrupts will be triggered for
each burst of traffic. The burst size depends on interrupt latency.
Interrupts management takes a lot of CPU cycles and using it for each
traffic burst takes big portion of CPU capacity.
When traffic is on, using timer for CQ poll scheduling instead of
interrupts saves a lot of CPU cycles.
Move CQ poll scheduling to be done by timer in case of running traffic.
Request interrupts only when traffic is off.
The timer scheduling management is done by a new dedicated thread uses
a usleep command.
Signed-off-by: Matan Azrad <matan@mellanox.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Michael Baum [Wed, 24 Jun 2020 13:50:39 +0000 (13:50 +0000)]
net/mlx5: fix iterator type in Rx queue management
The mlx5_check_vec_rx_support function in the mlx5_rxtx_vec.c file
passes the RX queues array in the loop. Similarly, the mlx5_mprq_enabled
function in the mlx5_rxq.c file passes the RX queues array in the loop.
In both cases, the iterator of the loop is called i and the variable
representing the array size is called rxqs_n.
The i variable is of UINT16_T type while the rxqs_n variable is of
unsigned int type. The size of the rxqs_n variable is much larger than
the number of iterations allowed by the i type, theoretically there may
be a situation where the value of the rxqs_n will be greater than can be
represented by 16 bits and the loop will never end.
Michael Baum [Wed, 24 Jun 2020 13:46:41 +0000 (13:46 +0000)]
net/mlx5: use anonymous Direct Verbs allocator argument
The mlx5_dev_spawn function defines an struct mlx5dv_ctx_allocators type
variable several hundred rows after it starts, with the only use it
being passed as a parameter to the mlx5_glue->dv_set_context_attr
function.
However, according to DPDK Coding Style Guidelines, variables should be
declared at the start of a block of code rather than in the middle.
Therefore, to improve the Coding Style, the variable is passed directly
to the function without declaring it before.
Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 24 Jun 2020 13:44:27 +0000 (13:44 +0000)]
net/mlx4: remove useless assignment
The mlx4_ibv_device_to_pci_addr function defines a variable called ret
inside a loop and uses it.
During the loop, the function assigns a value within the variable and
breaks from the loop, so that this assigning has done nothing and is
actually unnecessary.
Remove the unnecessary assigning.
Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 24 Jun 2020 13:33:50 +0000 (13:33 +0000)]
common/mlx5: remove useless assignment
The mlx5_dev_to_pci_addr function defines a variable called ret inside a
loop and uses it.
During the loop, the function assigns a value within the variable and
breaks from the loop, so that this assigning has done nothing and is
actually unnecessary.
Remove the unnecessary assigning.
Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 24 Jun 2020 13:29:55 +0000 (13:29 +0000)]
net/mlx4: use anonymous Direct Verbs allocator argument
The mlx4_pci_probe function defines an struct mlx4dv_ctx_allocators type
variable several hundred rows after it starts, with the only use it
being passed as a parameter to the mlx4_glue->dv_set_context_attr
function.
However, according to DPDK Coding Style Guidelines, variables should be
declared at the start of a block of code rather than in the middle.
Therefore, to improve the Coding Style, the variable is passed directly
to the function without declaring it before.
Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
Michael Baum [Wed, 24 Jun 2020 13:20:31 +0000 (13:20 +0000)]
common/mlx5: fix code arrangement in tag allocation
Flow tag action is supported only when the driver has DR or DV support.
The tag allocation is adjusted to the modes DV or DR.
In case both DR and DV are not supported in the system, the driver
handles static code for error report.
This error code, wrongly, was compiled when DV is supported while in
this case it cannot be accessed at all.
Ignore the aforementioned static error code in case of DV by
preprocessor commands rearrangement.
Fixes: cbb66daa3c85 ("net/mlx5: prepare Direct Verbs for Direct Rule") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@mellanox.com> Acked-by: Matan Azrad <matan@mellanox.com>
Shiri Kuzin [Tue, 23 Jun 2020 08:41:07 +0000 (11:41 +0300)]
net/mlx5: add parameter for LACP packets control
The new devarg will control the steering of the lacp traffic.
When setting dv_lacp_by_user = 0 the lacp traffic will be
steered to kernel and managed there.
When setting dv_lacp_by_user = 1 the lacp traffic will
not be steered and the user will need to manage it.