net/virtio-user: fix multiple queue for vhost-kernel
The multiple queue support in vhost-kernel is broken because
the dev->vhostfd is only available for vhost-user. We should
always try to enable queue pairs when it's not in server mode.
Fixes: 201a41651715 ("net/virtio-user: fix multiple queues fail in server mode") Cc: stable@dpdk.org Signed-off-by: Tiwei Bie <tiwei.bie@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Nikolay Nikolaev [Mon, 24 Sep 2018 20:17:32 +0000 (23:17 +0300)]
vhost: rework message handling as a callback array
Introduce vhost_message_handlers, which maps the message request
type to the message handler. Then replace the switch construct
with a map and call.
Failing vhost_user_set_features is fatal and all processing should
stop immediately and propagate the error to the upper layers. Change
the code accordingly to reflect that.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Nikolay Nikolaev [Mon, 24 Sep 2018 20:17:25 +0000 (23:17 +0300)]
vhost: unify message handling function signature
Each vhost-user message handling function will return an int result
which is described in the new enum vh_result: error, OK and reply.
All functions will now have two arguments, virtio_net double pointer
and VhostUserMsg pointer.
Signed-off-by: Nikolay Nikolaev <nicknickolaev@gmail.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
The patch introduces scatter/gather support on transmit path.
A separate Tx callback is added and set if the application
requests multisegment Tx offload. Multiple descriptors are
sent per one packet.
Tomasz Duszynski [Tue, 25 Sep 2018 07:05:05 +0000 (09:05 +0200)]
net/mvpp2: align with MUSDK 18.09
This patch introduces necessary changes required by MUSDK 18.09 library.
* As of MUSDK 18.09, pp2_cookie_t is no longer available. Now
RX descriptor cookie is defined as plain u64 so existing cast
is no longer valid.
* MUSDK 18.09 increased number of available bpools (buffer hw pools) by
introducing dma regions support. Update mvpp2 driver accordingly.
* replace MV_NET_IP4_F_TOS with MV_NET_IP4_F_DSCP
Before this patch, API allowed to configure a classification rule
according to IPv4 TOS, which was not supported in classifier. This patch
fixes this by using proper field.
* use 48 bit address mask
We cannot get pointers exceeding 48 bits thus using 48 bit
mask for extracting higher IOVA address bits is enough.
Functional change:
Open receive cls/qos related features, only if the
config file contains an rx_related configuration entry.
This allows to configure tx_related entries, w/o unintentionally
opening rx cls/qos.
Code:
'use_global_defaults' is by default set to '1'.
Only if an rx_related entry was configured, it is updated to '0'.
rx cls/qos is performed only if 'use_global_defaults' is '0'.
Default TC configuration is now only mandatory when
'use_global_defaults' is '0'.
The doxygen comment describing the rte_eth_dev_info structure
was separated from the structure itself so move the comment
back to be with the structure.
Fixes: 7238e63bce52 ("ethdev: add support for device offload capabilities") Cc: stable@dpdk.org Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Ivan Malov [Wed, 5 Sep 2018 09:13:38 +0000 (10:13 +0100)]
net/bonding: inherit descriptor limits from slaves
Descriptor limits are used by applications to take
optimal decisions on queue sizing.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Chas Williams <chas3@att.com>
Ivan Malov [Wed, 5 Sep 2018 09:13:37 +0000 (10:13 +0100)]
net/bonding: provide default Rx/Tx configuration
Default Rx/Tx configuration has become a helpful
resource for applications relying on the optimal
values to make rte_eth_rxconf and rte_eth_txconf
structures. These structures can then be tweaked.
Default configuration is also used internally by
rte_eth_rx_queue_setup or rte_eth_tx_queue_setup
API calls when NULL pointer is passed by callers
with the argument for custom queue configuration.
The use cases of bonding driver may also benefit
from exercising default settings in the same way.
Restructure the code to collect various settings
from slave ports and make it possible to combine
default Rx/Tx configuration of these devices and
report it to the callers of rte_eth_dev_info_get.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Acked-by: Chas Williams <chas3@att.com>
net/sfc/base: use transceiver ID when reading info
In efx_mcdi_phy_module_get_info() probe the
transceiver identification byte rather than assume
the module matches the fixed port type. This
supports scenarios such as a SFP mounted in a QSFP
port via a QSA module.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Add a function which makes an MCDI GET_LINK request and
packages up the results. Currently, the get-link function
is triggered from several entry points which then pass
on or store selected parts of the data. When the driver
needs to obtain the current link state, it is more
efficient to do this in a single call.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
net/sfc/base: support improvements to bandwidth calculations
Change the interface to ef10_nic_get_port_mode_bandwidth()
so more NIC information can be used to infer bandwidth
requirements. Huntington calculations separated out
completely.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
The current code has the hardcoded seq for fq allocation.
It require multiple changes, when some of the interfaces
are assigned to kernel stack. Changing it on the mac
id basis provide the flexibility to assign any interface
to kernel.
Chas Williams [Thu, 20 Sep 2018 12:52:26 +0000 (08:52 -0400)]
net/bonding: fix Rx slave fairness
Some PMDs, especially ones with vector receives, require a minimum number
of receive buffers in order to receive any packets. If the first slave
read leaves less than this number available, a read from the next slave
may return 0 implying that the slave doesn't have any packets which
results in skipping over that slave as the next active slave.
To fix this, implement round robin for the slaves during receive that
is only advanced to the next slave at the end of each receive burst.
This is also done to provide some additional fairness in processing in
other bonding RX burst routines as well.
Fixes: 2efb58cbab6e ("bond: new link bonding library") Cc: stable@dpdk.org Signed-off-by: Chas Williams <chas3@att.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Matan Azrad <matan@mellanox.com>
Bruce Richardson [Wed, 19 Sep 2018 10:04:16 +0000 (11:04 +0100)]
net/avf: fix missing compiler error flags
The AVF driver was missing $(WERROR_FLAGS) in it's cflags, which means
that a number of compilation errors were getting missed. This patch adds
in the flag and fixes most of the errors, just disabling the
strict-aliasing ones.
Bruce Richardson [Wed, 19 Sep 2018 10:04:15 +0000 (11:04 +0100)]
net/avf: fix unused variables and label
Compiling with all warnings turned on causes errors about unused variables
and an unused label. Remove these to allow building without having to
disable those warnings.
Add support of imissed and q_errors statistics, reported by PCIE_QPRDC
register (see datasheet, section 11.27.2.60), which exposes the number
of receive packets dropped for a queue.
Signed-off-by: Julien Meunier <julien.meunier@nokia.com> Acked-by: Xiao Wang <xiao.w.wang@intel.com>
In former API, ETH_TXQ_FLAGS_NOMULTSEGS was merely a hint indicating
that application will never send multisegmented packets, allowing
pmd to choose different tx methods accordingly.
In new API, DEV_TX_OFFLOAD_MULTI_SEGS became an offload capability
that is advertised by pmds, some of them do not advertise it and
expect to never receive fragmented packets (octeontx, axgbe)
So an ethdev that supports multisegmented packets should properly
advertise it.
Problem was spotted and tested on e1000, should be also present in
ixgbe_vf representor.
Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports") Cc: stable@dpdk.org Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
In former API, ETH_TXQ_FLAGS_NOMULTSEGS was merely a hint indicating
that application will never send multisegmented packets, allowing
pmd to choose different tx methods accordingly.
In new API, DEV_TX_OFFLOAD_MULTI_SEGS became an offload capability
that is advertised by pmds, some of them do not advertise it and
expect to never receive fragmented packets (octeontx, axgbe)
So an ethdev that supports multisegmented packets should properly
advertise it.
Problem was spotted and tested on e1000, should be also present in
i40e_vf representor.
Fixes: e0cb96204b71 ("net/i40e: add support for representor ports") Cc: stable@dpdk.org Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
In former API, ETH_TXQ_FLAGS_NOMULTSEGS was merely a hint indicating
that application will never send multisegmented packets, allowing
pmd to choose different tx methods accordingly.
In new API, DEV_TX_OFFLOAD_MULTI_SEGS became an offload capability
that is advertised by pmds, some of them do not advertise it and
expect to never receive fragmented packets (octeontx, axgbe)
So an ethdev that supports multisegmented packets should properly
advertise it.
Problem was spotted and tested on e1000, should be also present in
fm10k.
Fixes: 30f3ce999e6a ("net/fm10k: convert to new Tx offloads API") Cc: stable@dpdk.org Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
In former API, ETH_TXQ_FLAGS_NOMULTSEGS was merely a hint indicating
that application will never send multisegmented packets, allowing
pmd to choose different tx methods accordingly.
In new API, DEV_TX_OFFLOAD_MULTI_SEGS became an offload capability
that is advertised by pmds, some of them do not advertise it and
expect to never receive fragmented packets (octeontx, axgbe)
So an ethdev that supports multisegmented packets should properly
advertise it.
Fixes: e5c05e6590ea ("net/e1000: convert to new Tx offloads API") Cc: stable@dpdk.org Signed-off-by: Didier Pallard <didier.pallard@6wind.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Igor Romanov [Fri, 14 Sep 2018 07:31:36 +0000 (08:31 +0100)]
net/sfc: fix a Tx queue double release possibility
There are two function that call sfc_tx_qfini():
sfc_tx_fini_queues() and sfc_tx_queue_release(). But only
sfc_tx_queue_release() sets tx_queues pointer of the device data to NULL.
It may lead to the scenario in which a queue is destroyed by
sfc_tx_fini_queues() and after the queue is attempted to be destroyed again
by sfc_tx_queue_release().
Move NULL assignment to sfc_tx_qfini().
Fixes: b1b7ad933b39 ("net/sfc: set up and release Tx queues") Cc: stable@dpdk.org Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Igor Romanov [Fri, 14 Sep 2018 07:31:35 +0000 (08:31 +0100)]
net/sfc: fix an Rx queue double release possibility
There are two function that call sfc_rx_qfini():
sfc_rx_fini_queues() and sfc_rx_queue_release(). But only
sfc_rx_queue_release() sets rx_queues pointer of the device data to NULL.
It may lead to the scenario in which a queue is destroyed by
sfc_rx_fini_queues() and after the queue is attempted to be destroyed again
by sfc_rx_queue_release().
Move NULL assignment to sfc_rx_qfini().
Fixes: ce35b05c635e ("net/sfc: implement Rx queue setup release operations") Cc: stable@dpdk.org Signed-off-by: Igor Romanov <igor.romanov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
net/sfc/base: modify phy caps to indicate FEC request
The capability bits to request FEC modes are implicitly valid
when the corresponding FEC mode is a supported capability.
Drivers expect that it is only valid to advertise those
capabilities explicitly marked as supported. The capabilities
reported by firmware is modified with the implicit capabilities
to present the explicit model to drivers.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Mon, 10 Sep 2018 09:33:33 +0000 (10:33 +0100)]
net/sfc/base: improve handling of legacy RSS hash flags
Client drivers may use either legacy flags, for example,
EFX_RX_HASH_TCPIPV4, or generalised flags, for example,
EFX_RX_HASH(IPV4_TCP, 4TUPLE), to configure RSS hash.
The libefx is able to recognise what scheme is used.
Legacy flags may be consumed directly by a chip-specific handler to
configure the NIC, that is, on EF10, these flags can be used to fill
in legacy RSS mode field in MCDI request. Generalised flags can also
be directly used in EF10-specific handler as they are fully compatible
with additional fields of the same MCDI request.
Legacy flags undergo conversion to generalised flags before they
are consumed by a chip-specific handler. This conversion is used to
make sure that chip-specific handlers expect only generalised flags
in the input for the sake of clarity of the code.
Depending on firmware capabilities, a chip-specififc handler either
supplies the input to the NIC directly, for example,
EFX_RX_HASH(IPV4_TCP, 4TUPLE) flag will enable 4 bits in
RSS_CONTEXT_SET_FLAGS_IN_TCP_IPV4_RSS_MODE field on EF10, or takes
the opportunity to translate the input to enable bits which don't map
to the generic flag, like setting
RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_TCPV4_EN on EF10 when the firmware
claims no support for additional modes.
However, this approach has introduced a severe problem which can be
reproduced with ultra-low-latency firmware variant. In order to enable
IP hash, EF10-specific handler requires the user to request 2-tuple
hash for IP-other, TCP and UDP traffic classes, unconditionally.
In example, IPv4 hash can be enabled using the following input:
EFX_RX_HASH(IPV4_TCP, 2TUPLE) | EFX_RX_HASH(IPV4_UDP, 2TUPLE) |
EFX_RX_HASH(IPV4, 2TUPLE).
At the same time, on ultra-low-latency firmware, the common code will
never report support for any UDP tuple to the client driver. That is,
in the same example, the driver will use EFX_RX_HASH(IPV4_TCP, 2TUPLE) |
EFX_RX_HASH(IPV4, 2TUPLE). This input will not be recognised by
EF10-specific handler, and RSS_CONTEXT_SET_FLAGS_IN_TOEPLITZ_IPV4_EN
bit will not be set in the MCDI request.
In order to solve the problem, the patch removes conversion code
from chip-specific handlers and adds appropriate code to convert
EFX_RX_HASH() flags to their legacy counterparts to the common scale
mode set function. If the firmware does not support additional modes,
the function will convert generalised flags to legacy flags correctly
without any demand for UDP flags and pass the result to a chip-specific
handler.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Mon, 10 Sep 2018 09:33:32 +0000 (10:33 +0100)]
net/sfc/base: simplify the code to parse RSS hash type
RSS mode bits can be accessed a lot easier in the hash
type value provided that the variable type is uint32_t.
The macro helper can be removed to enhance readability.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Mon, 10 Sep 2018 09:33:31 +0000 (10:33 +0100)]
net/sfc/base: check buffer size for hash flags
The efx_rx_scale_hash_flags_get interface is unsafe, as it does not
have an argument for the size of the output buffer used to return
the flags. While the only caller currently supplies a sufficiently
large buffer, this should be checked at runtime to avoid writing
past the end of the buffer.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Mon, 10 Sep 2018 09:33:30 +0000 (10:33 +0100)]
net/sfc/base: use simpler code to check hash algorithm type
The API which is used to list supported hash flags verifies
hash algorithm choice before writing the output. This check
is based on a switch() statement which has only two options
and no distinctive actions to be conducted for each of them.
Use simpler code instead of switch() to improve readability.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Ivan Malov [Mon, 10 Sep 2018 09:33:27 +0000 (10:33 +0100)]
net/sfc/base: fix name of the argument to store RSS flags
The function used to retrieve supported RSS flags has an
argument which should be named properly to indicate
that it's a pointer.
Fixes: 613cbe75ae99 ("net/sfc/base: add a new means to control RSS hash") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Mon, 10 Sep 2018 09:33:24 +0000 (10:33 +0100)]
net/sfc/base: add API to inform libefx of hardware removal
The efx_nic_hw_unavailable() checks ensure that if the NIC hardware
has failed or has been physically removed then libefx will stop
further attempts to access the hardware.
Add an interface for libefx clients to force unavailability, so the
hardware is treated as dead or removed even if still physically present.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Mark Spender [Mon, 10 Sep 2018 09:33:21 +0000 (10:33 +0100)]
net/sfc/base: add information if TSO workaround is required
In SF bug 61297 it's been confirmed that the hardware does not always
calculate the TCP checksum correctly with TSO sends.
The value of the Total Length field (IPv4) or Payload Length field
(IPv6) is the critical factor. We're sufficiently confident that if
these fields are zero then the checksum will be calculated correctly.
The information may be used by the drivers to check if the workaround is
required when FATSOv2 is implemented.
Signed-off-by: Mark Spender <mspender@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
The SFN driver's PartitionControl WMI object requires an API to parse
and filter partition data in TLV format, particularly for the Dynamic
Config partition. The ef10_nvram_buffer functions provide this
functionality but are tied to use with license partition only.
Modify functions so they are applicable to all TLV partitions and add
functions to support in-place tag modification.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Extend efx_mcdi_get_port_modes() to optionally pass on the default
port mode field. This provides a more direct way of handling the case
where the dynamic config does not specify the port mode than the
alternative of a lookup table indexed by MCFW subtype.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
net/sfc/base: add buffer editing functions to boot config
Functions to process the DHCP option list format used by the expansion
ROM config buffers, to support extracting and updating of individual
options.
The initial use case is the driver presenting the global and per-PF
options as separate items, with the driver implementing the
synchronization of global options across the configuration buffers
for all PFs.
Signed-off-by: Richard Houldsworth <rhouldsworth@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Mark Spender [Mon, 10 Sep 2018 09:33:13 +0000 (10:33 +0100)]
net/sfc/base: remove probes when a Tx queue is too full
No need for probe messages when a TxQ is too full for a post to be done.
Existing drivers check if there is room in the queue before posting
descriptors, even though efx_tx_qdesc_post() does the check itself.
The new SFN Windows driver doesn't perform the check before calling
efx_tx_qdesc_post(), but that means these probes can get frequently
printed out. It's normal driver behaviour so there's no need to print
an error.
Signed-off-by: Mark Spender <mspender@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Martin Harvey [Mon, 10 Sep 2018 09:33:12 +0000 (10:33 +0100)]
net/sfc/base: refactor monitors support
Remove obsolete monitor types since Falcon SFN4000 series adapters
no longer supported by libefx.
Rename MCDI monitors to be consistent with YML.
The code may be simplified and generalized since only MCDI monitors
remain.
Signed-off-by: Martin Harvey <mharvey@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
net/sfc/base: add check for TUNNEL module in NIC reset API
Fixes: 17551f6dffcc ("net/sfc/base: add API to control UDP tunnel ports") Cc: stable@dpdk.org Signed-off-by: Vijay Srivastava <vijays@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Martin Harvey [Mon, 10 Sep 2018 09:33:10 +0000 (10:33 +0100)]
net/sfc/base: move empty efsys definitions to EFX headers
Move empty definitions for platform-specific annotations from efsys.h
to EFX headers.
Signed-off-by: Martin Harvey <mharvey@solarflare.com> Signed-off-by: Andrew Lee <alee@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Mon, 10 Sep 2018 09:33:06 +0000 (10:33 +0100)]
net/sfc/base: add space after sizeof
Required by GLD cstyle.
Fixes: d4f4b8f9d260 ("net/sfc/base: make RxQ type data an union") Cc: stable@dpdk.org Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>