Andy Moreton [Tue, 20 Feb 2018 07:34:21 +0000 (07:34 +0000)]
net/sfc/base: clarify port mode names and masks
New port mode names are defined for Medford2 and later, and
the existing names are aliased to them. Add comments with the
numeric port mode to clarify the external port modes table.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:34:20 +0000 (07:34 +0000)]
net/sfc/base: support Medford2 event timer semantics
The event timer interface has changed for Medford2 - for
details see bug66418 comment 9. Update the common code to
use the new timer semantics for Medford2.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:58 +0000 (07:33 +0000)]
net/sfc/base: use MAC stats DMA buffer size when decoding
On Medford2 and later the MAC stats buffer has been enlarged.
Use the MAC stats DMA buffer size to ensure that the stats END
generation count is read from the correct location, and that
over-reading of the DMA buffer is prevented.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:57 +0000 (07:33 +0000)]
net/sfc/base: use MAC stats DMA buffer size from caps
For Medford2 the DMA buffer used for one-shot or periodic MAC stats
has been extended. Ensure the MAC stats DMA buffer size is large
enough to hold the number of stats counters supported by firmware.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:56 +0000 (07:33 +0000)]
net/sfc/base: improve robustness of MAC stats get via MCDI
Previously the code relied on the callers of efx_mcdi_mac_stats
to provide a DMA buffer or NULL depending on the action. Fix
this so that the DMA buffer is only passed in the request when
needed, and that an error is reported for a missing DMA buffer.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:55 +0000 (07:33 +0000)]
net/sfc/base: retrieve number of MAC stats from NIC
This reports the number of stats (and hence the DMA buffer size)
for MAC stats. If MC_GET_CAPABABILITIES_V4 is not supported then
use the legacy Siena-compatible MC_CMD_MAC_NSTATS value.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:52 +0000 (07:33 +0000)]
net/sfc/base: add efsys macro to get memory region size
EFSYS_MEM_SIZE() reports the DMA mapped size of an efsys_mem_t
allocated region (the allocation size may be different due to
memory allocator and DMA alignment restrictions).
This ensures that common code internals have explicit knowledge
of the usable size of DMA mapped memory regions.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:48 +0000 (07:33 +0000)]
net/sfc/base: report memory BAR number
On Medford and earlier controllers the BAR layout is:
PF BAR 0: (32bit I/O) I/O mapped registers
PF BAR 2: (64bit Mem) Memory mapped registers (VI aperture)
PF BAR 4: (64bit Mem) MSI-X tables
VF BAR 0: (64bit Mem) Memory mapped registers (VI aperture)
VF BAR 2: (64bit Mem) MSI-X tables
On Medford2, the layout is:
PF/VF BAR 0: (64bit Mem) Memory mapped registers (VI aperture)
PF/VF BAR 2: (64bit Mem) MSI-X tables
Make the VI aperture BAR number available for drivers that need it.
Remove EFX_MEM_BAR define as it it is not correct on all platforms.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:44 +0000 (07:33 +0000)]
net/sfc/base: update hardware headers for Medford2
The changes to efx_regs_ef10.h are auto-generated and include:
- Updated event RX_L4_CLASS which is now 2 bits (was 3).
The encoding of TCP, UDP and UNKNOWN are unchanged so
the narrower Medford2 field definition is compatible with
all controllers.
- Fix definition of FATSOv2 option descriptors. These were
added manually and differ from the auto-generated values
in some fields (not yet used in common code). The field
definitions have been corrected to agree with the Linux net
driver headers and SF-108452-SW.
The remaining changes adapt the common code to use the updated
headers.
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Andy Moreton [Tue, 20 Feb 2018 07:33:41 +0000 (07:33 +0000)]
net/sfc/base: support runtime VI window size
Medford2 uses a configurable VI window size, and requires
updates to register accesses to use a runtime VI window size
rather than the *_STEP register constants used for earlier
controllers.
Update the common code to query the VI window size via MCDI,
and add new EFX_BAR_VI_* accessor macros for per-VI registers.
The existing EFX_BAR_TBL_* macros can be used for non-VI
register tables (and for code that can never be called for
a Medford2 controller e.g. Siena-only code).
Signed-off-by: Andy Moreton <amoreton@solarflare.com> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Tomasz Kulasek [Fri, 9 Feb 2018 17:10:00 +0000 (18:10 +0100)]
vhost: fix device cleanup at stop
This prevents from destroying & recreating user device in "incomplete"
vring state. virtio_is_ready() was returning true for devices with
vrings which did not have valid callfd (their VHOST_USER_SET_VRING_CALL
hasn't arrived yet)
Fixes: 8f972312b8f4 ("vhost: support vhost-user") Cc: stable@dpdk.org Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com> Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: validate virtqueue size
Check the virtqueue size constraints so that invalid values don't cause
bugs later on in the code. For example, sometimes the virtqueue size is
stored as unsigned int and sometimes as uint16_t, so bad things happen
if it is ever larger than 65535.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: fix message payload union in setting ring address
vhost_user_set_vring_addr() uses the msg->payload.addr union member, not
msg->payload.state. Luckily the offset of the 'index' field is
identical in both structs, so there was never any buggy behavior.
Fixes: 5cd690e4fda9 ("vhost: fix vring addresses not translated") Cc: stable@dpdk.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: reject invalid log base mmap offset
If the log base mmap_offset is larger than mmap_size then it points
outside the mmap region. We must not write to memory outside the mmap
region, so validate mmap_offset in vhost_user_set_log_base().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: clear out unused SCM_RIGHTS file descriptors
The number of file descriptors received is not stored by vhost_user.c.
vhost_user_set_mem_table() assumes that memory.nregions matches the
number of file descriptors received, but nothing guarantees this:
for (i = 0; i < memory.nregions; i++)
close(pmsg->fds[i]);
Another questionable code snippet is:
case VHOST_USER_SET_LOG_FD:
close(msg.fds[0]);
If not enough file descriptors were received then fds[] contains
uninitialized data from the stack (see read_fd_message()). This might
cause non-vhost file descriptors to be closed if the uninitialized data
happens to match.
Refactoring vhost_user.c to pass around and check the number of file
descriptors everywhere would make the code more complex. It is simpler
for read_fd_message() to set unused elements in fds[] to -1. This way
close(-1) is called and no harm is done.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: validate untrusted memory regions number field
Check if memory.nregions is valid right away. This eliminates the
possibility of bugs when memory.nregions is used later on in
vhost_user_set_mem_table().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: avoid enum fields in VhostUserMsg
The VhostUserMsg struct binary representation must match the vhost-user
protocol specification since this struct is read from and written to the
socket.
The VhostUserMsg.request union contains enum fields. Enum binary
representation is implementation-defined according to the C standard and
it is unportable to make assumptions about the representation:
6.7.2.2 Enumeration specifiers
...
Each enumerated type shall be compatible with char, a signed integer
type, or an unsigned integer type. The choice of type is
implementation-defined, but shall be capable of representing the
values of all the members of the enumeration.
Additionally, librte_vhost relies on the enum type being unsigned when
validating untrusted inputs:
if (ret <= 0 || msg.request.master >= VHOST_USER_MAX) {
If msg.request.master is signed then negative values pass this check!
Even if we assume gcc on x86_64 (SysV amd64 ABI) and don't care about
portability, the actual enum constants still affect the final type. For
example, if we add a negative constant then its type changes to signed
int:
Stefan Hajnoczi [Mon, 5 Feb 2018 12:16:00 +0000 (13:16 +0100)]
vhost: add security model documentation
Input validation is not applied consistently in vhost_user.c. This
suggests that not everyone has the same security model in mind when
working on the code.
Make the security model explicit so that everyone can understand and
follow the same model when modifying the code.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: John McNamara <john.mcnamara@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Shahaf Shuler [Sun, 25 Feb 2018 07:28:37 +0000 (09:28 +0200)]
net/mlx5: fix tunnel offloads cap query
The query for the tunnel stateless offloads is wrongly implemented
because of:
1. It was using the device id to query for the offloads.
2. It was using a compilation flag for Verbs which no longer exits.
The main reason was lack of proper API from Verbs.
Fixing the query to use rdma-core API. The capability returned from
rdma-core refer to both Tx and Rx sides.
Eventhough there is a separate cap for GRE and VXLAN, implementation merge
them into a single flag in order to simplify the checks on the data
path.
Nélio Laranjeiro [Wed, 14 Feb 2018 15:04:45 +0000 (16:04 +0100)]
net/mlx5: fix flow creation with a single target queue
Adding a pattern targeting a single queues wrongly behaves as it is an RSS
request, ending by creating several Verbs flows rules to match the RSS
configuration.
Several control operations implemented by these PMDs affect netdevices
through sysfs, itself subject to file system permission checks enforced by
the kernel, which limits their use for most purposes to applications
running with root privileges.
Since performing the same operations through ioctl() requires fewer
capabilities (only CAP_NET_ADMIN) and given the remaining operations are
already implemented this way, this patch standardizes on ioctl() and gets
rid of redundant code.
Thomas Monjalon [Thu, 29 Mar 2018 15:28:26 +0000 (17:28 +0200)]
mk: fix kernel modules build dependency
Some kernel modules may need some header files to be "installed"
in the build directory.
When running multiple threads of make, kernel modules can try to
be compiled before the lib headers are ready:
make -j3
kernel/linux/kni/kni_misc.c:19:37: fatal error:
exec-env/rte_kni_common.h: No such file or directory
This error appeared recently after moving kernel modules in their
own directory.