``/dev/path`` character device file will be created. For vhost-user server
mode, a Unix domain socket file ``path`` will be created.
- Currently two flags are supported (these are valid for vhost-user only):
+ Currently supported flags are (these are valid for vhost-user only):
- ``RTE_VHOST_USER_CLIENT``
This reconnect option is enabled by default. However, it can be turned off
by setting this flag.
+ - ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY``
+
+ Dequeue zero copy will be enabled when this flag is set. It is disabled by
+ default.
+
+ There are some truths (including limitations) you might want to know while
+ setting this flag:
+
+ * zero copy is not good for small packets (typically for packet size below
+ 512).
+
+ * zero copy is really good for VM2VM case. For iperf between two VMs, the
+ boost could be above 70% (when TSO is enableld).
+
+ * for VM2NIC case, the ``nb_tx_desc`` has to be small enough: <= 64 if virtio
+ indirect feature is not enabled and <= 128 if it is enabled.
+
+ The is because when dequeue zero copy is enabled, guest Tx used vring will
+ be updated only when corresponding mbuf is freed. Thus, the nb_tx_desc
+ has to be small enough so that the PMD driver will run out of available
+ Tx descriptors and free mbufs timely. Otherwise, guest Tx vring would be
+ starved.
+
+ * Guest memory should be backended with huge pages to achieve better
+ performance. Using 1G page size is the best.
+
+ When dequeue zero copy is enabled, the guest phys address and host phys
+ address mapping has to be established. Using non-huge pages means far
+ more page segments. To make it simple, DPDK vhost does a linear search
+ of those segments, thus the fewer the segments, the quicker we will get
+ the mapping. NOTE: we may speed it by using tree searching in future.
+
* ``rte_vhost_driver_session_start()``
This function starts the vhost session loop to handle vhost messages. It
in an mbuf chain and retrieve its packet type by software.
* Added new functions ``rte_get_ptype_*()`` to dump a packet type as a string.
+* **Added vhost-user dequeue zero copy support**
+
+ The copy in dequeue path is saved, which is meant to improve the performance.
+ In the VM2VM case, the boost is quite impressive. The bigger the packet size,
+ the bigger performance boost you may get. However, for VM2NIC case, there
+ are some limitations, yet the boost is not that impressive as VM2VM case.
+ It may even drop quite a bit for small packets.
+
+ For such reason, this feature is disabled by default. It can be enabled when
+ ``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` flag is given. Check the vhost section
+ at programming guide for more information.
+
* **Added vhost-user indirect descriptors support.**
If indirect descriptor feature is negotiated, each packet sent by the guest
#define RTE_VHOST_USER_CLIENT (1ULL << 0)
#define RTE_VHOST_USER_NO_RECONNECT (1ULL << 1)
+#define RTE_VHOST_USER_DEQUEUE_ZERO_COPY (1ULL << 2)
/* Enum for virtqueue management. */
enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
int connfd;
bool is_server;
bool reconnect;
+ bool dequeue_zero_copy;
};
struct vhost_user_connection {
size = strnlen(vsocket->path, PATH_MAX);
vhost_set_ifname(vid, vsocket->path, size);
+ if (vsocket->dequeue_zero_copy)
+ vhost_enable_dequeue_zero_copy(vid);
+
RTE_LOG(INFO, VHOST_CONFIG, "new device, handle is %d\n", vid);
vsocket->connfd = fd;
memset(vsocket, 0, sizeof(struct vhost_user_socket));
vsocket->path = strdup(path);
vsocket->connfd = -1;
+ vsocket->dequeue_zero_copy = flags & RTE_VHOST_USER_DEQUEUE_ZERO_COPY;
if ((flags & RTE_VHOST_USER_CLIENT) != 0) {
vsocket->reconnect = !(flags & RTE_VHOST_USER_NO_RECONNECT);
dev->ifname[sizeof(dev->ifname) - 1] = '\0';
}
+void
+vhost_enable_dequeue_zero_copy(int vid)
+{
+ struct virtio_net *dev = get_device(vid);
+
+ if (dev == NULL)
+ return;
+
+ dev->dequeue_zero_copy = 1;
+}
int
rte_vhost_get_numa_node(int vid)
int alloc_vring_queue_pair(struct virtio_net *dev, uint32_t qp_idx);
void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
+void vhost_enable_dequeue_zero_copy(int vid);
/*
* Backend-specific cleanup. Defined by vhost-cuse and vhost-user.