-.. BSD LICENSE
- Copyright(c) 2010-2016 Intel Corporation. All rights reserved.
- All rights reserved.
-
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions
- are met:
-
- * Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in
- the documentation and/or other materials provided with the
- distribution.
- * Neither the name of Intel Corporation nor the names of its
- contributors may be used to endorse or promote products derived
- from this software without specific prior written permission.
-
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2010-2016 Intel Corporation.
Vhost Library
=============
of those segments, thus the fewer the segments, the quicker we will get
the mapping. NOTE: we may speed it by using tree searching in future.
+ * zero copy can not work when using vfio-pci with iommu mode currently, this
+ is because we don't setup iommu dma mapping for guest memory. If you have
+ to use vfio-pci driver, please insert vfio-pci kernel module in noiommu
+ mode.
+
+ - ``RTE_VHOST_USER_IOMMU_SUPPORT``
+
+ IOMMU support will be enabled when this flag is set. It is disabled by
+ default.
+
+ Enabling this flag makes possible to use guest vIOMMU to protect vhost
+ from accessing memory the virtio device isn't allowed to, when the feature
+ is negotiated and an IOMMU device is declared.
+
+ However, this feature enables vhost-user's reply-ack protocol feature,
+ which implementation is buggy in Qemu v2.7.0-v2.9.0 when doing multiqueue.
+ Enabling this flag with these Qemu version results in Qemu being blocked
+ when multiple queue pairs are declared.
+
* ``rte_vhost_driver_set_features(path, features)``
This function sets the feature bits the vhost-user driver supports. The
vhost-user driver could be vhost-user net, yet it could be something else,
say, vhost-user SCSI.
-* ``rte_vhost_driver_session_start()``
-
- This function starts the vhost session loop to handle vhost messages. It
- starts an infinite loop, therefore it should be called in a dedicated
- thread.
-
-* ``rte_vhost_driver_callback_register(virtio_net_device_ops)``
+* ``rte_vhost_driver_callback_register(path, vhost_device_ops)``
This function registers a set of callbacks, to let DPDK applications take
the appropriate action when some events happen. The following events are
* ``new_device(int vid)``
- This callback is invoked when a virtio net device becomes ready. ``vid``
- is the virtio net device ID.
+ This callback is invoked when a virtio device becomes ready. ``vid``
+ is the vhost device ID.
* ``destroy_device(int vid)``
- This callback is invoked when a virtio net device shuts down (or when the
- vhost connection is broken).
+ This callback is invoked when a virtio device is paused or shut down.
* ``vring_state_changed(int vid, uint16_t queue_id, int enable)``
This callback is invoked when a specific queue's state is changed, for
example to enabled or disabled.
-* ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)``
+ * ``features_changed(int vid, uint64_t features)``
- Transmits (enqueues) ``count`` packets from host to guest.
+ This callback is invoked when the features is changed. For example,
+ ``VHOST_F_LOG_ALL`` will be set/cleared at the start/end of live
+ migration, respectively.
-* ``rte_vhost_dequeue_burst(vid, queue_id, mbuf_pool, pkts, count)``
+ * ``new_connection(int vid)``
- Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
+ This callback is invoked on new vhost-user socket connection. If DPDK
+ acts as the server the device should not be deleted before
+ ``destroy_connection`` callback is received.
+
+ * ``destroy_connection(int vid)``
+
+ This callback is invoked when vhost-user socket connection is closed.
+ It indicates that device with id ``vid`` is no longer in use and can be
+ safely deleted.
* ``rte_vhost_driver_disable/enable_features(path, features))``
disable mergeable buffers and TSO features, which both are enabled by
default.
+* ``rte_vhost_driver_start(path)``
+
+ This function triggers the vhost-user negotiation. It should be invoked at
+ the end of initializing a vhost-user driver.
+
+* ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)``
+
+ Transmits (enqueues) ``count`` packets from host to guest.
+
+* ``rte_vhost_dequeue_burst(vid, queue_id, mbuf_pool, pkts, count)``
+
+ Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``.
+
+* ``rte_vhost_crypto_create(vid, cryptodev_id, sess_mempool, socket_id)``
+
+ As an extension of new_device(), this function adds virtio-crypto workload
+ acceleration capability to the device. All crypto workload is processed by
+ DPDK cryptodev with the device ID of ``cryptodev_id``.
+
+* ``rte_vhost_crypto_free(vid)``
+
+ Frees the memory and vhost-user message handlers created in
+ rte_vhost_crypto_create().
+
+* ``rte_vhost_crypto_fetch_requests(vid, queue_id, ops, nb_ops)``
+
+ Receives (dequeues) ``nb_ops`` virtio-crypto requests from guest, parses
+ them to DPDK Crypto Operations, and fills the ``ops`` with parsing results.
+
+* ``rte_vhost_crypto_finalize_requests(queue_id, ops, nb_ops)``
+
+ After the ``ops`` are dequeued from Cryptodev, finalizes the jobs and
+ notifies the guest(s).
+
+* ``rte_vhost_crypto_set_zero_copy(vid, option)``
+
+ Enable or disable zero copy feature of the vhost crypto backend.
Vhost-user Implementations
--------------------------
When the socket connection is closed, vhost will destroy the device.
+Guest memory requirement
+------------------------
+
+* Memory pre-allocation
+
+ For non-zerocopy, guest memory pre-allocation is not a must. This can help
+ save of memory. If users really want the guest memory to be pre-allocated
+ (e.g., for performance reason), we can add option ``-mem-prealloc`` when
+ starting QEMU. Or, we can lock all memory at vhost side which will force
+ memory to be allocated when mmap at vhost side; option --mlockall in
+ ovs-dpdk is an example in hand.
+
+ For zerocopy, we force the VM memory to be pre-allocated at vhost lib when
+ mapping the guest memory; and also we need to lock the memory to prevent
+ pages being swapped out to disk.
+
+* Memory sharing
+
+ Make sure ``share=on`` QEMU option is given. vhost-user will not work with
+ a QEMU version without shared memory mapping.
+
Vhost supported vSwitch reference
---------------------------------