The DPDK KNI Kernel Module
--------------------------
-The KNI kernel loadable module provides support for two types of devices:
+The KNI kernel loadable module ``rte_kni`` provides the kernel interface
+for DPDK applications.
-* A Miscellaneous device (/dev/kni) that:
+When the ``rte_kni`` module is loaded, it will create a device ``/dev/kni``
+that is used by the DPDK KNI API functions to control and communicate with
+the kernel module.
- * Creates net devices (via ioctl calls).
+The ``rte_kni`` kernel module contains several optional parameters which
+can be specified when the module is loaded to control its behavior:
- * Maintains a kernel thread context shared by all KNI instances
- (simulating the RX side of the net driver).
+.. code-block:: console
- * For single kernel thread mode, maintains a kernel thread context shared by all KNI instances
- (simulating the RX side of the net driver).
+ # modinfo rte_kni.ko
+ <snip>
+ parm: lo_mode: KNI loopback mode (default=lo_mode_none):
+ lo_mode_none Kernel loopback disabled
+ lo_mode_fifo Enable kernel loopback with fifo
+ lo_mode_fifo_skb Enable kernel loopback with fifo and skb buffer
+ (charp)
+ parm: kthread_mode: Kernel thread mode (default=single):
+ single Single kernel thread mode enabled.
+ multiple Multiple kernel thread mode enabled.
+ (charp)
+ parm: carrier: Default carrier state for KNI interface (default=off):
+ off Interfaces will be created with carrier state set to off.
+ on Interfaces will be created with carrier state set to on.
+ (charp)
- * For multiple kernel thread mode, maintains a kernel thread context for each KNI instance
- (simulating the RX side of the net driver).
+Loading the ``rte_kni`` kernel module without any optional parameters is
+the typical way a DPDK application gets packets into and out of the kernel
+network stack. Without any parameters, only one kernel thread is created
+for all KNI devices for packet receiving in kernel side, loopback mode is
+disabled, and the default carrier state of KNI interfaces is set to *off*.
-* Net device:
+.. code-block:: console
- * Net functionality provided by implementing several operations such as netdev_ops,
- header_ops, ethtool_ops that are defined by struct net_device,
- including support for DPDK mbufs and FIFOs.
+ # insmod kmod/rte_kni.ko
- * The interface name is provided from userspace.
+.. _kni_loopback_mode:
- * The MAC address can be the real NIC MAC address or random.
+Loopback Mode
+~~~~~~~~~~~~~
+
+For testing, the ``rte_kni`` kernel module can be loaded in loopback mode
+by specifying the ``lo_mode`` parameter:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko lo_mode=lo_mode_fifo
+
+The ``lo_mode_fifo`` loopback option will loop back ring enqueue/dequeue
+operations in kernel space.
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko lo_mode=lo_mode_fifo_skb
+
+The ``lo_mode_fifo_skb`` loopback option will loop back ring enqueue/dequeue
+operations and sk buffer copies in kernel space.
+
+If the ``lo_mode`` parameter is not specified, loopback mode is disabled.
+
+.. _kni_kernel_thread_mode:
+
+Kernel Thread Mode
+~~~~~~~~~~~~~~~~~~
+
+To provide flexibility of performance, the ``rte_kni`` KNI kernel module
+can be loaded with the ``kthread_mode`` parameter. The ``rte_kni`` kernel
+module supports two options: "single kernel thread" mode and "multiple
+kernel thread" mode.
+
+Single kernel thread mode is enabled as follows:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko kthread_mode=single
+
+This mode will create only one kernel thread for all KNI interfaces to
+receive data on the kernel side. By default, this kernel thread is not
+bound to any particular core, but the user can set the core affinity for
+this kernel thread by setting the ``core_id`` and ``force_bind`` parameters
+in ``struct rte_kni_conf`` when the first KNI interface is created:
+
+For optimum performance, the kernel thread should be bound to a core in
+on the same socket as the DPDK lcores used in the application.
+
+The KNI kernel module can also be configured to start a separate kernel
+thread for each KNI interface created by the DPDK application. Multiple
+kernel thread mode is enabled as follows:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko kthread_mode=multiple
+
+This mode will create a separate kernel thread for each KNI interface to
+receive data on the kernel side. The core affinity of each ``kni_thread``
+kernel thread can be specified by setting the ``core_id`` and ``force_bind``
+parameters in ``struct rte_kni_conf`` when each KNI interface is created.
+
+Multiple kernel thread mode can provide scalable higher performance if
+sufficient unused cores are available on the host system.
+
+If the ``kthread_mode`` parameter is not specified, the "single kernel
+thread" mode is used.
+
+.. _kni_default_carrier_state:
+
+Default Carrier State
+~~~~~~~~~~~~~~~~~~~~~
+
+The default carrier state of KNI interfaces created by the ``rte_kni``
+kernel module is controlled via the ``carrier`` option when the module
+is loaded.
+
+If ``carrier=off`` is specified, the kernel module will leave the carrier
+state of the interface *down* when the interface is management enabled.
+The DPDK application can set the carrier state of the KNI interface using the
+``rte_kni_update_link()`` function. This is useful for DPDK applications
+which require that the carrier state of the KNI interface reflect the
+actual link state of the corresponding physical NIC port.
+
+If ``carrier=on`` is specified, the kernel module will automatically set
+the carrier state of the interface to *up* when the interface is management
+enabled. This is useful for DPDK applications which use the KNI interface as
+a purely virtual interface that does not correspond to any physical hardware
+and do not wish to explicitly set the carrier state of the interface with
+``rte_kni_update_link()``. It is also useful for testing in loopback mode
+where the NIC port may not be physically connected to anything.
+
+To set the default carrier state to *on*:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko carrier=on
+
+To set the default carrier state to *off*:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko carrier=off
+
+If the ``carrier`` parameter is not specified, the default carrier state
+of KNI interfaces will be set to *off*.
KNI Creation and Deletion
-------------------------
-The KNI interfaces are created by a DPDK application dynamically.
-The interface name and FIFO details are provided by the application through an ioctl call
-using the rte_kni_device_info struct which contains:
+Before any KNI interfaces can be created, the ``rte_kni`` kernel module must
+be loaded into the kernel and configured withe ``rte_kni_init()`` function.
+
+The KNI interfaces are created by a DPDK application dynamically via the
+``rte_kni_alloc()`` function.
+
+The ``struct rte_kni_conf`` structure contains fields which allow the
+user to specify the interface name, set the MTU size, set an explicit or
+random MAC address and control the affinity of the kernel Rx thread(s)
+(both single and multi-threaded modes).
+
+The ``struct rte_kni_ops`` structure contains pointers to functions to
+handle requests from the ``rte_kni`` kernel module. These functions
+allow DPDK applications to perform actions when the KNI interfaces are
+manipulated by control commands or functions external to the application.
+
+For example, the DPDK application may wish to enabled/disable a physical
+NIC port when a user enabled/disables a KNI interface with ``ip link set
+[up|down] dev <ifaceX>``. The DPDK application can register a callback for
+``config_network_if`` which will be called when the interface management
+state changes.
+
+There are currently four callbacks for which the user can register
+application functions:
-* The interface name.
+``config_network_if``:
-* Physical addresses of the corresponding memzones for the relevant FIFOs.
+ Called when the management state of the KNI interface changes.
+ For example, when the user runs ``ip link set [up|down] dev <ifaceX>``.
-* Mbuf mempool details, both physical and virtual (to calculate the offset for mbuf pointers).
+``change_mtu``:
-* PCI information.
+ Called when the user changes the MTU size of the KNI
+ interface. For example, when the user runs ``ip link set mtu <size>
+ dev <ifaceX>``.
-* Core affinity.
+``config_mac_address``:
-Refer to rte_kni_common.h in the DPDK source code for more details.
+ Called when the user changes the MAC address of the KNI interface.
+ For example, when the user runs ``ip link set address <MAC>
+ dev <ifaceX>``. If the user sets this callback function to NULL,
+ but sets the ``port_id`` field to a value other than -1, a default
+ callback handler in the rte_kni library ``kni_config_mac_address()``
+ will be called which calls ``rte_eth_dev_default_mac_addr_set()``
+ on the specified ``port_id``.
-The physical addresses will be re-mapped into the kernel address space and stored in separate KNI contexts.
+``config_promiscusity``:
-The affinity of kernel RX thread (both single and multi-threaded modes) is controlled by force_bind and
-core_id config parameters.
+ Called when the user changes the promiscusity state of the KNI
+ interface. For example, when the user runs ``ip link set promisc
+ [on|off] dev <ifaceX>``. If the user sets this callback function to
+ NULL, but sets the ``port_id`` field to a value other than -1, a default
+ callback handler in the rte_kni library ``kni_config_promiscusity()``
+ will be called which calls ``rte_eth_promiscuous_<enable|disable>()``
+ on the specified ``port_id``.
-The KNI interfaces can be deleted by a DPDK application dynamically after being created.
-Furthermore, all those KNI interfaces not deleted will be deleted on the release operation
-of the miscellaneous device (when the DPDK application is closed).
+In order to run these callbacks, the application must periodically call
+the ``rte_kni_handle_request()`` function. Any user callback function
+registered will be called directly from ``rte_kni_handle_request()`` so
+care must be taken to prevent deadlock and to not block any DPDK fastpath
+tasks. Typically DPDK applications which use these callbacks will need
+to create a separate thread or secondary process to periodically call
+``rte_kni_handle_request()``.
+
+The KNI interfaces can be deleted by a DPDK application with
+``rte_kni_release()``. All KNI interfaces not explicitly deleted will be
+deleted when the the ``/dev/kni`` device is closed, either explicitly with
+``rte_kni_close()`` or when the DPDK application is closed.
DPDK mbuf Flow
--------------
The mbuf is dequeued (without waiting due the cache) and filled with data from sk_buff.
The sk_buff is then freed and the mbuf sent in the tx_q FIFO.
-The DPDK TX thread dequeues the mbuf and sends it to the PMD (via rte_eth_tx_burst()).
+The DPDK TX thread dequeues the mbuf and sends it to the PMD via ``rte_eth_tx_burst()``.
It then puts the mbuf back in the cache.
Ethtool
where each net device must register its own callbacks for the supported operations.
The current implementation uses the igb/ixgbe modified Linux drivers for ethtool support.
Ethtool is not supported in i40e and VMs (VF or EM devices).
-
-Link state and MTU change
--------------------------
-
-Link state and MTU change are network interface specific operations usually done via ifconfig.
-The request is initiated from the kernel side (in the context of the ifconfig process)
-and handled by the user space DPDK application.
-The application polls the request, calls the application handler and returns the response back into the kernel space.
-
-The application handlers can be registered upon interface creation or explicitly registered/unregistered in runtime.
-This provides flexibility in multiprocess scenarios
-(where the KNI is created in the primary process but the callbacks are handled in the secondary one).
-The constraint is that a single process can register and handle the requests.
static char *kthread_mode;
static uint32_t multiple_kthread_on;
+/* Default carrier state for created KNI network interfaces */
+static char *carrier;
+uint32_t dflt_carrier;
+
#define KNI_DEV_IN_USE_BIT_NUM 0 /* Bit number for device in use */
static int kni_net_id;
return -ENODEV;
}
+ netif_carrier_off(net_dev);
+
ret = kni_run_thread(knet, kni, dev_info.force_bind);
if (ret != 0)
return ret;
return 0;
}
+static int __init
+kni_parse_carrier_state(void)
+{
+ if (!carrier) {
+ dflt_carrier = 0;
+ return 0;
+ }
+
+ if (strcmp(carrier, "off") == 0)
+ dflt_carrier = 0;
+ else if (strcmp(carrier, "on") == 0)
+ dflt_carrier = 1;
+ else
+ return -1;
+
+ return 0;
+}
+
static int __init
kni_init(void)
{
else
pr_debug("Multiple kernel thread mode enabled\n");
+ if (kni_parse_carrier_state() < 0) {
+ pr_err("Invalid parameter for carrier\n");
+ return -EINVAL;
+ }
+
+ if (dflt_carrier == 0)
+ pr_debug("Default carrier state set to off.\n");
+ else
+ pr_debug("Default carrier state set to on.\n");
+
#ifdef HAVE_SIMPLIFIED_PERNET_OPERATIONS
rc = register_pernet_subsys(&kni_net_ops);
#else
module_init(kni_init);
module_exit(kni_exit);
-module_param(lo_mode, charp, S_IRUGO | S_IWUSR);
+module_param(lo_mode, charp, 0644);
MODULE_PARM_DESC(lo_mode,
"KNI loopback mode (default=lo_mode_none):\n"
-" lo_mode_none Kernel loopback disabled\n"
-" lo_mode_fifo Enable kernel loopback with fifo\n"
-" lo_mode_fifo_skb Enable kernel loopback with fifo and skb buffer\n"
-"\n"
+"\t\tlo_mode_none Kernel loopback disabled\n"
+"\t\tlo_mode_fifo Enable kernel loopback with fifo\n"
+"\t\tlo_mode_fifo_skb Enable kernel loopback with fifo and skb buffer\n"
+"\t\t"
);
-module_param(kthread_mode, charp, S_IRUGO);
+module_param(kthread_mode, charp, 0644);
MODULE_PARM_DESC(kthread_mode,
"Kernel thread mode (default=single):\n"
-" single Single kernel thread mode enabled.\n"
-" multiple Multiple kernel thread mode enabled.\n"
-"\n"
+"\t\tsingle Single kernel thread mode enabled.\n"
+"\t\tmultiple Multiple kernel thread mode enabled.\n"
+"\t\t"
+);
+
+module_param(carrier, charp, 0644);
+MODULE_PARM_DESC(carrier,
+"Default carrier state for KNI interface (default=off):\n"
+"\t\toff Interfaces will be created with carrier state set to off.\n"
+"\t\ton Interfaces will be created with carrier state set to on.\n"
+"\t\t"
);