1 .. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2010-2014 Intel Corporation.
4 Kernel NIC Interface Sample Application
5 =======================================
7 The Kernel NIC Interface (KNI) is a DPDK control plane solution that
8 allows userspace applications to exchange packets with the kernel networking stack.
9 To accomplish this, DPDK userspace applications use an IOCTL call
10 to request the creation of a KNI virtual device in the Linux* kernel.
11 The IOCTL call provides interface information and the DPDK's physical address space,
12 which is re-mapped into the kernel address space by the KNI kernel loadable module
13 that saves the information to a virtual device context.
14 The DPDK creates FIFO queues for packet ingress and egress
15 to the kernel module for each device allocated.
17 The KNI kernel loadable module is a standard net driver,
18 which upon receiving the IOCTL call access the DPDK's FIFO queue to
19 receive/transmit packets from/to the DPDK userspace application.
20 The FIFO queues contain pointers to data packets in the DPDK. This:
22 * Provides a faster mechanism to interface with the kernel net stack and eliminates system calls
24 * Facilitates the DPDK using standard Linux* userspace net tools (tcpdump, ftp, and so on)
26 * Eliminate the copy_to_user and copy_from_user operations on packets.
28 The Kernel NIC Interface sample application is a simple example that demonstrates the use
29 of the DPDK to create a path for packets to go through the Linux* kernel.
30 This is done by creating one or more kernel net devices for each of the DPDK ports.
31 The application allows the use of standard Linux tools (ethtool, ifconfig, tcpdump) with the DPDK ports and
32 also the exchange of packets between the DPDK application and the Linux* kernel.
37 The Kernel NIC Interface sample application uses two threads in user space for each physical NIC port being used,
38 and allocates one or more KNI device for each physical NIC port with kernel module's support.
39 For a physical NIC port, one thread reads from the port and writes to KNI devices,
40 and another thread reads from KNI devices and writes the data unmodified to the physical NIC port.
41 It is recommended to configure one KNI device for each physical NIC port.
42 If configured with more than one KNI devices for a physical NIC port,
43 it is just for performance testing, or it can work together with VMDq support in future.
45 The packet flow through the Kernel NIC Interface application is as shown in the following figure.
47 .. _figure_kernel_nic:
49 .. figure:: img/kernel_nic.*
51 Kernel NIC Application Packet Flow
53 Compiling the Application
54 -------------------------
56 To compile the sample application see :doc:`compiling`.
58 The application is located in the ``kni`` sub-directory.
62 This application is intended as a linuxapp only.
64 Loading the Kernel Module
65 -------------------------
67 Loading the KNI kernel module without any parameter is the typical way a DPDK application
68 gets packets into and out of the kernel net stack.
69 This way, only one kernel thread is created for all KNI devices for packet receiving in kernel side:
71 .. code-block:: console
75 Pinning the kernel thread to a specific core can be done using a taskset command such as following:
77 .. code-block:: console
79 #taskset -p 100000 `pgrep --fl kni_thread | awk '{print $1}'`
81 This command line tries to pin the specific kni_thread on the 20th lcore (lcore numbering starts at 0),
82 which means it needs to check if that lcore is available on the board.
83 This command must be sent after the application has been launched, as insmod does not start the kni thread.
85 For optimum performance,
86 the lcore in the mask must be selected to be on the same socket as the lcores used in the KNI application.
88 To provide flexibility of performance, the kernel module of the KNI,
89 located in the kmod sub-directory of the DPDK target directory,
90 can be loaded with parameter of kthread_mode as follows:
92 * #insmod rte_kni.ko kthread_mode=single
94 This mode will create only one kernel thread for all KNI devices for packet receiving in kernel side.
95 By default, it is in this single kernel thread mode.
96 It can set core affinity for this kernel thread by using Linux command taskset.
98 * #insmod rte_kni.ko kthread_mode =multiple
100 This mode will create a kernel thread for each KNI device for packet receiving in kernel side.
101 The core affinity of each kernel thread is set when creating the KNI device.
102 The lcore ID for each kernel thread is provided in the command line of launching the application.
103 Multiple kernel thread mode can provide scalable higher performance.
105 To measure the throughput in a loopback mode, the kernel module of the KNI,
106 located in the kmod sub-directory of the DPDK target directory,
107 can be loaded with parameters as follows:
109 * #insmod rte_kni.ko lo_mode=lo_mode_fifo
111 This loopback mode will involve ring enqueue/dequeue operations in kernel space.
113 * #insmod rte_kni.ko lo_mode=lo_mode_fifo_skb
115 This loopback mode will involve ring enqueue/dequeue operations and sk buffer copies in kernel space.
117 Running the Application
118 -----------------------
120 The application requires a number of command line options:
122 .. code-block:: console
124 kni [EAL options] -- -P -p PORTMASK --config="(port,lcore_rx,lcore_tx[,lcore_kthread,...])[,port,lcore_rx,lcore_tx[,lcore_kthread,...]]"
128 * -P: Set all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address.
129 Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted.
131 * -p PORTMASK: Hexadecimal bitmask of ports to configure.
133 * --config="(port,lcore_rx, lcore_tx[,lcore_kthread, ...]) [, port,lcore_rx, lcore_tx[,lcore_kthread, ...]]":
134 Determines which lcores of RX, TX, kernel thread are mapped to which ports.
136 Refer to *DPDK Getting Started Guide* for general information on running applications and the Environment Abstraction Layer (EAL) options.
138 The -c coremask or -l corelist parameter of the EAL options should include the lcores indicated by the lcore_rx and lcore_tx,
139 but does not need to include lcores indicated by lcore_kthread as they are used to pin the kernel thread on.
140 The -p PORTMASK parameter should include the ports indicated by the port in --config, neither more nor less.
142 The lcore_kthread in --config can be configured none, one or more lcore IDs.
143 In multiple kernel thread mode, if configured none, a KNI device will be allocated for each port,
144 while no specific lcore affinity will be set for its kernel thread.
145 If configured one or more lcore IDs, one or more KNI devices will be allocated for each port,
146 while specific lcore affinity will be set for its kernel thread.
147 In single kernel thread mode, if configured none, a KNI device will be allocated for each port.
148 If configured one or more lcore IDs,
149 one or more KNI devices will be allocated for each port while
150 no lcore affinity will be set as there is only one kernel thread for all KNI devices.
152 For example, to run the application with two ports served by six lcores, one lcore of RX, one lcore of TX,
153 and one lcore of kernel thread for each port:
155 .. code-block:: console
157 ./build/kni -l 4-7 -n 4 -- -P -p 0x3 --config="(0,4,6,8),(1,5,7,9)"
162 Once the KNI application is started, one can use different Linux* commands to manage the net interfaces.
163 If more than one KNI devices configured for a physical port,
164 only the first KNI device will be paired to the physical device.
165 Operations on other KNI devices will not affect the physical port handled in user space application.
167 Assigning an IP address:
169 .. code-block:: console
171 #ifconfig vEth0_0 192.168.0.1
173 Displaying the NIC registers:
175 .. code-block:: console
179 Dumping the network traffic:
181 .. code-block:: console
185 Change the MAC address:
187 .. code-block:: console
189 #ifconfig vEth0_0 hw ether 0C:01:02:03:04:08
191 When the DPDK userspace application is closed, all the KNI devices are deleted from Linux*.
196 The following sections provide some explanation of code.
201 Setup of mbuf pool, driver and queues is similar to the setup done in the :doc:`l2_forward_real_virtual`..
202 In addition, one or more kernel NIC interfaces are allocated for each
203 of the configured ports according to the command line parameters.
205 The code for allocating the kernel NIC interfaces for a specific port is as follows:
210 kni_alloc(uint16_t port_id)
214 struct rte_kni_conf conf;
215 struct kni_port_params **params = kni_port_params_array;
217 if (port_id >= RTE_MAX_ETHPORTS || !params[port_id])
220 params[port_id]->nb_kni = params[port_id]->nb_lcore_k ? params[port_id]->nb_lcore_k : 1;
222 for (i = 0; i < params[port_id]->nb_kni; i++) {
224 /* Clear conf at first */
226 memset(&conf, 0, sizeof(conf));
227 if (params[port_id]->nb_lcore_k) {
228 snprintf(conf.name, RTE_KNI_NAMESIZE, "vEth%u_%u", port_id, i);
229 conf.core_id = params[port_id]->lcore_k[i];
232 snprintf(conf.name, RTE_KNI_NAMESIZE, "vEth%u", port_id);
233 conf.group_id = (uint16_t)port_id;
234 conf.mbuf_size = MAX_PACKET_SZ;
237 * The first KNI device associated to a port
238 * is the master, for multiple kernel thread
243 struct rte_kni_ops ops;
244 struct rte_eth_dev_info dev_info;
246 memset(&dev_info, 0, sizeof(dev_info)); rte_eth_dev_info_get(port_id, &dev_info);
248 conf.addr = dev_info.pci_dev->addr;
249 conf.id = dev_info.pci_dev->id;
251 /* Get the interface default mac address */
252 rte_eth_macaddr_get(port_id, (struct ether_addr *)&conf.mac_addr);
254 memset(&ops, 0, sizeof(ops));
256 ops.port_id = port_id;
257 ops.change_mtu = kni_change_mtu;
258 ops.config_network_if = kni_config_network_interface;
259 ops.config_mac_address = kni_config_mac_address;
261 kni = rte_kni_alloc(pktmbuf_pool, &conf, &ops);
263 kni = rte_kni_alloc(pktmbuf_pool, &conf, NULL);
266 rte_exit(EXIT_FAILURE, "Fail to create kni for "
267 "port: %d\n", port_id);
269 params[port_id]->kni[i] = kni;
274 The other step in the initialization process that is unique to this sample application
275 is the association of each port with lcores for RX, TX and kernel threads.
277 * One lcore to read from the port and write to the associated one or more KNI devices
279 * Another lcore to read from one or more KNI devices and write to the port
281 * Other lcores for pinning the kernel threads on one by one
283 This is done by using the`kni_port_params_array[]` array, which is indexed by the port ID.
284 The code is as follows:
286 .. code-block:: console
289 parse_config(const char *arg)
291 const char *p, *p0 = arg;
298 _NUM_FLD = KNI_MAX_KTHREAD + 3,
301 char *str_fld[_NUM_FLD];
302 unsigned long int_fld[_NUM_FLD];
303 uint16_t port_id, nb_kni_port_params = 0;
305 memset(&kni_port_params_array, 0, sizeof(kni_port_params_array));
307 while (((p = strchr(p0, '(')) != NULL) && nb_kni_port_params < RTE_MAX_ETHPORTS) {
309 if ((p0 = strchr(p, ')')) == NULL)
314 if (size >= sizeof(s)) {
315 printf("Invalid config parameters\n");
319 snprintf(s, sizeof(s), "%.*s", size, p);
320 nb_token = rte_strsplit(s, sizeof(s), str_fld, _NUM_FLD, ',');
322 if (nb_token <= FLD_LCORE_TX) {
323 printf("Invalid config parameters\n");
327 for (i = 0; i < nb_token; i++) {
329 int_fld[i] = strtoul(str_fld[i], &end, 0);
330 if (errno != 0 || end == str_fld[i]) {
331 printf("Invalid config parameters\n");
337 port_id = (uint8_t)int_fld[i++];
339 if (port_id >= RTE_MAX_ETHPORTS) {
340 printf("Port ID %u could not exceed the maximum %u\n", port_id, RTE_MAX_ETHPORTS);
344 if (kni_port_params_array[port_id]) {
345 printf("Port %u has been configured\n", port_id);
349 kni_port_params_array[port_id] = (struct kni_port_params*)rte_zmalloc("KNI_port_params", sizeof(struct kni_port_params), RTE_CACHE_LINE_SIZE);
350 kni_port_params_array[port_id]->port_id = port_id;
351 kni_port_params_array[port_id]->lcore_rx = (uint8_t)int_fld[i++];
352 kni_port_params_array[port_id]->lcore_tx = (uint8_t)int_fld[i++];
354 if (kni_port_params_array[port_id]->lcore_rx >= RTE_MAX_LCORE || kni_port_params_array[port_id]->lcore_tx >= RTE_MAX_LCORE) {
355 printf("lcore_rx %u or lcore_tx %u ID could not "
356 "exceed the maximum %u\n",
357 kni_port_params_array[port_id]->lcore_rx, kni_port_params_array[port_id]->lcore_tx, RTE_MAX_LCORE);
361 for (j = 0; i < nb_token && j < KNI_MAX_KTHREAD; i++, j++)
362 kni_port_params_array[port_id]->lcore_k[j] = (uint8_t)int_fld[i];
363 kni_port_params_array[port_id]->nb_lcore_k = j;
372 for (i = 0; i < RTE_MAX_ETHPORTS; i++) {
373 if (kni_port_params_array[i]) {
374 rte_free(kni_port_params_array[i]);
375 kni_port_params_array[i] = NULL;
386 After the initialization steps are completed, the main_loop() function is run on each lcore.
387 This function first checks the lcore_id against the user provided lcore_rx and lcore_tx
388 to see if this lcore is reading from or writing to kernel NIC interfaces.
390 For the case that reads from a NIC port and writes to the kernel NIC interfaces,
391 the packet reception is the same as in L2 Forwarding sample application
392 (see :ref:`l2_fwd_app_rx_tx_packets`).
393 The packet transmission is done by sending mbufs into the kernel NIC interfaces by rte_kni_tx_burst().
394 The KNI library automatically frees the mbufs after the kernel successfully copied the mbufs.
399 * Interface to burst rx and enqueue mbufs into rx_q
403 kni_ingress(struct kni_port_params *p)
405 uint8_t i, nb_kni, port_id;
407 struct rte_mbuf *pkts_burst[PKT_BURST_SZ];
413 port_id = p->port_id;
415 for (i = 0; i < nb_kni; i++) {
416 /* Burst rx from eth */
417 nb_rx = rte_eth_rx_burst(port_id, 0, pkts_burst, PKT_BURST_SZ);
418 if (unlikely(nb_rx > PKT_BURST_SZ)) {
419 RTE_LOG(ERR, APP, "Error receiving from eth\n");
423 /* Burst tx to kni */
424 num = rte_kni_tx_burst(p->kni[i], pkts_burst, nb_rx);
425 kni_stats[port_id].rx_packets += num;
426 rte_kni_handle_request(p->kni[i]);
428 if (unlikely(num < nb_rx)) {
429 /* Free mbufs not tx to kni interface */
430 kni_burst_free_mbufs(&pkts_burst[num], nb_rx - num);
431 kni_stats[port_id].rx_dropped += nb_rx - num;
436 For the other case that reads from kernel NIC interfaces and writes to a physical NIC port, packets are retrieved by reading
437 mbufs from kernel NIC interfaces by `rte_kni_rx_burst()`.
438 The packet transmission is the same as in the L2 Forwarding sample application
439 (see :ref:`l2_fwd_app_rx_tx_packets`).
444 * Interface to dequeue mbufs from tx_q and burst tx
449 kni_egress(struct kni_port_params *p)
451 uint8_t i, nb_kni, port_id;
453 struct rte_mbuf *pkts_burst[PKT_BURST_SZ];
459 port_id = p->port_id;
461 for (i = 0; i < nb_kni; i++) {
462 /* Burst rx from kni */
463 num = rte_kni_rx_burst(p->kni[i], pkts_burst, PKT_BURST_SZ);
464 if (unlikely(num > PKT_BURST_SZ)) {
465 RTE_LOG(ERR, APP, "Error receiving from KNI\n");
469 /* Burst tx to eth */
471 nb_tx = rte_eth_tx_burst(port_id, 0, pkts_burst, (uint16_t)num);
473 kni_stats[port_id].tx_packets += nb_tx;
475 if (unlikely(nb_tx < num)) {
476 /* Free mbufs not tx to NIC */
477 kni_burst_free_mbufs(&pkts_burst[nb_tx], num - nb_tx);
478 kni_stats[port_id].tx_dropped += num - nb_tx;
483 Callbacks for Kernel Requests
484 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
486 To execute specific PMD operations in user space requested by some Linux* commands,
487 callbacks must be implemented and filled in the struct rte_kni_ops structure.
488 Currently, setting a new MTU, change in MAC address, configuring promiscusous mode and
489 configuring the network interface(up/down) re supported.
490 Default implementation for following is available in rte_kni library.
491 Application may choose to not implement following callbacks:
493 - ``config_mac_address``
494 - ``config_promiscusity``
499 static struct rte_kni_ops kni_ops = {
500 .change_mtu = kni_change_mtu,
501 .config_network_if = kni_config_network_interface,
502 .config_mac_address = kni_config_mac_address,
503 .config_promiscusity = kni_config_promiscusity,
506 /* Callback for request of changing MTU */
509 kni_change_mtu(uint16_t port_id, unsigned new_mtu)
512 struct rte_eth_conf conf;
514 RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu);
516 /* Stop specific port */
518 rte_eth_dev_stop(port_id);
520 memcpy(&conf, &port_conf, sizeof(conf));
524 if (new_mtu > ETHER_MAX_LEN)
525 conf.rxmode.jumbo_frame = 1;
527 conf.rxmode.jumbo_frame = 0;
529 /* mtu + length of header + length of FCS = max pkt length */
531 conf.rxmode.max_rx_pkt_len = new_mtu + KNI_ENET_HEADER_SIZE + KNI_ENET_FCS_SIZE;
533 ret = rte_eth_dev_configure(port_id, 1, 1, &conf);
535 RTE_LOG(ERR, APP, "Fail to reconfigure port %d\n", port_id);
539 /* Restart specific port */
541 ret = rte_eth_dev_start(port_id);
543 RTE_LOG(ERR, APP, "Fail to restart port %d\n", port_id);
550 /* Callback for request of configuring network interface up/down */
553 kni_config_network_interface(uint16_t port_id, uint8_t if_up)
557 RTE_LOG(INFO, APP, "Configure network interface of %d %s\n",
559 port_id, if_up ? "up" : "down");
562 /* Configure network interface up */
563 rte_eth_dev_stop(port_id);
564 ret = rte_eth_dev_start(port_id);
565 } else /* Configure network interface down */
566 rte_eth_dev_stop(port_id);
569 RTE_LOG(ERR, APP, "Failed to start port %d\n", port_id);
573 /* Callback for request of configuring device mac address */
576 kni_config_mac_address(uint16_t port_id, uint8_t mac_addr[])
581 /* Callback for request of configuring promiscuous mode */
584 kni_config_promiscusity(uint16_t port_id, uint8_t to_on)