1 .. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2010-2017 Intel Corporation.
4 Quota and Watermark Sample Application
5 ======================================
7 The Quota and Watermark sample application is a simple example of packet
8 processing using Data Plane Development Kit (DPDK) that showcases the use
9 of a quota as the maximum number of packets enqueue/dequeue at a time and
10 low and high thresholds, or watermarks, to signal low and high ring usage
13 Additionally, it shows how the thresholds can be used to feedback congestion notifications to data producers by
14 temporarily stopping processing overloaded rings and sending Ethernet flow control frames.
16 This sample application is split in two parts:
18 * qw - The core quota and watermark sample application
20 * qwctl - A command line tool to alter quota and watermarks while qw is running
25 The Quota and Watermark sample application performs forwarding for each packet that is received on a given port.
26 The destination port is the adjacent port from the enabled port mask, that is,
27 if the first four ports are enabled (port mask 0xf), ports 0 and 1 forward into each other,
28 and ports 2 and 3 forward into each other.
29 The MAC addresses of the forwarded Ethernet frames are not affected.
31 Internally, packets are pulled from the ports by the master logical core and put on a variable length processing pipeline,
32 each stage of which being connected by rings, as shown in :numref:`figure_pipeline_overview`.
34 .. _figure_pipeline_overview:
36 .. figure:: img/pipeline_overview.*
41 An adjustable quota value controls how many packets are being moved through the pipeline per enqueue and dequeue.
42 Adjustable threshold values associated with the rings control a back-off mechanism that
43 tries to prevent the pipeline from being overloaded by:
45 * Stopping enqueuing on rings for which the usage has crossed the high watermark threshold
47 * Sending Ethernet pause frames
49 * Only resuming enqueuing on a ring once its usage goes below a global low watermark threshold
51 This mechanism allows congestion notifications to go up the ring pipeline and
52 eventually lead to an Ethernet flow control frame being send to the source.
54 On top of serving as an example of quota and watermark usage,
55 this application can be used to benchmark ring based processing pipelines performance using a traffic- generator,
56 as shown in :numref:`figure_ring_pipeline_perf_setup`.
58 .. _figure_ring_pipeline_perf_setup:
60 .. figure:: img/ring_pipeline_perf_setup.*
62 Ring-based Processing Pipeline Performance Setup
64 Compiling the Application
65 -------------------------
67 To compile the sample application see :doc:`compiling`.
69 The application is located in the ``quota_watermark`` sub-directory.
71 Running the Application
72 -----------------------
74 The core application, qw, has to be started first.
76 Once it is up and running, one can alter quota and watermarks while it runs using the control application, qwctl.
78 Running the Core Application
79 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
81 The application requires a single command line option:
83 .. code-block:: console
85 ./qw/build/qw [EAL options] -- -p PORTMASK
89 -p PORTMASK: A hexadecimal bitmask of the ports to configure
91 To run the application in a linuxapp environment with four logical cores and ports 0 and 2,
92 issue the following command:
94 .. code-block:: console
96 ./qw/build/qw -l 0-3 -n 4 -- -p 5
98 Refer to the *DPDK Getting Started Guide* for general information on running applications and
99 the Environment Abstraction Layer (EAL) options.
101 Running the Control Application
102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104 The control application requires a number of command line options:
106 .. code-block:: console
108 ./qwctl/build/qwctl [EAL options] --proc-type=secondary
110 The --proc-type=secondary option is necessary for the EAL to properly initialize the control application to
111 use the same huge pages as the core application and thus be able to access its rings.
113 To run the application in a linuxapp environment on logical core 0, issue the following command:
115 .. code-block:: console
117 ./qwctl/build/qwctl -l 0 -n 4 --proc-type=secondary
119 Refer to the *DPDK Getting Started* Guide for general information on running applications and
120 the Environment Abstraction Layer (EAL) options.
122 qwctl is an interactive command line that let the user change variables in a running instance of qw.
123 The help command gives a list of available commands:
125 .. code-block:: console
132 The following sections provide a quick guide to the application's source code.
134 Core Application - qw
135 ~~~~~~~~~~~~~~~~~~~~~
137 EAL and Drivers Setup
138 ^^^^^^^^^^^^^^^^^^^^^
140 The EAL arguments are parsed at the beginning of the main() function:
144 ret = rte_eal_init(argc, argv);
146 rte_exit(EXIT_FAILURE, "Cannot initialize EAL\n");
151 Then, a call to init_dpdk(), defined in init.c, is made to initialize the poll mode drivers:
160 /* Bind the drivers to usable devices */
162 ret = rte_pci_probe();
164 rte_exit(EXIT_FAILURE, "rte_pci_probe(): error %d\n", ret);
166 if (rte_eth_dev_count_avail() < 2)
167 rte_exit(EXIT_FAILURE, "Not enough Ethernet port available\n");
170 To fully understand this code, it is recommended to study the chapters that relate to the *Poll Mode Driver*
171 in the *DPDK Getting Started Guide* and the *DPDK API Reference*.
173 Shared Variables Setup
174 ^^^^^^^^^^^^^^^^^^^^^^
176 The quota and high and low watermark shared variables are put into an rte_memzone using a call to setup_shared_variables():
181 setup_shared_variables(void)
183 const struct rte_memzone *qw_memzone;
185 qw_memzone = rte_memzone_reserve(QUOTA_WATERMARK_MEMZONE_NAME,
186 3 * sizeof(int), rte_socket_id(), 0);
187 if (qw_memzone == NULL)
188 rte_exit(EXIT_FAILURE, "%s\n", rte_strerror(rte_errno));
190 quota = qw_memzone->addr;
191 low_watermark = (unsigned int *) qw_memzone->addr + 1;
192 high_watermark = (unsigned int *) qw_memzone->addr + 2;
195 These three variables are initialized to a default value in main() and
196 can be changed while qw is running using the qwctl control program.
198 Application Arguments
199 ^^^^^^^^^^^^^^^^^^^^^
201 The qw application only takes one argument: a port mask that specifies which ports should be used by the application.
202 At least two ports are needed to run the application and there should be an even number of ports given in the port mask.
204 The port mask parsing is done in parse_qw_args(), defined in args.c.
206 Mbuf Pool Initialization
207 ^^^^^^^^^^^^^^^^^^^^^^^^
209 Once the application's arguments are parsed, an mbuf pool is created.
210 It contains a set of mbuf objects that are used by the driver and the application to store network packets:
214 /* Create a pool of mbuf to store packets */
215 mbuf_pool = rte_pktmbuf_pool_create("mbuf_pool", MBUF_PER_POOL, 32, 0,
216 MBUF_DATA_SIZE, rte_socket_id());
218 if (mbuf_pool == NULL)
219 rte_panic("%s\n", rte_strerror(rte_errno));
221 The rte_mempool is a generic structure used to handle pools of objects.
222 In this case, it is necessary to create a pool that will be used by the driver.
224 The number of allocated pkt mbufs is MBUF_PER_POOL, with a data room size
225 of MBUF_DATA_SIZE each.
226 A per-lcore cache of 32 mbufs is kept.
227 The memory is allocated in on the master lcore's socket, but it is possible to extend this code to allocate one mbuf pool per socket.
229 The rte_pktmbuf_pool_create() function uses the default mbuf pool and mbuf
230 initializers, respectively rte_pktmbuf_pool_init() and rte_pktmbuf_init().
231 An advanced application may want to use the mempool API to create the
232 mbuf pool with more control.
234 Ports Configuration and Pairing
235 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
237 Each port in the port mask is configured and a corresponding ring is created in the master lcore's array of rings.
238 This ring is the first in the pipeline and will hold the packets directly coming from the port.
242 for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++)
243 if (is_bit_set(port_id, portmask)) {
244 configure_eth_port(port_id);
245 init_ring(master_lcore_id, port_id);
250 The configure_eth_port() and init_ring() functions are used to configure a port and a ring respectively and are defined in init.c.
251 They make use of the DPDK APIs defined in rte_eth.h and rte_ring.h.
253 pair_ports() builds the port_pairs[] array so that its key-value pairs are a mapping between reception and transmission ports.
254 It is defined in init.c.
256 Logical Cores Assignment
257 ^^^^^^^^^^^^^^^^^^^^^^^^
259 The application uses the master logical core to poll all the ports for new packets and enqueue them on a ring associated with the port.
261 Each logical core except the last runs pipeline_stage() after a ring for each used port is initialized on that core.
262 pipeline_stage() on core X dequeues packets from core X-1's rings and enqueue them on its own rings. See :numref:`figure_threads_pipelines`.
266 /* Start pipeline_stage() on all the available slave lcore but the last */
268 for (lcore_id = 0 ; lcore_id < last_lcore_id; lcore_id++) {
269 if (rte_lcore_is_enabled(lcore_id) && lcore_id != master_lcore_id) {
270 for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++)
271 if (is_bit_set(port_id, portmask))
272 init_ring(lcore_id, port_id);
274 rte_eal_remote_launch(pipeline_stage, NULL, lcore_id);
278 The last available logical core runs send_stage(),
279 which is the last stage of the pipeline dequeuing packets from the last ring in the pipeline and
280 sending them out on the destination port setup by pair_ports().
284 /* Start send_stage() on the last slave core */
286 rte_eal_remote_launch(send_stage, NULL, last_lcore_id);
288 Receive, Process and Transmit Packets
289 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
291 .. _figure_threads_pipelines:
293 .. figure:: img/threads_pipelines.*
295 Threads and Pipelines
298 In the receive_stage() function running on the master logical core,
299 the main task is to read ingress packets from the RX ports and enqueue them
300 on the port's corresponding first ring in the pipeline.
301 This is done using the following code:
305 lcore_id = rte_lcore_id();
307 /* Process each port round robin style */
309 for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
310 if (!is_bit_set(port_id, portmask))
313 ring = rings[lcore_id][port_id];
315 if (ring_state[port_id] != RING_READY) {
316 if (rte_ring_count(ring) > *low_watermark)
319 ring_state[port_id] = RING_READY;
322 /* Enqueue received packets on the RX ring */
323 nb_rx_pkts = rte_eth_rx_burst(port_id, 0, pkts,
325 ret = rte_ring_enqueue_bulk(ring, (void *) pkts,
327 if (RING_SIZE - free > *high_watermark) {
328 ring_state[port_id] = RING_OVERLOADED;
329 send_pause_frame(port_id, 1337);
335 * Return mbufs to the pool,
336 * effectively dropping packets
338 for (i = 0; i < nb_rx_pkts; i++)
339 rte_pktmbuf_free(pkts[i]);
343 For each port in the port mask, the corresponding ring's pointer is fetched into ring and that ring's state is checked:
345 * If it is in the RING_READY state, \*quota packets are grabbed from the port and put on the ring.
346 Should this operation make the ring's usage cross its high watermark,
347 the ring is marked as overloaded and an Ethernet flow control frame is sent to the source.
349 * If it is not in the RING_READY state, this port is ignored until the ring's usage crosses the \*low_watermark value.
351 The pipeline_stage() function's task is to process and move packets from the preceding pipeline stage.
352 This thread is running on most of the logical cores to create and arbitrarily long pipeline.
356 lcore_id = rte_lcore_id();
358 previous_lcore_id = get_previous_lcore_id(lcore_id);
360 for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
361 if (!is_bit_set(port_id, portmask))
364 tx = rings[lcore_id][port_id];
365 rx = rings[previous_lcore_id][port_id];
367 if (ring_state[port_id] != RING_READY) {
368 if (rte_ring_count(tx) > *low_watermark)
371 ring_state[port_id] = RING_READY;
374 /* Dequeue up to quota mbuf from rx */
375 nb_dq_pkts = rte_ring_dequeue_burst(rx, pkts,
377 if (unlikely(nb_dq_pkts < 0))
380 /* Enqueue them on tx */
381 ret = rte_ring_enqueue_bulk(tx, pkts,
383 if (RING_SIZE - free > *high_watermark)
384 ring_state[port_id] = RING_OVERLOADED;
389 * Return mbufs to the pool,
390 * effectively dropping packets
392 for (i = 0; i < nb_dq_pkts; i++)
393 rte_pktmbuf_free(pkts[i]);
397 The thread's logic works mostly like receive_stage(),
398 except that packets are moved from ring to ring instead of port to ring.
400 In this example, no actual processing is done on the packets,
401 but pipeline_stage() is an ideal place to perform any processing required by the application.
403 Finally, the send_stage() function's task is to read packets from the last ring in a pipeline and
404 send them on the destination port defined in the port_pairs[] array.
405 It is running on the last available logical core only.
409 lcore_id = rte_lcore_id();
411 previous_lcore_id = get_previous_lcore_id(lcore_id);
413 for (port_id = 0; port_id < RTE_MAX_ETHPORTS; port_id++) {
414 if (!is_bit_set(port_id, portmask)) continue;
416 dest_port_id = port_pairs[port_id];
417 tx = rings[previous_lcore_id][port_id];
419 if (rte_ring_empty(tx)) continue;
421 /* Dequeue packets from tx and send them */
423 nb_dq_pkts = rte_ring_dequeue_burst(tx, (void *) tx_pkts, *quota);
424 nb_tx_pkts = rte_eth_tx_burst(dest_port_id, 0, tx_pkts, nb_dq_pkts);
427 For each port in the port mask, up to \*quota packets are pulled from the last ring in its pipeline and
428 sent on the destination port paired with the current port.
430 Control Application - qwctl
431 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
433 The qwctl application uses the rte_cmdline library to provide the user with an interactive command line that
434 can be used to modify and inspect parameters in a running qw application.
435 Those parameters are the global quota and low_watermark value as well as each ring's built-in high watermark.
440 The available commands are defined in commands.c.
442 It is advised to use the cmdline sample application user guide as a reference for everything related to the rte_cmdline library.
444 Accessing Shared Variables
445 ^^^^^^^^^^^^^^^^^^^^^^^^^^
447 The setup_shared_variables() function retrieves the shared variables quota and
448 low_watermark from the rte_memzone previously created by qw.
453 setup_shared_variables(void)
455 const struct rte_memzone *qw_memzone;
457 qw_memzone = rte_memzone_lookup(QUOTA_WATERMARK_MEMZONE_NAME);
458 if (qw_memzone == NULL)
459 rte_exit(EXIT_FAILURE, "Couldn't find memzone\n");
461 quota = qw_memzone->addr;
463 low_watermark = (unsigned int *) qw_memzone->addr + 1;
464 high_watermark = (unsigned int *) qw_memzone->addr + 2;