l2_forward_job_stats.rst

   1 ..  BSD LICENSE
   2     Copyright(c) 2010-2015 Intel Corporation. All rights reserved.
   3     All rights reserved.
   4
   5     Redistribution and use in source and binary forms, with or without
   6     modification, are permitted provided that the following conditions
   7     are met:
   8
   9     * Redistributions of source code must retain the above copyright
  10     notice, this list of conditions and the following disclaimer.
  11     * Redistributions in binary form must reproduce the above copyright
  12     notice, this list of conditions and the following disclaimer in
  13     the documentation and/or other materials provided with the
  14     distribution.
  15     * Neither the name of Intel Corporation nor the names of its
  16     contributors may be used to endorse or promote products derived
  17     from this software without specific prior written permission.
  18
  19     THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  20     "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  21     LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  22     A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  23     OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  24     SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  25     LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  26     DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  27     THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  28     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  29     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  30
  31 L2 Forwarding Sample Application (in Real and Virtualized Environments) with core load statistics.
  32 ==================================================================================================
  33
  34 The L2 Forwarding sample application is a simple example of packet processing using
  35 the Data Plane Development Kit (DPDK) which
  36 also takes advantage of Single Root I/O Virtualization (SR-IOV) features in a virtualized environment.
  37
  38 .. note::
  39
  40     This application is a variation of L2 Forwarding sample application. It demonstrate possible
  41     scheme of job stats library usage therefore some parts of this document is identical with original
  42     L2 forwarding application.
  43
  44 Overview
  45 --------
  46
  47 The L2 Forwarding sample application, which can operate in real and virtualized environments,
  48 performs L2 forwarding for each packet that is received.
  49 The destination port is the adjacent port from the enabled portmask, that is,
  50 if the first four ports are enabled (portmask 0xf),
  51 ports 1 and 2 forward into each other, and ports 3 and 4 forward into each other.
  52 Also, the MAC addresses are affected as follows:
  53
  54 *   The source MAC address is replaced by the TX port MAC address
  55
  56 *   The destination MAC address is replaced by  02:00:00:00:00:TX_PORT_ID
  57
  58 This application can be used to benchmark performance using a traffic-generator, as shown in the :numref:`figure_l2_fwd_benchmark_setup_jobstats`.
  59
  60 The application can also be used in a virtualized environment as shown in :numref:`figure_l2_fwd_virtenv_benchmark_setup_jobstats`.
  61
  62 The L2 Forwarding application can also be used as a starting point for developing a new application based on the DPDK.
  63
  64 .. _figure_l2_fwd_benchmark_setup_jobstats:
  65
  66 .. figure:: img/l2_fwd_benchmark_setup.*
  67
  68    Performance Benchmark Setup (Basic Environment)
  69
  70 .. _figure_l2_fwd_virtenv_benchmark_setup_jobstats:
  71
  72 .. figure:: img/l2_fwd_virtenv_benchmark_setup.*
  73
  74    Performance Benchmark Setup (Virtualized Environment)
  75
  76
  77 Virtual Function Setup Instructions
  78 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  79
  80 This application can use the virtual function available in the system and
  81 therefore can be used in a virtual machine without passing through
  82 the whole Network Device into a guest machine in a virtualized scenario.
  83 The virtual functions can be enabled in the host machine or the hypervisor with the respective physical function driver.
  84
  85 For example, in a Linux* host machine, it is possible to enable a virtual function using the following command:
  86
  87 .. code-block:: console
  88
  89     modprobe ixgbe max_vfs=2,2
  90
  91 This command enables two Virtual Functions on each of Physical Function of the NIC,
  92 with two physical ports in the PCI configuration space.
  93 It is important to note that enabled Virtual Function 0 and 2 would belong to Physical Function 0
  94 and Virtual Function 1 and 3 would belong to Physical Function 1,
  95 in this case enabling a total of four Virtual Functions.
  96
  97 Compiling the Application
  98 -------------------------
  99
 100 #.  Go to the example directory:
 101
 102     .. code-block:: console
 103
 104         export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/l2fwd-jobstats
 105
 106 #.  Set the target (a default target is used if not specified). For example:
 107
 108     .. code-block:: console
 109
 110         export RTE_TARGET=x86_64-native-linuxapp-gcc
 111
 112     *See the DPDK Getting Started Guide* for possible RTE_TARGET values.
 113
 114 #.  Build the application:
 115
 116     .. code-block:: console
 117
 118         make
 119
 120 Running the Application
 121 -----------------------
 122
 123 The application requires a number of command line options:
 124
 125 .. code-block:: console
 126
 127     ./build/l2fwd-jobstats [EAL options] -- -p PORTMASK [-q NQ] [-l]
 128
 129 where,
 130
 131 *   p PORTMASK: A hexadecimal bitmask of the ports to configure
 132
 133 *   q NQ: A number of queues (=ports) per lcore (default is 1)
 134
 135 *   l: Use locale thousands separator when formatting big numbers.
 136
 137 To run the application in linuxapp environment with 4 lcores, 16 ports, 8 RX queues per lcore and
 138 thousands  separator printing, issue the command:
 139
 140 .. code-block:: console
 141
 142     $ ./build/l2fwd-jobstats -c f -n 4 -- -q 8 -p ffff -l
 143
 144 Refer to the *DPDK Getting Started Guide* for general information on running applications
 145 and the Environment Abstraction Layer (EAL) options.
 146
 147 Explanation
 148 -----------
 149
 150 The following sections provide some explanation of the code.
 151
 152 Command Line Arguments
 153 ~~~~~~~~~~~~~~~~~~~~~~
 154
 155 The L2 Forwarding sample application takes specific parameters,
 156 in addition to Environment Abstraction Layer (EAL) arguments (see Section 9.3).
 157 The preferred way to parse parameters is to use the getopt() function,
 158 since it is part of a well-defined and portable library.
 159
 160 The parsing of arguments is done in the l2fwd_parse_args() function.
 161 The method of argument parsing is not described here.
 162 Refer to the *glibc getopt(3)* man page for details.
 163
 164 EAL arguments are parsed first, then application-specific arguments.
 165 This is done at the beginning of the main() function:
 166
 167 .. code-block:: c
 168
 169     /* init EAL */
 170
 171     ret = rte_eal_init(argc, argv);
 172     if (ret < 0)
 173         rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
 174
 175     argc -= ret;
 176     argv += ret;
 177
 178     /* parse application arguments (after the EAL ones) */
 179
 180     ret = l2fwd_parse_args(argc, argv);
 181     if (ret < 0)
 182         rte_exit(EXIT_FAILURE, "Invalid L2FWD arguments\n");
 183
 184 Mbuf Pool Initialization
 185 ~~~~~~~~~~~~~~~~~~~~~~~~
 186
 187 Once the arguments are parsed, the mbuf pool is created.
 188 The mbuf pool contains a set of mbuf objects that will be used by the driver
 189 and the application to store network packet data:
 190
 191 .. code-block:: c
 192
 193     /* create the mbuf pool */
 194     l2fwd_pktmbuf_pool =
 195         rte_mempool_create("mbuf_pool", NB_MBUF,
 196                    MBUF_SIZE, 32,
 197                    sizeof(struct rte_pktmbuf_pool_private),
 198                    rte_pktmbuf_pool_init, NULL,
 199                    rte_pktmbuf_init, NULL,
 200                    rte_socket_id(), 0);
 201
 202     if (l2fwd_pktmbuf_pool == NULL)
 203         rte_exit(EXIT_FAILURE, "Cannot init mbuf pool\n");
 204
 205 The rte_mempool is a generic structure used to handle pools of objects.
 206 In this case, it is necessary to create a pool that will be used by the driver,
 207 which expects to have some reserved space in the mempool structure,
 208 sizeof(struct rte_pktmbuf_pool_private) bytes.
 209 The number of allocated pkt mbufs is NB_MBUF, with a size of MBUF_SIZE each.
 210 A per-lcore cache of 32 mbufs is kept.
 211 The memory is allocated in rte_socket_id() socket,
 212 but it is possible to extend this code to allocate one mbuf pool per socket.
 213
 214 Two callback pointers are also given to the rte_mempool_create() function:
 215
 216 *   The first callback pointer is to rte_pktmbuf_pool_init() and is used
 217     to initialize the private data of the mempool, which is needed by the driver.
 218     This function is provided by the mbuf API, but can be copied and extended by the developer.
 219
 220 *   The second callback pointer given to rte_mempool_create() is the mbuf initializer.
 221     The default is used, that is, rte_pktmbuf_init(), which is provided in the rte_mbuf library.
 222     If a more complex application wants to extend the rte_pktmbuf structure for its own needs,
 223     a new function derived from rte_pktmbuf_init( ) can be created.
 224
 225 Driver Initialization
 226 ~~~~~~~~~~~~~~~~~~~~~
 227
 228 The main part of the code in the main() function relates to the initialization of the driver.
 229 To fully understand this code, it is recommended to study the chapters that related to the Poll Mode Driver
 230 in the *DPDK Programmer's Guide* and the *DPDK API Reference*.
 231
 232 .. code-block:: c
 233
 234     nb_ports = rte_eth_dev_count();
 235
 236     if (nb_ports == 0)
 237         rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n");
 238
 239     if (nb_ports > RTE_MAX_ETHPORTS)
 240         nb_ports = RTE_MAX_ETHPORTS;
 241
 242     /* reset l2fwd_dst_ports */
 243
 244     for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++)
 245         l2fwd_dst_ports[portid] = 0;
 246
 247     last_port = 0;
 248
 249     /*
 250      * Each logical core is assigned a dedicated TX queue on each port.
 251      */
 252     for (portid = 0; portid < nb_ports; portid++) {
 253         /* skip ports that are not enabled */
 254         if ((l2fwd_enabled_port_mask & (1 << portid)) == 0)
 255            continue;
 256
 257         if (nb_ports_in_mask % 2) {
 258             l2fwd_dst_ports[portid] = last_port;
 259             l2fwd_dst_ports[last_port] = portid;
 260         }
 261         else
 262            last_port = portid;
 263
 264         nb_ports_in_mask++;
 265
 266         rte_eth_dev_info_get((uint8_t) portid, &dev_info);
 267     }
 268
 269 The next step is to configure the RX and TX queues.
 270 For each port, there is only one RX queue (only one lcore is able to poll a given port).
 271 The number of TX queues depends on the number of available lcores.
 272 The rte_eth_dev_configure() function is used to configure the number of queues for a port:
 273
 274 .. code-block:: c
 275
 276     ret = rte_eth_dev_configure((uint8_t)portid, 1, 1, &port_conf);
 277     if (ret < 0)
 278         rte_exit(EXIT_FAILURE, "Cannot configure device: "
 279             "err=%d, port=%u\n",
 280             ret, portid);
 281
 282 The global configuration is stored in a static structure:
 283
 284 .. code-block:: c
 285
 286     static const struct rte_eth_conf port_conf = {
 287         .rxmode = {
 288             .split_hdr_size = 0,
 289             .header_split = 0,   /**< Header Split disabled */
 290             .hw_ip_checksum = 0, /**< IP checksum offload disabled */
 291             .hw_vlan_filter = 0, /**< VLAN filtering disabled */
 292             .jumbo_frame = 0,    /**< Jumbo Frame Support disabled */
 293             .hw_strip_crc= 0,    /**< CRC stripped by hardware */
 294         },
 295
 296         .txmode = {
 297             .mq_mode = ETH_DCB_NONE
 298         },
 299     };
 300
 301 RX Queue Initialization
 302 ~~~~~~~~~~~~~~~~~~~~~~~
 303
 304 The application uses one lcore to poll one or several ports, depending on the -q option,
 305 which specifies the number of queues per lcore.
 306
 307 For example, if the user specifies -q 4, the application is able to poll four ports with one lcore.
 308 If there are 16 ports on the target (and if the portmask argument is -p ffff ),
 309 the application will need four lcores to poll all the ports.
 310
 311 .. code-block:: c
 312
 313     ret = rte_eth_rx_queue_setup(portid, 0, nb_rxd,
 314                 rte_eth_dev_socket_id(portid),
 315                 NULL,
 316                 l2fwd_pktmbuf_pool);
 317
 318     if (ret < 0)
 319         rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup:err=%d, port=%u\n",
 320                 ret, (unsigned) portid);
 321
 322 The list of queues that must be polled for a given lcore is stored in a private structure called struct lcore_queue_conf.
 323
 324 .. code-block:: c
 325
 326     struct lcore_queue_conf {
 327         unsigned n_rx_port;
 328         unsigned rx_port_list[MAX_RX_QUEUE_PER_LCORE];
 329         truct mbuf_table tx_mbufs[RTE_MAX_ETHPORTS];
 330
 331         struct rte_timer rx_timers[MAX_RX_QUEUE_PER_LCORE];
 332         struct rte_jobstats port_fwd_jobs[MAX_RX_QUEUE_PER_LCORE];
 333
 334         struct rte_timer flush_timer;
 335         struct rte_jobstats flush_job;
 336         struct rte_jobstats idle_job;
 337         struct rte_jobstats_context jobs_context;
 338
 339         rte_atomic16_t stats_read_pending;
 340         rte_spinlock_t lock;
 341     } __rte_cache_aligned;
 342
 343 Values of struct lcore_queue_conf:
 344
 345 *   n_rx_port and rx_port_list[] are used in the main packet processing loop
 346     (see Section 9.4.6 "Receive, Process and Transmit Packets" later in this chapter).
 347
 348 *   rx_timers and flush_timer are used to ensure forced TX on low packet rate.
 349
 350 *   flush_job, idle_job and jobs_context are librte_jobstats objects used for managing l2fwd jobs.
 351
 352 *   stats_read_pending and lock are used during job stats read phase.
 353
 354 TX Queue Initialization
 355 ~~~~~~~~~~~~~~~~~~~~~~~
 356
 357 Each lcore should be able to transmit on any port. For every port, a single TX queue is initialized.
 358
 359 .. code-block:: c
 360
 361     /* init one TX queue on each port */
 362
 363     fflush(stdout);
 364     ret = rte_eth_tx_queue_setup(portid, 0, nb_txd,
 365             rte_eth_dev_socket_id(portid),
 366             NULL);
 367     if (ret < 0)
 368         rte_exit(EXIT_FAILURE, "rte_eth_tx_queue_setup:err=%d, port=%u\n",
 369                 ret, (unsigned) portid);
 370
 371 Jobs statistics initialization
 372 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 373 There are several statistics objects available:
 374
 375 *   Flush job statistics
 376
 377 .. code-block:: c
 378
 379     rte_jobstats_init(&qconf->flush_job, "flush", drain_tsc, drain_tsc,
 380             drain_tsc, 0);
 381
 382     rte_timer_init(&qconf->flush_timer);
 383     ret = rte_timer_reset(&qconf->flush_timer, drain_tsc, PERIODICAL,
 384                 lcore_id, &l2fwd_flush_job, NULL);
 385
 386     if (ret < 0) {
 387         rte_exit(1, "Failed to reset flush job timer for lcore %u: %s",
 388                     lcore_id, rte_strerror(-ret));
 389     }
 390
 391 *   Statistics per RX port
 392
 393 .. code-block:: c
 394
 395     rte_jobstats_init(job, name, 0, drain_tsc, 0, MAX_PKT_BURST);
 396     rte_jobstats_set_update_period_function(job, l2fwd_job_update_cb);
 397
 398     rte_timer_init(&qconf->rx_timers[i]);
 399     ret = rte_timer_reset(&qconf->rx_timers[i], 0, PERIODICAL, lcore_id,
 400             l2fwd_fwd_job, (void *)(uintptr_t)i);
 401
 402     if (ret < 0) {
 403         rte_exit(1, "Failed to reset lcore %u port %u job timer: %s",
 404                     lcore_id, qconf->rx_port_list[i], rte_strerror(-ret));
 405     }
 406
 407 Following parameters are passed to rte_jobstats_init():
 408
 409 *   0 as minimal poll period
 410
 411 *   drain_tsc as maximum poll period
 412
 413 *   MAX_PKT_BURST as desired target value (RX burst size)
 414
 415 Main loop
 416 ~~~~~~~~~
 417
 418 The forwarding path is reworked comparing to original L2 Forwarding application.
 419 In the l2fwd_main_loop() function three loops are placed.
 420
 421 .. code-block:: c
 422
 423     for (;;) {
 424         rte_spinlock_lock(&qconf->lock);
 425
 426         do {
 427             rte_jobstats_context_start(&qconf->jobs_context);
 428
 429             /* Do the Idle job:
 430              * - Read stats_read_pending flag
 431              * - check if some real job need to be executed
 432              */
 433             rte_jobstats_start(&qconf->jobs_context, &qconf->idle_job);
 434
 435             do {
 436                 uint8_t i;
 437                 uint64_t now = rte_get_timer_cycles();
 438
 439                 need_manage = qconf->flush_timer.expire < now;
 440                 /* Check if we was esked to give a stats. */
 441                 stats_read_pending =
 442                         rte_atomic16_read(&qconf->stats_read_pending);
 443                 need_manage |= stats_read_pending;
 444
 445                 for (i = 0; i < qconf->n_rx_port && !need_manage; i++)
 446                     need_manage = qconf->rx_timers[i].expire < now;
 447
 448             } while (!need_manage);
 449             rte_jobstats_finish(&qconf->idle_job, qconf->idle_job.target);
 450
 451             rte_timer_manage();
 452             rte_jobstats_context_finish(&qconf->jobs_context);
 453         } while (likely(stats_read_pending == 0));
 454
 455         rte_spinlock_unlock(&qconf->lock);
 456         rte_pause();
 457     }
 458
 459 First infinite for loop is to minimize impact of stats reading. Lock is only locked/unlocked when asked.
 460
 461 Second inner while loop do the whole jobs management. When any job is ready, the use rte_timer_manage() is used to call the job handler.
 462 In this place functions l2fwd_fwd_job() and l2fwd_flush_job() are called when needed.
 463 Then rte_jobstats_context_finish() is called to mark loop end - no other jobs are ready to execute. By this time stats are ready to be read
 464 and if stats_read_pending is set, loop breaks allowing stats to be read.
 465
 466 Third do-while loop is the idle job (idle stats counter). Its only purpose is monitoring if any job is ready or stats job read is pending
 467 for this lcore. Statistics from this part of code is considered as the headroom available for additional processing.
 468
 469 Receive, Process and Transmit Packets
 470 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 471
 472 The main task of l2fwd_fwd_job() function is to read ingress packets from the RX queue of particular port and forward it.
 473 This is done using the following code:
 474
 475 .. code-block:: c
 476
 477     total_nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
 478             MAX_PKT_BURST);
 479
 480     for (j = 0; j < total_nb_rx; j++) {
 481         m = pkts_burst[j];
 482         rte_prefetch0(rte_pktmbuf_mtod(m, void *));
 483         l2fwd_simple_forward(m, portid);
 484     }
 485
 486 Packets are read in a burst of size MAX_PKT_BURST.
 487 Then, each mbuf in the table is processed by the l2fwd_simple_forward() function.
 488 The processing is very simple: process the TX port from the RX port, then replace the source and destination MAC addresses.
 489
 490 The rte_eth_rx_burst() function writes the mbuf pointers in a local table and returns the number of available mbufs in the table.
 491
 492 After first read second try is issued.
 493
 494 .. code-block:: c
 495
 496     if (total_nb_rx == MAX_PKT_BURST) {
 497         const uint16_t nb_rx = rte_eth_rx_burst((uint8_t) portid, 0, pkts_burst,
 498                 MAX_PKT_BURST);
 499
 500         total_nb_rx += nb_rx;
 501         for (j = 0; j < nb_rx; j++) {
 502             m = pkts_burst[j];
 503             rte_prefetch0(rte_pktmbuf_mtod(m, void *));
 504             l2fwd_simple_forward(m, portid);
 505         }
 506     }
 507
 508 This second read is important to give job stats library a feedback how many packets was processed.
 509
 510 .. code-block:: c
 511
 512     /* Adjust period time in which we are running here. */
 513     if (rte_jobstats_finish(job, total_nb_rx) != 0) {
 514         rte_timer_reset(&qconf->rx_timers[port_idx], job->period, PERIODICAL,
 515                 lcore_id, l2fwd_fwd_job, arg);
 516     }
 517
 518 To maximize performance exactly MAX_PKT_BURST is expected (the target value) to be read for each l2fwd_fwd_job() call.
 519 If total_nb_rx is smaller than target value job->period will be increased. If it is greater the period will be decreased.
 520
 521 .. note::
 522
 523     In the following code, one line for getting the output port requires some explanation.
 524
 525 During the initialization process, a static array of destination ports (l2fwd_dst_ports[]) is filled such that for each source port,
 526 a destination port is assigned that is either the next or previous enabled port from the portmask.
 527 Naturally, the number of ports in the portmask must be even, otherwise, the application exits.
 528
 529 .. code-block:: c
 530
 531     static void
 532     l2fwd_simple_forward(struct rte_mbuf *m, unsigned portid)
 533     {
 534         struct ether_hdr *eth;
 535         void *tmp;
 536         unsigned dst_port;
 537
 538         dst_port = l2fwd_dst_ports[portid];
 539
 540         eth = rte_pktmbuf_mtod(m, struct ether_hdr *);
 541
 542         /* 02:00:00:00:00:xx */
 543
 544         tmp = &eth->d_addr.addr_bytes[0];
 545
 546         *((uint64_t *)tmp) = 0x000000000002 + ((uint64_t) dst_port << 40);
 547
 548         /* src addr */
 549
 550         ether_addr_copy(&l2fwd_ports_eth_addr[dst_port], &eth->s_addr);
 551
 552         l2fwd_send_packet(m, (uint8_t) dst_port);
 553     }
 554
 555 Then, the packet is sent using the l2fwd_send_packet (m, dst_port) function.
 556 For this test application, the processing is exactly the same for all packets arriving on the same RX port.
 557 Therefore, it would have been possible to call the l2fwd_send_burst() function directly from the main loop
 558 to send all the received packets on the same TX port,
 559 using the burst-oriented send function, which is more efficient.
 560
 561 However, in real-life applications (such as, L3 routing),
 562 packet N is not necessarily forwarded on the same port as packet N-1.
 563 The application is implemented to illustrate that, so the same approach can be reused in a more complex application.
 564
 565 The l2fwd_send_packet() function stores the packet in a per-lcore and per-txport table.
 566 If the table is full, the whole packets table is transmitted using the l2fwd_send_burst() function:
 567
 568 .. code-block:: c
 569
 570     /* Send the packet on an output interface */
 571
 572     static int
 573     l2fwd_send_packet(struct rte_mbuf *m, uint8_t port)
 574     {
 575         unsigned lcore_id, len;
 576         struct lcore_queue_conf *qconf;
 577
 578         lcore_id = rte_lcore_id();
 579         qconf = &lcore_queue_conf[lcore_id];
 580         len = qconf->tx_mbufs[port].len;
 581         qconf->tx_mbufs[port].m_table[len] = m;
 582         len++;
 583
 584         /* enough pkts to be sent */
 585
 586         if (unlikely(len == MAX_PKT_BURST)) {
 587             l2fwd_send_burst(qconf, MAX_PKT_BURST, port);
 588             len = 0;
 589         }
 590
 591         qconf->tx_mbufs[port].len = len; return 0;
 592     }
 593
 594 To ensure that no packets remain in the tables, the flush job exists. The l2fwd_flush_job()
 595 is called periodically to for each lcore draining TX queue of each port.
 596 This technique introduces some latency when there are not many packets to send,
 597 however it improves performance:
 598
 599 .. code-block:: c
 600
 601     static void
 602     l2fwd_flush_job(__rte_unused struct rte_timer *timer, __rte_unused void *arg)
 603     {
 604         uint64_t now;
 605         unsigned lcore_id;
 606         struct lcore_queue_conf *qconf;
 607         struct mbuf_table *m_table;
 608         uint8_t portid;
 609
 610         lcore_id = rte_lcore_id();
 611         qconf = &lcore_queue_conf[lcore_id];
 612
 613         rte_jobstats_start(&qconf->jobs_context, &qconf->flush_job);
 614
 615         now = rte_get_timer_cycles();
 616         lcore_id = rte_lcore_id();
 617         qconf = &lcore_queue_conf[lcore_id];
 618         for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
 619             m_table = &qconf->tx_mbufs[portid];
 620             if (m_table->len == 0 || m_table->next_flush_time <= now)
 621                 continue;
 622
 623             l2fwd_send_burst(qconf, portid);
 624         }
 625
 626
 627         /* Pass target to indicate that this job is happy of time interval
 628          * in which it was called. */
 629         rte_jobstats_finish(&qconf->flush_job, qconf->flush_job.target);
 630     }