doc/guides/prog_guide/switch_representation.rst

   1 ..  SPDX-License-Identifier: BSD-3-Clause
   2     Copyright(c) 2018 6WIND S.A.
   3
   4 .. _switch_representation:
   5
   6 Switch Representation within DPDK Applications
   7 ==============================================
   8
   9 .. contents:: :local:
  10
  11 Introduction
  12 ------------
  13
  14 Network adapters with multiple physical ports and/or SR-IOV capabilities
  15 usually support the offload of traffic steering rules between their virtual
  16 functions (VFs), sub functions (SFs), physical functions (PFs) and ports.
  17
  18 Like for standard Ethernet switches, this involves a combination of
  19 automatic MAC learning and manual configuration. For most purposes it is
  20 managed by the host system and fully transparent to users and applications.
  21
  22 On the other hand, applications typically found on hypervisors that process
  23 layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
  24 according on their own criteria.
  25
  26 Without a standard software interface to manage traffic steering rules
  27 between VFs, SFs, PFs and the various physical ports of a given device,
  28 applications cannot take advantage of these offloads; software processing is
  29 mandatory even for traffic which ends up re-injected into the device it
  30 originates from.
  31
  32 This document describes how such steering rules can be configured through
  33 the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
  34 (PF/VF steering) using a single physical port for clarity, however the same
  35 logic applies to any number of ports without necessarily involving SR-IOV.
  36
  37 Sub Function
  38 ------------
  39 Besides SR-IOV, Sub function is a portion of the PCI device, a SF netdev
  40 has its own dedicated queues(txq, rxq). A SF netdev supports E-Switch
  41 representation offload similar to existing PF and VF representors.
  42 A SF shares PCI level resources with other SFs and/or with its parent PCI
  43 function.
  44
  45 Sub function is created on-demand, coexists with VFs. Number of SFs is
  46 limited by hardware resources.
  47
  48 Port Representors
  49 -----------------
  50
  51 In many cases, traffic steering rules cannot be determined in advance;
  52 applications usually have to process a bit of traffic in software before
  53 thinking about offloading specific flows to hardware.
  54
  55 Applications therefore need the ability to receive and inject traffic to
  56 various device endpoints (other VFs, SFs, PFs or physical ports) before
  57 connecting them together. Device drivers must provide means to hook the
  58 "other end" of these endpoints and to refer them when configuring flow
  59 rules.
  60
  61 This role is left to so-called "port representors" (also known as "VF
  62 representors" in the specific context of VFs, "SF representors" in the
  63 specific context of SFs), which are to DPDK what the Ethernet switch
  64 device driver model (**switchdev**) [1]_ is to Linux, and which can be
  65 thought as a software "patch panel" front-end for applications.
  66
  67 - DPDK port representors are implemented as additional virtual Ethernet
  68   device (**ethdev**) instances, spawned on an as needed basis through
  69   configuration parameters passed to the driver of the underlying
  70   device using devargs.
  71
  72 ::
  73
  74    -a pci:dbdf,representor=vf0
  75    -a pci:dbdf,representor=vf[0-3]
  76    -a pci:dbdf,representor=vf[0,5-11]
  77    -a pci:dbdf,representor=sf1
  78    -a pci:dbdf,representor=sf[0-1023]
  79    -a pci:dbdf,representor=sf[0,2-1023]
  80
  81 - As virtual devices, they may be more limited than their physical
  82   counterparts, for instance by exposing only a subset of device
  83   configuration callbacks and/or by not necessarily having Rx/Tx capability.
  84
  85 - Among other things, they can be used to assign MAC addresses to the
  86   resource they represent.
  87
  88 - Applications can tell port representors apart from other physical of virtual
  89   port by checking the dev_flags field within their device information
  90   structure for the RTE_ETH_DEV_REPRESENTOR bit-field.
  91
  92 .. code-block:: c
  93
  94   struct rte_eth_dev_info {
  95       ...
  96       uint32_t dev_flags; /**< Device flags */
  97       ...
  98   };
  99
 100 - The device or group relationship of ports can be discovered using the
 101   switch ``domain_id`` field within the devices switch information structure. By
 102   default the switch ``domain_id`` of a port will be
 103   ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't
 104   support the concept of a switch domain, but ports which do support the concept
 105   will be allocated a unique switch ``domain_id``, ports within the same switch
 106   domain will share the same ``domain_id``. The switch ``port_id`` is used to
 107   specify the port_id in terms of the switch, so in the case of SR-IOV devices
 108   the switch ``port_id`` would represent the virtual function identifier of the
 109   port.
 110
 111 .. code-block:: c
 112
 113    /**
 114     * Ethernet device associated switch information
 115     */
 116    struct rte_eth_switch_info {
 117        const char *name; /**< switch name */
 118        uint16_t domain_id; /**< switch domain id */
 119        uint16_t port_id; /**< switch port id */
 120    };
 121
 122
 123 .. [1] `Ethernet switch device driver model (switchdev)
 124        <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
 125
 126 Basic SR-IOV
 127 ------------
 128
 129 "Basic" in the sense that it is not managed by applications, which
 130 nonetheless expect traffic to flow between the various endpoints and the
 131 outside as if everything was linked by an Ethernet hub.
 132
 133 The following diagram pictures a setup involving a device with one PF, two
 134 VFs and one shared physical port
 135
 136 ::
 137
 138        .-------------.                 .-------------. .-------------.
 139        | hypervisor  |                 |    VM 1     | |    VM 2     |
 140        | application |                 | application | | application |
 141        `--+----------'                 `----------+--' `--+----------'
 142           |                                       |       |
 143     .-----+-----.                                 |       |
 144     | port_id 3 |                                 |       |
 145     `-----+-----'                                 |       |
 146           |                                       |       |
 147         .-+--.                                .---+--. .--+---.
 148         | PF |                                | VF 1 | | VF 2 |
 149         `-+--'                                `---+--' `--+---'
 150           |                                       |       |
 151           `---------.     .-----------------------'       |
 152                     |     |     .-------------------------'
 153                     |     |     |
 154                  .--+-----+-----+--.
 155                  | interconnection |
 156                  `--------+--------'
 157                           |
 158                      .----+-----.
 159                      | physical |
 160                      |  port 0  |
 161                      `----------'
 162
 163 - A DPDK application running on the hypervisor owns the PF device, which is
 164   arbitrarily assigned port index 3.
 165
 166 - Both VFs are assigned to VMs and used by unknown applications; they may be
 167   DPDK-based or anything else.
 168
 169 - Interconnection is not necessarily done through a true Ethernet switch and
 170   may not even exist as a separate entity. The role of this block is to show
 171   that something brings PF, VFs and physical ports together and enables
 172   communication between them, with a number of built-in restrictions.
 173
 174 Subsequent sections in this document describe means for DPDK applications
 175 running on the hypervisor to freely assign specific flows between PF, VFs
 176 and physical ports based on traffic properties, by managing this
 177 interconnection.
 178
 179 Controlled SR-IOV
 180 -----------------
 181
 182 Initialization
 183 ~~~~~~~~~~~~~~
 184
 185 When a DPDK application gets assigned a PF device and is deliberately not
 186 started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
 187 received by PF according to default rules, while VFs remain isolated.
 188
 189 ::
 190
 191        .-------------.                 .-------------. .-------------.
 192        | hypervisor  |                 |    VM 1     | |    VM 2     |
 193        | application |                 | application | | application |
 194        `--+----------'                 `----------+--' `--+----------'
 195           |                                       |       |
 196     .-----+-----.                                 |       |
 197     | port_id 3 |                                 |       |
 198     `-----+-----'                                 |       |
 199           |                                       |       |
 200         .-+--.                                .---+--. .--+---.
 201         | PF |                                | VF 1 | | VF 2 |
 202         `-+--'                                `------' `------'
 203           |
 204           `-----.
 205                 |
 206              .--+----------------------.
 207              | managed interconnection |
 208              `------------+------------'
 209                           |
 210                      .----+-----.
 211                      | physical |
 212                      |  port 0  |
 213                      `----------'
 214
 215 In this mode, interconnection must be configured by the application to
 216 enable VF communication, for instance by explicitly directing traffic with a
 217 given destination MAC address to VF 1 and allowing that with the same source
 218 MAC address to come out of it.
 219
 220 For this to work, hypervisor applications need a way to refer to either VF 1
 221 or VF 2 in addition to the PF. This is addressed by `VF representors`_.
 222
 223 VF Representors
 224 ~~~~~~~~~~~~~~~
 225
 226 VF representors are virtual but standard DPDK network devices (albeit with
 227 limited capabilities) created by PMDs when managing a PF device.
 228
 229 Since they represent VF instances used by other applications, configuring
 230 them (e.g. assigning a MAC address or setting up promiscuous mode) affects
 231 interconnection accordingly. If supported, they may also be used as two-way
 232 communication ports with VFs (assuming **switchdev** topology)
 233
 234
 235 ::
 236
 237        .-------------.                 .-------------. .-------------.
 238        | hypervisor  |                 |    VM 1     | |    VM 2     |
 239        | application |                 | application | | application |
 240        `--+---+---+--'                 `----------+--' `--+----------'
 241           |   |   |                               |       |
 242           |   |   `-------------------.           |       |
 243           |   `---------.             |           |       |
 244           |             |             |           |       |
 245     .-----+-----. .-----+-----. .-----+-----.     |       |
 246     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 247     `-----+-----' `-----+-----' `-----+-----'     |       |
 248           |             |             |           |       |
 249         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 250         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 251         `-+--'    `-----+-----' `-----+-----' `---+--' `--+---'
 252           |             |             |           |       |
 253           |             |   .---------'           |       |
 254           `-----.       |   |   .-----------------'       |
 255                 |       |   |   |   .---------------------'
 256                 |       |   |   |   |
 257              .--+-------+---+---+---+--.
 258              | managed interconnection |
 259              `------------+------------'
 260                           |
 261                      .----+-----.
 262                      | physical |
 263                      |  port 0  |
 264                      `----------'
 265
 266 - VF representors are assigned arbitrary port indices 4 and 5 in the
 267   hypervisor application and are respectively associated with VF 1 and VF 2.
 268
 269 - They can't be dissociated; even if VF 1 and VF 2 were not connected,
 270   representors could still be used for configuration.
 271
 272 - In this context, port index 3 can be thought as a representor for physical
 273   port 0.
 274
 275 As previously described, the "interconnection" block represents a logical
 276 concept. Interconnection occurs when hardware configuration enables traffic
 277 flows from one place to another (e.g. physical port 0 to VF 1) according to
 278 some criteria.
 279
 280 This is discussed in more detail in `traffic steering`_.
 281
 282 Traffic Steering
 283 ~~~~~~~~~~~~~~~~
 284
 285 In the following diagram, each meaningful traffic origin or endpoint as seen
 286 by the hypervisor application is tagged with a unique letter from A to F.
 287
 288 ::
 289
 290        .-------------.                 .-------------. .-------------.
 291        | hypervisor  |                 |    VM 1     | |    VM 2     |
 292        | application |                 | application | | application |
 293        `--+---+---+--'                 `----------+--' `--+----------'
 294           |   |   |                               |       |
 295           |   |   `-------------------.           |       |
 296           |   `---------.             |           |       |
 297           |             |             |           |       |
 298     .----(A)----. .----(B)----. .----(C)----.     |       |
 299     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 300     `-----+-----' `-----+-----' `-----+-----'     |       |
 301           |             |             |           |       |
 302         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 303         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 304         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 305           |             |             |           |       |
 306           |             |   .---------'           |       |
 307           `-----.       |   |   .-----------------'       |
 308                 |       |   |   |   .---------------------'
 309                 |       |   |   |   |
 310              .--+-------+---+---+---+--.
 311              | managed interconnection |
 312              `------------+------------'
 313                           |
 314                      .---(F)----.
 315                      | physical |
 316                      |  port 0  |
 317                      `----------'
 318
 319 - **A**: PF device.
 320 - **B**: port representor for VF 1.
 321 - **C**: port representor for VF 2.
 322 - **D**: VF 1 proper.
 323 - **E**: VF 2 proper.
 324 - **F**: physical port.
 325
 326 Although uncommon, some devices do not enforce a one to one mapping between
 327 PF and physical ports. For instance, by default all ports of **mlx4**
 328 adapters are available to all their PF/VF instances, in which case
 329 additional ports appear next to **F** in the above diagram.
 330
 331 Assuming no interconnection is provided by default in this mode, setting up
 332 a `basic SR-IOV`_ configuration involving physical port 0 could be broken
 333 down as:
 334
 335 PF:
 336
 337 - **A to F**: let everything through.
 338 - **F to A**: PF MAC as destination.
 339
 340 VF 1:
 341
 342 - **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
 343 - **D to A**: VF 1 MAC as source and PF MAC as destination.
 344 - **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
 345 - **D to F**: VF 1 MAC as source.
 346
 347 VF 2:
 348
 349 - **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
 350 - **E to A**: VF 2 MAC as source and PF MAC as destination.
 351 - **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
 352 - **E to F**: VF 2 MAC as source.
 353
 354 Devices may additionally support advanced matching criteria such as
 355 IPv4/IPv6 addresses or TCP/UDP ports.
 356
 357 The combination of matching criteria with target endpoints fits well with
 358 **rte_flow** [6]_, which expresses flow rules as combinations of patterns
 359 and actions.
 360
 361 Enhancing **rte_flow** with the ability to make flow rules match and target
 362 these endpoints provides a standard interface to manage their
 363 interconnection without introducing new concepts and whole new API to
 364 implement them. This is described in `flow API (rte_flow)`_.
 365
 366 .. [6] :doc:`Generic flow API (rte_flow) <rte_flow>`
 367
 368 Flow API (rte_flow)
 369 -------------------
 370
 371 Extensions
 372 ~~~~~~~~~~
 373
 374 Compared to creating a brand new dedicated interface, **rte_flow** was
 375 deemed flexible enough to manage representor traffic only with minor
 376 extensions:
 377
 378 - Using physical ports, PF, SF, VF or port representors as targets.
 379
 380 - Affecting traffic that is not necessarily addressed to the DPDK port ID a
 381   flow rule is associated with (e.g. forcing VF traffic redirection to PF).
 382
 383 For advanced uses:
 384
 385 - Rule-based packet counters.
 386
 387 - The ability to combine several identical actions for traffic duplication
 388   (e.g. VF representor in addition to a physical port).
 389
 390 - Dedicated actions for traffic encapsulation / decapsulation before
 391   reaching an endpoint.
 392
 393 Traffic Direction
 394 ~~~~~~~~~~~~~~~~~
 395
 396 From an application standpoint, "ingress" and "egress" flow rule attributes
 397 apply to the DPDK port ID they are associated with. They select a traffic
 398 direction for matching patterns, but have no impact on actions.
 399
 400 When matching traffic coming from or going to a different place than the
 401 immediate port ID a flow rule is associated with, these attributes keep
 402 their meaning while applying to the chosen origin, as highlighted by the
 403 following diagram
 404
 405 ::
 406
 407        .-------------.                 .-------------. .-------------.
 408        | hypervisor  |                 |    VM 1     | |    VM 2     |
 409        | application |                 | application | | application |
 410        `--+---+---+--'                 `----------+--' `--+----------'
 411           |   |   |                               |       |
 412           |   |   `-------------------.           |       |
 413           |   `---------.             |           |       |
 414           | ^           | ^           | ^         |       |
 415           | | ingress   | | ingress   | | ingress |       |
 416           | | egress    | | egress    | | egress  |       |
 417           | v           | v           | v         |       |
 418     .----(A)----. .----(B)----. .----(C)----.     |       |
 419     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 420     `-----+-----' `-----+-----' `-----+-----'     |       |
 421           |             |             |           |       |
 422         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 423         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 424         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 425           |             |             |         ^ |       | ^
 426           |             |             |  egress | |       | | egress
 427           |             |             | ingress | |       | | ingress
 428           |             |   .---------'         v |       | v
 429           `-----.       |   |   .-----------------'       |
 430                 |       |   |   |   .---------------------'
 431                 |       |   |   |   |
 432              .--+-------+---+---+---+--.
 433              | managed interconnection |
 434              `------------+------------'
 435                         ^ |
 436                 ingress | |
 437                  egress | |
 438                         v |
 439                      .---(F)----.
 440                      | physical |
 441                      |  port 0  |
 442                      `----------'
 443
 444 Ingress and egress are defined as relative to the application creating the
 445 flow rule.
 446
 447 For instance, matching traffic sent by VM 2 would be done through an ingress
 448 flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
 449 (**F**). This also applies to **C** and **A** respectively.
 450
 451 Transferring Traffic
 452 ~~~~~~~~~~~~~~~~~~~~
 453
 454 Without Port Representors
 455 ^^^^^^^^^^^^^^^^^^^^^^^^^
 456
 457 `Traffic direction`_ describes how an application could match traffic coming
 458 from or going to a specific place reachable from a DPDK port ID. This makes
 459 sense when the traffic in question is normally seen (i.e. sent or received)
 460 by the application creating the flow rule (e.g. as in "redirect all traffic
 461 coming from VF 1 to local queue 6").
 462
 463 However this does not force such traffic to take a specific route. Creating
 464 a flow rule on **A** matching traffic coming from **D** is only meaningful
 465 if it can be received by **A** in the first place, otherwise doing so simply
 466 has no effect.
 467
 468 A new flow rule attribute named "transfer" is necessary for that. Combining
 469 it with "ingress" or "egress" and a specific origin requests a flow rule to
 470 be applied at the lowest level
 471
 472 ::
 473
 474              ingress only           :       ingress + transfer
 475                                     :
 476     .-------------. .-------------. : .-------------. .-------------.
 477     | hypervisor  | |    VM 1     | : | hypervisor  | |    VM 1     |
 478     | application | | application | : | application | | application |
 479     `------+------' `--+----------' : `------+------' `--+----------'
 480            |           | | traffic  :        |           | | traffic
 481      .----(A)----.     | v          :  .----(A)----.     | v
 482      | port_id 3 |     |            :  | port_id 3 |     |
 483      `-----+-----'     |            :  `-----+-----'     |
 484            |           |            :        | ^         |
 485            |           |            :        | | traffic |
 486          .-+--.    .---+--.         :      .-+--.    .---+--.
 487          | PF |    | VF 1 |         :      | PF |    | VF 1 |
 488          `-+--'    `--(D)-'         :      `-+--'    `--(D)-'
 489            |           | | traffic  :        | ^         | | traffic
 490            |           | v          :        | | traffic | v
 491         .--+-----------+--.         :     .--+-----------+--.
 492         | interconnection |         :     | interconnection |
 493         `--------+--------'         :     `--------+--------'
 494                  | | traffic        :              |
 495                  | v                :              |
 496             .---(F)----.            :         .---(F)----.
 497             | physical |            :         | physical |
 498             |  port 0  |            :         |  port 0  |
 499             `----------'            :         `----------'
 500
 501 With "ingress" only, traffic is matched on **A** thus still goes to physical
 502 port **F** by default
 503
 504
 505 ::
 506
 507    testpmd> flow create 3 ingress pattern vf id is 1 / end
 508               actions queue index 6 / end
 509
 510 With "ingress + transfer", traffic is matched on **D** and is therefore
 511 successfully assigned to queue 6 on **A**
 512
 513
 514 ::
 515
 516     testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
 517               actions queue index 6 / end
 518
 519
 520 With Port Representors
 521 ^^^^^^^^^^^^^^^^^^^^^^
 522
 523 When port representors exist, implicit flow rules with the "transfer"
 524 attribute (described in `without port representors`_) are be assumed to
 525 exist between them and their represented resources. These may be immutable.
 526
 527 In this case, traffic is received by default through the representor and
 528 neither the "transfer" attribute nor traffic origin in flow rule patterns
 529 are necessary. They simply have to be created on the representor port
 530 directly and may target a different representor as described in `PORT_ID
 531 action`_.
 532
 533 Implicit traffic flow with port representor
 534
 535 ::
 536
 537        .-------------.   .-------------.
 538        | hypervisor  |   |    VM 1     |
 539        | application |   | application |
 540        `--+-------+--'   `----------+--'
 541           |       | ^               | | traffic
 542           |       | | traffic       | v
 543           |       `-----.           |
 544           |             |           |
 545     .----(A)----. .----(B)----.     |
 546     | port_id 3 | | port_id 4 |     |
 547     `-----+-----' `-----+-----'     |
 548           |             |           |
 549         .-+--.    .-----+-----. .---+--.
 550         | PF |    | VF 1 rep. | | VF 1 |
 551         `-+--'    `-----+-----' `--(D)-'
 552           |             |           |
 553        .--|-------------|-----------|--.
 554        |  |             |           |  |
 555        |  |             `-----------'  |
 556        |  |              <-- traffic   |
 557        `--|----------------------------'
 558           |
 559      .---(F)----.
 560      | physical |
 561      |  port 0  |
 562      `----------'
 563
 564 Pattern Items And Actions
 565 ~~~~~~~~~~~~~~~~~~~~~~~~~
 566
 567 PORT Pattern Item
 568 ^^^^^^^^^^^^^^^^^
 569
 570 Matches traffic originating from (ingress) or going to (egress) a physical
 571 port of the underlying device.
 572
 573 Using this pattern item without specifying a port index matches the physical
 574 port associated with the current DPDK port ID by default. As described in
 575 `traffic steering`_, specifying it should be rarely needed.
 576
 577 - Matches **F** in `traffic steering`_.
 578
 579 PORT Action
 580 ^^^^^^^^^^^
 581
 582 Directs matching traffic to a given physical port index.
 583
 584 - Targets **F** in `traffic steering`_.
 585
 586 PORT_ID Pattern Item
 587 ^^^^^^^^^^^^^^^^^^^^
 588
 589 Matches traffic originating from (ingress) or going to (egress) a given DPDK
 590 port ID.
 591
 592 Normally only supported if the port ID in question is known by the
 593 underlying PMD and related to the device the flow rule is created against.
 594
 595 This must not be confused with the `PORT pattern item`_ which refers to the
 596 physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
 597 object on the application side (also known as "port representor" depending
 598 on the kind of underlying device).
 599
 600 - Matches **A**, **B** or **C** in `traffic steering`_.
 601
 602 PORT_ID Action
 603 ^^^^^^^^^^^^^^
 604
 605 Directs matching traffic to a given DPDK port ID.
 606
 607 Same restrictions as `PORT_ID pattern item`_.
 608
 609 - Targets **A**, **B** or **C** in `traffic steering`_.
 610
 611 PF Pattern Item
 612 ^^^^^^^^^^^^^^^
 613
 614 Matches traffic originating from (ingress) or going to (egress) the physical
 615 function of the current device.
 616
 617 If supported, should work even if the physical function is not managed by
 618 the application and thus not associated with a DPDK port ID. Its behavior is
 619 otherwise similar to `PORT_ID pattern item`_ using PF port ID.
 620
 621 - Matches **A** in `traffic steering`_.
 622
 623 PF Action
 624 ^^^^^^^^^
 625
 626 Directs matching traffic to the physical function of the current device.
 627
 628 Same restrictions as `PF pattern item`_.
 629
 630 - Targets **A** in `traffic steering`_.
 631
 632 VF Pattern Item
 633 ^^^^^^^^^^^^^^^
 634
 635 Matches traffic originating from (ingress) or going to (egress) a given
 636 virtual function of the current device.
 637
 638 If supported, should work even if the virtual function is not managed by
 639 the application and thus not associated with a DPDK port ID. Its behavior is
 640 otherwise similar to `PORT_ID pattern item`_ using VF port ID.
 641
 642 Note this pattern item does not match VF representors traffic which, as
 643 separate entities, should be addressed through their own port IDs.
 644
 645 - Matches **D** or **E** in `traffic steering`_.
 646
 647 VF Action
 648 ^^^^^^^^^
 649
 650 Directs matching traffic to a given virtual function of the current device.
 651
 652 Same restrictions as `VF pattern item`_.
 653
 654 - Targets **D** or **E** in `traffic steering`_.
 655
 656 \*_ENCAP actions
 657 ^^^^^^^^^^^^^^^^
 658
 659 These actions are named according to the protocol they encapsulate traffic
 660 with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
 661 VXLAN).
 662
 663 While they modify traffic and can be used multiple times (order matters),
 664 unlike `PORT_ID action`_ and friends, they have no impact on steering.
 665
 666 As described in `actions order and repetition`_ this means they are useless
 667 if used alone in an action list, the resulting traffic gets dropped unless
 668 combined with either ``PASSTHRU`` or other endpoint-targeting actions.
 669
 670 \*_DECAP actions
 671 ^^^^^^^^^^^^^^^^
 672
 673 They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers
 674 from traffic instead of pushing them. They can be used multiple times as
 675 well.
 676
 677 Note that using these actions on non-matching traffic results in undefined
 678 behavior. It is recommended to match the protocol headers to decapsulate on
 679 the pattern side of a flow rule in order to use these actions or otherwise
 680 make sure only matching traffic goes through.
 681
 682 Actions Order and Repetition
 683 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 684
 685 Flow rules are currently restricted to at most a single action of each
 686 supported type, performed in an unpredictable order (or all at once). To
 687 repeat actions in a predictable fashion, applications have to make rules
 688 pass-through and use priority levels.
 689
 690 It's now clear that PMD support for chaining multiple non-terminating flow
 691 rules of varying priority levels is prohibitively difficult to implement
 692 compared to simply allowing multiple identical actions performed in a
 693 defined order by a single flow rule.
 694
 695 - This change is required to support protocol encapsulation offloads and the
 696   ability to perform them multiple times (e.g. VLAN then VXLAN).
 697
 698 - It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
 699   be combined for duplication.
 700
 701 - The (non-)terminating property of actions must be discarded. Instead, flow
 702   rules themselves must be considered terminating by default (i.e. dropping
 703   traffic if there is no specific target) unless a ``PASSTHRU`` action is
 704   also specified.
 705
 706 Switching Examples
 707 ------------------
 708
 709 This section provides practical examples based on the established testpmd
 710 flow command syntax [2]_, in the context described in `traffic steering`_
 711
 712 ::
 713
 714       .-------------.                 .-------------. .-------------.
 715       | hypervisor  |                 |    VM 1     | |    VM 2     |
 716       | application |                 | application | | application |
 717       `--+---+---+--'                 `----------+--' `--+----------'
 718          |   |   |                               |       |
 719          |   |   `-------------------.           |       |
 720          |   `---------.             |           |       |
 721          |             |             |           |       |
 722    .----(A)----. .----(B)----. .----(C)----.     |       |
 723    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 724    `-----+-----' `-----+-----' `-----+-----'     |       |
 725          |             |             |           |       |
 726        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 727        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 728        `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 729          |             |             |           |       |
 730          |             |   .---------'           |       |
 731          `-----.       |   |   .-----------------'       |
 732                |       |   |   |   .---------------------'
 733                |       |   |   |   |
 734             .--|-------|---|---|---|--.
 735             |  |       |   `---|---'  |
 736             |  |       `-------'      |
 737             |  `---------.            |
 738             `------------|------------'
 739                          |
 740                     .---(F)----.
 741                     | physical |
 742                     |  port 0  |
 743                     `----------'
 744
 745 By default, PF (**A**) can communicate with the physical port it is
 746 associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
 747 and restricted to communicate with the hypervisor application through their
 748 respective representors (**B** and **C**) if supported.
 749
 750 Examples in subsequent sections apply to hypervisor applications only and
 751 are based on port representors **A**, **B** and **C**.
 752
 753 .. [2] :ref:`Flow syntax <testpmd_rte_flow>`
 754
 755 Associating VF 1 with Physical Port 0
 756 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 757
 758 Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
 759 their representors
 760
 761 ::
 762
 763    flow create 3 ingress pattern / end actions port_id id 4 / end
 764    flow create 4 ingress pattern / end actions port_id id 3 / end
 765
 766 More practical example with MAC address restrictions
 767
 768 ::
 769
 770    flow create 3 ingress
 771        pattern eth dst is {VF 1 MAC} / end
 772        actions port_id id 4 / end
 773
 774 ::
 775
 776    flow create 4 ingress
 777        pattern eth src is {VF 1 MAC} / end
 778        actions port_id id 3 / end
 779
 780
 781 Sharing Broadcasts
 782 ~~~~~~~~~~~~~~~~~~
 783
 784 From outside to PF and VFs
 785
 786 ::
 787
 788    flow create 3 ingress
 789       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 790       actions port_id id 3 / port_id id 4 / port_id id 5 / end
 791
 792 Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
 793 traffic.
 794
 795 From PF to outside and VFs
 796
 797 ::
 798
 799    flow create 3 egress
 800       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 801       actions port / port_id id 4 / port_id id 5 / end
 802
 803 From VFs to outside and PF
 804
 805 ::
 806
 807    flow create 4 ingress
 808       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
 809       actions port_id id 3 / port_id id 5 / end
 810
 811    flow create 5 ingress
 812       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
 813       actions port_id id 4 / port_id id 4 / end
 814
 815 Similar ``33:33:*`` rules based on known MAC addresses should be added for
 816 IPv6 traffic.
 817
 818 Encapsulating VF 2 Traffic in VXLAN
 819 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 820
 821 Assuming pass-through flow rules are supported
 822
 823 ::
 824
 825    flow create 5 ingress
 826       pattern eth / end
 827       actions vxlan_encap vni 42 / passthru / end
 828
 829 ::
 830
 831    flow create 5 egress
 832       pattern vxlan vni is 42 / end
 833       actions vxlan_decap / passthru / end
 834
 835 Here ``passthru`` is needed since as described in `actions order and
 836 repetition`_, flow rules are otherwise terminating; if supported, a rule
 837 without a target endpoint will drop traffic.
 838
 839 Without pass-through support, ingress encapsulation on the destination
 840 endpoint might not be supported and action list must provide one
 841
 842 ::
 843
 844    flow create 5 ingress
 845       pattern eth src is {VF 2 MAC} / end
 846       actions vxlan_encap vni 42 / port_id id 3 / end
 847
 848    flow create 3 ingress
 849       pattern vxlan vni is 42 / end
 850       actions vxlan_decap / port_id id 5 / end