net/virtio: fix performance regression due to TSO
authorYuanhan Liu <yuanhan.liu@linux.intel.com>
Wed, 11 Jan 2017 04:27:11 +0000 (12:27 +0800)
committerYuanhan Liu <yuanhan.liu@linux.intel.com>
Mon, 30 Jan 2017 13:33:04 +0000 (14:33 +0100)
TSO is now enabled, but it's not actually being used by default in a
simple L2 forward mode. In such case, we have to zero the virtio net
headers, to inform the vhost backend that no offload is being used:

    hdr->csum_start = 0;
    hdr->csum_offset = 0;
    hdr->flags = 0;

    hdr->gso_type = 0;
    hdr->gso_size = 0;
    hdr->hdr_len = 0;

Such writes could be very costly; it introduces severe cache issues:
The above operations introduce cache write for each packet, which
stalls the read operation from the vhost backend.

The fact that virtio net header is initiated to zero in PMD driver
init stage means that these costly writes are unnecessary and could
be avoided:

    if (hdr->csum_start != 0)
        hdr->csum_start = 0;

And that's what the macro ASSIGN_UNLESS_EQUAL does. With this, the
performance drop introduced by TSO enabling is recovered: it could
be up to 20% in micro benchmarking.

Fixes: 58169a9c8153 ("net/virtio: support Tx checksum offload")
Fixes: 696573046e9e ("net/virtio: support TSO")
Cc: stable@dpdk.org
Cc: Olivier Matz <olivier.matz@6wind.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
drivers/net/virtio/virtio_rxtx.c

index b29565e..2194837 100644 (file)
@@ -267,6 +267,12 @@ tx_offload_enabled(struct virtio_hw *hw)
                vtpci_with_feature(hw, VIRTIO_NET_F_HOST_TSO6);
 }
 
+/* avoid write operation when necessary, to lessen cache issues */
+#define ASSIGN_UNLESS_EQUAL(var, val) do {     \
+       if ((var) != (val))                     \
+               (var) = (val);                  \
+} while (0)
+
 static inline void
 virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
                       uint16_t needed, int use_indirect, int can_push)
@@ -346,9 +352,9 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
                        break;
 
                default:
-                       hdr->csum_start = 0;
-                       hdr->csum_offset = 0;
-                       hdr->flags = 0;
+                       ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+                       ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+                       ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
                        break;
                }
 
@@ -364,9 +370,9 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie,
                                cookie->l3_len +
                                cookie->l4_len;
                } else {
-                       hdr->gso_type = 0;
-                       hdr->gso_size = 0;
-                       hdr->hdr_len = 0;
+                       ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+                       ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+                       ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
                }
        }