net/ice: improve performance of Rx timestamp offload
Previously, each time a burst of packets is received, SW reads HW
register and assembles it and the timestamp from descriptor together to
get the complete 64 bits timestamp.
This patch optimizes the algorithm. The SW only needs to check the
monotonicity of the low 32bits timestamp to avoid crossing borders.
Each time before SW receives a burst of packets, it should check the
time difference between current time and last update time to avoid
the low 32 bits timestamp cycling twice.
The patch proved a 50% ~ 70% single core performance improvement on a
main stream Xeon server, this fix the performance gap for some use cases.