The clock() function is not good practice to use for multiple
cores/threads, since it measures the CPU time used by the process
and not the wall clock time, while when running through multiple
cores/threads simultaneously, we can burn through CPU time much
faster.
As a result this commit will change the way of measurement to use
rd_tsc, and the results will be divided by the processor frequency.