On hashing performance

I just thought of a quick and dirty way to informally test the performance of data hashing.

Setting a baseline

dd if=/dev/zero bs=64k status=progress | cat >/dev/null
24847384576 bytes (25 GB, 23 GiB) copied, 4 s, 6.2 GB/s

So we get 6.2GB/s when doing nothing. Please note that in my experience the throughput depends heavily on the chosen block size. In my case, 64k gave the best performance on my Core i5-3570. Also note that not piping the output but running just dd gives roughly 3x the performance:

dd if=/dev/zero bs=64k status=progress of=/dev/null
78377648128 bytes (78 GB, 73 GiB) copied, 4 s, 19.6 GB/s

Testing sha1sum

Well, sha1sum gives roughly 740MB/s. Plenty of speed to not be a bottleneck when hashing files from a hard disk (Modern hard disks typically read ~200MB/s, but if you are hashing from a decent SSD (~500MB/s) or one of these then you will be in trouble.

dd if=/dev/zero bs=64k status=progress | sha1sum
7405109248 bytes (7.4 GB, 6.9 GiB) copied, 10 s, 741 MB/s

For comparison on my Atom D525 NAS, the performance was much much worse, at 92MB/s

Testing md5sum

The same test, using md5sum on the i5-3570 gives a throughput roughly similar at 681MB/s:

dd if=/dev/zero bs=64k status=progress | md5sum 
6807289856 bytes (6.8 GB, 6.3 GiB) copied, 10 s, 681 MB/s

What is surprising though is that the same test on the Atom D525 gives us an unexpected and respectable 205MB/s. So md5sum might be of some use in this aging and slow hardware as it is more than 2x faster than sha1 on the same hardware.

The test is highly unscientific as it does not involve real data, the versions of GNU coreutils are slightly different on the two machines (8.29 on the i5 vs 8.26 on the Atom), using dd for the benchmark is questionable etc... I was just looking for some ballpark numbers though so there you go!

Show Comments