Hashmaps Benchmarks

栏目: IT技术 · 发布时间: 4年前

内容简介:I’ve spent a long time developing myThat’s why I have now spent considerable time to create a highly improved benchmarks, where I have tried to remedy, but I am still very pleased with the results.

Table of Contents

I’ve spent a long time developing my robin_hood::unordered_map , and after claiming that it is now the fastest hashmap I understandably got quite a few skeptic comments. Some of the comments were quite right, and my benchmarks were not as unbiased as they could be, I did not test as many unordered maps as I should have, my compiler options were not choosen well, and so on.

That’s why I have now spent considerable time to create a highly improved benchmarks, where I have tried to remedy

all most of the critique that I got. The results are not as flattering to my robin_hood::unordered_map

, but I am still very pleased with the results.

What is actually Benchmarked?

This benchmark has evalued 20 different unordered_map implementations, each with 5 different hashing implementations. So there are a total of 20*5 = 100 hashmap variants to benchmark. Each of this 100 hashmaps was evaluated in 10 different benchmarks, so in total 1000 benchmark evaluations. I ran each benchmark 9 times and show the median, to filter out any outliers. So in total I ran 9000 benchmarks, which took about 6 days on my Intel i7-8700 at 3200 MHz. To get highly accurate results, I’ve isolated a core to only benchmarking , and disabled all frequency scaling .

Hashmaps

  • Google’s Abseil ’s abseil::flat_hash_map , abseil::node_hash_map . They are brand new, and have just recently pushed the boundary on what’s possible to achieve for unordered_maps. It uses several interesting optimizations, described in CppCon 2017: Matt Kulukundis “Designing a Fast, Efficient, Cache-friendly Hash Table, Step by Step .
  • boost multiindex : boost::multi_index::hashed_unique . Boost.MultiIndex is a versatile container that is highly configurable, it’s main features is not speed but it’s versatility. It is not a straight forward std::unordered_map replacement, the implementation for the wrapper was thankfully provided by joaquintides .
  • Boost’s unordered map boost::unordered_map is very similar to std::unordered_map , just boosts (older) version before std::unordered_map was a thing. I’ve tested with boost version 1.65.1.
  • EASTL has eastl::hash_map . The Electronic Arts Standard Template Library, an STL implementation with emphasis on high performance. It seems to be a bit dated though.
  • Facebook’s folly : folly::F14ValueMap and folly::F14NodeMap . C++14 conform and high performance in mind. The maps are described in the F14 Hash Table document.
  • greg7mdp’s parallel hashmap : phmap::flat_hash_map and phmap::node_hash_map are closely based on Abseil’s map, but simpler to integrate since they are header only. phmap::parallel_flat_hash_map and phmap::parallel_node_hash_map use a novel improvement that makes the maps a tad slower but usable in parallel. Also, peak memory requirements are a bit lower. Read more in “ The Parallel Hashmap ”.
  • greg7mdp’s sparsepp : spp::sparse_hash_map tuned to be memory efficient.
  • ktprime’s HashMap : A rather unknown implementation emilib1::HashMap by /u/huangyuanbing . It might not be as stable and well tested as other implementations here, but the numbers look very promising.
  • martinus’s robin-hood-hashing : A single-file header-only implementation that contains robin_hood::unordered_flat_map and robin_hood::unordered_node_map . I am the author of the maps, so I might not be perfectly unbiased… The numbers won’t lie though, and I try to be as objective as possible.
  • Malte Skarupke’s Bytell Map After first claiming I Wrote The Fastest Hashtable and later A new fast hash table in response to Google’s new fast hash table , his maps ska::flat_hash_map and ska::bytell_hash_map are obvious choices for this benchmark.
  • std::unordered_map Of course, the standard implementation of std::unordered_map has to be included has well Since I am using g++ 8.2, this uses the libstdc++ implementation.
  • tessil’s maps : Tessil has done lots and lots of work on hashmaps, in all kinds of flavours. Here I am benchmarking tsl::hopscotch_map , tsl::robin_map , and tsl::sparse_map . They are all available on github .

Hashes

Some hashmap implementations come with their own hashing methods, each with different properties. In my benchmarks I have used either integral types like int or uint64_t , and std::string as the keys.

  • Abseil’s Hash absl:Hash : An extremely fast hash, that works very well in all situations I have tested.
  • FNV1a A very simple hash that is used by Microsoft in Visual Studio 2017. Interestingly, they even use this byte-wise hash for integral types. My benchmark has its own implementation, but in my experiments it has produced the same assembler code as the original Microsoft variant.
  • Folly’s Hash folly::hasher : Unfortunately I could not find any documentation. It seems to be well optimized and uses native crc instruction if available. Unfortunately the result is only a 32bit hash which can work badly for some hashmap variants.
  • libstdc++-v3 simply casts integral types to size_t and uses this as a hash function. It is obviously the fastest hash, but many hashmap implementations rely on a somewhat good avalanching hash quality so this seems to be a rather bad choice.
  • martinus’s robin-hood-hashing robin_hood::hash is based on abseil’s hash for integral types, with minor modifications.

How is benchmarked?

Build

  • I’ve used g++ 8.2.0 with -O3 -march=native :
    g++-8 (Ubuntu 8.2.0-1ubuntu2~18.04) 8.2.0
  • CMake build is done with Release mode
  • and I’ve set FOLLY_CXX_FLAGS to -march=native .
  • For the ktprime map benchmarks I had to add -fno-strict-aliasing .

System Configuration

  • All benchmarks were run on an Linux. uname -a output:
    Linux dualit 4.15.0-47-generic #50-Ubuntu SMP Wed Mar 13 10:44:52 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
  • Processor Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz, locked to 3200 MHz.
  • Isolated a core with it’s hyperthreading companion by editing /etc/default/grub and changing GRUB_CMDLINE_LINUX_DEFAULT so it looks like this:
    GRUB_CMDLINE_LINUX_DEFAULT="quiet splash isolcpus=5,11 rcu_nocbs=5,11"
  • Turbo boost and frequency scaling were disabled with the python tool perf with the command
    sudo python3 -m perf system tune

    This sets cores 5 and 11 to 3200 MHz, sets scaling governor to performance , disables Turbo Boost, sets irqbalances service to inactive, IRQ affinity to all CPUs except 5 and 11.

  • Each benchmarks is run in a separately started process.
  • Isolated cores are used with taskset -c 5,11
  • To get rid of any potential outliers and to average effects of ASLR , all benchmarks were run 9 times and I show only the median result.

Benchmarks

Enough talk, onwards to the benchmarks!


以上就是本文的全部内容,希望对大家的学习有所帮助,也希望大家多多支持 码农网

查看所有标签

猜你喜欢:

本站部分资源来源于网络,本站转载出于传递更多信息之目的,版权归原作者或者来源机构所有,如转载稿涉及版权问题,请联系我们

微商思维

微商思维

龚文祥、罗剑锋、触电会 / 金城出版社 / 2018-7 / 88.00元

微商不仅仅是一种继传统实体、电商之后的革命性新兴商业形态,更是一种能够写入中国商业史的思潮。龚文祥新著《微商思维》,从道的层面对广大微商人的商业实践智慧进行了高度浓缩与抽象总结,站在更高的视角解读微商背后的商业逻辑与本质。 本书前半部分,主要从本质、品牌、营销等几个方面,阐述了微商思维的内涵及应用场景,帮助读者了解并认识这种革命性的商业思维。 后半部分主要是触电会社群内部各位大咖的实操......一起来看看 《微商思维》 这本书的介绍吧!

RGB转16进制工具
RGB转16进制工具

RGB HEX 互转工具

MD5 加密
MD5 加密

MD5 加密工具

XML 在线格式化
XML 在线格式化

在线 XML 格式化压缩工具