Serge « sans paille » Guelton <serge.guelton@quarkslab.com>
CppCon 2018 -- USA / 23 -- 29 September 2018
under the influence of Chris Beck and Jérôme Dumesnil
with the great help of Austin McCartney's CMake knowledge
$ g++ --version g++ (Debian 7.3.0-11) 7.3.0
#include <array> extern const std::array<int, 4> uktabi = { 1, 3, 3, 7 };
$ objdump -S -Mintel -j .rodata uktabi.o ... Disassembly of section .rodata: 0000000000000000 <uktabi>: 0: 01 00 00 00 03 00 00 00 03 00 00 00 07 00 00 00 ................ $ objdump -S -Mintel -j .text.startup uktabi.o | wc -l objdump: section '.text.startup' mentioned in a -j option,\ but not found in any input file
#include <vector> extern const std::vector<int> shiv = { 1, 3, 3, 7 };
$ objdump -S -Mintel -j .rodata shiv.o ... Disassembly of section .rodata: 0000000000000000 <shiv>: 0: 01 00 00 00 03 00 00 00 03 00 00 00 07 00 00 00 ................ $ sobjdump -S -Mintel -j .text.startup shiv.o | wc -l 35
#include <set> const std::set<int> nekrataal = { 1, 8, 7 };
$ objdump -S -Mintel -j .rodata a.o ... 0: 01 00 00 00 08 00 00 00 07 00 00 00 $ objdump -S -Mintel -j .text.startup a.o | wc -l 74
std::vector | std::array |
std::set | ø |
std::map | ø |
std::unordered_map | ø |
std::unordered_set | ø |
std::vector | std::array |
std::set | frozen::set |
std::map | frozen::map |
std::unordered_map | frozen::unordered_set |
std::unordered_set | frozen::unordered_map |
template <class Key, std::size_t N, class Compare = std::less<Key>> class set;
Rational: a flat sorted std::array with binary-search.
Compare needs to be constexpr-compatible.
template <class Key, class Value, std::size_t N, class Compare = std::less<Key>> class map;
Same as frozen::map but stores key, value pairs.
With a constexpr constructor:
extern constexpr frozen::set<int, 3> nekrataal = { 1, 8, 7 };
$ objdump -S -Mintel -j .rodata nekrataal.o ... 0: 00 00 00 00 01 00 00 00 07 00 00 00 08 00 00 00 $ objdump -S -Mintel -j .text.startup nekrataal.o | wc -l objdump: section '.text.startup' mentioned in a -j option, \ but not found in any input file
extern constexpr frozen::map<int, bool, 2> suqata = { {7, true}, {8, false} };
$ objdump -S -Mintel -j .rodata suqata.o ... 10: 00 00 00 00 07 00 00 00 01 00 00 00 08 00 00 00 20: 00 00 00 00 $ objdump -S -Mintel -j .text.startup suqata.o | wc -l objdump: section '.text.startup' mentioned in a -j option, \ but not found in any input file
Sucessively look for all elements in a set of 32 integers:
------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_IntInFzSet 2174 ns 2163 ns 314909 BM_IntInStdSet 7401 ns 7371 ns 93767 BM_IntInStdArray 3594 ns 3581 ns 195584 BM_IntInStdArrayBinarySearch 3979 ns 3966 ns 176996
Sucessively look for elements not in a set of 32 integers:
------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_IntNotInFzSet 2168 ns 2162 ns 319424 BM_IntNotInStdSet 7668 ns 7649 ns 90889 BM_IntNotInStdArray 4811 ns 4800 ns 145907 BM_IntNotInStdArrayBinarySearch 4000 ns 3992 ns 175349
.o size: 6072 bytes
char const * enum_to_string(RELOC_i386 e) { switch(e) { case RELOC_i386::R_386_NONE: return "NONE"; case RELOC_i386::R_386_32: return "R32"; ...
.o size: 8496 bytes
const std::map<RELOC_i386, const char*> e2s = { { RELOC_i386::R_386_NONE, "NONE"}, { RELOC_i386::R_386_32, "R32"}, ... }; char const * enum_to_string(RELOC_i386 e) { return e2s.at(e); }
.o size: 4088 bytes
constexpr frozen::map<RELOC_i386, const char*, 41> e2s = { { RELOC_i386::R_386_NONE, "NONE"}, { RELOC_i386::R_386_32, "R32"}, ... }; char const * enum_to_string(RELOC_i386 e) { return e2s.at(e); }
no collision
memory usage in $mathcal{O}(n)$
gperf implements that.
gperf titan.in > titan.c
// titan.in %% Coeus Crius Cronus Hyperion ... Ophion Pallas Perses Prometheus Styx %%
constexpr frozen::unordered_set<frozen::string, 33> Titans = { "Coeus", "Crius", "Cronus", "Hyperion", "Iapetus", "Mnemosyne", "Oceanus", "Phoebe", "Rhea", "Tethys", "Theia", "Themis", "Asteria", "Astraeus", "Atlas", "Aura", "Clymene", "Dione", "Helios", "Selene", "Eos", "Epimetheus", "Eurybia", "Eurynome", "Lelantos", "Leto", "Menoetius", "Metis", "Ophion", "Pallas", "Perses", "Prometheus", "Styx", };
gperf generated code may be faster than frozen's ($times$ 2 or 3)
Customization point!
struct olaf { constexpr std::size_t operator()( frozen::string value, std::size_t seed) const { std::size_t d = seed ^ value.size(); d = (d + 0x01000193) ^ value[0]; d = (d + 0x01000193) ^ value[1]; return d; } }; constexpr frozen::unordered_set<frozen::string, 33, olaf> Titans = ...
template <class Key, std::size_t N, class Hash = elsa<Key>, class KeyEqual = std::equal_to<Key>> class unordered_set;
Rational: not a one-liner :-)
Hash and KeyEqual needs to be constexpr
template <class Key, class Value, std::size_t N, class Hash = anna<Key>, class KeyEqual = std::equal_to<Key>> class unordered_map;
Hash and KeyEqual needs to be constexpr
Sucessively look for elements in a unordered_set of 32 strings:
------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_StrInFzUnorderedSet 2583 ns 2583 ns 268092 BM_StrInStdUnorderedSet 4802 ns 4800 ns 145135 BM_StrInStdArray 4829 ns 4829 ns 146107
Sucessively look for elements not in an unordered_set of 32 strings:
------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------ BM_StrNotInFzUnorderedSet 2619 ns 2619 ns 266644 BM_StrNotInStdUnorderedSet 4204 ns 4204 ns 166675 BM_StrNotInStdArray 9392 ns 9392 ns 73318
#include <frozen/set.h> constexpr frozen::set<unsigned, 15> primes = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47}; bool is_small_prime(int n) { return primes.count(n); }
Why is it faster?
$ clang a.c -O2 -std=c++14 -c $ objdump -S -Mintel -j .text a.o 0000000000000000 <_Z14is_small_primei>: 0: 83 ff 13 cmp $0x13,%edi 3: b8 00 00 00 00 mov $0x0,%eax 8: b9 00 00 00 00 mov $0x0,%ecx d: 48 0f 47 c8 cmova %rax,%rcx 11: 39 79 0c cmp %edi,0xc(%rcx) 14: 48 8d 41 10 lea 0x10(%rcx),%rax 18: 48 0f 43 c1 cmovae %rcx,%rax 1c: 39 78 04 cmp %edi,0x4(%rax) 1f: 48 8d 48 08 lea 0x8(%rax),%rcx 23: 48 0f 43 c8 cmovae %rax,%rcx 27: 39 39 cmp %edi,(%rcx) 29: 48 8d 41 04 lea 0x4(%rcx),%rax 2d: 48 0f 43 c1 cmovae %rcx,%rax 31: b9 00 00 00 00 mov $0x0,%ecx 36: 48 39 c8 cmp %rcx,%rax 39: 74 06 je 41 <_Z14is_small_primei+0x41> 3b: 39 38 cmp %edi,(%rax) 3d: 0f 96 c0 setbe %al 40: c3 retq 41: 31 c0 xor %eax,%eax 43: c3 retq
template <class ForwardIt, std::size_t N> constexpr ForwardIt doit_fast( ForwardIt first, std::integral_constant<std::size_t, N>) { auto constexpr step = N / 2; static_assert(N/2 == N - N / 2 - 1, "power of two minus 1"); auto it = first + step; auto next_it = compare_(*it, value_) ? it + 1 : first; return doit_fast(next_it, std::integral_constant<std::size_t, N / 2>{}); }
constexpr std::size_t operator()(string value, std::size_t seed) const { std::size_t d = seed; for (std::size_t i = 0; i < value.size(); ++i) d = (d * 0x01000193) ^ value[i]; return d; }
Sounds random? It is! But there's a paper that states it's in \(\mathcal{O}(n)\) :-)
image credits: http://stevehanov.ca/blog/index.php?id=119
How to design a container that can grow while staying constexpr?
template <class T, std::size_t N> class cvector { T data [N] = {}; std::size_t dsize = 0; ... };
Problems:
Problems:
→ Once the seed is fixed, a PRNG is 100% constexpr compatible.
std::map<unsigned, char> mesa = {{1, '1'}, {2, '2'}}; frozen::map<unsigned, char, 2> serra = {{1, '1'}, {2, '2'}};
Problem:
template<size_t N> constexpr void init(std::initializer_list<int> s) { // not ok: `s' is not a constant expression static_assert(N == s.size(), "size mismatch"); // ok with recent compilers assert(N == s.size() && "size mismatch"); }
Should a header-only library force a dependency on exceptions?
cf. specification of map::at
#if defined(FROZEN_NO_EXCEPTIONS) ||\ (defined(_MSC_VER) && !defined(_CPPUNWIND)) ||\ (defined(__cpp_exceptions) && __cpp_exceptions) #include <cstdlib> #define FROZEN_THROW_OR_ABORT(_) std::abort() #else #include <stdexcept> #define FROZEN_THROW_OR_ABORT(err) throw err #endif
constexpr frozen::map<frozen::string, c_keyword, 32> Keywords{ {"auto", KW_AUTO}, {"break", KW_BREAK}, ... {"volatile", KW_VOLATILE}, };
struct codes_t { uint32_t iCodepoint1; uint32_t iCodepoint2{0}; }; constexpr std::pair<frozen::string, codes_t> s_Entities[] { { "AElig", { 0xC6 }}, // +2000 entities here { "zwnj" , { 0x200C }}, }; constexpr auto s_NamedEntitiesHTML4 = frozen::make_unordered_map(s_Entities);
Date: Wed, 28 Mar 2018 15:21:51 +0000 From: Chris Beck <xxxxxxxxx> To: serge guelton <yyyyyyyy> Subject: Re: Frozen use at Tesla
Tesla autopilot uses shared memory segments (...).
At time of writing, we have an enum that describes all of the different shared memory segments used by different parts of the program, and a struct that describes the configuration of each segment – (...).
In the past we had a const std::map for this. Now we use a frozen::map, which simplifies the startup of the tasks.
template<class F, class T, T... Vs> struct memoized { auto operator()(T v) const { static constexpr frozen::map<T, decltype(F{}(v)), sizeof...(Vs)> cached = {{Vs, F{}(Vs)}...}; auto where = cached.find(v); if(where != cached.end()) { return where->second; } return F{}(v); } };
All methods calls are constexpr thus...
#include <frozen/set.h> static constexpr frozen::set<unsigned, 3> supported_sizes = { 1, 2, 4 }; static_assert(supported_sizes.count(sizeof(long)), "unsupported size");
Set the maximum nested evaluation depth for C++11 constexpr functions to N.
Set the maximum number of iterations for a loop in C++14 constexpr functions to N.
Maximum number of steps in constexpr function evaluation
Frozen containers actually have two characteristics:
The C++ compiler can make step 1. \(\mathcal{O}(1)\) and depending on the context, step 2. too.
Any other algorithms that work that way?
yes, algorithms with a precomputed table, like...
From c++17
std::search(in.begin(), in.end(), std::boyer_moore_searcher(needle.begin(), needle.end())
Make the init phase constexpr!
std::string haystack = "ABC ABCDAB ABCDABCDABDE"; auto index = frozen::search( haystack.begin(), haystack.end(), frozen::make_boyer_moore_searcher("ABCDABD") );
see also the regexp talk by Hana Dusíková
LIEF - Library to Instrument Executable Formats https://lief.quarkslab.com
Contributor to the project!
Uses frozen for:
Store a colormap:
constexpr frozen::map<char, std::array<char, 3>, 5> ColorMap{ {'R', {(char)0xFF, (char)0x00, (char)0x00}}, {'G', {(char)0x00, (char)0xFF, (char)0x00}}, {'B', {(char)0x00, (char)0x00, (char)0xFF}}, {'#', {(char)0x00, (char)0x00, (char)0x00}}, {' ', {(char)0xFF, (char)0xFF, (char)0xFF}}, }
And use it to turn ASCIIART into PPM at compile time
constexpr unsigned char bytes[] = " ###### " " ##GGGG ## " " # GGGG # " " # GGGGGG # " " # GG GG # " " #GGG GGGGG# " " #GGG GG G# " " # GG G # " " # GG GG # " " # GGGGGGGGG G# " " # GG########GGG# " " ### # # ### " " # # # # " " # # " " # # " " ######## " };
Thanks to Quarkslab for allowing me to spend time on that project.
Kudos to Chris Beck and Jérôme Dumesnil for the common work!
$ lynx https://github.com/serge-sans-paille/frozen
Space | Forward |
---|---|
Right, Down, Page Down | Next slide |
Left, Up, Page Up | Previous slide |
G | Go to slide number |
P | Open presenter console |
H | Toggle this help |