BBHash [cite/t:@bbhash] uses multiple layers to create a minimal perfect
hashing functions (MPFH), that hashes some input set into
(See also my note on PTHash [cite:@pthash].)
Simply said, it maps the
One bottleneck seems to be that this needs multiple layers of lookups each at different positions in memory. A possible idea:
- Using
$h_0$ , map all items into buckets with$∼ 100$ elements each. - Hash each of those buckets using the original ‘recursive’ BBhash technique
into a cacheline of
$256$ bits.
This way, only two lookups should be needed (apart from the rank operation that follows).
I have no idea how the analysis works out though – maybe the recursive strategy works better on a larger scale and when hashing only so few items there isn’t much benefit to the BBhash approach in the first place.