carb::container::detail::RobinHood

Defined in carb/container/RobinHoodImpl.h

template<size_t LoadFactorMax100, class Key, class ValueType, class KeyFromValue, class Hasher, class Equals>
class RobinHood

Implements a “Robin Hood” open-addressing hash container that can either store keys alone or key/value pairs; this template is not meant to be used directly&#8212;instead use RHUnorderedSet, RHUnorderedMap, RHUnorderedMultimap, or RHUnorderedMultiset.

In an open-addressing (“OA”) hash table, the contained items are stored in the buckets directly. Contrast this with traditional hash tables that typically have a level of indirection: buckets point to the head of a linked-list that contains every item that hashes to that bucket. Open-addressing hash tables are great for using contiguous memory, whereas traditional hash tables have a separate allocation per node and fragment memory. However, OA hash tables have a couple downsides: if a collision occurs on insertion, probing must happen until an open spot is found where the item can be placed. For a find operation, probing must continue until an empty spot is reached to make sure that all keys have been checked. When erasing an item, a “deleted” marker must be put in its place so that probing past the key can continue. This system also gives advantage to earlier insertions and penalizes later collisions.

The Robin Hood algorithm for open-addressing hashing was first postulated by Pedro Celis in 1986: https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf. Simply put, it applies a level of fairness to locality of items within the OA hash table. This is done by tracking the distance from an items ideal insertion point. Similarly the distance-from-ideal can be easily computed for existing locations that are probed. Once a probed location for a new item will cause the new item to be worse off (farther from ideal insertion) than the existing item, the new item can “steal” the location from the existing item, which must then probe until it finds a location where it is worse off than the existing item, and so on. This balancing of locality has beneficial side effects for finding and erasing too: when searching for an item, once a location is reached where the item would be worse off than the existing item, probing can cease with the knowledge that the item is not contained.

OA hash tables cannot be direct drop-in replacements for closed-addressing hash containers such as std::unordered_map as nearly every modification to the table can potentially invalidate any other iterator.

Open-addressing hash tables may not be a good replacement for std unordered containers in cases where the key and/or value is very large (though this may be mitigated somewhat by using indirection through std::unique_ptr). Since OA hash tables must carry the size of each value_type, having a low load factor (or a high capacity() to size() ratio) wastes a lot of memory, especially if the key/value pair is very large.

It is important to keep OA hash tables as compact as possible, as operations like clear() and iterating over the hash table are O(n) over capacity(), not size(). You can always ensure that the hash table is as compact as possible by calling rehash(0).

Because of the nature of how elements are stored in this hash table, there are two iterator types: iterator and find_iterator (both with const versions). These types can be compared with each other, but incrementing these objects works differently. iterator and const_iterator traverse to the next item in the container, while find_iterator and const_find_iterator will only traverse to the next item with the same key. In multi-key containers, items with the same key may not necessarily be stored adjacently, so incrementing iterator may not encounter the next item with the same key as the previous. For unique-key containers, incrementing a find_iterator will always produce end() since keys are guaranteed to be unique.

Subclassed by carb::container::RHUnorderedMultiset< Key, Hasher, Equals, LoadFactorMax100 >, carb::container::RHUnorderedSet< Key, Hasher, Equals, LoadFactorMax100 >

Public Functions

inline const_iterator cbegin() const

Creates an iterator to the first element in the container.

Returns

a const_iterator to the first element in the container. If the container is empty() the iterator will be equal to cend().

inline iterator begin()

Creates an iterator to the first element in the container.

Returns

an iterator to the first element in the container. If the container is empty() the iterator will be equal to end().

inline const_iterator begin() const

Creates an iterator to the first element in the container.

Returns

a const_iterator to the first element in the container. If the container is empty() the iterator will be equal to end().

inline const_iterator cend() const

Creates an iterator to the past-the-end element in the container.

Returns

a const_iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline iterator end()

Creates an iterator to the past-the-end element in the container.

Returns

an iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline const_iterator end() const

Creates an iterator to the past-the-end element in the container.

Returns

a const_iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline bool empty() const noexcept

Checks whether the container is empty.

Returns

true if the container contains no elements; false otherwise.

inline size_t size() const noexcept

Returns the number of elements contained.

O(1)

Returns

the number of elements contained.

inline constexpr size_t max_size() const noexcept

Returns the maximum possible number of elements.

O(1)

Returns

the maximum possible number of elements.

inline size_t capacity() const noexcept

Returns the number of elements that can be stored with the current memory usage.

This is based on the LoadFactorMax100 percentage and the current power-of-two memory allocation size. O(1)

See also

reserve()

Returns

the number of elements that can be stored with the current memory usage.

inline void clear()

Clears the contents.

O(n) over capacity()

Erases all elements from the container. After this call size() returns zero. Invalidates all iterators, pointers and references to contained elements.

Note

This does not free the memory used by the container; to free the hash table memory, use rehash(0) after this call.

inline void swap(RobinHood &other)

Swaps the contents of two containers.

O(1)

Exchanges the contents of *this with other. Will not invoke any move/copy/swap operations on the individual elements.

All iterators are invalidated for both containers. However, pointers and references to contained elements remain valid.

The Hasher and KeyEqual template parameters must be Swappable.

Parameters

other – The other container to swap with *this.

inline iterator erase(const_iterator pos)

Removes the given element.

References, pointers and iterators to the erased element are invalidated. All other iterators, pointers and references remain valid.

Parameters

pos – The iterator to the element to remove. This iterator must be valid and dereferenceable.

Returns

the iterator immediately following pos.

inline iterator erase(const_iterator first, const_iterator last)

Removes the elements in the given range.

The range [first, last) must be a valid range in *this. References, pointers and iterators to erased elements are invalidated. All other iterators, pointers and references to other elements remain valid.

Parameters
  • first – The start of the range of iterators to remove.

  • last – The past-the-end iterator for the range to remove.

Returns

last

inline find_iterator erase(const_find_iterator pos)

Removes the given element.

References, pointers and iterators to the erased element are invalidated. All other iterators, pointers and references remain valid.

Parameters

pos – The const_find_iterator to the element to remove. This iterator must be valid and dereferenceable.

Returns

the find_iterator immediately following pos.

inline find_iterator erase(const_find_iterator first, const_find_iterator last)

Removes the elements in the given range.

The range [first, last) must be a valid range in *this. References, pointers and iterators to erased elements are invalidated. All other iterators, pointers and references to other elements remain valid.

Parameters
  • first – The start of the range of iterators to remove.

  • last – The past-the-end iterator for the range to remove.

Returns

last

inline find_iterator find(const key_type &key)

Finds the first element with the specified key.

Note

find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

a find_iterator to the first element matching key, or end() if no element was found matching key.

inline const_find_iterator find(const key_type &key) const

Finds the first element with the specified key.

Note

const_find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

a const_find_iterator to the first element matching key, or cend() if no element was found matching key.

inline bool contains(const key_type &key) const

Returns whether there is at least one element matching a given key in the container.

Note

This function can be faster than count() for multimap and multiset containers.

Parameters

key – The key of the element to search for.

Returns

true if at least one element matching key exists in the container; false otherwise.

inline std::pair<find_iterator, find_iterator> equal_range(const key_type &key)

Returns a range containing all elements with the given key.

Note

find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

A pair containing a pair of iterators for the desired range. If there are no such elements, both iterators will be end().

inline std::pair<const_find_iterator, const_find_iterator> equal_range(const key_type &key) const

Returns a range containing all elements with the given key.

Note

const_find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

A pair containing a pair of iterators for the desired range. If there are no such elements, both iterators will be end().

inline void reserve(size_t n)

Reserves space for at least the specified number of elements and regenerates the hash table.

Sets capacity() of *this to a value greater-than-or-equal-to n. If capacity() already exceeds n, nothing happens.

If a rehash occurs, all iterators, pointers and references to existing elements are invalidated.

Parameters

n – The desired minimum capacity of *this.

inline void rehash(size_t n)

Sets the capacity of the container to the lowest valid value greater-than-or-equal-to the given value, and rehashes the container.

If n is less-than size(), size() is used instead.

The value is computed as if by cpp::bit_ceil(std::ceil(float(n * 100) / float(LoadFactorMax100))) with a minimum size of 8.

If the container is empty and n is zero, the memory for the container is freed.

After this function is called, all iterators, pointers and references to existing elements are invalidated.

Parameters

n – The minimum capacity for the container. The actual size of the container may be larger than this.