carb::container::RHUnorderedSet

Defined in carb/container/RHUnorderedSet.h

template<class Key, class Hasher = std::hash<Key>, class Equals = std::equal_to<Key>, size_t LoadFactorMax100 = 80>
class RHUnorderedSet : public carb::container::detail::RobinHood<80, Key, const Key, detail::Identity<Key, const Key>, std::hash<Key>, std::equal_to<Key>>

Implements an Unordered Set, that is: a container that contains a set of keys that all must be unique.

There is no defined order to the set of keys.

In an open-addressing (“OA”) hash table, the contained items are stored in the buckets directly. Contrast this with traditional hash tables that typically have a level of indirection: buckets point to the head of a linked-list that contains every item that hashes to that bucket. Open-addressing hash tables are great for using contiguous memory, whereas traditional hash tables have a separate allocation per node and fragment memory. However, OA hash tables have a couple downsides: if a collision occurs on insertion, probing must happen until an open spot is found where the item can be placed. For a find operation, probing must continue until an empty spot is reached to make sure that all keys have been checked. When erasing an item, a “deleted” marker must be put in its place so that probing past the key can continue. This system also gives advantage to earlier insertions and penalizes later collisions.

The Robin Hood algorithm for open-addressing hashing was first postulated by Pedro Celis in 1986: https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf. Simply put, it applies a level of fairness to locality of items within the OA hash table. This is done by tracking the distance from an items ideal insertion point. Similarly the distance-from-ideal can be easily computed for existing locations that are probed. Once a probed location for a new item will cause the new item to be worse off (farther from ideal insertion) than the existing item, the new item can “steal” the location from the existing item, which must then probe until it finds a location where it is worse off than the existing item, and so on. This balancing of locality has beneficial side effects for finding and erasing too: when searching for an item, once a location is reached where the item would be worse off than the existing item, probing can cease with the knowledge that the item is not contained.

OA hash tables cannot be direct drop-in replacements for closed-addressing hash containers such as std::unordered_map as nearly every modification to the table can potentially invalidate any other iterator.

Open-addressing hash tables may not be a good replacement for std unordered containers in cases where the key and/or value is very large (though this may be mitigated somewhat by using indirection through std::unique_ptr). Since OA hash tables must carry the size of each value_type, having a low load factor (or a high capacity() to size() ratio) wastes a lot of memory, especially if the key/value pair is very large.

It is important to keep OA hash tables as compact as possible, as operations like clear() and iterating over the hash table are O(n) over capacity(), not size(). You can always ensure that the hash table is as compact as possible by calling rehash(0).

Because of the nature of how elements are stored in this hash table, there are two iterator types: iterator and find_iterator (both with const versions). These types can be compared with each other, but incrementing these objects works differently. iterator and const_iterator traverse to the next item in the container, while find_iterator and const_find_iterator will only traverse to the next item with the same key. In multi-key containers, items with the same key may not necessarily be stored adjacently, so incrementing iterator may not encounter the next item with the same key as the previous. For unique-key containers, incrementing a find_iterator will always produce end() since keys are guaranteed to be unique.

Iterator/reference/pointer invalidation (note differences from std::unordered_set):

Operation

Invalidates

All read operations

Never

clear, rehash, reserve, operator=, insert, emplace

Always

erase

Only the element removed

swap

All iterators, no pointers/references

Warning

This container is similar to, but not a drop-in replacement for std::unordered_set due to differences in iterator invalidation and memory layout.

Template Parameters
  • Key – The key type

  • Hasher – A functor to use as a hashing function for Key

  • Equals – A functor to use to compare two Key values for equality

  • LoadFactorMax100 – The load factor to use for the table. This value must be in the range [10, 100] and represents the percentage of entries in the hash table that will be filled before resizing. Open-addressing hash maps with 100% usage have better memory usage but worse performance since they need “gaps” in the hash table to terminate runs.

Public Types

using key_type = typename Base::key_type

The key type.

using value_type = typename Base::value_type

The value type (effectively const key_type)

using size_type = typename Base::size_type

Unsigned integer type (typically size_t)

using difference_type = typename Base::difference_type

Signed integer type (typically ptrdiff_t)

using hasher = typename Base::hasher

The hash function.

using key_equal = typename Base::key_equal

The key-equals function.

using reference = typename Base::reference

value_type&

using const_reference = typename Base::const_reference

const value_type&

using pointer = typename Base::pointer

value_type*

using const_pointer = typename Base::const_pointer

const value_type*

using iterator = typename Base::iterator

A LegacyForwardIterator to value_type.

using const_iterator = typename Base::const_iterator

A LegacyForwardIterator to const value_type

using find_iterator = typename Base::find_iterator

A LegacyForwardIterator to value_type that proceeds to the next matching key when incremented.

using const_find_iterator = typename Base::const_find_iterator

A LegacyForwardIterator to const value_type that proceeds to the next matching key when incremented.

Public Functions

constexpr RHUnorderedSet() noexcept = default

Constructs empty container.

inline RHUnorderedSet(const RHUnorderedSet &other)

Copy constructor.

Copies elements from another container.

Note

*this may have a different capacity() than other.

Parameters

other – The other container to copy entries from.

inline RHUnorderedSet(RHUnorderedSet &&other)

Move constructor.

Moves elements from another container.

Note

No move constructors on contained elements are invoked. other will be empty() after this operation.

Parameters

other – The other container to move entries from.

~RHUnorderedSet() = default

Destructor.

Destroys all contained elements and frees memory.

inline RHUnorderedSet &operator=(const RHUnorderedSet &other)

Copy-assign operator.

Destroys all currently stored elements and copies elements from another container.

Parameters

other – The other container to copy entries from.

Returns

*this

inline RHUnorderedSet &operator=(RHUnorderedSet &&other)

Move-assign operator.

Effectively swaps with another container.

Parameters

other – The other container to copy entries from.

Returns

*this

inline std::pair<iterator, bool> insert(const value_type &value)

Inserts an element into the container.

If insertion is successful, all iterators, references and pointers are invalidated.

Parameters

value – The value to insert by copying.

Returns

A pair consisting of an iterator to the inserted element (or the existing element that prevented the insertion) and a bool that will be true if insertion took place or false if insertion did not take place.

inline std::pair<iterator, bool> insert(value_type &&value)

Inserts an element into the container.

If insertion is successful, all iterators, references and pointers are invalidated.

Parameters

value – The value to insert by moving.

Returns

A pair consisting of an iterator to the inserted element (or the existing element that prevented the insertion) and a bool that will be true if insertion took place or false if insertion did not take place.

template<class P>
inline std::pair<iterator, bool> insert(std::enable_if_t<std::is_constructible<value_type, P&&>::value, P&&> value)

Inserts an element into the container.

Only participates in overload resolution if std::is_constructible_v<value_type, P&&> is true.

If insertion is successful, all iterators, references and pointers are invalidated.

Parameters

value – The value to insert by constructing via std::forward<P>(value).

Returns

A pair consisting of an iterator to the inserted element (or the existing element that prevented the insertion) and a bool that will be true if insertion took place or false if insertion did not take place.

template<class ...Args>
inline std::pair<iterator, bool> emplace(Args&&... args)

Constructs an element in-place.

If insertion is successful, all iterators, references and pointers are invalidated.

Parameters

args – The arguments to pass to the value_type constructor.

Returns

A pair consisting of an iterator to the inserted element (or the existing element that prevented the insertion) and a bool that will be true if insertion took place or false if insertion did not take place.

inline size_type erase(const key_type &key)

Removes elements with the given key.

References, pointers and iterators to the erase element are invalidated. All other iterators, pointers and references remain valid.

Parameters

key – the key value of elements to remove

Returns

the number of elements removed (either 1 or 0).

inline size_t count(const key_type &key) const

Returns the number of elements matching the specified key.

Parameters

key – The key to check for.

Returns

The number of elements with the given key (either 1 or 0).

inline const_iterator cbegin() const

Creates an iterator to the first element in the container.

Returns

a const_iterator to the first element in the container. If the container is empty() the iterator will be equal to cend().

inline iterator begin()

Creates an iterator to the first element in the container.

Returns

an iterator to the first element in the container. If the container is empty() the iterator will be equal to end().

inline const_iterator begin() const

Creates an iterator to the first element in the container.

Returns

a const_iterator to the first element in the container. If the container is empty() the iterator will be equal to end().

inline const_iterator cend() const

Creates an iterator to the past-the-end element in the container.

Returns

a const_iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline iterator end()

Creates an iterator to the past-the-end element in the container.

Returns

an iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline const_iterator end() const

Creates an iterator to the past-the-end element in the container.

Returns

a const_iterator to the past-the-end element in the container. This iterator is a placeholder; attempting to access it results in undefined behavior.

inline bool empty() const noexcept

Checks whether the container is empty.

Returns

true if the container contains no elements; false otherwise.

inline size_t size() const noexcept

Returns the number of elements contained.

O(1)

Returns

the number of elements contained.

inline constexpr size_t max_size() const noexcept

Returns the maximum possible number of elements.

O(1)

Returns

the maximum possible number of elements.

inline size_t capacity() const noexcept

Returns the number of elements that can be stored with the current memory usage.

This is based on the LoadFactorMax100 percentage and the current power-of-two memory allocation size. O(1)

See also

reserve()

Returns

the number of elements that can be stored with the current memory usage.

inline void clear()

Clears the contents.

O(n) over capacity()

Erases all elements from the container. After this call size() returns zero. Invalidates all iterators, pointers and references to contained elements.

Note

This does not free the memory used by the container; to free the hash table memory, use rehash(0) after this call.

inline void swap(RobinHood &other)

Swaps the contents of two containers.

O(1)

Exchanges the contents of *this with other. Will not invoke any move/copy/swap operations on the individual elements.

All iterators are invalidated for both containers. However, pointers and references to contained elements remain valid.

The Hasher and KeyEqual template parameters must be Swappable.

Parameters

other – The other container to swap with *this.

inline iterator erase(const_iterator pos)

Removes the given element.

References, pointers and iterators to the erased element are invalidated. All other iterators, pointers and references remain valid.

Parameters

pos – The iterator to the element to remove. This iterator must be valid and dereferenceable.

Returns

the iterator immediately following pos.

inline iterator erase(const_iterator first, const_iterator last)

Removes the elements in the given range.

The range [first, last) must be a valid range in *this. References, pointers and iterators to erased elements are invalidated. All other iterators, pointers and references to other elements remain valid.

Parameters
  • first – The start of the range of iterators to remove.

  • last – The past-the-end iterator for the range to remove.

Returns

last

inline find_iterator erase(const_find_iterator pos)

Removes the given element.

References, pointers and iterators to the erased element are invalidated. All other iterators, pointers and references remain valid.

Parameters

pos – The const_find_iterator to the element to remove. This iterator must be valid and dereferenceable.

Returns

the find_iterator immediately following pos.

inline find_iterator erase(const_find_iterator first, const_find_iterator last)

Removes the elements in the given range.

The range [first, last) must be a valid range in *this. References, pointers and iterators to erased elements are invalidated. All other iterators, pointers and references to other elements remain valid.

Parameters
  • first – The start of the range of iterators to remove.

  • last – The past-the-end iterator for the range to remove.

Returns

last

inline find_iterator find(const key_type &key)

Finds the first element with the specified key.

Note

find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

a find_iterator to the first element matching key, or end() if no element was found matching key.

inline const_find_iterator find(const key_type &key) const

Finds the first element with the specified key.

Note

const_find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

a const_find_iterator to the first element matching key, or cend() if no element was found matching key.

inline bool contains(const key_type &key) const

Returns whether there is at least one element matching a given key in the container.

Note

This function can be faster than count() for multimap and multiset containers.

Parameters

key – The key of the element to search for.

Returns

true if at least one element matching key exists in the container; false otherwise.

inline std::pair<find_iterator, find_iterator> equal_range(const key_type &key)

Returns a range containing all elements with the given key.

Note

find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

A pair containing a pair of iterators for the desired range. If there are no such elements, both iterators will be end().

inline std::pair<const_find_iterator, const_find_iterator> equal_range(const key_type &key) const

Returns a range containing all elements with the given key.

Note

const_find_iterator objects returned from this function will only iterate through elements with the same key; they cannot be used to iterate through the entire container.

Parameters

key – The key of the element(s) to search for.

Returns

A pair containing a pair of iterators for the desired range. If there are no such elements, both iterators will be end().

inline void reserve(size_t n)

Reserves space for at least the specified number of elements and regenerates the hash table.

Sets capacity() of *this to a value greater-than-or-equal-to n. If capacity() already exceeds n, nothing happens.

If a rehash occurs, all iterators, pointers and references to existing elements are invalidated.

Parameters

n – The desired minimum capacity of *this.

inline void rehash(size_t n)

Sets the capacity of the container to the lowest valid value greater-than-or-equal-to the given value, and rehashes the container.

If n is less-than size(), size() is used instead.

The value is computed as if by cpp::bit_ceil(std::ceil(float(n * 100) / float(LoadFactorMax100))) with a minimum size of 8.

If the container is empty and n is zero, the memory for the container is freed.

After this function is called, all iterators, pointers and references to existing elements are invalidated.

Parameters

n – The minimum capacity for the container. The actual size of the container may be larger than this.