Quadratic probing
Quadratic probing is an open addressing scheme in computer programming for resolving hash collisions in hash tables. Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found.
An example sequence using quadratic probing is:
Quadratic probing can be a more efficient algorithm in an open addressing table, since it better avoids the clustering problem that can occur with linear probing, although it is not immune. It also provides good memory caching because it preserves some locality of reference; however, linear probing has greater locality and, thus, better cache performance.
Quadratic function
Let h(k) be a hash function that maps an element k to an integer in [0, m−1], where m is the size of the table. Let the ith probe position for a value k be given by the function
where c2 ≠ 0 (If c2 = 0, then h(k,i) degrades to a linear probe). For a given hash table, the values of c1 and c2 remain constant.
Examples:
- If , then the probe sequence will be
- For m = 2n, a good choice for the constants are c1 = c2 = 1/2, as the values of h(k,i) for i in [0, m−1] are all distinct (in fact, it is a permutation on [0, m−1][1]). This leads to a probe sequence of (the triangular numbers) where the values increase by 1, 2, 3, ...
- For prime m > 2, most choices of c1 and c2 will make h(k,i) distinct for i in [0, (m−1)/2]. Such choices include c1 = c2 = 1/2, c1 = c2 = 1, and c1 = 0, c2 = 1. However, there are only m/2 distinct probes for a given element, requiring other techniques to guarantee that insertions will succeed when the load factor exceeds 1/2.
- For , where m, n, and p are integer greater or equal 2 (degrades to linear probe when p = 1), then gives cycle of all distinct probes. It can be computed in loop as: , and
- For any m, full cycle with quadratic probing can be achieved by rounding up m to closest power of 2, compute probe index: , and skip iteration when . There is maximum skipped iterations, and these iterations do not refer to memory, so it is fast operation on most modern processors. Rounding up m can be computed by:
uint64_t roundUp2(uint64_t v){
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v |= v >> 32;
v++;
return v;
}
Limitations
Alternating signs
If the sign of the offset is alternated (e.g. +1, −4, +9, −16, etc.), and if the number of buckets is a prime number congruent to 3 modulo 4 (e.g. 3, 7, 11, 19, 23, 31, etc.), then the first offsets will be unique (modulo ). In other words, a permutation of 0 through is obtained, and, consequently, a free bucket will always be found as long as at least one exists.
References
- The Art of Computer Science Volume 3 Sorting and Searching, Chapter 6.4, exercise 20, Donald Knuth