Rabin Karp Hashing

Rabin-Karp hashing is a method of using linear time precomputation to allow constant time hashing of arbitrary substrings with a polynomial hash.

Polynomial Hash

For a string a_0,a_1,a_2,\ldots,a_n, a polynomial hash of it is (a_0+a_1\cdot p + a_2\cdot p^2 + \cdots + a_n\cdot p^n) % m for some modulus m and p is generally prime. The modulus is generally ignored in implementations and is taken care of by integer overflow.

Rabin-Karp

We create an array of the hashes for the prefixes. For example, in the array [3,1,4,1,5], we would get the array of hashes [3,\ 3+1\cdot p,\ 3+1\cdot p+4\cdot p^2,\ 3+1\cdot p+4\cdot p^2+1\cdot p^3,\ 3+1\cdot p + 4\cdot p^2 + 1\cdot p^3 + 5\cdot p^4].

Now suppose we wanted the hash of the substring [1,4,1]. We notice that if we take the hash of [3,1,4,1] and subtract the hash of [3], we get the hash of [1,4,1] but multiplied by a factor of p. To take care of this, when comparing hashes, we multiply by appropriate powers of p such that both sides start with the same power.

This gives linear time precomputation for constant time hashes.

Please add implementation

page_revision: 0, last_edited: 1199503563|%e %b %Y, %H:%M %Z (%O ago)
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License