Rabin Karp Hashing

Rabin-Karp hashing is a method of using linear time precomputation to allow constant time hashing of arbitrary substrings with a polynomial hash.

Polynomial Hash

For a string $a_0,a_1,a_2,\ldots,a_n$, a polynomial hash of it is $(a_0+a_1\cdot p + a_2\cdot p^2 + \cdots + a_n\cdot p^n) % m$ for some modulus $m$ and $p$ is generally prime. The modulus is generally ignored in implementations and is taken care of by integer overflow.


We create an array of the hashes for the prefixes. For example, in the array $[3,1,4,1,5]$, we would get the array of hashes $[3,\ 3+1\cdot p,\ 3+1\cdot p+4\cdot p^2,\ 3+1\cdot p+4\cdot p^2+1\cdot p^3,\ 3+1\cdot p + 4\cdot p^2 + 1\cdot p^3 + 5\cdot p^4]$.

Now suppose we wanted the hash of the substring [1,4,1]. We notice that if we take the hash of [3,1,4,1] and subtract the hash of [3], we get the hash of [1,4,1] but multiplied by a factor of $p$. To take care of this, when comparing hashes, we multiply by appropriate powers of $p$ such that both sides start with the same power.

This gives linear time precomputation for constant time hashes.

Please add implementation

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License