Pollard’s $\\rho$

Pollard’s $\\rho$ method is an algorithm for factoring numbers, better than trial and error for larger numbers.

Let $n$ a non prime number and $d$ a non trivial factor. The actual value of the factors are unknown at this stage, and Pollard’s $\\rho$ provides a way to find them. We do know, however, than $d$ is not larger than $n$ . In fact, we know at least one of the factors must hold $d\\leq\\sqrt{n}$ , so we assume this condition.

So, how does this help? If you start picking numbers at random(keeping your numbers greater or equal to zero and strictly lessthan $n$ ), then the only time you will get $a\\equiv b\\mod n$ iswhen $a$ and $b$ are identical. However, since $d$ is smaller than $n$ , there is a good chance that $a\\equiv b\\mod d$ sometimes when $a\eq b$ .

Well, if $a\\equiv b\\mod d$ , that means that $(a-b)$ is a multiple of $d$ . Since $n$ is also a multiple of $d$ , the greatest commondivisor of $(a-b)$ and $n$ is a positive, integer multiple of $d$ .We can keep picking numbers randomly until the greatest commondivisor of $n$ and the difference of two of our random numbers isgreater than one. Then, we can divide $n$ by whatever this greatestcommon divisor turned out to be. In doing so, we have broken down $n$ into two factors. If we suspect that the factors may becomposite, we can continue trying to break them down further bydoing the algorithm again on each half.

The amazing thing here is that through all of this, we just knewthere had to be some divisor of $n$ . We were able to use propertiesof that divisor to our advantage before we even knew what thedivisor was!

This is at the heart of Pollard’s rho method. Pick a random number $a$ . Pick another random number $b$ . See if the greatest commondivisor of $(a-b)$ and $n$ is greater than one. If not, pickanother random number $c$ . Now, check the greatest common divisorof $(c-b)$ and $n$ . If that is not greater than one, check thegreatest common divisor of $(c-a)$ and $n$ . If that doesn’t work,pick another random number $d$ . Check $(d-c)$ , $(d-b)$ , and $(d-a)$ .Continue in this way until you find a factor.

As you can see from the above paragraph, this could get quitecumbersome quite quickly. By the $k$ -th iteration, you will haveto do $(k-1)$ greatest common divisor checks. Fortunately, thereis way around that. By structuring the way in which you pick“random” numbers, you can avoid this buildup.

We use an appropiate polynomial $f(x)$ to generate pseudorandom numbers. Because we’re only concerned with numbers fromzero up to (but not including) $n$ , we will take all of the valuesof $f(x)$ modulo $n$ . We start with some $x_{1}$ . We then pick ournumbers by taking $x_{k+1}=(f(x_{k})\\mod n)$ .

Now, say for example we get to some point $k$ where $x_{k}\\equiv x_{j}\\mod d$ with $k<j$ . Then, because of the way that moduloarithmetic works, $f(x_{k})$ will be congruent to $f(x_{j})$ modulo $d$ . So, once we hit upon $x_{k}$ and $x_{j}$ , then each element inthe sequence starting with $x_{k}$ will be congruent modulo $d$ tothe corresponding element in the sequence starting at $x_{j}$ . Thus,once the sequence gets to $x_{k}$ it has looped back upon itself tomatch up with $x_{j}$ (when considering them modulo $d$ ).

This looping is what gives the $\\rho$ method its name. If you go backthrough (once you determine $d$ ) and look at the sequence of randomnumbers that you used (looking at them modulo $d$ ), you will seethat they start off just going along by themselves for a bit.Then, they start to come back upon themselves. They don’t typicallyloop the whole way back to the first number of your sequence. So,they have a bit of a tail and a loop—just like the Greek letterrho ( $\\rho$ ).

Before we see why that looping helps, we will first speak to whyit has to happen. When we consider a number modulo $d$ , we areonly considering the numbers greater than or equal to zero andstrictly less than $d$ . This is a very finite set of numbers.Your random sequence cannot possibly go on for more than $d$ numberswithout having some number repeat modulo $d$ . And, if the function $f(x)$ is well-chosen, you can probably loop back a great dealsooner.

The looping helps because it means that we can get away withoutaccumulating the number of greatest common divisor steps we needto perform with each new random number. In fact, it makes it sothat we only need to do one greatest common divisor check for everysecond random number that we pick.

Now, why is that? Let’s assume that the loop is of length $t$ andstarts at the $j$ -th random number. Say that we are on the $k$ -thelement of our random sequence. Furthermore, say that $k$ isgreater than or equal to $j$ and $t$ divides $k$ . Because $k$ isgreater than $j$ we know it is inside the looping part of the $\\rho$ . We also know that if $t$ divides $k$ , then $t$ also divides $2k$ . What this means is that $x_{2k}$ and $x_{k}$ will be congruentmodulo $d$ because they correspond to the same point on the loop.Because they are congruent modulo $d$ , their difference is a multipleof $d$ . So, if we check the greatest common divisor of $(x_{k}-x_{k/2})$ with $n$ every time we get to an even $k$ , we will find some factorof $n$ without having to do $k-1$ greatest common divisor calculationsevery time we come up with a new random number. Instead, we onlyhave to do one greatest common divisor calculation for every secondrandom number.

The only open question is what to use for a polynomial $f(x)$ to get some random numbers which don’t have toomany choices modulo $d$ . Since we don’t usually knowmuch about $d$ , we really can’t tailor the polynomialtoo much. A typical choice of polynomial is

f(x)=x^{2}+a

where $a$ issome constant which isn’t congruent to $0$ or $-2$ modulo $n$ . If you don’t place those restrictions on $a$ , then you will end up degenerating into the sequence $\\{1,1,1,1,...\\}$ as soon as you hit upon some $x$ which is congruent to either $1$ or $-1$ modulo $n$ .

Let’s use the algorithm now to factor our number $16843009$ . Wewill use the sequence $x_{1}=1$ with $x_{n+1}=(1024x_{n}^{2}+32767\\mod n)$ . [ I also tried it with the very basic polynomial $f(x)=x^{2}+1$ , but that one went 80 rounds before stopping so I didn’tinclude the table here.]

and so we have discovered the factor $257$ .

Let’s try to factor again with a different random number schema.We will use the sequence $x_{1}=1$ with $x_{n+1}=(2048x_{n}^{2}+32767\\mod n)$ .

Again, the factor $257$ shows up.

Pollard’s $\\rho$ can also be applied to other finite groups besides integers, providing one of the best known methods to computing discrete logarithms on arbitrary groups.

Pollard’s ρ

Pollard’s $\\rho$