Extended Euclidean Algorithm
4th March 2025
In my last post, one area I skipped over was how to compute the multiplicative inverse in a finite field. For GF(7) I brute forced the answer for each element by testing each possibility. For GF(23) I had already calculated the result of every multiplication (mostly as an exercise) so I effectively brute forced that too.
In this post I'm going to explain how to use the Extended Euclidean Algorithm to do it more efficiently. I know that's the algorithm to use because multiple texts on the subject of cryptography said so. They didn't explain it further though, so I assume you are just supposed to know this stuff. I don't. So I best get studying.
Some time later
This is even older maths than abstract algebra. This comes from Euclid's Elements published in 300 BC. Euclid's Algorithm efficiently computes the greatest common divisor of two numbers. This is not what we want, but it helps and I'll explain why later, but first let's understand Euclid's Algorithm.
Take two numbers, for example 28 and 12, the greatest common divisor (GCD) is the biggest integer that divides into both numbers without leaving a reminder. You can work this out by factorising each number, comparing the two lists and selecting the biggest common number. The factors or divisors (it seems the terms are interchangeable) of 28 are 1, 2, 4, 7, 14 and 28. The divisors of 12 are 1, 2, 3, 4, 6 and 12. The largest common divisor is therefore 4. Or in maths speak .
This was possible to work out fairly quickly in my head. Turns out those times tables drummed into my head at school are still there decades later. But what if the numbers were bigger, say 973 and 301, I never learnt my times tables that high.
No worries, we can look them up. The divisors for 973 (1, 7, 139 and 973) and 301 (1, 7, 43 and 301) are listed on Wikipedia (https://en.wikipedia.org/wiki/Table_of_divisors). So . Using a list of precomputed divisors is not always practical, what if the numbers are really big, cryptography uses numbers 2000 bits long sometimes, what then?
This is also a tricky problem for computers. Factorising numbers is computationally expensive. It's slow. It's why people publish lists of them. Factorising is clearly not the solution.
Euclid's Algorithm provides an alternative method to work out the GCD without needing to factorise the numbers. It’s an iterative algorithm that starts by writing the larger value as a multiple of the smaller and working out the remainder. E.g.:
973 = q × 301 + r
In this case the quotient q is 3 as 301 goes into 973 3 times with a reminder r of 70.
973 = 3 × 301 + 70
The second iteration of the algorithm replaces the largest value with the smaller, and the smaller with the remainder from the previous iteration. Then repeat the previous step to work out a new quotient and remainder. In our example 70 goes into 301 4 times with a reminder of 21.
973 = 3 × 301 + 70
301 = 4 × 70 + 21
The algorithm keeps repeating this until the new remainder is 0.
973 = 3 × 301 + 70
301 = 4 × 70 + 21
70 = 3 × 21 + 7
21 = 3 × 7 + 0
The GCD is the previous remainder (or the last small value). In our example 7.
Let's test our first example too and make sure it computes the GSD(28, 12) as 4.
28 = 2 × 12 + 4
12 = 3 × 4 + 0
Well that was easier than factorising, even for small numbers. Nice.
To help turn this into code, and (spoilers) to understand the Extend Euclid Algorithm, it will help to label all the terms. The larger initial number is labelled r0, the smaller r1, its quotient is q1 and the new remainder is r2.
r0 = q1r1 + r2
The next iteration we move the terms about as previously described and calculate the new remainder.
r1 = q2r2 + r3
So in general each iteration is:
ri-2 = qi-1ri-1 + ri
The equivalent code is actually quite trivial. It's a simple recursive function. Recursion is magic.
int gcd(int a, int b) {
if (b == 0) return a;
return gcd(b, a % b);
}
So what is the extended version of the algorithm? Well, in addition to the GCD, it computes the coefficients of Bézout's identity. Clear? No. It's not for me either, so more research is required.
Research break.
Bézout's identity is a theorem that states for two integers (r0 and r1) with a common divisor, there exist integer coefficients (s and t) such that . This is going to be useful apparently, so let's push forward a bit longer and see how we can calculate s and t.
To understand the Extended Euclidean Algorithm we start by rearranging the equation at each stage of the original Euclidean Algorithm into the same signature as Bézout's identity.
Normal | Extended |
---|---|
973 = 3 × 301 + 70 | 1 × 973 + (-3) × 301 = 70 |
We don't completely simplify this equation. The final expression needs to be in terms of r0 and r1, in this example 973 and 301. We only simplify to the point we know how many r0 and r1 we need. In this first iteration we are already there so there is no further work required.
For the second iteration, the initial first term is a multiple of 301, which is r1, so nothing more needs to be done. The second term however is multiples of 70, which is not what we want. It is however the value of r2 and the previous iteration gave us an expression that represented 70 in terms of 973 and 301, or r2 in terms of r0 and r1. We can therefore substitute 70 and simplify the expression back to terms of r0 and r1.
Normal | Extended |
---|---|
973 = 3 × 301 + 70 | 1 × 973 + (-3) × 301 = 70 |
301 = 4 × 70 + 21 | 1 × 301 + (-4) × 70 = 21 301 - 4 × (973 - 3 × 301) = 21 -4 × 973 + 13 × 301 = 21 |
The third iteration, the initial expression is in terms of 70 and 21, or r2 and r3. We already know we can substitute the 70 with the expression from the first iteration and the 21 can be substituted with the second expression.
Normal | Extended |
---|---|
973 = 3 × 301 + 70 | 1 × 973 + (-3) × 301 = 70 |
301 = 4 × 70 + 21 | 1 × 301 + (-4) × 70 = 21 301 - 4 × (973 - 3 × 301) = 21 -4 × 973 + 13 × 301 = 21 |
70 = 3 × 21 + 7 | 70 - 3 × 21 = 7 973 - 3 × 301 - 3 × (-4 × 973 + 13 × 301) = 7 13 × 973 - 42 × 301 = 7 |
The expression for the fourth iteration has a result of 0. We can therefore stop as we know the previous result, 7, is the GCD and the quotients in that expression are the values for s and t. Therefore s is -4 and t is 13.
Normal | Extended |
---|---|
973 = 3 × 301 + 70 | 1 × 973 + (-3) × 301 = 70 |
301 = 4 × 70 + 21 | 1 × 301 + (-4) × 70 = 21 301 - 4 × (973 - 3 × 301) = 21 -4 × 973 + 13 × 301 = 21 |
70 = 3 × 21 + 7 | 70 - 3 × 21 = 7 973 - 3 × 301 - 3 × (-4 × 973 + 13 × 301) = 7 13 × 973 - 42 × 301 = 7 |
21 = 3 × 7 + 0 | 21 - 3 × 7 = 0 |
If we did keep going and rearrange the result of the fourth interaction to be in terms of r0 and r1 we spot a pattern. The first term is multiples of 21 which was the result from 2 iterations above and the second term is multiples of 7, which was the result from the previous expression. For other longer examples it turns out this pattern holds true and therefore we can devise a generic expression to compute the new s and t values based on the previous two.
si = si-2 - qi-1 × si-1
ti = ti-2 - qi-1 × ti-1
Which means we can turn this into code:
struct eea_terms {
int r;
int s;
int t;
};
struct eea_terms eea_recurse(struct eea_terms i2, struct eea_terms i1) {
if (i1.r==0) return i2;
int q = i2.r / i1.r;
return eea_recurse(i1, (struct eea_terms) {
.r = i2.r % i1.r,
.s = i2.s - q*i1.s,
.t = i2.t - q*i1.t
});
}
struct eea_terms eea(int r0, int r1) {
return eea_recurse((struct eea_terms) {r0, 1, 0}, (struct eea_terms) {r1, 0, 1});
}
The burning question remains, what has this got to do with computing the multiplicative inverse? As a reminder, when an element in a field is paired with its inverse it results in the identity element. For multiplication the identity element is 1, so the multiplicative inverse is whatever element that when multiplied by the original result in 1. Or in maths speak:
a` × a = 1
Let's consider prime fields, specifically GF(7). This consists of the set of elements {0,1, …, 6}, and if we were to compute the GCD of each element with 7 the result would be 1. Because 7 is a prime, we can guarantee the GCD with positive integers smaller than it will be 1, that’s kind of the definition of prime numbers. We can then combine this with Bézout identity to assert that:
s×7 + t×a = GCD(7, a) = 1
In other words, there exists a quotient (s) for the prime 7 and a quotient (t) for any element (a) of the set where their sum will be 1.
Remembering that this is a finite field we can reduce the expression using modular arithmetic to ensure the result and all the terms are elements of the set. Specifically the first term s×7 is always going to result in a 0 as any multiple of 7 modulo 7 is 0.
s×7 ≡ 0 (mod 7)
So the first term will always reduce to 0 and can be cancelled out, the whole expression is then reduce to:
t×a ≡ 1 (mod 7)
This expression looks remarkably similar to our expression defining a multiplicative inverse, it is in fact the same signature. This makes t (mod 7) the multiplicative inverse of a. Which eventually answers the question we started with, how we can use the extended euclidean algorithm to compute the multiplicative inverse? Pipe in the modulus as r0 and the element as r1, the GCD will always be 1, s can be ignored and t will be the inverse.
int inverse(int a, int mod) {
struct eea_terms result = eea(mod, a);
return result.t<0 ? result.t+mod : result.t;
}
To test this works, I used the following script.
int max = 7;
for (int i=1; i<max; i++) {
struct eea_terms eear = eea(max, i);
printf("%d: r =%2d, s =%2d, t =%2d, inverse =%2d\n", i, eear.r, eear.s, eear.t, inverse(i, max));
}
Which generated this output
1: r = 1, s = 0, t = 1, inverse = 1 2: r = 1, s = 1, t =-3, inverse = 4 3: r = 1, s = 1, t =-2, inverse = 5 4: r = 1, s =-1, t = 2, inverse = 2 5: r = 1, s =-2, t = 3, inverse = 3 6: r = 1, s = 1, t =-1, inverse = 6
Excellent, job done, time to move on? Well not quite, we have those pesky polynomials to deal with.
The good news is it’s exactly the same process as with integers, we just need to make sure instead of calling the native addition, multiplication, division and modulo operators, we create and call functions that will perform correct operations with polynomials stored as bit arrays.
struct eea_terms gf2_eea_recurse(struct eea_terms i2, struct eea_terms i1) {
if (i1.r==0) return i2;
unsigned int r;
unsigned int q = gf2_div(i2.r, i1.r, &r);
return gf2_eea_recurse(i1, (struct eea_terms) {
.r = r,
.s = i2.s ^ gf2_mul(q, i1.s),
.t = i2.t ^ gf2_mul(q, i1.t)
});
}
struct eea_terms gf2_eea(unsigned int r0, unsigned int r1) {
return gf2_eea_recurse((struct eea_terms) {r0, 1, 0}, (struct eea_terms) {r1, 0, 1});
}
unsigned int gf2_inverse(unsigned int a, unsigned int mod) {
struct eea_terms result = gf2_eea(mod, a);
return result.t;
}
The tricky part was writing a division function, we got to skip this part in the last post as we took a shortcut when performing division, this time no such luck. The function works by looking for the position of the highest bit set in the dividend (a) and comparing that to the position of highest bit set in the divisor (b), the difference between them is the power of the first term in the result. Subtract this from the dividend and repeat recursively. It might sound complicated but it’s just long division in base 2. As a bonus, it also computes the remainder, in other words we can perform the division and modulo operation in one move.
unsigned int gf2_div(unsigned int a, unsigned int b, unsigned int* r) {
unsigned int mask = 1 << ((sizeof(unsigned int)*8) - 1);
while (mask && !(mask & a) && !(mask & b)) {
mask >>= 1;
}
if (!(mask & a)) {
*r = a;
return 0;
}
unsigned int mag = 0;
while (mask && !(mask & b)) {
mask >>= 1;
mag++;
}
return (1 << mag) ^ gf2_div(a ^ (b << mag), b, r);
}
With that now I can compute the multiplicative inverse for prime fields and extensions fields 2n. It's more than I strictly need to implement AES, but it's all useful learning. My math skills (for this tiny part of mathematics) are way more developed now than they have ever been. I really am having fun.
TC