Skip to main content

Reed-Solomon

The Problem

A satellite sends 223 bytes to Earth. Some bytes get corrupted by noise on the way down. We want the receiver to detect and fix the corruption automatically, without retransmission.

Reed-Solomon (RS) coding adds 32 extra parity bytes to the 223 data bytes, making 255 total. If up to 16 of those 255 bytes are corrupted in transit, the receiver can figure out which bytes are wrong and what they should have been.

Why We Need Arithmetic

A simple check like XOR-ing all bytes together can tell you whether something changed, but not which byte changed or what it changed to. To recover both the position and the value of a corrupted byte, you need more information.

The idea is to build multiple independent equations that relate the parity bytes to the data bytes. Each equation multiplies every data byte by a different coefficient and sums the results. For example, one parity byte might be:

p0=d0α2+d1α+d2p_0 = d_0 \cdot \alpha^2 + d_1 \cdot \alpha + d_2

and another:

p1=d0α4+d1α2+d2p_1 = d_0 \cdot \alpha^4 + d_1 \cdot \alpha^2 + d_2

If a single byte d1d_1 gets corrupted, both p0p_0 and p1p_1 will be wrong by amounts that depend on d1d_1's position (because each equation weighted d1d_1 differently). The receiver can solve these two equations to find both where and by how much.

More errors need more equations, which means more parity bytes. Correcting up to 16 errors requires 32 parity bytes (2 per error: one for position, one for magnitude).

This is why arithmetic — multiplication and addition — is unavoidable.

A Special Number System

Since we're doing arithmetic on bytes, we need every operation to produce a result that is also a byte. In normal arithmetic this fails: 200×200=40000200 \times 200 = 40000, which doesn't fit. If intermediate results grow beyond one byte, the parity symbols become a different size than the data symbols, and the fixed-size structure (255 symbols ×\times 1 byte each) falls apart.

So we use a number system called GF(282^8) — a Galois field with 256 elements (0--255) — where every operation on two bytes always produces exactly one byte:

  • Addition is XOR: 200+200=0200 + 200 = 0 (everything cancels itself)
  • Multiplication uses special rules that always stay within 0--255

The field has a special element called α\alpha (alpha), defined as α=2\alpha = 2. Its powers generate every non-zero value in the field:

α0=1,α1=2,α2=4,,α254=142,α255=1 (wraps around)\alpha^0 = 1, \quad \alpha^1 = 2, \quad \alpha^2 = 4, \quad \ldots, \quad \alpha^{254} = 142, \quad \alpha^{255} = 1 \text{ (wraps around)}

So every non-zero byte can be written as αk\alpha^k for some kk. This is useful because it turns multiplication into addition of exponents: αa×αb=αa+b\alpha^a \times \alpha^b = \alpha^{a+b}.

We store these powers in a lookup table called EXP (and the reverse mapping in LOG), so a multiplication becomes two table lookups and one addition — effectively free on any CPU.

Bytes as a Polynomial

Take the 223 data bytes, say [d0,d1,d2,][d_0, d_1, d_2, \ldots]. We treat them as coefficients of a polynomial:

d0x222+d1x221+d2x220++d222d_0 \cdot x^{222} + d_1 \cdot x^{221} + d_2 \cdot x^{220} + \cdots + d_{222}

This isn't something we "solve for x." It's a way of representing the byte sequence so we can do algebra on it. Each byte is a coefficient; its position in the array determines the power of xx.

The Generator Polynomial

We want a polynomial g(x)g(x) that is zero at 32 specific points. Start with the simplest polynomial that is zero at one point — α112=2112\alpha^{112} = 2^{112}:

(x2112)(x - 2^{112})

This is zero when x=2112x = 2^{112} and non-zero everywhere else. To also be zero at 21132^{113}, multiply by another factor:

(x2112)(x2113)(x - 2^{112})(x - 2^{113})

Continue for all 32 points:

g(x)=(x2112)(x2113)(x2143)g(x) = (x - 2^{112})(x - 2^{113}) \cdots (x - 2^{143})

Multiplying these 32 factors out (in GF(282^8) arithmetic, where subtraction is the same as addition/XOR) gives a degree-32 polynomial with 33 concrete byte-valued coefficients. It is fixed — the same for every message, computed once.

The key property: g(2112)=0g(2^{112}) = 0, g(2113)=0g(2^{113}) = 0, ..., g(2143)=0g(2^{143}) = 0, by construction.

Encoding

We have the data polynomial D(x)D(x) (223 bytes) and the generator g(x)g(x) (degree 32). The goal is to find 32 parity bytes such that the complete 255-byte codeword is divisible by g(x)g(x).

First, shift D(x)D(x) up by 32 positions to make room for parity:

D(x)x32D(x) \cdot x^{32}

This is like writing the 223 data bytes followed by 32 zeros. Now divide by g(x)g(x):

D(x)x32=Q(x)g(x)+R(x)D(x) \cdot x^{32} = Q(x) \cdot g(x) + R(x)

where R(x)R(x) is the remainder (degree < 32, so exactly 32 bytes). The codeword is:

C(x)=D(x)x32+R(x)C(x) = D(x) \cdot x^{32} + R(x)

In GF(282^8), addition is XOR, so adding R(x)R(x) replaces those 32 trailing zeros with the parity bytes. Now check — is C(x)C(x) divisible by g(x)g(x)?

C(x)=D(x)x32+R(x)=Q(x)g(x)+R(x)+R(x)=Q(x)g(x)C(x) = D(x) \cdot x^{32} + R(x) = Q(x) \cdot g(x) + R(x) + R(x) = Q(x) \cdot g(x)

Yes — because R(x)+R(x)=0R(x) + R(x) = 0 (XOR with itself). So C(x)C(x) is exactly divisible by g(x)g(x), which means evaluating C(x)C(x) at any root of g(x)g(x) gives zero:

C(2112)=Q(2112)g(2112)=0=0C(2^{112}) = Q(2^{112}) \cdot \underbrace{g(2^{112})}_{= 0} = 0

The 223 data bytes sit at the front, unchanged. Only 32 parity bytes are appended. This is called systematic encoding.

Decoding

Syndrome Check

The receiver evaluates the received 255-byte polynomial at the same 32 special values. If the data arrived intact, all 32 results are zero (because a valid codeword is divisible by g(x)g(x)).

If any byte was corrupted, at least some results will be non-zero. These 32 numbers are the syndromes — a fingerprint of the damage.

Finding Error Positions: Berlekamp-Massey

For a single error, the ratio S1/S0S_1 / S_0 directly reveals the error position (as shown in the example). For multiple errors, we need to find all error positions simultaneously.

The idea is to build an error locator polynomial σ(x)\sigma(x) whose roots correspond to the error positions. If there are tt errors at positions p1,p2,,ptp_1, p_2, \ldots, p_t, define Xk=2254pkX_k = 2^{254 - p_k} for each error and:

σ(x)=(1X1x)(1X2x)(1Xtx)=1+σ1x+σ2x2++σtxt\sigma(x) = (1 - X_1 x)(1 - X_2 x) \cdots (1 - X_t x) = 1 + \sigma_1 x + \sigma_2 x^2 + \cdots + \sigma_t x^t

Each Xk1X_k^{-1} is a root of σ(x)\sigma(x), and from XkX_k we can recover pkp_k.

The Berlekamp-Massey algorithm finds σ(x)\sigma(x) from the syndromes. It works iteratively, processing one syndrome at a time:

  1. Start with σ(x)=1\sigma(x) = 1 (no errors assumed).
  2. For each syndrome SnS_n (n=0,1,,31n = 0, 1, \ldots, 31), compute a discrepancy: Δn=Sn+σ1Sn1+σ2Sn2++σlSnl\Delta_n = S_n + \sigma_1 \cdot S_{n-1} + \sigma_2 \cdot S_{n-2} + \cdots + \sigma_l \cdot S_{n-l} where ll is the current number of estimated errors.
  3. If Δn=0\Delta_n = 0, the current σ\sigma already predicts SnS_n correctly — no update needed.
  4. If Δn0\Delta_n \neq 0, the current σ\sigma is wrong. Update it using a correction term scaled by Δn\Delta_n. If this increases the estimated error count (2ln2l \leq n), also update ll.
  5. After all 32 syndromes, ll is the number of errors and σ(x)\sigma(x) has degree ll. If l>16l > 16, the codeword has too many errors to correct.

The algorithm is efficient because it reuses a saved "old" version of σ\sigma as the correction term, so each step is just a few multiplications.

Berlekamp-Massey gives us σ(x)\sigma(x), but we need the actual roots. The Chien search simply evaluates σ(x)\sigma(x) at every possible position:

  1. For each m=0,1,,254m = 0, 1, \ldots, 254, compute σ(2m)\sigma(2^m) using the GF(282^8) lookup tables.
  2. If σ(2m)=0\sigma(2^m) = 0, then X=2mX = 2^m is a root, meaning there is an error at position p=(m+254)mod255p = (m + 254) \bmod 255.
  3. Collect all positions where σ\sigma evaluates to zero.

If the number of roots found does not match ll from Berlekamp-Massey, the codeword is uncorrectable.

This is a brute-force search, but with only 255 positions and each evaluation being a few table lookups, it is fast.

Finding Error Magnitudes: Forney Algorithm

Now we know where the errors are. The Forney algorithm computes how much each corrupted byte is off by.

First, build the error evaluator polynomial:

Ω(x)=S(x)σ(x)modx32\Omega(x) = S(x) \cdot \sigma(x) \bmod x^{32}

where S(x)=S0+S1x+S2x2++S31x31S(x) = S_0 + S_1 x + S_2 x^2 + \cdots + S_{31} x^{31} is the syndrome polynomial. This multiplication and truncation combines the syndrome information with the error locations.

Next, compute the formal derivative of σ(x)\sigma(x). In GF(282^8), the derivative has a simplification: since 2=02 = 0 in GF(22), all even-power terms vanish. Only the odd-index coefficients survive:

σ(x)=σ1+σ3x2+σ5x4+\sigma'(x) = \sigma_1 + \sigma_3 x^2 + \sigma_5 x^4 + \cdots

The error magnitude at position kk is then:

ek=Xk1112Ω(Xk1)σ(Xk1)e_k = X_k^{1 - 112} \cdot \frac{\Omega(X_k^{-1})}{\sigma'(X_k^{-1})}

where Xk=2254pkX_k = 2^{254 - p_k} is the error locator value for position pkp_k, and the Xk1112X_k^{1-112} factor adjusts for the first consecutive root being α112\alpha^{112} instead of α0\alpha^0.

Correction

XOR each corrupted byte with its computed error magnitude. The original data is restored. As a final check, recompute the 32 syndromes — if they are all zero, the correction succeeded.

When Correction Fails

If more than 16 bytes are corrupted, the decoder detects the failure at one of three points:

  1. Berlekamp-Massey finds l>16l > 16: the syndrome equations imply more errors than the code can handle.
  2. Chien search finds fewer roots than ll: the error locator polynomial has no valid solution within the 255 positions.
  3. Verification after correction: the recomputed syndromes are still non-zero, meaning the applied corrections were wrong.

In all three cases, the decoder reports failure. The data remains corrupted and must be recovered by retransmission or a higher-level protocol.

Why These Numbers

Why 32 parity bytes for 16 errors? Each error has two unknowns: where it is and what changed. That's 2 unknowns per error, and each syndrome gives 1 equation. 32 syndromes -> at most 16 solvable errors.

Why 255 total? In GF(282^8) there are 255 non-zero values. Each one maps to a position in the codeword. You can't have more positions than field elements, so 255 is the maximum codeword length for byte-sized symbols.

Why 223 data bytes? 25532=223255 - 32 = 223. It's simply whatever room is left after reserving space for 32 parity bytes.

Interleaving

Burst errors (e.g. a brief signal dropout) corrupt consecutive bytes. With interleaving depth II, the transmitter shuffles II independent codewords together so that consecutive bytes belong to different codewords.

A burst of 16I16 \cdot I corrupted bytes gets spread across II codewords, each seeing at most 16 errors — exactly within the correction capability. CCSDS supports I=1I = 1 through 55.

CCSDS Parameters

The specific parameters used in this implementation follow CCSDS 131.0-B-5:

ParameterValue
Field polynomialx8+x7+x2+x+1x^8 + x^7 + x^2 + x + 1 (0x187)
Primitive elementα=2\alpha = 2
Codeword length255 symbols
Data length223 symbols
Parity symbols32
Error correctionUp to 16 symbol errors
First consecutive rootα112\alpha^{112}
Interleave depthI=1I = 1 to 55

End-to-End Example

We encode the message "HELLO" (bytes 72, 69, 76, 76, 79), corrupt one byte in transit, and recover the original.

1. Encoding

The 5-byte message is padded with zeros to fill all 223 data positions:

Data (223 bytes):
[ 72, 69, 76, 76, 79, 0, 0, 0, ... , 0 ]
H E L L O

The data bytes become a polynomial (each byte is a coefficient, highest power first):

D(x)=72x222+69x221+76x220+76x219+79x218D(x) = 72 x^{222} + 69 x^{221} + 76 x^{220} + 76 x^{219} + 79 x^{218}

(The remaining 218 coefficients are zero.)

Shift it up by 32 to make room for parity:

D(x)x32=72x254+69x253+76x252+76x251+79x250D(x) \cdot x^{32} = 72 x^{254} + 69 x^{253} + 76 x^{252} + 76 x^{251} + 79 x^{250}

The generator g(x)=(x2112)(x2113)(x2143)g(x) = (x - 2^{112})(x - 2^{113}) \cdots (x - 2^{143}) is a fixed degree-32 polynomial.

Divide D(x)x32D(x) \cdot x^{32} by g(x)g(x). The remainder R(x)R(x) has 32 coefficients — these become the parity bytes [243, 147, 197, 58, ...]. XOR them into the trailing zeros:

Codeword (255 bytes):
[ 72, 69, 76, 76, 79, 0, ..., 0, 243, 147, 197, 58, 154, 156, 250, 218, ... ]
|--- 223 data bytes ----------| |-------- 32 parity bytes --------------|

The data is unchanged at the front. Only parity was added.

2. Corruption

During transmission, byte 2 gets corrupted — the L (76) becomes 255:

Received:
[ 72, 69, 255, 76, 79, 0, ..., 0, 243, 147, 197, ... ]
^^^
was 76, now 255

3. Syndrome Check

The received 255 bytes [r0,r1,,r254][r_0, r_1, \ldots, r_{254}] form a polynomial:

R(x)=r0x254+r1x253++r254R(x) = r_0 \cdot x^{254} + r_1 \cdot x^{253} + \cdots + r_{254}

The receiver computes each syndrome by replacing xx with one of the 32 special values. Since α=2\alpha = 2, the first syndrome S0S_0 uses x=α112=2112x = \alpha^{112} = 2^{112}. In GF(282^8), exponents wrap mod 255, so 21122^{112} is just a single byte (looked up from the EXP table):

S0=r0(2112)254+r1(2112)253+r2(2112)252++r254S_0 = r_0 \cdot (2^{112})^{254} + r_1 \cdot (2^{112})^{253} + r_2 \cdot (2^{112})^{252} + \cdots + r_{254}

Each term like (2112)254(2^{112})^{254} simplifies to 2(112254)mod255=2932^{(112 \cdot 254) \bmod 255} = 2^{93}, which is just another byte from the EXP table. Plugging in the received bytes (r0=72r_0 = 72, r1=69r_1 = 69, r2=255r_2 = 255, ...):

S0=72293+692236+2552124+=219S_0 = 72 \cdot 2^{93} + 69 \cdot 2^{236} + 255 \cdot 2^{124} + \cdots = 219

All arithmetic is in GF(282^8): every multiplication uses the lookup tables, every addition is XOR, and the result is always a single byte.

In general, each syndrome SjS_j uses a different power of 2:

Sj=r02(112+j)254mod255+r12(112+j)253mod255++r254S_j = r_0 \cdot 2^{(112+j) \cdot 254 \bmod 255} + r_1 \cdot 2^{(112+j) \cdot 253 \bmod 255} + \cdots + r_{254}

for j=0,1,,31j = 0, 1, \ldots, 31. The same bytes, but different coefficients each time — that's what gives the receiver 32 independent equations to work with.

For the original (uncorrupted) codeword, every syndrome is zero. Recall from the encoding step that the codeword is C(x)=Q(x)g(x)C(x) = Q(x) \cdot g(x), where g(x)=(x2112)(x2113)(x2143)g(x) = (x - 2^{112})(x - 2^{113}) \cdots (x - 2^{143}). To compute S0S_0, we evaluate C(x)C(x) at x=2112x = 2^{112}:

S0=C(2112)=Q(2112)g(2112)S_0 = C(2^{112}) = Q(2^{112}) \cdot g(2^{112})

Expanding g(2112)g(2^{112}):

g(2112)=(21122112)(21122113)(21122143)g(2^{112}) = (2^{112} - 2^{112})(2^{112} - 2^{113}) \cdots (2^{112} - 2^{143})

The first factor is 21122112=02^{112} - 2^{112} = 0 (any value XOR'd with itself is zero). A product with a zero factor is zero, so g(2112)=0g(2^{112}) = 0 and:

S0=Q(2112)0=0S_0 = Q(2^{112}) \cdot 0 = 0

The same applies to S1S_1 through S31S_{31}: each evaluates at x=2113x = 2^{113} through x=2143x = 2^{143}, and each is a root of g(x)g(x) by construction, so every factor chain has a zero term:

S0=0,S1=0,S2=0,,S31=0S_0 = 0, \quad S_1 = 0, \quad S_2 = 0, \quad \ldots, \quad S_{31} = 0

After the corruption at byte 2, the syndromes become non-zero:

S0=219,S1=232,S2=29,S3=145,S4=67,S_0 = 219, \quad S_1 = 232, \quad S_2 = 29, \quad S_3 = 145, \quad S_4 = 67, \quad \ldots

These 32 values encode both where the error is and how large it is. The decoder's job is to extract that information.

4. Error Location (Berlekamp-Massey)

The decoder builds an error locator polynomial σ(x)\sigma(x) whose roots reveal the corrupted positions. It processes the syndromes one at a time, updating σ(x)\sigma(x) whenever the current guess fails to predict the next syndrome.

Start with σ(x)=1\sigma(x) = 1 and l=0l = 0 (no errors assumed).

Iteration n=0n = 0: Compute the discrepancy Δ0=S0=219\Delta_0 = S_0 = 219. This is non-zero, so the current σ\sigma is wrong. Update: σ(x)1+219x\sigma(x) \leftarrow 1 + 219 x, l1l \leftarrow 1.

Iteration n=1n = 1: Δ1=S1σ1S0=232mul(219,219)=232101=141\Delta_1 = S_1 \oplus \sigma_1 \cdot S_0 = 232 \oplus \text{mul}(219, 219) = 232 \oplus 101 = 141. Non-zero, so update σ\sigma. Since ll doesn't increase (21>12 \cdot 1 > 1), only the coefficients change: σ(x)1+81x\sigma(x) \leftarrow 1 + 81 x.

Iteration n=2n = 2: Δ2=S2σ1S1=29mul(81,232)=2929=0\Delta_2 = S_2 \oplus \sigma_1 \cdot S_1 = 29 \oplus \text{mul}(81, 232) = 29 \oplus 29 = 0. Zero — σ\sigma already predicts S2S_2 correctly. No update.

Iterations n=3n = 3 through 3131: All discrepancies are zero. The polynomial has converged.

Result: σ(x)=1+81x\sigma(x) = 1 + 81 x with l=1l = 1 (one error). Since 81=α25281 = \alpha^{252}, the error locator coefficient directly encodes the error position.

The Chien search finds the roots of σ(x)=1+81x\sigma(x) = 1 + 81 x by evaluating it at every possible value x=αmx = \alpha^m:

σ(α0)=1mul(81,1)=181=80\sigma(\alpha^0) = 1 \oplus \text{mul}(81, 1) = 1 \oplus 81 = 80

σ(α1)=1mul(81,2)=1162=163\sigma(\alpha^1) = 1 \oplus \text{mul}(81, 2) = 1 \oplus 162 = 163

σ(α2)=1mul(81,4)=1195=194\sigma(\alpha^2) = 1 \oplus \text{mul}(81, 4) = 1 \oplus 195 = 194

σ(α3)=1mul(81,8)=11=0root!\sigma(\alpha^3) = 1 \oplus \text{mul}(81, 8) = 1 \oplus 1 = 0 \quad \leftarrow \text{root!}

σ(α3)=0\sigma(\alpha^3) = 0, so α3\alpha^3 is a root. The error position is (3+254)mod255=2(3 + 254) \bmod 255 = 2. That is byte 2 — exactly where the L was corrupted.

5. Error Magnitude (Forney)

The Forney algorithm computes the error evaluator polynomial Ω(x)=S(x)σ(x)modx32\Omega(x) = S(x) \cdot \sigma(x) \bmod x^{32}, which starts:

Ω0=S0σ0=2191=219\Omega_0 = S_0 \cdot \sigma_0 = 219 \cdot 1 = 219 Ω1=S1σ0S0σ1=232mul(219,81)=232232=0\Omega_1 = S_1 \cdot \sigma_0 \oplus S_0 \cdot \sigma_1 = 232 \oplus \text{mul}(219, 81) = 232 \oplus 232 = 0

So Ω(x)=219\Omega(x) = 219 (a constant). The formal derivative of σ\sigma is also a constant: σ(x)=σ1=81\sigma'(x) = \sigma_1 = 81.

The error magnitude uses X=α254p=α252=81X = \alpha^{254-p} = \alpha^{252} = 81 and X1=α3=8X^{-1} = \alpha^3 = 8:

e=X1112Ω(X1)σ(X1)=α7821981=19196=179e = X^{1 - 112} \cdot \frac{\Omega(X^{-1})}{\sigma'(X^{-1})} = \alpha^{78} \cdot \frac{219}{81} = 19 \cdot 196 = 179

where X1112=α252144mod255=α78=19X^{1 - 112} = \alpha^{252 \cdot 144 \bmod 255} = \alpha^{78} = 19 adjusts for the first consecutive root, and 219/81=196219 / 81 = 196 in GF(282^8).

Correction is XOR:

255179=76(the original L!)255 \oplus 179 = 76 \quad (\text{the original } \texttt{L}\text{!})

6. Result

The corrected codeword matches the original:

Corrected:
[ 72, 69, 76, 76, 79, 0, ..., 0, 243, 147, 197, ... ]
H E L L O (parity intact)

Errors corrected: 1