# Notes on Error-Correcting Codes

David K. Zhang

Definition: Alphabet, Letter, Symbol

An alphabet is a finite set. We call the elements of an alphabet letters or symbols.

Definition: Word, Length, Kleene Star, Kleene Closure, $\Sigma^*$

A word over an alphabet $\Sigma$ of length $n \in \N$ is an element of $\Sigma^n$ (the $n$-fold Cartesian product of $\Sigma$ with itself). The set of all words of any length over $\Sigma$ is called the Kleene star or Kleene closure of $\Sigma$, and is denoted by $\Sigma^*$.

In other words, $\Sigma^* \coloneqq \bigcup_{n \in \N} \Sigma^n$, and length defines a function $\Sigma^* \to \N$ that maps each element of $\Sigma^n$ to $n$.

Definition: Code, Block Length, Codeword

A code over an alphabet $\Sigma$ of block length $n \in \N$ is a subset of $\Sigma^n$. The elements of a code are called its codewords.

Definition: Systematic Encoding Map

Let $\Sigma$ be an alphabet, and let $n, k \in \N$ with $k \le n$. A systematic encoding map is a function $f: \Sigma^k \to \Sigma^n$ that has the following property: for any $x \in \Sigma^k$, the first $k$ symbols of $f(x)$ coincide with $x$.

Definition: Indicator Function, $1[P]$

The indicator function of a logical proposition $P$ is the quantity

$1[P] \coloneqq \begin{cases} 1 & \text{if } P. \\ 0 & \text{if } \neg P. \end{cases}$

Definition: Hamming Distance, $\Delta(x, y)$, Relative Hamming Distance, $\delta(x, y)$

Let $\Sigma$ be an alphabet, and let $n \in \N$. The Hamming distance between two words $x, y \in \Sigma^n$ is the quantity

$\Delta(x, y) \coloneqq \sum_{i=1}^n \mathbb{1}[x_i \ne y_i].$

Similarly, the relative Hamming distance between $x$ and $y$ is the quantity

$\delta(x, y) \coloneqq \frac{\Delta(x, y)}{n} = \frac{1}{n} \sum_{i=1}^n \mathbb{1}[x_i \ne y_i].$

The Hamming distance $\Delta$ and relative Hamming distance $\delta$ are metrics on $\Sigma^n$. In particular, they satisfy the triangle inequality.

Definition: Message Length, Dimension, Rate, Minimum Distance, Distance, $(n, k, d)_{\abs{\Sigma}}$-Code

Let $\Sigma$ be an alphabet, and let $n \in \N$.

• The message length or dimension of a code $C \subseteq \Sigma^n$ is the quantity $\log_{\abs{\Sigma}} \abs{C}$.
• The rate of a code $C \subseteq \Sigma^n$ is the quantity $(\log_{\abs{\Sigma}} \abs{C}) / n$.
• The minimum distance (or simply distance) of a code $C \subseteq \Sigma^n$ is the minimum Hamming distance between its distinct codewords, i.e., $\min\,\{ \Delta(c, c') : c, c' \in C \text{ and } c \ne c' \}$.

A code $C \subseteq \Sigma^n$ with message length $k$ and minimum distance $d$ is called an $(n, k, d)_{\abs{\Sigma}}$-code.

Definition: Hamming Ball, $B_{\Sigma^n}(w, r)$, Volume, $\Vol_{\abs{\Sigma}}(r, n)$

Let $\Sigma$ be an alphabet, and let $n \in \N$. The Hamming ball in $\Sigma^n$ of radius $r \in \N$ about a word $w \in \Sigma^n$ is the set

$B_{\Sigma^n}(w, r) \coloneqq \{ x \in \Sigma^n : \Delta(w, x) \le r \}.$

The volume of a Hamming ball is simply its cardinality, and is denoted by

$\Vol_{\abs{\Sigma}}(r, n) \coloneqq \abs{B_{\Sigma^n}(w, r)}.$

Note that this quantity is independent of the choice of word $w \in \Sigma^n$.

$\Vol_q(r, n) = 1 + \binom{n}{1} (q - 1) + \binom{n}{2} (q - 1)^2 + \cdots + \binom{n}{r} (q - 1)^r$

Definition: Erasure, Error

TODO

Minimum Distance is a Proxy for Robustness

Theorem: Let $\Sigma$ be an alphabet, and let $n \in \N$. If a code $C \subseteq \Sigma^n$ has minimum distance $d$, then:

• $C$ can correct $\le d - 1$ erasures.
• $C$ can detect $\le d - 1$ errors.
• $C$ can correct $\le \lfloor \frac{d-1}{2} \rfloor$ errors.

Proof: TODO

Hamming Bound

Theorem: Let $\Sigma$ be an alphabet, and let $n \in \N$. If a code $C \subseteq \Sigma^n$ has minimum distance $d$, then its rate $R$ is bounded above by

$R \coloneqq \frac{\log_{\abs{\Sigma}} \abs{C}}{n} \le 1 - \frac{\log_{\abs{\Sigma}} \! \left( \Vol_{\abs{\Sigma}} \! \left( \left\lfloor \frac{d-1}{2} \right\rfloor\!, n \right) \right)}{n}$

Proof: Let $\mathcal{B} \coloneqq \{ B_{\Sigma^n}(c, \lfloor\!\frac{d-1}{2}\!\rfloor) : c \in C \}$ denote the collection of Hamming balls of radius $\lfloor\!\frac{d-1}{2}\!\rfloor$ about each codeword in $C$. Observe that the members of $\mathcal{B}$ are pairwise disjoint. Indeed, if there were two distinct codewords $c, c' \in C$ whose Hamming balls intersect, say at $\tilde{c} \in B_{\Sigma^n}(c, \lfloor\!\frac{d-1}{2}\!\rfloor) \cap B_{\Sigma^n}(c', \lfloor\!\frac{d-1}{2}\!\rfloor)$, then by definition, we would have $\Delta(c, \tilde{c}) \le \lfloor\!\frac{d-1}{2}\!\rfloor$ and $\Delta(\tilde{c}, c') \le \lfloor\!\frac{d-1}{2}\!\rfloor$. The triangle inequality would then imply

$\Delta(c, c') \le \Delta(c, \tilde{c}) + \Delta(\tilde{c}, c') \le \left\lfloor \frac{d-1}{2} \right\rfloor + \left\lfloor \frac{d-1}{2} \right\rfloor \le d - 1,$

contradicting the hypothesis that $C$ has minimum distance $d$. Now, the members of $\mathcal{B}$ are all subsets of $\Sigma^n$, so it follows that the total cardinality of all of the balls in $\mathcal{B}$ is bounded above by $\abs{\Sigma}^n$.

$\abs{C} \Vol_{\abs{\Sigma}} \! \left( \left\lfloor \frac{d-1}{2} \right\rfloor\!, n \right) \le \abs{\Sigma}^n$

Taking the base-$\abs{\Sigma}$ logarithm of both sides, we obtain:

$\log_{\abs{\Sigma}} \abs{C} + \log_{\abs{\Sigma}} \! \left( \Vol_{\abs{\Sigma}} \! \left( \left\lfloor \frac{d-1}{2} \right\rfloor\!, n \right) \right) \le n$

Dividing both sides by $n$ yields the desired result.

Definition: Perfect Code

Let $\Sigma$ be an alphabet, and let $n \in \N$. A code $C \subseteq \Sigma^n$ is perfect if the collection of Hamming balls $\{ B_{\Sigma^n}(c, \lfloor\!\frac{d-1}{2}\!\rfloor) : c \in C \}$ covers $\Sigma^n$, where $d$ denotes the minimum distance of $C$.

In other words, a code is perfect if it saturates the Hamming bound.

Definition: Linear Code

Let $F$ be a finite field. A linear code of dimension $k \in \N$ and block length $n \in \N$ over $F$ is a $k$-dimensional linear subspace of $F^n$.

Note that the linear-algebraic notion of dimension coincides with our previous definition of the dimension of a code. If $C$ is a $k$-dimensional linear subspace of $F^n$, then $\abs{C} = \abs{F}^k$, which implies $k = \log_{\abs{F}} \abs{C}$.

Definition: Generator Matrix, Parity-Check Matrix, Dual, $C^\perp$

Let $F$ be a finite field, $n, k \in \N$, and let $C \subseteq F^n$ be a linear code of dimension $k$.

• A generator matrix for $C$ is a matrix $G \in F^{n \times k}$ whose column space is $C$.
• A parity-check matrix for $C$ is a matrix $H \in F^{(n - k) \times k}$ whose kernel is $C$.
• The dual of $C$ is the linear code $C^\perp \coloneqq \{ \vd \in F^n : \langle \vc, \vd \rangle = 0_F \text{ for all } \vc \in C \}$.

Here, $\langle \cdot, \cdot \rangle$ denotes the standard inner product on $F^n$, i.e., $\langle \vv, \vw \rangle \coloneqq \sum_{i=1}^n v_i w_i$.

Because every vector space has a basis, every linear code $C \subseteq F^n$ admits a generator matrix (whose columns are the basis vectors for $C$). In fact, up to permutations of symbol positions within codewords, every linear code admits a generator matrix whose first $k$ rows coincide with the $k \times k$ identity matrix. Such a generator matrix defines a systematic encoding map.

Gilbert–Varshamov Bound

Theorem: Let $q \in \N$ be a prime power, and let $n, d \in \N$ with $d \le n$. There exists a linear code $C \subseteq (\F_q)^n$ of minimum distance $d$ and rate

$R \ge 1 - \frac{\lfloor \log_q(\Vol_q(d - 1, n)) \rfloor + 1}{n}.$

Proof: Let $C$ be a uniformly random linear subspace of $(\F_q)^n$ of dimension $k \coloneqq n - \lfloor \log_q(\Vol_q(d-1, n)) \rfloor - 1$. Note that there are only finitely many such subspaces, so this uniform distribution makes sense. We will show that $C$ has minimum distance $\ge d$ with probability $> 0$.

Let $G \in (\F_q)^{n \times k}$ be a uniformly random generator matrix for $C$. Observe that, for any fixed $\vx \in (\F_q)^n \setminus \{\vo\}$, the vector $G\vx$ is uniformly distributed in $(\F_q)^n \setminus \{\vo\}$. Indeed, all of the random choices we have made so far are invariant under a linear isomorphism of $(\F_q)^n$, and there exist isomorphisms of $(\F_q)^n$ that send any nonzero vector to any other nonzero vector, so this random process cannot favor any nonzero vector.

TODO

Remark: The Gilbert-Varshamov bound actually holds for any $q \in \N$, not just prime powers. A proof of this stronger result requires a different construction, and the resulting codes are not linear codes.