From 7dc99f1ef5b10d74035c00f6161dfea84517a2b3 Mon Sep 17 00:00:00 2001 From: Sungchan Yi Date: Wed, 17 Jan 2024 18:11:34 +0900 Subject: [PATCH] [PUBLISHER] upload files #133 --- .../2023-10-05-number-theory.md | 257 ++++++++++++++++++ 1 file changed, 257 insertions(+) create mode 100644 _posts/Lecture Notes/Modern Cryptography/2023-10-05-number-theory.md diff --git a/_posts/Lecture Notes/Modern Cryptography/2023-10-05-number-theory.md b/_posts/Lecture Notes/Modern Cryptography/2023-10-05-number-theory.md new file mode 100644 index 0000000..8dc8186 --- /dev/null +++ b/_posts/Lecture Notes/Modern Cryptography/2023-10-05-number-theory.md @@ -0,0 +1,257 @@ +--- +share: true +toc: true +math: true +categories: + - Lecture Notes + - Modern Cryptography +tags: + - lecture-note + - cryptography + - number-theory + - security +title: 8. Number Theory +date: 2023-10-05 +github_title: 2023-10-05-number-theory +--- + + +## Background + +### Number Theory + +Let $n$ be a positive integer and let $p$ be prime. + +> **Notation.** Let $\mathbb{Z}$ denote the set of integers. We will write $\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n - 1 \right\rbrace$. + +> **Definition.** Let $x, y \in \mathbb{Z}$. $\gcd(x, y)$ is the **greatest common divisor** of $x, y$. $x$ and $y$ are relatively prime if $\gcd(x, y) = 1$. + +> **Definition.** The **multiplicative inverse** of $x \in \mathbb{Z}_n$ is an element $y \in \mathbb{Z}_n$ such that $xy = 1$ in $\mathbb{Z}_n$. + +> **Lemma.** $x \in \mathbb{Z}_n$ has a multiplicative inverse if and only if $\gcd(x, n) = 1$. + +> **Definition.** $\mathbb{Z}_n^\ast$ is the set of invertible elements in $\mathbb{Z}_n$. i.e, $\mathbb{Z}_n^\ast = \left\lbrace x \in \mathbb{Z}_n : \gcd(x, n) = 1 \right\rbrace$. + +> **Lemma.** (Extended Euclidean Algorithm) For $x, y \in \mathbb{Z}$, there exists $a, b \in \mathbb{Z}$ such that $ax + by = \gcd(x, y)$. + +### Group Theory + +> **Definition.** A **group** is a set $G$ with a binary operation $* : G \times G \rightarrow G$, satisfying the following properties. +> +> - $(\mathsf{G1})$ (Associative) $(a * b) * c = a * (b * c)$ for all $a, b, c \in G$. +> - $(\mathsf{G2})$ (Identity) $\exists e \in G$ such that for all $a\in G$, $e * a = a * e = a$. +> - $(\mathsf{G3})$ (Inverse) For each $a \in G$, $\exists x \in G$ such that $a * x = x * a = e$. In this case, $x = a^{-1}$. + +> **Definition.** A group is **commutative** if $a * b = b * a$ for all $a, b \in G$. + +> **Definition.** The **order** of a group is the number of elements in $G$, denoted as $\left\lvert G \right\lvert$. + +> **Definition.** A set $H \subseteq G$ is a **subgroup** of $G$ if $H$ is itself a group under the operation of $G$. We write $H \leq G$. + +> **Theorem.** (Lagrange) Let $G$ be a finite group and $H \leq G$. Then $\left\lvert H \right\lvert \mid \left\lvert G \right\lvert$. + +*Proof*. All left cosets of $H$ have the same number of elements. A bijection between any two coset can be constructed. Cosets partition $G$, so $\left\lvert G \right\lvert$ is equal to the number of left cosets multiplied by $\left\lvert H \right\lvert$. + +Let $G$ be a group. + +> **Definition.** Let $g \in G$. The set $\left\langle g \right\rangle = \left\lbrace g^n : n \in \mathbb{Z} \right\rbrace$ is called the **cyclic subgroup generated by $g$**. The **order** of $g$ is the number of elements in $\left\langle g \right\rangle$, denoted as $\left\lvert g \right\lvert$. + +> **Definition.** $G$ is **cyclic** if there exists $g \in G$ such that $G = \left\langle g \right\rangle$. + +> **Theorem.** $\mathbb{Z}_p^\ast$ is cyclic. + +*Proof*. $\mathbb{Z}_p$ is a finite field, so $\mathbb{Z}_p^\ast = \mathbb{Z}_p \setminus \left\lbrace 0 \right\rbrace$ is cyclic. + +> **Theorem.** If $G$ is a finite group, then $g^{\left\lvert G \right\lvert} = 1$ for all $g \in G$. i.e, $\left\lvert g \right\lvert \mid \left\lvert G \right\lvert$. + +*Proof*. Consider $\left\langle g \right\rangle \leq G$, then the result follows from Lagrange's theorem. + +> **Corollary.** (Fermat's Little Theorem) If $x \in \mathbb{Z}_p^\ast$, $x^{p-1} = 1$. + +*Proof*. $\mathbb{Z}_p^\ast$ has $p-1$ elements. + +> **Corollary.** (Euler's Generalization) If $x \in \mathbb{Z}_n^\ast$, $x^{\phi(n)} = 1$. + +*Proof*. $\mathbb{Z}_n^\ast$ has $\phi(n)$ elements, where $\phi(n)$ is the Euler's totient function. + +--- + +Schemes such as Diffie-Hellman rely on the hardness of the DLP. So, *how hard is it*? How does one compute the discrete logarithm? + +There are group-specific algorithms that exploit the algebraic features of the group, but we only cover generic algorithms, that works on any cyclic group. A trivial example would be the exhaustive search, where if $\left\lvert G \right\lvert = n$ and given a generator $g \in G$, find the discrete logarithm of $h \in G$ by computing $g^i$ for all $i = 1, \dots, n - 1$. Obviously, it has running time $\mathcal{O}(n)$. We can do better than this. + +## Baby Step Giant Step Method (BSGS) + +Let $G = \left\langle g \right\rangle$, where $g \in G$ has order $q$. $q$ need not be prime for this method. We are given $u = g^\alpha$, $g$, and $q$. Our task is to find $\alpha \in \mathbb{Z}_q$. + +Set $m = \left\lceil \sqrt{q} \right\rceil$. $\alpha$ is currently unknown, but by the division algorithm, there exists integers $i,j$ such that $\alpha = i \cdot m + j$ and $0\leq i, j < m$. Then $u = g^\alpha = g^{i\cdot m + j} = g^{im} \cdot g^j$. Therefore, + +$$ +u(g^{-m})^i = g^j. +$$ + +Now, we compute the values of $g^j$ for $j = 0, 1,\dots, m - 1$ and keep a table of $(j, g^j)$ pairs. Next, compute $g^{-m}$ and for each $i$, compute $u(g^{-m})^{i}$ and check if this value is in the table. If a value is found, then we found $(i, j)$ such that $i \cdot m + j = \alpha$. + +We see that this algorithm takes $2\sqrt{q}$ group operations on $G$ in the worst case, so the time complexity is $\mathcal{O}(\sqrt{q})$. However, to store the values of $(j, g^j)$ pairs, a lot of memory is required. The table must be large enough to contain $\sqrt{q}$ group elements, so the space complexity is also $\mathcal{O}(\sqrt{q})$. + +To get around this, we can build a smaller table by choosing a smaller $m$. But then $0 \leq j < m$ but $i$ must be checked for around $q/m$ values. + +There is actually an algorithm using constant space. **Pollard's Rho** algorithm takes $\mathcal{O}(\sqrt{q})$ times and $\mathcal{O}(1)$ space. + +## Groups of Composite Order + +In Diffie-Hellman, we only used large primes. There is a reason for using groups with prime order. We study what would happen if we used composite numbers. + +Let $G$ be a cyclic group of composite order $n$. First, we start with a simple case. + +### Prime Power Case: Order $n = q^e$ + +Let $G = \left\langle g \right\rangle$ be a cyclic group of order $q^e$.[^1] ($q > 1$, $e \geq 1$) We are given $g,q, e$ and $u = g^\alpha$ and we will find $\alpha$. ($0 \leq \alpha < q^e)$ + +For each $f = 0, \dots, e$, define $g_f = g^{(q^f)}$. Then + +$$ +(g_f)^{(q^{e-f})} = g^{(q^f) \cdot (q^{e-f})} = g^{(q^e)} = 1. +$$ + +So $g_f$ generates a cyclic subgroup of order $q^{e-f}$. In particular, $g_{e-1}$ generates a cyclic subgroup of order $q$. Using this fact, we will reduce the given problem into a discrete logarithm problem on a group having smaller order $q$. + +We proceed with recursion on $e$. If $e = 1$, then $\alpha \in \mathbb{Z}_q$, so we have nothing to do. Suppose $e > 1$. Choose $f$ so that $1 \leq f \leq e-1$. We can write $\alpha = i\cdot q^f + j$, where $0 \leq i < q^{e-f}$ and $0 \leq j < g^f$. Then + +$$ +u = g^\alpha = g^{i \cdot q^f + j} = (g_f)^i \cdot g^j. +$$ + +Since $g_f$ has order $q^{e-f}$, exponentiate both sides by $q^{e-f}$ to get + +$$ +u^{(q^{e-f})} = (g_f)^{q^{e-f} \cdot i} \cdot g^{q^{e-f} \cdot j} = (g_{e-f})^j. +$$ + +Now the problem has been reduced to a discrete logarithm problem with base $g_{e-f}$, which has order $q^f$. We can compute $j$ using algorithms for discrete logarithms. + +After finding $j$, we have + +$$ +u/g^j = (g_f)^i +$$ + +which is also a discrete logarithm problem with base $g_f$, which has order $q^{e-f}$. We can compute $i$ that satisfies this equation. Finally, we can compute $\alpha = i \cdot q^f + j$. We have reduced a discrete logarithm problem into two smaller discrete logarithm problems. + +To get the best running time, choose $f \approx e/2$. Let $T(e)$ be the running time, then + +$$ +T(e) = 2T\left( \frac{e}{2} \right) + \mathcal{O}(e\log q). +$$ + +The $\mathcal{O}(e\log q)$ term comes from exponentiating both sides by $q^{e-f}$. Solving this recurrence gives + +$$ +T(e) = \mathcal{O}(e \cdot T_{\mathrm{base}} + e\log e \log q), +$$ + +where $T_\mathrm{base}$ is the complexity of the algorithm for the base case $e = 1$. $T_\mathrm{base}$ is usually the dominant term, since the best known algorithm takes $\mathcal{O}(\sqrt{q})$. + +Thus, computing the discrete logarithm in $G$ is only as hard as computing it in the subgroup of prime order. + +### General Case: Pohlig-Hellman Algorithm + +Let $G = \left\langle g \right\rangle$ be a cyclic group of order $n = q_1^{e_1}\cdots q_r^{e_r}$, where the factorization of $n$ into distinct primes $q_i$ is given. We want to find $\alpha$ such that $g^\alpha = u$. + +For $i = 1, \dots, r$, define $q_i^\ast = n / q_i^{e_i}$. Then $u^{q_i^\ast} = (g^{q_i^\ast})^\alpha$, where $g^{q_i^\ast}$ will have order $q_i^{e_i}$ in $G$. Now compute $\alpha_i$ using the algorithm for the prime power case. + +Then for all $i$, we have $\alpha \equiv \alpha_i \pmod{q_i^{e_i}}$. We can now use the Chinese remainder theorem to recover $\alpha$. Let $q_r$ be the largest prime, then the running time is bounded by + +$$ +\sum_{i=1}^r \mathcal{O}(e_i T(q_i) + e_i \log e_i \log q_i) = \mathcal{O}(T(q_r) \log n + \log n \log \log n) +$$ + +group operations. Thus, we can conclude the following. + +> The difficulty of computing discrete logarithms in a cyclic group of order $n$ is determined by the size of the largest prime factor. + +### Consequences + +- For a group with order $n = 2^k$, the Pohlig-Hellman algorithm will easily compute the discrete logarithm, since the largest prime factor is $2$. The DL assumption is false for this group. +- For primes of the form $p = 2^k + 1$, the group $\mathbb{Z}_p^\ast$ has order $2^k$, so the DL assumption is also false for these primes. +- In general, $G$ must have at least one large prime factor for the DL assumption to be true. +- By the Pohlig-Hellman algorithm, discrete logarithms in groups of composite order is a little harder than groups of prime order. So we often use a prime order group. + +## Information Leakage in Groups of Composite Order + +Let $G = \left\langle g \right\rangle$ be a cyclic group of composite order $n$. We suppose that $n = n_1n_2$, where $n_1$ is a small prime factor. + +By the Pohlig-Hellman algorithm, the adversary can compute $\alpha_1 \equiv \alpha \pmod {n_1}$ by computing the discrete logarithm of $u^{n_2}$ with base $g^{n_2}$. + +Consider $n_1 = 2$. Then the adversary knows whether $\alpha$ is even or not. + +> **Lemma.** $\alpha$ is even if and only if $u^{n/2} = 1$. + +*Proof*. If $\alpha$ is even, then $u^{n/2} = g^{\alpha n/2} = (g^{\alpha/2})^n = 1$, since the group has order $n$. Conversely, if $u^{n/2} = g^{\alpha n/2} = 1$, then the order of $g$ must divide $\alpha n/2$, so $n \mid (\alpha n /2)$ and $\alpha$ is even. + +This lemma can be used to break the DDH assumption. + +> **Lemma.** Given $u = g^\alpha$ and $v = g^\beta$, $\alpha\beta \in \mathbb{Z}_n$ is even if and only if $u^{n/2} = 1$ or $v^{n/2} = 1$. + +*Proof*. $\alpha\beta$ is even if and only if either $\alpha$ or $\beta$ is even. By the above lemma, this is equivalent to $u^{n/2} = 1$ or $v^{n/2} = 1$. + +Now we describe an attack for the DDH problem. + +> 1. The adversary is given $(g^\alpha, g^\beta, g^\gamma)$. +> 2. The adversary computes the parity of $\gamma$ and $\alpha\beta$ and compares them. +> 3. The adversary outputs $\texttt{accept}$ if the parities match, otherwise output $\texttt{{reject}}$. + +If $\gamma$ was chosen uniformly, then the adversary wins with probability $1/2$. But if $\gamma = \alpha\beta$, the adversary always wins, so the adversary has DDH advantage $1/2$. + +The above process can be generalized to any groups with small prime factor. See Exercise 16.2[^2] Thus, this is another reason we use groups of prime order. + +- DDH assumption does not hold in $\mathbb{Z}_p^\ast$, since its order $p-1$ is always even. +- Instead, we use a prime order subgroup of $\mathbb{Z}_p^\ast$ or prime order elliptic curve group. + +## Summary of Discrete Logarithm Algorithms + +|Name|Time Complexity|Space Complexity| +|:-:|:-:|:-:| +|BSGS|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(\sqrt{q})$| +|Pohlig-Hellman|$\mathcal{O}(\sqrt{q_\mathrm{max}}$|$\mathcal{O}(1)$| +|Pollard's Rho|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(1)$| + +- In generic groups, solving the DLP requires $\Omega(\sqrt{q})$ operations. + - By *generic groups*, we mean that only group operations and equality checks are allowed. Algebraic properties are not used. +- Thus, we use a large prime $q$ such that $\sqrt{q}$ is large enough. + +## Candidates of Discrete Logarithm Groups + +We need groups of order prime, and we cannot use $\mathbb{Z}_p^\ast$ as itself. We have two candidates. + +- Use a subgroup of $\mathbb{Z}_p^\ast$ having prime order $q$ such that $q \mid (p-1)$ as in Diffie-Hellman. +- Elliptic curve group modulo $p$. + +### Reduced Residue Class $\mathbb{Z}_p^\ast$ + +There are many specific algorithms for discrete logarithms on $\mathbb{Z}_p^\ast$. + +- Index-calculus +- Elliptic-curve method +- Special number-field sieve (SNFS) +- **General number-field sieve** (GNFS) + +GNFS running time is dominated by the term $\exp(\sqrt[3]{\ln p})$. If we let $p$ to be an $n$-bit prime, then the complexity is $\exp(\sqrt[3]{n})$. Suppose that GNFS runs in time $T$ for prime $p$. Since $\sqrt[3]{2} \approx 1.26$, doubling the number of bits will increase the running time of GNFS to $T^{1,26}$. + +Compare this with symmetric ciphers such as AES, where doubling the key size squares the amount of work required.[^3] NIST and Lenstra recommends the size of primes that gives a similar level of security to that of symmetric ciphers. + +|Symmetric key length|Size of prime (NIST)|Size of prime (Lenstra)| +|:-:|:-:|:-:| +|80|1024|1329| +|128|3072|4440| +|256|15360|26268| + +All sizes are in bits. Thus we need a very large prime, for example $p > 2^{2048}$, for security these days. + +### Elliptic Curve Group over $\mathbb{Z}_p$ + +Currently, the best-known attacks are generic attacks, so we can use much smaller parameters than $\mathbb{Z}_p^\ast$. Often the groups have sizes about $2^{256}$, $2^{384}$, $2^{512}$. + +[^1]: We didn't require $q$ to be prime! +[^2]: A Graduate Course in Applied Cryptography +[^3]: Recall that the best known attack was only 4 times faster than brute-force search.