mirror of
https://github.com/calofmijuck/blog.git
synced 2025-12-06 22:53:51 +00:00
Compare commits
16 Commits
main
...
a8457d15c8
| Author | SHA1 | Date | |
|---|---|---|---|
|
a8457d15c8
|
|||
| 30a1a142d4 | |||
| 2a859e647b | |||
| 298b1ba1aa | |||
|
b90039687a
|
|||
| 13755ba204 | |||
| dd3e21344e | |||
| 0d6ba31ba6 | |||
| 0f1120b2f6 | |||
| 871ca66457 | |||
| 33846b79a1 | |||
| 66f0e0a50e | |||
| 922844d638 | |||
| 47872c6bef | |||
| 4d68a99404 | |||
| da098b4126 |
@@ -36,7 +36,7 @@ attachment:
|
|||||||
In this course, we are mainly interested in system/network security!
|
In this course, we are mainly interested in system/network security!
|
||||||
|
|
||||||
There are two categories in **IT Security**, (though the boundary is blurry)
|
There are two categories in **IT Security**, (though the boundary is blurry)
|
||||||
- **Computer** (system) **security** uses automated tools and mechanisms to protect **data in a computer**, against hackers, malware, etc.
|
- **Computer** (system) **security** uses automated tools and mechanisms to protect the **data in a computer**, against hackers, malware, etc.
|
||||||
- **Internet** (network) **security** prevents, detects, and corrects security violations that involve the **transmission of information** in a network.
|
- **Internet** (network) **security** prevents, detects, and corrects security violations that involve the **transmission of information** in a network.
|
||||||
|
|
||||||
In internet security, we assume that:
|
In internet security, we assume that:
|
||||||
@@ -52,7 +52,7 @@ In internet security, we assume that:
|
|||||||
- inserting, modifying, deleting, replaying messages
|
- inserting, modifying, deleting, replaying messages
|
||||||
- poisoning data
|
- poisoning data
|
||||||
- impersonate and pretend to be someone else
|
- impersonate and pretend to be someone else
|
||||||
- Conventionally, we use the terms:
|
- Conventionally, we use the following names:
|
||||||
- Alice and Bob for the two parties participating in the communication.
|
- Alice and Bob for the two parties participating in the communication.
|
||||||
- Eve (or Mallory, Oscar) for the adversary.
|
- Eve (or Mallory, Oscar) for the adversary.
|
||||||
|
|
||||||
@@ -94,9 +94,9 @@ This is only an overview, so the attacks are introduced briefly.
|
|||||||
There are two types of attacks in security attacks
|
There are two types of attacks in security attacks
|
||||||
- **Active attacks**: modify the content of messages
|
- **Active attacks**: modify the content of messages
|
||||||
- Ex. (D)DoS, MITM, poisoning, smurf attack, system attacks.
|
- Ex. (D)DoS, MITM, poisoning, smurf attack, system attacks.
|
||||||
- *Prevention* is important since the active attacks are a danger to *data integrity* and *availability*.
|
- *Prevention* is important since the active attacks concern *data integrity* and *availability*.
|
||||||
- **Passive attacks**: does not modify information, but observes the content or copies it.
|
- **Passive attacks**: does not modify information, but observes the content or copies it.
|
||||||
- Ex. eavesdropping, port scanning (idle scan secretly scanns).
|
- Ex. eavesdropping, port scanning (idle scan secretly scans).
|
||||||
- *Detection* is important since passive attacks are a danger to *confidentiality*.
|
- *Detection* is important since passive attacks are a danger to *confidentiality*.
|
||||||
|
|
||||||
## Security Services and Mechanisms
|
## Security Services and Mechanisms
|
||||||
@@ -112,7 +112,7 @@ What kind of security services do we want? The basic network security services m
|
|||||||
Additionally, we also need:
|
Additionally, we also need:
|
||||||
- **Authentication**: a way to authenticate users (ID, passwords)
|
- **Authentication**: a way to authenticate users (ID, passwords)
|
||||||
- **Non-repudiation**: ensure that no party can deny that it sent or received a message or approved some information
|
- **Non-repudiation**: ensure that no party can deny that it sent or received a message or approved some information
|
||||||
- Assurance that someone cannot deny the validity of something
|
- Assurance that someone cannot deny the validity of message or information
|
||||||
|
|
||||||
### Attacks Against CIA Triad
|
### Attacks Against CIA Triad
|
||||||
|
|
||||||
@@ -142,10 +142,10 @@ There are many ways of achieving security.
|
|||||||
- It may be desirable to not leak *any* information, so one might add padding to the traffic, so the traffic is indistinguishable by the adversary (prevents side-channel attacks)
|
- It may be desirable to not leak *any* information, so one might add padding to the traffic, so the traffic is indistinguishable by the adversary (prevents side-channel attacks)
|
||||||
- **Digital signatures**: provides authenticity of digital messages or documents
|
- **Digital signatures**: provides authenticity of digital messages or documents
|
||||||
- **Trusted Third Party** (TTP): a safe third-party that we can trust
|
- **Trusted Third Party** (TTP): a safe third-party that we can trust
|
||||||
- If we have a TTP, a lot of problems go away. We can always ask the TTP for the truth
|
- If we have a TTP, a lot of problems go away. We can always ask the TTP for the truth.
|
||||||
- But TTP can become a *single point of failure* (SPOF), and security architectures may become too dependent on the TTP
|
- But TTP can become a *single point of failure* (SPOF), and security architectures may become too dependent on the TTP.
|
||||||
- **Append-only server**: keeps track of all modifications, good for auditing
|
- **Append-only server**: keeps track of all modifications, good for auditing
|
||||||
- Blockchain is a kind of append-only data structure
|
- Blockchain is a kind of append-only data structure.
|
||||||
|
|
||||||
## Cryptography
|
## Cryptography
|
||||||
|
|
||||||
@@ -155,7 +155,7 @@ There are many ways of achieving security.
|
|||||||
|
|
||||||
### Basics of a Cryptosystem
|
### Basics of a Cryptosystem
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- A **message** in *plaintext* is given to an **encryption algorithm**.
|
- A **message** in *plaintext* is given to an **encryption algorithm**.
|
||||||
- The encryption algorithm uses an **encryption key** to create a *ciphertext*.
|
- The encryption algorithm uses an **encryption key** to create a *ciphertext*.
|
||||||
@@ -168,7 +168,7 @@ There are many ways of achieving security.
|
|||||||
There are two criteria for classifying cryptosystems.
|
There are two criteria for classifying cryptosystems.
|
||||||
|
|
||||||
- How are the keys used?
|
- How are the keys used?
|
||||||
- **Symmetric** cryptography uses a single key for both encryption and decryption
|
- **Symmetric** cryptography uses a single key for both encryption and decryption.
|
||||||
- **Public key** cryptography uses different keys for encryption and decryption, respectively.
|
- **Public key** cryptography uses different keys for encryption and decryption, respectively.
|
||||||
- How are plaintexts processed?
|
- How are plaintexts processed?
|
||||||
- **Block cipher**
|
- **Block cipher**
|
||||||
@@ -232,7 +232,7 @@ In a smartphone, assets (things of value) would be
|
|||||||
For example,
|
For example,
|
||||||
|
|
||||||
|Attacker|Abilities|Goals|
|
|Attacker|Abilities|Goals|
|
||||||
|-|-|-|
|
|:-:|-|-|
|
||||||
|Thief|Steal the phone|Take the device|
|
|Thief|Steal the phone|Take the device|
|
||||||
|FBI|Lot of things...|Obtain evidence from the device|
|
|FBI|Lot of things...|Obtain evidence from the device|
|
||||||
|Eavesdropper|Observe network traffic|Steal information|
|
|Eavesdropper|Observe network traffic|Steal information|
|
||||||
|
|||||||
@@ -24,7 +24,7 @@ github_title: 2023-09-11-symmetric-key-cryptography-1
|
|||||||
- A strong encryption algorithm, which is known to the public.
|
- A strong encryption algorithm, which is known to the public.
|
||||||
- Kerckhoff's principle!
|
- Kerckhoff's principle!
|
||||||
- A secret key known only to sender and receiver.
|
- A secret key known only to sender and receiver.
|
||||||
- We assume the **existence of a a secure channel for distributing the key**.
|
- We assume the **existence of a a secure channel for distributing the key**.[^1]
|
||||||
- **Correctness requirement**
|
- **Correctness requirement**
|
||||||
- Let $m$, $k$ denote the message and the key.
|
- Let $m$, $k$ denote the message and the key.
|
||||||
- For encryption/decryption algorithm $E$ and $D$,
|
- For encryption/decryption algorithm $E$ and $D$,
|
||||||
@@ -32,7 +32,7 @@ github_title: 2023-09-11-symmetric-key-cryptography-1
|
|||||||
|
|
||||||
## Cryptographic Attacks
|
## Cryptographic Attacks
|
||||||
|
|
||||||
In increasing order of increasing power of the attacker,
|
In increasing order of the power of the attacker,
|
||||||
|
|
||||||
- **Ciphertext only attacks**: the attacker has ciphertexts, and tries to obtain information.
|
- **Ciphertext only attacks**: the attacker has ciphertexts, and tries to obtain information.
|
||||||
- **Known plaintext attack**: the attacker has a collection of plaintext/ciphertext pairs.
|
- **Known plaintext attack**: the attacker has a collection of plaintext/ciphertext pairs.
|
||||||
@@ -44,8 +44,10 @@ In increasing order of increasing power of the attacker,
|
|||||||
The following two properties should hold for a secure cipher.
|
The following two properties should hold for a secure cipher.
|
||||||
- **Diffusion** hides the relationship between the ciphertext and the plaintext.
|
- **Diffusion** hides the relationship between the ciphertext and the plaintext.
|
||||||
- It should be hard to obtain the plaintext from the ciphertext.
|
- It should be hard to obtain the plaintext from the ciphertext.
|
||||||
|
- Changing a single bit of the plaintext affects several bits of the ciphertext, and vice versa.
|
||||||
- **Confusion** hides the relationship between the ciphertext and the key.
|
- **Confusion** hides the relationship between the ciphertext and the key.
|
||||||
- It should be hard to obtain the key from the ciphertext.
|
- It should be hard to obtain the key from the ciphertext.
|
||||||
|
- Each bit of the ciphertext should depend on several parts of the key.
|
||||||
|
|
||||||
## Primitives
|
## Primitives
|
||||||
|
|
||||||
@@ -66,8 +68,9 @@ In **substitution cipher**, encryption is done by replacing units of plaintext w
|
|||||||
- In Caesar cipher, $a = 1$ and $b = 3$.
|
- In Caesar cipher, $a = 1$ and $b = 3$.
|
||||||
- Encryption: $E(x) = ax + b \pmod m$.
|
- Encryption: $E(x) = ax + b \pmod m$.
|
||||||
- Decryption: $D(x) = a^{-1}(x - b) \pmod m$.
|
- Decryption: $D(x) = a^{-1}(x - b) \pmod m$.
|
||||||
- There are $12$ possible values for $a$, and $26$ possible values for $b$.
|
- If we use the $26$ alphabets, there are $12$ possible values for $a$, and $26$ possible values for $b$.
|
||||||
- $a^{-1}$ does not exist for all $m$.
|
- $a^{-1}$ does not exist for all $m$.
|
||||||
|
- We need that $\gcd(a, m) = 1$. The number of possible $a$ values is $\phi(m)$.
|
||||||
- This scheme is not secure either, since we can try all possibilities and check if the message makes sense.
|
- This scheme is not secure either, since we can try all possibilities and check if the message makes sense.
|
||||||
|
|
||||||
#### Monoalphabetic Substitution Cipher
|
#### Monoalphabetic Substitution Cipher
|
||||||
@@ -79,17 +82,17 @@ In **substitution cipher**, encryption is done by replacing units of plaintext w
|
|||||||
- Decryption is done by replacing each letter $x$ by $\pi^{-1}(x)$.
|
- Decryption is done by replacing each letter $x$ by $\pi^{-1}(x)$.
|
||||||
- This scheme is still not secure, since we can try all possibilities on a *modern* computer.
|
- This scheme is still not secure, since we can try all possibilities on a *modern* computer.
|
||||||
|
|
||||||
To attack this scheme, we use frequency analysis. Calculate the frequency of each letter and compare it with the actual distribution of English letters. Also, we could use bigrams (2-letters)
|
To attack this scheme, we use frequency analysis. Calculate the frequency of each letter and compare it with the actual distribution of English letters. We could also use *bigrams* (2-letters) for calculating the frequency.
|
||||||
|
|
||||||
#### Vigenère Cipher
|
#### Vigenère Cipher
|
||||||
|
|
||||||
- A polyalphabetic substitution
|
- A polyalphabetic substitution
|
||||||
- Given a key length $m$, take key $k = (k_1, k_2, \dots, k_m)$.
|
- Given a key length $m$, take key $k = (k_1, k_2, \dots, k_m)$.
|
||||||
- For the $i$-th letter $x$, set $j = i \pmod m$.
|
- For the $i$-th letter $x$, set $j = i \bmod m$.
|
||||||
- Encryption is done by replacing $x$ by $x + k_{j}$.
|
- Encryption is done by replacing $x$ by $x + k_{j}$.
|
||||||
- Decryption is done by replacing $x$ by $x - k_j$.
|
- Decryption is done by replacing $x$ by $x - k_j$.
|
||||||
|
|
||||||
To attack this scheme, find the key length by *index of coincidence*. Then use frequency analysis.
|
To attack this scheme, find the key length by [*index of coincidence*](https://en.wikipedia.org/wiki/Index_of_coincidence). Then use frequency analysis.
|
||||||
|
|
||||||
#### Hill Cipher
|
#### Hill Cipher
|
||||||
|
|
||||||
@@ -113,6 +116,48 @@ This scheme is vulnerable to known plaintext attack, since the equation can be s
|
|||||||
- To encrypt, reorder the columns by the chosen permutation.
|
- To encrypt, reorder the columns by the chosen permutation.
|
||||||
- Then the ciphertext is taken by taking letters in column major order.
|
- Then the ciphertext is taken by taking letters in column major order.
|
||||||
|
|
||||||
|
##### Example
|
||||||
|
|
||||||
|
Suppose we encrypt the following text:
|
||||||
|
|
||||||
|
$$
|
||||||
|
\texttt{CRYPTOGRAPHY INTERNET SECURITY}
|
||||||
|
$$
|
||||||
|
|
||||||
|
Choose a key $\sigma = (1, 4, 5, 2, 3, 6)$. Then
|
||||||
|
|
||||||
|
$$
|
||||||
|
\begin{matrix} \\
|
||||||
|
4 & 3 & 6 & 5 & 2 & 1 \\ \hline
|
||||||
|
\texttt{C} & \texttt{R} & \texttt{Y} & \texttt{P} & \texttt{T} & \texttt{O} \\
|
||||||
|
\texttt{G} & \texttt{R} & \texttt{A} & \texttt{P} & \texttt{H} & \texttt{Y} \\
|
||||||
|
\texttt{I} & \texttt{N} & \texttt{T} & \texttt{E} & \texttt{R} & \texttt{N} \\
|
||||||
|
\texttt{E} & \texttt{T} & \texttt{S} & \texttt{E} & \texttt{C} & \texttt{U} \\
|
||||||
|
\texttt{R} & \texttt{I} & \texttt{T} & \texttt{Y}
|
||||||
|
\end{matrix}
|
||||||
|
$$
|
||||||
|
|
||||||
|
Now reorder the columns,
|
||||||
|
|
||||||
|
$$
|
||||||
|
\begin{matrix} \\
|
||||||
|
1 & 2 & 3 & 4 & 5 & 6 \\ \hline
|
||||||
|
\texttt{O} & \texttt{T} & \texttt{R} & \texttt{C} & \texttt{P} & \texttt{Y} \\
|
||||||
|
\texttt{Y} & \texttt{H} & \texttt{R} & \texttt{G} & \texttt{P} & \texttt{A} \\
|
||||||
|
\texttt{N} & \texttt{R} & \texttt{N} & \texttt{I} & \texttt{E} & \texttt{T} \\
|
||||||
|
\texttt{U} & \texttt{C} & \texttt{T} & \texttt{E} & \texttt{E} & \texttt{S} \\
|
||||||
|
&& \texttt{I} & \texttt{R} & \texttt{Y} & \texttt{T}
|
||||||
|
\end{matrix}
|
||||||
|
$$
|
||||||
|
|
||||||
|
The ciphertext is
|
||||||
|
|
||||||
|
$$
|
||||||
|
\texttt{OYNU THRC RRNTI CGIER PPEEY YATST}.
|
||||||
|
$$
|
||||||
|
|
||||||
|
The decryption process is the reverse of this operation. It seems to be breakable by inspecting the $i$-th letter of each block and reordering the letters to check if any reordering makes sense.
|
||||||
|
|
||||||
### Exclusive OR (XOR)
|
### Exclusive OR (XOR)
|
||||||
|
|
||||||
- A bitwise operation $x \oplus y = x + y \pmod 2$.
|
- A bitwise operation $x \oplus y = x + y \pmod 2$.
|
||||||
@@ -130,8 +175,8 @@ This scheme is vulnerable to known plaintext attack, since the equation can be s
|
|||||||
|
|
||||||
$$
|
$$
|
||||||
\begin{align*}
|
\begin{align*}
|
||||||
\mathrm{Pr}[C = 0] &= \mathrm{Pr}[M = 0 \land K = 0] + \mathrm{Pr}[M = 1 \land K = 1] \\ &= \mathrm{Pr}[M = 0] \cdot \mathrm{Pr}[K = 0] + \mathrm{Pr}[M = 1] \cdot \mathrm{Pr}[K = 1] \\
|
\Pr[C = 0] &= \Pr[M = 0 \land K = 0] + \Pr[M = 1 \land K = 1] \\ &= \Pr[M = 0] \cdot \Pr[K = 0] + \Pr[M = 1] \cdot \Pr[K = 1] \\
|
||||||
&= \frac{1}{2}\left(\mathrm{Pr}[M = 0] + \mathrm{Pr}[M = 1]\right) \\
|
&= \frac{1}{2}\left(\Pr[M = 0] + \Pr[M = 1]\right) \\
|
||||||
&= \frac{1}{2}.
|
&= \frac{1}{2}.
|
||||||
\end{align*}
|
\end{align*}
|
||||||
$$
|
$$
|
||||||
@@ -140,20 +185,20 @@ The case for $C = 1$ is similar.
|
|||||||
|
|
||||||
### One-Time Pad (OTP)
|
### One-Time Pad (OTP)
|
||||||
|
|
||||||
Omitted.
|
[1. OTP, Stream Ciphers and PRGs > One-Time Pad (OTP)](../../modern-cryptography/2023-09-07-otp-stream-cipher-prgs#one-time-pad-otp)
|
||||||
|
|
||||||
## Perfect Secrecy
|
## Perfect Secrecy
|
||||||
|
|
||||||
> **Definition.** Let $(E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. We assume that $\lvert \mathcal{K} \rvert = \lvert \mathcal{M} \rvert = \lvert \mathcal{C} \rvert$. The cipher is **perfectly secure** if for all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
> **Definition.** Let $(E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. We assume that $\lvert \mathcal{K} \rvert = \lvert \mathcal{M} \rvert = \lvert \mathcal{C} \rvert$. The cipher is **perfectly secure** if for all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
||||||
>
|
>
|
||||||
> $$
|
> $$
|
||||||
> \mathrm{Pr}[\mathcal{M} = m \mid \mathcal{C} = c] = \mathrm{Pr}[\mathcal{M} = m].
|
> \Pr[\mathcal{M} = m \mid \mathcal{C} = c] = \Pr[\mathcal{M} = m].
|
||||||
> $$
|
> $$
|
||||||
>
|
>
|
||||||
> Or equivalently, for all $m_0, m_1 \in \mathcal{M}$, $c \in \mathcal{C}$,
|
> Or equivalently, for all $m_0, m_1 \in \mathcal{M}$, $c \in \mathcal{C}$,
|
||||||
>
|
>
|
||||||
> $$
|
> $$
|
||||||
> \mathrm{Pr}[E(k, m _ 0) = c] = \mathrm{Pr}[E(k, m _ 1) = c]
|
> \Pr[E(k, m _ 0) = c] = \Pr[E(k, m _ 1) = c]
|
||||||
> $$
|
> $$
|
||||||
>
|
>
|
||||||
> where $k$ is chosen uniformly in $\mathcal{K}$.
|
> where $k$ is chosen uniformly in $\mathcal{K}$.
|
||||||
@@ -163,7 +208,7 @@ In other words, the adversary learns nothing from the ciphertext.
|
|||||||
With this definition, we can show that **OTP is perfectly secure**. For all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
With this definition, we can show that **OTP is perfectly secure**. For all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
||||||
|
|
||||||
$$
|
$$
|
||||||
\mathrm{Pr}[E(k, m) = c] = \frac{1}{\lvert \mathcal{K} \rvert}
|
\Pr[E(k, m) = c] = \frac{1}{\lvert \mathcal{K} \rvert}
|
||||||
$$
|
$$
|
||||||
|
|
||||||
since for each $m$ and $c$, $k$ is determined uniquely.
|
since for each $m$ and $c$, $k$ is determined uniquely.
|
||||||
@@ -278,3 +323,5 @@ Given a bit string (defined in the specification), the sender performs long divi
|
|||||||
- $c \oplus (x \parallel \mathrm{CRC}(x)) = k_s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$
|
- $c \oplus (x \parallel \mathrm{CRC}(x)) = k_s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$
|
||||||
- The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
|
- The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
|
||||||
- CRC check by the receiver will succeed.
|
- CRC check by the receiver will succeed.
|
||||||
|
|
||||||
|
[^1]: This assumption will be removed when we learn public key cryptography.
|
||||||
|
|||||||
@@ -28,8 +28,8 @@ attachment:
|
|||||||
### Modules
|
### Modules
|
||||||
|
|
||||||
- **S-box**: a substitution module
|
- **S-box**: a substitution module
|
||||||
- Usually for confusion
|
- Usually for confusion, also gives diffusion
|
||||||
- $m \times n$ lookup box is needed, since it should be invertible
|
- $m \times n$ lookup box is used for implementation
|
||||||
- **P-box**: a permutation module
|
- **P-box**: a permutation module
|
||||||
- Usually for diffusion
|
- Usually for diffusion
|
||||||
- Compared to the number of input bits,
|
- Compared to the number of input bits,
|
||||||
@@ -42,28 +42,28 @@ attachment:
|
|||||||
- Standardized in 1979.
|
- Standardized in 1979.
|
||||||
- Block size is $64$ bits ($8$ bytes)
|
- Block size is $64$ bits ($8$ bytes)
|
||||||
- $64$ bits input $\rightarrow$ $64$ bits output
|
- $64$ bits input $\rightarrow$ $64$ bits output
|
||||||
- Key is $56$ bits, but there are $8$ bits representing parity, so total of $64$ bits
|
- Key is $56$ bits, and every $8$th bit is a parity bit.
|
||||||
- Every $8$th bit is a parity bit
|
- Thus $64$ bits in total
|
||||||
|
|
||||||
### Encryption
|
### Encryption
|
||||||
|
|
||||||
1. From the $56$-bit key, generate $16$ different $48$ bit keys $k_1, \dots, k_{16}$.
|
1. From the $56$-bit key, generate $16$ different $48$ bit keys $k_1, \dots, k_{16}$.
|
||||||
2. The plaintext message goes through the P-box.
|
2. The plaintext message goes through an initial permutation.
|
||||||
3. The output goes through $16$ rounds, and in the round $i$, key $k_i$ is used.
|
3. The output goes through $16$ rounds, and key $k_i$ is used in round $i$.
|
||||||
4. After $16$ rounds, split the output into two $32$ bit halves and swap them.
|
4. After $16$ rounds, split the output into two $32$ bit halves and swap them.
|
||||||
5. The output goes through the inverse of the P-box from Step 1.
|
5. The output goes through the inverse of the permutation from Step 1.
|
||||||
|
|
||||||
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.
|
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.[^1]
|
||||||
|
|
||||||
In each round $i$,
|
In each round $i$, the following operation is performed:
|
||||||
|
|
||||||
$$
|
$$
|
||||||
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1})
|
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1}).
|
||||||
$$
|
$$
|
||||||
|
|
||||||
#### The Feistel Function
|
#### The Feistel Function
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunks. Each chunk is expanded to $6$ bits using a P-box. Now, we have 48 bits of data, so apply XOR with the key for this round. Next, each $6$-bit block is compressed back to $4$ bits using a S-box. Finally, there is a (straight) permutation at the end, resulting in $32$ bit data.
|
The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunks. Each chunk is expanded to $6$ bits using a P-box. Now, we have 48 bits of data, so apply XOR with the key for this round. Next, each $6$-bit block is compressed back to $4$ bits using a S-box. Finally, there is a (straight) permutation at the end, resulting in $32$ bit data.
|
||||||
|
|
||||||
@@ -108,10 +108,10 @@ Thus $F$ and $G$ are inverses of each other, thus $f$ doesn't have to be inverti
|
|||||||
Also, note that
|
Also, note that
|
||||||
|
|
||||||
$$
|
$$
|
||||||
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i),
|
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i).
|
||||||
$$
|
$$
|
||||||
|
|
||||||
so evaluating the decryption round is actually equivalent to running the encryption round with upper/lower $32$ bit halves swapped. Hence the reason for swapping each $32$ bit halves.
|
Notice that evaluating $G$ is equivalent to evaluating $F$ on a encrypted block, with their upper/lower $32$ bit halves swapped. We get $L_i \oplus f(R_i) \parallel R_i$ exactly when we swap each halves of $F(L_i \parallel R_i)$. Thus, we can use the same hardware for encryption and decryption, which is the reason for swapping each $32$ bit halves.
|
||||||
|
|
||||||
## Advanced Encryption Standard (AES)
|
## Advanced Encryption Standard (AES)
|
||||||
|
|
||||||
@@ -130,7 +130,7 @@ Each round consists of the following:
|
|||||||
- **AddRoundKey**: XOR with round key
|
- **AddRoundKey**: XOR with round key
|
||||||
|
|
||||||
The first and last rounds are a little different.
|
The first and last rounds are a little different.
|
||||||
- Before the first round, AddRoundKey is done.
|
- AddRoundKey is done before the first round.
|
||||||
- The last round does not have MixColumns.
|
- The last round does not have MixColumns.
|
||||||
|
|
||||||
The objectives of AES:
|
The objectives of AES:
|
||||||
@@ -138,7 +138,7 @@ The objectives of AES:
|
|||||||
- Code must be compact, and should run fast on many CPUs
|
- Code must be compact, and should run fast on many CPUs
|
||||||
- Design must be simple
|
- Design must be simple
|
||||||
|
|
||||||
### Modules
|
### Layers
|
||||||
|
|
||||||
#### SubBytes
|
#### SubBytes
|
||||||
|
|
||||||
@@ -157,7 +157,7 @@ The objectives of AES:
|
|||||||
- For each column, each byte is replaced by a value
|
- For each column, each byte is replaced by a value
|
||||||
- The value depends on all 4 bytes of the column
|
- The value depends on all 4 bytes of the column
|
||||||
- Each column is processed separately
|
- Each column is processed separately
|
||||||
- Thus effectively, it is a matrix multiplication (Hill cipher)
|
- Thus effectively, it is a matrix multiplication (Hill cipher).[^2]
|
||||||
|
|
||||||
#### AddRoundKey
|
#### AddRoundKey
|
||||||
|
|
||||||
@@ -171,7 +171,7 @@ These 4 modules are all invertible!
|
|||||||
- Why is there a AddRoundKey at the beginning?
|
- Why is there a AddRoundKey at the beginning?
|
||||||
- Why is the last round different?
|
- Why is the last round different?
|
||||||
|
|
||||||
Both are for engineering purposes, to make the encryption and decryption process the same. (Check!)
|
Both are for engineering purposes, to make the encryption and decryption process the same.[^3]
|
||||||
|
|
||||||
## Modes of Operations
|
## Modes of Operations
|
||||||
|
|
||||||
@@ -179,14 +179,14 @@ AES, DES use fixed block size for encryption. How do we encrypt longer messages?
|
|||||||
|
|
||||||
### Electronic Codebook Mode (ECB)
|
### Electronic Codebook Mode (ECB)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- Codebook is a mapping table.
|
- Codebook is a mapping table.
|
||||||
- For the $i$-th plaintext block, we use key $k$ to encrypt and obtain the $i$-th ciphertext block.
|
- For the $i$-th plaintext block, we use key $k$ to encrypt and obtain the $i$-th ciphertext block.
|
||||||
- **Uses the same key for all blocks**
|
- **Uses the same key for all blocks**
|
||||||
- Adjacent blocks are independent of each other.
|
- Adjacent blocks are independent of each other.
|
||||||
- Advantages
|
- Advantages
|
||||||
- Good when run in parallel
|
- Fast when run in parallel
|
||||||
- Limitations
|
- Limitations
|
||||||
- Repetitions in messages (if aligned with the block) may lead to repetitions in the ciphertext
|
- Repetitions in messages (if aligned with the block) may lead to repetitions in the ciphertext
|
||||||
- Susceptible to *cut-and-paste attacks*
|
- Susceptible to *cut-and-paste attacks*
|
||||||
@@ -198,7 +198,7 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
|
|||||||
|
|
||||||
### Cipher Block Chaining Mode (CBC)
|
### Cipher Block Chaining Mode (CBC)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- Two identical messages produce to different ciphertexts.
|
- Two identical messages produce to different ciphertexts.
|
||||||
- This prevents chosen plaintext attacks
|
- This prevents chosen plaintext attacks
|
||||||
@@ -234,8 +234,8 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
|
|||||||
- If the IV is the same, then the encryption of the same plaintext is the same.
|
- If the IV is the same, then the encryption of the same plaintext is the same.
|
||||||
- Thus IVs should be random.
|
- Thus IVs should be random.
|
||||||
- IV are not required to be secret, but
|
- IV are not required to be secret, but
|
||||||
- No IVs should be reused under the same key
|
- **No IVs should be reused under the same key**
|
||||||
- IV changes should be unpredictable
|
- **IV changes should be unpredictable**
|
||||||
- On IV reuse, same message will generate the same ciphertext if key isn't changed
|
- On IV reuse, same message will generate the same ciphertext if key isn't changed
|
||||||
- If IV is predictable, CBC is vulnerable to chosen plaintext attacks.
|
- If IV is predictable, CBC is vulnerable to chosen plaintext attacks.
|
||||||
- Define Eve's new message $m' = \mathrm{IV} _ {\mathrm{E}} \oplus \mathrm{IV} _ {\mathrm{A}} \oplus g$, where
|
- Define Eve's new message $m' = \mathrm{IV} _ {\mathrm{E}} \oplus \mathrm{IV} _ {\mathrm{A}} \oplus g$, where
|
||||||
@@ -248,12 +248,12 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
|
|||||||
|
|
||||||
### Cipher Feedback Mode (CFB)
|
### Cipher Feedback Mode (CFB)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- The message is treated as a stream of bits; similar to stream cipher
|
- The message is treated as a stream of bits; similar to stream cipher
|
||||||
- **Result of the encryption is fed to the next stage.**
|
- **Result of the encryption is fed to the next stage.**
|
||||||
- Standard allows any number of bits to be fed to the next stage
|
- Standard allows any number of bits to be fed to the next stage
|
||||||
- It is most efficient to use all $64$ bits (CFB-64)
|
- It is most efficient to use all bits.
|
||||||
- Initialization vector is used.
|
- Initialization vector is used.
|
||||||
- Same requirements on the IV as CBC mode.
|
- Same requirements on the IV as CBC mode.
|
||||||
- Should be randomized, and should not be predictable.
|
- Should be randomized, and should not be predictable.
|
||||||
@@ -277,13 +277,13 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
|
|||||||
- CFB mode is self-recovering.
|
- CFB mode is self-recovering.
|
||||||
- 1 bit error in the ciphertext corrupts some number of blocks.
|
- 1 bit error in the ciphertext corrupts some number of blocks.
|
||||||
- Bit errors in the ciphertext will cause bit errors at the same position.
|
- Bit errors in the ciphertext will cause bit errors at the same position.
|
||||||
- Since this ciphertext is fed to the next block, the error is propagated
|
- Since this ciphertext is fed to the next block, the error is propagated.
|
||||||
- Some implementations (like CFB-8) use shift registers, so errors will be propagated as long as the erroneous bit is in the shift register.
|
- Some implementations (like CFB-8) use shift registers, so errors will be propagated as long as the erroneous bit is in the shift register.
|
||||||
- If the error is removed from the shift register, it automatically recovers.
|
- If the error is removed from the shift register, it automatically recovers.
|
||||||
|
|
||||||
### Output Feedback Mode (OFB)
|
### Output Feedback Mode (OFB)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- Very similar to stream cipher.
|
- Very similar to stream cipher.
|
||||||
- Initialization vector is used as a seed to generate the key stream.
|
- Initialization vector is used as a seed to generate the key stream.
|
||||||
@@ -316,14 +316,39 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
|
|||||||
|
|
||||||
### Counter Mode (CTR)
|
### Counter Mode (CTR)
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
- Without chaining, we use a counter (typically incremented by $1$).
|
- Without chaining, we use a counter (typically incremented by $1$).
|
||||||
- Counter starts from the initialization vector.
|
- Counter starts from the initialization vector.
|
||||||
- Highly parallelizable.
|
- Highly parallelizable.
|
||||||
- Can decrypt from any arbitrary position.
|
- Can decrypt from any arbitrary position.
|
||||||
- Counter should not be repeated for the same key.
|
- Counter should not be repeated for the same key.
|
||||||
|
- Suppose that the same counter $ctr$ is used for encrypting $m_0$ and $m_1$.
|
||||||
|
- Encryption results are: $(ctr, E(k, ctr) \oplus m_0), (ctr, E(k, ctr) \oplus m_1)$.
|
||||||
|
- Then the attacker can obtain $m_0 \oplus m_1$.
|
||||||
|
|
||||||
|
## Modes of Operations Summary
|
||||||
|
|
||||||
|
|Criteria\Modes|ECB|CBC|CFB|OFB|CTR|
|
||||||
|
|:-:|:-:|:-:|:-:|:-:|:-:|
|
||||||
|
|IV|-|Yes|Yes|Yes|Counter|
|
||||||
|
|Encryption Parallelizable|Yes|No|No|Yes\*|Yes|
|
||||||
|
|Decryption Parallelizable|Yes|Yes|Yes|Yes\*|Yes|
|
||||||
|
|Random Read Access|Yes|Yes|Yes|No|Yes|
|
||||||
|
|Self-Recovering|-|Yes|Yes|-|-|
|
||||||
|
|
||||||
|
- OFB is parallelizable only if the keystream is generated in advance.
|
||||||
|
- We don't have to consider self-recovery if the ciphertext is not fed into the encryption of the next block.
|
||||||
|
- Errors in the ciphertext are not be propagated for ECB, OFB and CTR.
|
||||||
|
- **Random read access**
|
||||||
|
- Suppose that a part of the plaintext changes.
|
||||||
|
- In OFB, the *whole* keystream must be recalculated to fix the ciphertext.
|
||||||
|
- But for other modes, only a part of the ciphertext needs to be changed, using the information from the previous block if necesary.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
Images are from [Wikipedia](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation).
|
Images are from [Wikipedia](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation).
|
||||||
|
|
||||||
|
[^1]: Some people call this function the *mangler* function.
|
||||||
|
[^2]: Over the finite field $\mathrm{GF}(2^8)$.
|
||||||
|
[^3]: See also a helpful [question](https://crypto.stackexchange.com/questions/1346/why-is-mixcolumns-omitted-from-the-last-round-of-aes) on cryptography SE.
|
||||||
|
|||||||
@@ -0,0 +1,277 @@
|
|||||||
|
---
|
||||||
|
share: true
|
||||||
|
toc: true
|
||||||
|
math: true
|
||||||
|
categories:
|
||||||
|
- Lecture Notes
|
||||||
|
- Internet Security
|
||||||
|
tags:
|
||||||
|
- lecture-note
|
||||||
|
- security
|
||||||
|
- cryptography
|
||||||
|
- number-theory
|
||||||
|
title: 05. Modular Arithmetic (2)
|
||||||
|
date: 2023-10-04
|
||||||
|
github_title: 2023-10-04-modular-arithmetic-2
|
||||||
|
---
|
||||||
|
|
||||||
|
## Exponentiation by Squaring
|
||||||
|
|
||||||
|
Suppose we want to calculate $a^n$ where $n$ is very large, like $n \approx 2^{1000}$. A naive multiplication would take $\mathcal{O}(n)$ multiplications. We will ignore integer overflow for simplicity.
|
||||||
|
|
||||||
|
```c
|
||||||
|
int naive_exponentiation(int a, int n) {
|
||||||
|
int result = 1;
|
||||||
|
for (int i = 0; i < n; ++i) {
|
||||||
|
result *= a;
|
||||||
|
}
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Using the above implementation, computing $3^{2^{63} - 1}$ takes almost forever...
|
||||||
|
|
||||||
|
Instead, we use **exponentiation by squaring** method. Notice the following,
|
||||||
|
|
||||||
|
$$
|
||||||
|
a^n = \begin{cases}
|
||||||
|
(a^2)^{\frac{n}{2}} & (n \text{ is even})\\
|
||||||
|
a \cdot (a^2)^{\frac{n-1}{2}} & (n \text{ is odd})
|
||||||
|
\end{cases}.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Therefore, the exponent is reduced by half for every multiplication. Here is the implementation. The base cases are to be handled separately.
|
||||||
|
|
||||||
|
```c
|
||||||
|
int exponentiation_by_squaring(int a, int n) {
|
||||||
|
if (n == 0) {
|
||||||
|
return 1;
|
||||||
|
} else if (n == 1) {
|
||||||
|
return a;
|
||||||
|
}
|
||||||
|
|
||||||
|
int result = 1;
|
||||||
|
if (n % 2 == 0) {
|
||||||
|
return exponentiation_by_squaring(a * a, n / 2);
|
||||||
|
} else {
|
||||||
|
return a * exponentiation_by_squaring(a * a, (n - 1) / 2);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The above code executes about $\mathcal{O}(\log n)$ multiplications. Now we can actually get an answer for $3^{2^{63} - 1}$.
|
||||||
|
|
||||||
|
Alternatively, here is an iterative version of the above for those who want to save some memory.
|
||||||
|
|
||||||
|
```c
|
||||||
|
int exponentiation_by_squaring_iterative(int a, int n) {
|
||||||
|
int result = 1;
|
||||||
|
int base = a, exponent = n;
|
||||||
|
while (exponent > 0) {
|
||||||
|
if (n % 2 == 1) {
|
||||||
|
result *= base;
|
||||||
|
}
|
||||||
|
|
||||||
|
base *= base;
|
||||||
|
exponent /= 2;
|
||||||
|
}
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
For even better (maybe faster) results, we need the help of elementary number theory.
|
||||||
|
|
||||||
|
## Fermat's Little Theorem
|
||||||
|
|
||||||
|
> **Theorem.** Let $p$ be prime. For $a \in \mathbb{Z}$ such that $\gcd(a, p) = 1$,
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> a^{p-1} \equiv 1 \pmod p.
|
||||||
|
> $$
|
||||||
|
|
||||||
|
*Proof*. (Using group theory) The statement can be rewritten as follows. For $a \neq 0$ in $\mathbb{Z}_p$, $a^{p-1} = 1$ in $\mathbb{Z}_p$. Since $\mathbb{Z}_p^*$ is a (multiplicative) group of order $p-1$, the order of $a$ should divide $p-1$. Therefore, $a^{p-1} = 1$ in $\mathbb{Z}_p$.
|
||||||
|
|
||||||
|
Here is an elementary proof not using group theory.
|
||||||
|
|
||||||
|
*Proof*. (Elementary) Let $S = \left\lbrace 0, 1, \dots, p-1 \right\rbrace$. Consider a map $f : S \rightarrow S$ defined as $x \mapsto ax \bmod p$ ($a \neq 0$).
|
||||||
|
|
||||||
|
We will show that $f$ is injective. Suppose that $ax \equiv ay \pmod p$ for distinct $x, y \in S$. Since $\gcd(a, p) = 1$, $a$ has a multiplicative inverse, thus $x \equiv y \pmod p$. Then $x, y$ should be same elements of $S$.
|
||||||
|
|
||||||
|
By injectivity, $f(i)$ are distinct for all $i \in S$, so $f$ is a permutation on $S$. Therefore, the product of all elements of $S$ must be equal to the product of all $f(i)$ for $i \in S$.
|
||||||
|
|
||||||
|
$$
|
||||||
|
(p-1)! \equiv f(1)f(2)\cdots f(p-1) \equiv a^{p-1} \cdot (p-1)!\pmod p.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Since $\gcd(i, p) = 1$ for all $i \in S$, we can multiply the multiplicative inverse for all $i \in S$ and we get $a^{p-1} \equiv 1 \pmod p$.
|
||||||
|
|
||||||
|
## Euler's Totient Function
|
||||||
|
|
||||||
|
For composite modulus, we have Euler's generalization. Before proving the theorem, we first need to define Euler's totient function.
|
||||||
|
|
||||||
|
> **Definition.** Let $n \in \mathbb{N}$. Define $\phi(n)$ as the number of positive integers $k \leq n$ such that $\gcd(n, k) = 1$.
|
||||||
|
|
||||||
|
For direct calculation, we use the following formula.
|
||||||
|
|
||||||
|
> **Lemma.** For $n \in \mathbb{N}$, the following holds.
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> \phi(n) = n \cdot \prod_{p \mid n} \left( 1 - \frac{1}{p} \right)
|
||||||
|
> $$
|
||||||
|
>
|
||||||
|
> where $p$ is a prime number dividing $n$.
|
||||||
|
|
||||||
|
So to calculate $\phi(n)$, we need to **factorize** $n$. From the formula above, we have some corollaries.
|
||||||
|
|
||||||
|
> **Corollary.** For prime numbers $p, q$ and $k \in \mathbb{N}$, the following hold.
|
||||||
|
> 1. $\phi(p) = p - 1$.
|
||||||
|
> 2. $\phi(pq) = (p-1)(q-1)$.
|
||||||
|
> 3. $\phi(p^k) = p^{k-1}(p-1)$.
|
||||||
|
|
||||||
|
### Reduced Set of Residues
|
||||||
|
|
||||||
|
Let $n \in \mathbb{N}$. The **complete set of residues** was denoted $\mathbb{Z}_n$ and
|
||||||
|
|
||||||
|
$$
|
||||||
|
\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n-1 \right\rbrace.
|
||||||
|
$$
|
||||||
|
|
||||||
|
We also often use the **reduced set of residues**.
|
||||||
|
|
||||||
|
> **Definition.** The **reduced set of residues** is the set of residues that are relatively prime to $n$. We denote this set as $\mathbb{Z}_n^*$.
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> \mathbb{Z}_n^* = \left\lbrace a \in \mathbb{Z}_n \setminus \left\lbrace 0 \right\rbrace : \gcd(a, n) = 1 \right\rbrace.
|
||||||
|
> $$
|
||||||
|
|
||||||
|
Then by definition, we have the following result.
|
||||||
|
|
||||||
|
> **Lemma.** $\left\lvert \mathbb{Z}_n^* \right\lvert = \phi(n)$.
|
||||||
|
|
||||||
|
We can also show that $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||||
|
|
||||||
|
> **Lemma.** $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||||
|
|
||||||
|
*Proof*. Let $a, b \in \mathbb{Z}_n^{ * }$. We must check if $ab \in \mathbb{Z}_n^{ * }$. Since $\gcd(a, n) = \gcd(b, n) = 1$, $\gcd(ab, n) = 1$. This is because if $d = \gcd(ab, n) > 1$, then a prime factor $p$ of $d$ must divide $a$ or $b$ and also $n$. Then $\gcd(a, n) \geq p$ or $\gcd(b, n) \geq p$, which is a contradiction. Thus $ab \in \mathbb{Z}_n^{ * }$.
|
||||||
|
|
||||||
|
Associativity holds trivially, as a subset of $\mathbb{Z}_n$. We also have an identity element $1$, and inverse of $a \in \mathbb{Z}_n^*$ exists since $\gcd(a, n) = 1$.
|
||||||
|
|
||||||
|
Now we can prove Euler's generalization.
|
||||||
|
|
||||||
|
## Euler's Generalization
|
||||||
|
|
||||||
|
> **Theorem.** Let $a \in \mathbb{Z}$ such that $\gcd(a, n) = 1$. Then
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> a^{\phi(n)} \equiv 1 \pmod n.
|
||||||
|
> $$
|
||||||
|
|
||||||
|
*Proof*. Since $\gcd(a, n) = 1$, $a \in \mathbb{Z}_n^{ * }$. Then $a^\left\lvert \mathbb{Z}_n^{ * } \right\lvert = 1$ in $\mathbb{Z}_n$. By the above lemma, we have the desired result.
|
||||||
|
|
||||||
|
*Proof*. (Elementary) Set $f : \mathbb{Z}_n^* \rightarrow \mathbb{Z}_n^*$ as $x \mapsto ax \bmod n$, then the rest of the reasoning follows similarly as in the proof of Fermat's little theorem.
|
||||||
|
|
||||||
|
Using the above result, we remark an important result that will be used in RSA.
|
||||||
|
|
||||||
|
> **Lemma.** Let $n \in \mathbb{N}$. For $a, b \in \mathbb{Z}$ and $x \in \mathbb{Z}_n^*$, if $a \equiv b \pmod{\phi(n)}$, then $x^a \equiv x^b \pmod n$.
|
||||||
|
|
||||||
|
*Proof*. $a = b + k\phi(n)$ for some $k \in \mathbb{Z}$. Then
|
||||||
|
|
||||||
|
$$
|
||||||
|
x^a \equiv x^{b + k\phi(n)} = (x^{\phi(n)})^k \cdot x^b \equiv x^b \pmod n
|
||||||
|
$$
|
||||||
|
|
||||||
|
by Euler's generalization.
|
||||||
|
|
||||||
|
## Groups Based on Modular Arithmetic
|
||||||
|
|
||||||
|
> **Definition.** A **group** is a set $G$ with a binary operation $* : G \times G \rightarrow G$, satisfying the following properties.
|
||||||
|
>
|
||||||
|
> - $(\mathsf{G1})$ The binary operation $*$ is **closed**.
|
||||||
|
> - $(\mathsf{G2})$ The binary operation $*$ is **associative**, so $(a * b) * c = a * (b * c)$ for all $a, b, c \in G$.
|
||||||
|
> - $(\mathsf{G3})$ $G$ has an **identity** element $e$ such that $e * a = a * e = a$ for all $a \in G$.
|
||||||
|
> - $(\mathsf{G4})$ There is an **inverse** for every element of $G$. For each $a \in G$, there exists $x \in G$ such that $a * x = x * a = e$. We write $x = a^{-1}$ in this case.
|
||||||
|
|
||||||
|
$\mathbb{Z}_n$ is an additive group, and $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||||
|
|
||||||
|
## Chinese Remainder Theorem (CRT)
|
||||||
|
|
||||||
|
> **Theorem.** Let $n_1, \dots, n_k$ integers greater than $1$, and let $N = n_1n_2\cdots n_k$. If $n_i$ are pairwise relatively prime, then the system of equations $x \equiv a_i \pmod {n_i}$ has a unique solution modulo $N$.
|
||||||
|
>
|
||||||
|
> *(Abstract Algebra)* The map
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
|
||||||
|
> $$
|
||||||
|
>
|
||||||
|
> defines a ring isomorphism
|
||||||
|
>
|
||||||
|
> $$
|
||||||
|
> \mathbb{Z}_N \simeq \mathbb{Z}_{n_1} \times \mathbb{Z}_{n_2} \times \cdots \times \mathbb{Z}_{n_k}.
|
||||||
|
> $$
|
||||||
|
|
||||||
|
*Proof*. (**Existence**) Let $N_i = N/n_i$. Then $\gcd(N_i, n_i) = 1$. By the extended Euclidean algorithm, there exist integers $M_i, m_i$ such that $M_iN_i + m_in_i= 1$. Now set
|
||||||
|
|
||||||
|
$$
|
||||||
|
x = \sum_{i=1}^k a_i M_i N_i.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Then $x \equiv a_iM_iN_i \equiv a_i(1 - m_in_i) \equiv a_i \pmod {n_i}$ for all $i = 1, \dots, k$.
|
||||||
|
|
||||||
|
(**Uniqueness**) Suppose that we have two distinct solutions $x, y$ modulo $N$. $x, y$ are solutions to $x \equiv a_i \pmod {n_i}$, so $n_i \mid (x - y)$ for all $i$. Therefore we have
|
||||||
|
|
||||||
|
$$
|
||||||
|
\mathrm{lcm}(n_1, \dots, n_k) \mid (x - y).
|
||||||
|
$$
|
||||||
|
|
||||||
|
But $n_i$ are pairwise relatively prime, so $\mathrm{lcm}(n_1, \dots, n_k) = N$ and $N \mid (x-y)$. Hence $x \equiv y \pmod N$.
|
||||||
|
|
||||||
|
*Proof*. (**Abstract Algebra**) The above uniqueness proof shows that the map
|
||||||
|
|
||||||
|
$$
|
||||||
|
x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
|
||||||
|
$$
|
||||||
|
|
||||||
|
is injective. By pigeonhole principle, this map must also be surjective. This map is also a ring homomorphism, by the properties of modular arithmetic. We have a ring isomorphism.
|
||||||
|
|
||||||
|
### Notes on the Proof of the Chinese Remainder Theorem
|
||||||
|
|
||||||
|
The elementary proof given above gives a *direct construction* of the solution. It is clear and easy to understand, and tells us how to find the actual solution.
|
||||||
|
|
||||||
|
But when the above proof is used in actual computation, it involves computations of very large numbers. The following is an implementation.
|
||||||
|
|
||||||
|
```cpp
|
||||||
|
// remainder holds the a_i values
|
||||||
|
// modulus holds the n_i values
|
||||||
|
int chinese_remainder_theorem(vector<int>& remainder, vector<int>& modulus) {
|
||||||
|
int product = 1;
|
||||||
|
for (int m : modulus) {
|
||||||
|
product *= m;
|
||||||
|
}
|
||||||
|
|
||||||
|
int result = 0;
|
||||||
|
for (int i = 0; i < (int) modulus.size(); ++i) {
|
||||||
|
int N_i = product / modulus[i];
|
||||||
|
result += remainder[i] * modular_inverse(N_i, modulus[i]) * N_i;
|
||||||
|
result %= product;
|
||||||
|
}
|
||||||
|
|
||||||
|
return result;
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
The `modular_inverse` function uses the extended Euclidean algorithm to find $M_i$ in the proof. For large moduli and many equations, $N_i = N / n_i$ results in a very large number, which is hard to handle (if your language has integer overflow) and takes longer to compute.
|
||||||
|
|
||||||
|
A better way is to construct the solution **inductively**. Find a solution for the first two equations,
|
||||||
|
|
||||||
|
$$
|
||||||
|
\begin{array}{c}
|
||||||
|
x \equiv a_1 \pmod{n_1} \\
|
||||||
|
x \equiv a_2 \pmod{n_2}
|
||||||
|
\end{array} \implies x \equiv a_{1, 2} \pmod{n_1n_2}
|
||||||
|
$$
|
||||||
|
|
||||||
|
and using the result, add the next equation $x \equiv a_3 \pmod{n_3}$ and find a solution.[^1]
|
||||||
|
|
||||||
|
Lastly, the ring isomorphism actually tells us a lot and is quite effective for computation. Since the two rings are *isomorphic*, operations in $\mathbb{Z} _ N$ can be done independently in each $\mathbb{Z} _ {n_i}$ and then merged back to $\mathbb{Z} _ N$. $N$ was a large number, so computations can be much faster in $\mathbb{Z} _ {n _ i}$. Specifically, we will see how this fact is used for computations in RSA.
|
||||||
|
|
||||||
|
[^1]: I have an implementation in my repository. [Link](https://github.com/calofmijuck/BOJ/blob/4b29e0c7f487aac3186661176d2795f85f0ab21b/Codes/23000/23062.cpp#L38).
|
||||||
179
_posts/Lecture Notes/Internet Security/2023-10-04-rsa-elgamal.md
Normal file
179
_posts/Lecture Notes/Internet Security/2023-10-04-rsa-elgamal.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
---
|
||||||
|
share: true
|
||||||
|
toc: true
|
||||||
|
math: true
|
||||||
|
categories:
|
||||||
|
- Lecture Notes
|
||||||
|
- Internet Security
|
||||||
|
tags:
|
||||||
|
- lecture-note
|
||||||
|
- security
|
||||||
|
- cryptography
|
||||||
|
- number-theory
|
||||||
|
title: 06. RSA and ElGamal Encryption
|
||||||
|
date: 2023-10-04
|
||||||
|
github_title: 2023-10-04-rsa-elgamal
|
||||||
|
---
|
||||||
|
|
||||||
|
## Exponential Inverses
|
||||||
|
|
||||||
|
Suppose we are given integers $a$ and $N$. For any integer $x$ that is relatively prime to $N$, we choose $b$ so that
|
||||||
|
|
||||||
|
$$
|
||||||
|
|
||||||
|
\tag{$*$}
|
||||||
|
ab \equiv 1 \pmod{\phi(N)}.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Then we have
|
||||||
|
|
||||||
|
$$
|
||||||
|
x^{ab} \equiv x^{1 + k\phi(N)} \equiv x \pmod N
|
||||||
|
$$
|
||||||
|
|
||||||
|
by Euler's generalization.
|
||||||
|
|
||||||
|
> **Definition.** The integer $b$ satisfying $(\ast)$ is called the **exponential inverse of $a$ modulo $N$**.
|
||||||
|
|
||||||
|
Using exponential inverses will be a key idea in the RSA cryptosystem.
|
||||||
|
|
||||||
|
## RSA Cryptosystem
|
||||||
|
|
||||||
|
This is an explanation of *textbook* RSA encryption scheme.
|
||||||
|
|
||||||
|
### Key Generation
|
||||||
|
|
||||||
|
- We pick two large primes $p, q$ and set $N = pq$.
|
||||||
|
- Select $(e, d)$ so that $ed \equiv 1 \pmod{\phi(N)}$.
|
||||||
|
- Set $(N, e)$ as the **public key** and make it public.
|
||||||
|
- Set $d$ as the **private key** and keep it secret.
|
||||||
|
|
||||||
|
### RSA Encryption and Decryption
|
||||||
|
|
||||||
|
Suppose we want to encrypt a message $m \in \mathbb{Z}_N$.
|
||||||
|
|
||||||
|
- **Encryption**
|
||||||
|
- Using the public key $(N, e)$, compute the ciphertext $c = m^e \bmod N$.
|
||||||
|
- **Decryption**
|
||||||
|
- Recover the original message by computing $c^d \bmod N$.
|
||||||
|
|
||||||
|
### Correctness of RSA?
|
||||||
|
|
||||||
|
Since $ed \equiv 1 \pmod{\phi(N)}$, we have
|
||||||
|
|
||||||
|
$$
|
||||||
|
c^d \equiv m^{ed} \equiv m \pmod N
|
||||||
|
$$
|
||||||
|
|
||||||
|
by the properties of exponential inverses.
|
||||||
|
|
||||||
|
Wait, but the properties requires that $\gcd(m, N) = 1$. So it seems like we can't use some values of $m$. Furthermore, it should be computationally infeasible to recover $d$ using $e$ and $N$.
|
||||||
|
|
||||||
|
### Regarding the Choice of $N$
|
||||||
|
|
||||||
|
If $N$ is prime, it is very easy to find $d$. Since the relation $ed \equiv 1 \pmod {(N-1)}$ holds, we directly see that $d$ can be computed efficiently using the extended Euclidean algorithm.
|
||||||
|
|
||||||
|
The next simplest case would be setting $N = pq$ for two large primes $p$ and $q$. We expose $N$ to the public but hide primes $p$ and $q$. Now suppose the attacker wants to compute $d$ using $(N, e)$. The attacker knows that $ed \equiv 1 \pmod {\phi(N)}$, and $\phi(N) = (p-1)(q-1)$. So to calculate $d$, the attacker must know $\phi(N)$, which requires the **factorization of $N$**.
|
||||||
|
|
||||||
|
If the factorization $N = pq$ is known, finding $d$ is easy. But factoring large prime numbers (especially a product of two primes of similar size) is known to be very difficult.[^1] No one has formally proven this, but we believe and assume that it is hard.[^2]
|
||||||
|
|
||||||
|
## Chinese Remainder Theorem in RSA
|
||||||
|
|
||||||
|
Assume that the message $m$ is not divisible by both $p$ and $q$. By Fermat's little theorem, we have $m^{p-1} \equiv 1 \pmod p$ and $m^{q-1} \equiv 1 \pmod q$.
|
||||||
|
|
||||||
|
Therefore, for decryption in RSA, the following holds. Note that $N = pq$.
|
||||||
|
|
||||||
|
$$
|
||||||
|
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m \cdot (m^{p-1})^{k(q-1)} \equiv m \cdot 1^{k(q-1)} \equiv m \pmod p.
|
||||||
|
$$
|
||||||
|
|
||||||
|
A similar result holds for modulus $q$. This does not exactly recover the message yet, since $m$ could have been chosen to be larger than $p$. The above equation is true, but during actual computation, one may get a result that is less than $p$. *This may not be equal to the original message*.[^3]
|
||||||
|
|
||||||
|
Since $N = pq$, we use the Chinese remainder theorem. Instead of computing $c^d \pmod N$, we can compute
|
||||||
|
|
||||||
|
$$
|
||||||
|
c^d \equiv m \pmod p, \qquad c^d \equiv m \pmod q
|
||||||
|
$$
|
||||||
|
|
||||||
|
independently and solve the system of equations to recover the message.
|
||||||
|
|
||||||
|
## Can I Encrypt $p$ with RSA?
|
||||||
|
|
||||||
|
Now we return to the problem where $\gcd(m, N) \neq 1$. The probability of $\gcd(m, N) \neq 1$ is actually $\frac{1}{p} + \frac{1}{q} - \frac{1}{pq}$, so if we take large primes $p, q \approx 2^{1000}$ as in RSA2048, the probability of this occurring is roughly $2^{-999}$, which is negligible. But for completeness, we also prove for this case.
|
||||||
|
|
||||||
|
$e, d$ are still chosen to satisfy $ed \equiv 1 \pmod {\phi(N)}$. Suppose we want to decrypt $c \equiv m^e \pmod N$.
|
||||||
|
|
||||||
|
We will also use the Chinese remainder theorem here.
|
||||||
|
|
||||||
|
Since $\gcd(m, N) \neq 1$ and $N = pq$, we have $p \mid m$. So if we compute in $\mathbb{Z}_p$, we will get $0$,
|
||||||
|
|
||||||
|
$$
|
||||||
|
c^d \equiv m^{ed} \equiv 0^{ed} \equiv 0 \pmod p.
|
||||||
|
$$
|
||||||
|
|
||||||
|
We also do the computation in $\mathbb{Z}_q$ and get
|
||||||
|
|
||||||
|
$$
|
||||||
|
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m\cdot (m^{q-1})^{k(p-1)} \equiv m \cdot 1^{k(p-1)} \equiv m \pmod q.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Here, we used the fact that $m^{q-1} \equiv 1 \pmod q$. This holds because if $p \mid m$, $m$ is a multiple of $p$ that is less than $N$, so $m = pm'$ for some $m'$ such that $1 \leq m' < q$. Then $\gcd(m, q) = \gcd(pm', q) = 1$ since $q$ does not divide $p$ and $m'$ is less than $q$.
|
||||||
|
|
||||||
|
Therefore, from $c^d \equiv 0 \pmod p$ and $c^d \equiv (m \bmod q) \pmod q$, we can recover a unique solution $c^d \equiv m \pmod N$.
|
||||||
|
|
||||||
|
Now we must argue that the recovered solution is actually equal to the original $m$. But what we did above was showing that $m^{ed}$ and $m$ in $\mathbb{Z}_N$ are mapped to the same element $(0, m \bmod q)$ in $\mathbb{Z}_p \times \mathbb{Z}_q$. Since the Chinese remainder theorem tells us that this mapping is an isomorphism, $m^{ed}$ and $m$ must have been the same elements of $\mathbb{Z}_N$ in the first place.
|
||||||
|
|
||||||
|
Notice that we did not require $m$ to be relatively prime to $N$. Thus the RSA encryption scheme is correct for any $m \in \mathbb{Z}_N$.
|
||||||
|
|
||||||
|
## Correctness of RSA with Fermat's Little Theorem
|
||||||
|
|
||||||
|
Actually, the above argument can be proven only with Fermat's little theorem. In the above proof, the Chinese remainder theorem was used to transform the operation, but for $N = pq$, the situation is simple enough that this theorem is not necessarily required.
|
||||||
|
|
||||||
|
Let $M = m^{ed} - m$. We have shown above only using Fermat's little theorem that $p \mid M$ and $q \mid M$, for any choice of $m \in \mathbb{Z}_N$. Then since $N = pq = \mathrm{lcm}(p, q)$, we have $N \mid M$, so $m^{ed} \equiv m \pmod N$. Hence the RSA scheme is correct.
|
||||||
|
|
||||||
|
So we don't actually need Euler's generalization for proving the correctness of RSA...?! In fact, the proof given in the original paper of RSA used Fermat's little theorem.
|
||||||
|
|
||||||
|
## Discrete Logarithms
|
||||||
|
|
||||||
|
This is an inverse problem of exponentiation. The inverse of exponentials is logarithms, so we consider the **discrete logarithm of a number modulo $p$**.
|
||||||
|
|
||||||
|
Given $y \equiv g^x \pmod p$ for some prime $p$, we want to find $x = \log_g y$. We set $g$ to be a generator of the group $\mathbb{Z}_p$ or $\mathbb{Z}_p^*$, since if $g$ is the generator, a solution always exists.
|
||||||
|
|
||||||
|
Read more in [discrete logarithm problem (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange#discrete-logarithm-problem-dl).
|
||||||
|
|
||||||
|
## ElGamal Encryption
|
||||||
|
|
||||||
|
This is an encryption scheme built upon the hardness of the DLP.
|
||||||
|
|
||||||
|
> 1. Let $p$ be a large prime.
|
||||||
|
> 2. Select a generator $g \in \mathbb{Z}_p^*$.
|
||||||
|
> 3. Choose a private key $x \in \mathbb{Z}_p^*$.
|
||||||
|
> 4. Compute the public key $y = g^x \pmod p$.
|
||||||
|
> - $p, g, y$ will be publicly known.
|
||||||
|
> - $x$ is kept secret.
|
||||||
|
|
||||||
|
### ElGamal Encryption and Decryption
|
||||||
|
|
||||||
|
Suppose we encrypt a message $m \in \mathbb{Z}_p^*$.
|
||||||
|
|
||||||
|
> 1. The sender chooses a random $k \in \mathbb{Z}_p^*$, called *ephemeral key*.
|
||||||
|
> 2. Compute $c_1 = g^k \pmod p$ and $c_2 = my^k \pmod p$.
|
||||||
|
> 3. $c_1, c_2$ are sent to the receiver.
|
||||||
|
> 4. The receiver calculates $c_1^x \equiv g^{xk} \equiv y^k \pmod p$, and find the inverse $y^{-k} \in \mathbb{Z}_p^*$.
|
||||||
|
> 5. Then $c_2y^{-k} \equiv m \pmod p$, recovering the message.
|
||||||
|
|
||||||
|
The attacker will see $g^k$. By the hardness of DLP, the attacker is unable to recover $k$ even if he knows $g$.
|
||||||
|
|
||||||
|
#### Ephemeral Key Should Be Distinct
|
||||||
|
|
||||||
|
If the same $k$ is used twice, the encryption is not secure. Suppose we encrypt two different messages $m_1, m_2 \in \mathbb{Z} _ p^{ * }$. The attacker will see $(g^k, m_1y^k)$ and $(g^k, m_2 y^k)$. Then since we are in a multiplicative group $\mathbb{Z} _ p^{ * }$, inverses exist. So
|
||||||
|
|
||||||
|
$$
|
||||||
|
m_1y^k \cdot (m_2 y^k)^{-1} \equiv m_1m_2^{-1} \equiv 1 \pmod p
|
||||||
|
$$
|
||||||
|
|
||||||
|
which implies that $m_1 \equiv m_2 \pmod p$, leaking some information.
|
||||||
|
|
||||||
|
[^1]: If one of the primes is small, factoring is easy. Therefore we require that $p, q$ both be large primes.
|
||||||
|
[^2]: There is a quantum polynomial time (BQP) algorithm for integer factorization. See [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm).
|
||||||
|
[^3]: This part of the explanation is not necessary if we use abstract algebra!
|
||||||
@@ -0,0 +1,138 @@
|
|||||||
|
---
|
||||||
|
share: true
|
||||||
|
toc: true
|
||||||
|
math: true
|
||||||
|
categories:
|
||||||
|
- Lecture Notes
|
||||||
|
- Internet Security
|
||||||
|
tags:
|
||||||
|
- lecture-note
|
||||||
|
- security
|
||||||
|
- cryptography
|
||||||
|
title: 07. Public Key Cryptography
|
||||||
|
date: 2023-10-09
|
||||||
|
github_title: 2023-10-09-public-key-cryptography
|
||||||
|
---
|
||||||
|
|
||||||
|
In symmetric key cryptography, we have a problem with key sharing and management. More info in the first few paragraphs of [Key Exchange (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange).
|
||||||
|
|
||||||
|
## Public Key Cryptography
|
||||||
|
|
||||||
|
We use **two** keys for public key cryptography. The keys are called *public key* and *private key*. These two keys are related to each other, but it is almost impossible to calculate the private key from the public key.
|
||||||
|
|
||||||
|
- **Public key** is *public*, and anyone can use it to encrypt messages or verify signatures.
|
||||||
|
- **Private key** (or secret key) is only kept by the owner. It is used to decrypt messages or create signatures.
|
||||||
|
|
||||||
|
We will denote public keys as $pk$ and private keys as $sk$.
|
||||||
|
|
||||||
|
These keys are created to be used in **trapdoor one-way functions**.
|
||||||
|
|
||||||
|
### One-way Function
|
||||||
|
|
||||||
|
A **one-way function** is a function that is easy to compute, but hard to compute the pre-image of any output. Here are some common examples.
|
||||||
|
|
||||||
|
- *Cryptographic hash functions*: [Hash Functions (Modern Cryptography)](../../modern-cryptography/2023-09-28-hash-functions#collision-resistance).
|
||||||
|
- *Factoring a large integer*: It is easy to multiply to integers even if they're large, but factoring is very hard.
|
||||||
|
- *Discrete logarithm problem*: It is easy to exponentiate a number, but it is hard to find the discrete logarithm.
|
||||||
|
|
||||||
|
But a one-way function is not enough. Suppose that $f$ is a one way function with a public key $pk$. It will be easy to encrypt a message $m$ as $f(pk, m)$, but recovering $m$ is hard even for the intended recipient.
|
||||||
|
|
||||||
|
### Trapdoor One-way Function
|
||||||
|
|
||||||
|
A **trapdoor one-way function** has a *trapdoor*. It is computationally difficult to find the preimage, but with the trapdoor, the inverting is easy.
|
||||||
|
|
||||||
|
In public key cryptography, the trapdoor is the *private key* that makes it easy to invert the one-way function $f$. So the recipient can efficiently invert $f$ and recover the message $m$.
|
||||||
|
|
||||||
|
### Encryption and Decryption
|
||||||
|
|
||||||
|
In public key cryptography, encryption and decryption are done as follows.
|
||||||
|
|
||||||
|
Suppose that Alice wants to send a secret message to Bob. Alice must encrypt the message using **Bob's public key**, so that only Bob can decrypt the message.
|
||||||
|
|
||||||
|
> 1. Alice takes a plaintext and encrypts it using Bob's public key.
|
||||||
|
> 2. The ciphertext is sent to Bob.
|
||||||
|
> 3. Bob uses his private key to decrypt the ciphertext.
|
||||||
|
|
||||||
|
Mathematically, let $pk, sk$ be Bob's public key and private key.
|
||||||
|
|
||||||
|
> 1. Alice computes the ciphertext $c = f(pk, m)$ of the message $m$.
|
||||||
|
> 2. $c$ is sent to Bob.
|
||||||
|
> 3. Bob computes $m = f^{-1}(sk, c)$ and recovers $m$.
|
||||||
|
|
||||||
|
### Authentication
|
||||||
|
|
||||||
|
Public key cryptography can be used also for **authentication**. If some ciphertext can be decrypted with Alice's public key, we can verify that the message was from Alice.
|
||||||
|
|
||||||
|
We will learn more about this when we learn digital signatures.
|
||||||
|
|
||||||
|
### Applications of Public Key Cryptography
|
||||||
|
|
||||||
|
- **Encryption and decryption**: for private communication.
|
||||||
|
- **Digital signatures**: authentication, as explained above.
|
||||||
|
- This was not possible with symmetric cryptography since both parties have the key, so does not satisfy non-repudiation.
|
||||||
|
- **Key exchange**
|
||||||
|
- We assumed that in symmetric cryptography, there was a secure channel to share the secret key.
|
||||||
|
- We use public key cryptography to exchange and agree on the secret key for the symmetric cipher.
|
||||||
|
- Public key cryptography takes longer to calculate, so it is preferable to use symmetric ciphers.
|
||||||
|
|
||||||
|
But a problem still remains. How does one verify that this key is indeed from that identity? In the example above, how does Alice know that this public key is from Bob and not someone else's? This problem will be solved using **public key infrastructure**.
|
||||||
|
|
||||||
|
## Diffie-Hellman Key Exchange
|
||||||
|
|
||||||
|
Choose a large prime $p$ and a generator $g$ of $\mathbb{Z}_p^{ * }$. The description of $g$ and $p$ will be known to the public.
|
||||||
|
|
||||||
|
> 1. Alice chooses some $x \in \mathbb{Z}_p^{ * }$ and sends $g^x \bmod p$ to Bob.
|
||||||
|
> 2. Bob chooses some $y \in \mathbb{Z}_p^{ * }$ and sends $g^y \bmod p$ to Alice.
|
||||||
|
> 3. Alice and Bob calculate $g^{xy} \bmod p$ separately.
|
||||||
|
> 4. Eve can see $g^x \bmod p$, $g^y \bmod p$ but cannot calculate $g^{xy} \bmod p$.
|
||||||
|
|
||||||
|
Refer to [Diffie-Hellman Key Exchange (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange#diffie-hellman-key-exchange-dhke).
|
||||||
|
|
||||||
|
## Message Integrity
|
||||||
|
|
||||||
|
A function $H$ takes an input of arbitrary length message and outputs a fixed length string. The output is called **message digest**, *tag*, *fingerprint* or **hash**.
|
||||||
|
|
||||||
|
Here, the $H$ is called a **hash function**. This function is many-to-one, but it is usually computationally infeasible to find a collision.
|
||||||
|
|
||||||
|
**Desirable Properties of $H$**.
|
||||||
|
|
||||||
|
- $H$ should be easy to calculate.
|
||||||
|
- It should be hard to recover $m$ from $H(m)$. (one-wayness)
|
||||||
|
- It should be computationally difficult to find a collision. (collision resistance)
|
||||||
|
- The output should seem random.
|
||||||
|
|
||||||
|
Using this function, we can check whether if the message was tampered during transmission.
|
||||||
|
|
||||||
|
### Message Authentication Code (MAC)
|
||||||
|
|
||||||
|
We assume that Alice and Bob already share a secret $k$. Alice wants to send a message $m$ to Bob.
|
||||||
|
|
||||||
|
> 1. Alice signs the message using the key and calculates the tag $t = H(k, m)$.
|
||||||
|
> 2. Alice sends the message and tag.
|
||||||
|
> 3. Bob calculates the tag $t'$ from the received message. If $t'$ does not match with $t$, Bob detects that the message was modified.
|
||||||
|
|
||||||
|
We only care about message integrity in MACs, so the message is not encrypted.
|
||||||
|
|
||||||
|
### Properties of MAC
|
||||||
|
|
||||||
|
- MACs are based on symmetric keys, so communicating parties must share a key.
|
||||||
|
- MACs should be able to accept messages of arbitrary length.
|
||||||
|
- MACs should output a fixed-length string.
|
||||||
|
- MACs should provide message integrity. Any manipulations in transit will be detected, and receiving party is assured of the origin of the message.
|
||||||
|
- MACs **do not** support non-repudiation.
|
||||||
|
- Since both parties have the secret, any two party can create the message.
|
||||||
|
|
||||||
|
## Digital Signatures
|
||||||
|
|
||||||
|
**Digital signatures** achieve *integrity*, *non-repudiation* and *authentication*. We leverage public key cryptography.
|
||||||
|
|
||||||
|
Suppose Alice wants to **sign** a message $m$. Alice has public key $pk$ and private key $sk$.
|
||||||
|
|
||||||
|
> 1. Alice calculates $\sigma = D(sk, m)$ and sends $m \parallel \sigma$.
|
||||||
|
> 2. Bob receives it and calculates $E(pk, \sigma)$ and compares it with $m$.
|
||||||
|
> - The key $pk$ here is Alice's public key.
|
||||||
|
|
||||||
|
- Since the signature can be decrypted using Alice's public key, it must have been signed using Alice's private key.
|
||||||
|
- Thus the message must have been from Alice.
|
||||||
|
- Verification is done using Alice's public key, so anyone can verify the message.
|
||||||
|
- Messages are usually long, so we take a hash function $H$ to shorten it, and sign $H(m)$ instead.
|
||||||
204
_posts/Lecture Notes/Internet Security/2023-10-16-pki.md
Normal file
204
_posts/Lecture Notes/Internet Security/2023-10-16-pki.md
Normal file
@@ -0,0 +1,204 @@
|
|||||||
|
---
|
||||||
|
share: true
|
||||||
|
toc: true
|
||||||
|
math: true
|
||||||
|
categories:
|
||||||
|
- Lecture Notes
|
||||||
|
- Internet Security
|
||||||
|
tags:
|
||||||
|
- lecture-note
|
||||||
|
- security
|
||||||
|
title: 08. Public Key Infrastructure
|
||||||
|
date: 2023-10-16
|
||||||
|
github_title: 2023-10-16-pki
|
||||||
|
attachment:
|
||||||
|
folder: assets/img/posts/Lecture Notes/Internet Security
|
||||||
|
---
|
||||||
|
|
||||||
|
Suppose that we're using RSA, Alice has public key $(N, e)$ and private key $d$. Anyone can send messages to Alice using $(N, e)$. But because anyone can generate $(N, e)$, we are not sure whether the key $(N, e)$ is *really* Alice's key. We might run into a situation where $(N, e)$ was actually some other person's key. *How do we check whose key this is?*
|
||||||
|
|
||||||
|
**Public key infrastructure** (PKI) solves this problem by using **certificates**.
|
||||||
|
|
||||||
|
## Cryptographic Certificates
|
||||||
|
|
||||||
|
We focus on **cryptographic certificates**.
|
||||||
|
|
||||||
|
> A **certificate** is an electronic document to bind a *public key* with its *owner's identity*. This binding is assured by a *digital signature* of an issuer, we call the issuer a **certificate authority**. (CA)
|
||||||
|
|
||||||
|
- A certificate authority is a *trusted third party* (TTP).
|
||||||
|
- If we don't trust the CA, then we cannot trust the signature of the certificate.
|
||||||
|
- Commercial CAs charge to issue certificates.
|
||||||
|
- Kinds of certificates
|
||||||
|
- Digital certificate
|
||||||
|
- Public key certificate
|
||||||
|
- SSL (or TLS) certificate
|
||||||
|
- X.509 certificates (ITU-T)
|
||||||
|
- Web server certificates
|
||||||
|
|
||||||
|
## Components of Public Key Infrastructure
|
||||||
|
|
||||||
|
- **Registration Authority** (RA)
|
||||||
|
- Checks the individual or entity that requests the certificate.
|
||||||
|
- It will check if the requester has a matching private key, etc.
|
||||||
|
- This result will be sent to the certification authority.
|
||||||
|
- **Certification Authority** (CA)
|
||||||
|
- Issues certificates. Binds a public key to a identity.
|
||||||
|
- User identity must be unique within each CA domain.
|
||||||
|
- Relying parties (browsers, etc.) need a copy of CA's public key.
|
||||||
|
- Directory Service
|
||||||
|
- A directory of public keys and certificates, so that anyone can access it.
|
||||||
|
- Revocation Service
|
||||||
|
- A mechanism to check if a certificate is revoked or not.
|
||||||
|
- Certificate revocation list (CRL), online certificate status protocol (OCSP).
|
||||||
|
|
||||||
|
Note that the certification authority can offload some of its work to the registration authority.
|
||||||
|
|
||||||
|
## Contents of a Certificate (X.509)
|
||||||
|
|
||||||
|
- **Serial Number**: a unique identifier for the certificate within the CA.
|
||||||
|
- **Subject**: the identified person or entity. Also known as *distinguished name* (DN).
|
||||||
|
- Common Name
|
||||||
|
- Organization
|
||||||
|
- Organizational Unit
|
||||||
|
- Locality
|
||||||
|
- State or Province
|
||||||
|
- Country/Region
|
||||||
|
- **Signature Algorithm**: the algorithm used to create the signature.
|
||||||
|
- **Issuer**: the entity that verified the subject and issued the certificate.
|
||||||
|
- **Valid-From**
|
||||||
|
- **Valid-To**: certificate expiration date.
|
||||||
|
- **Key-Usage**: purpose of this public key
|
||||||
|
- Encryption, signature, etc.
|
||||||
|
- **Public Key**: the public key of the subject.
|
||||||
|
- **Signature**: the digital signature signed by the issuer.
|
||||||
|
- For verifying that this signature is from the issuer, and can be trusted.
|
||||||
|
|
||||||
|
The hash of the entire certificate is called a **thumbprint**, but this is not included in the certificate.
|
||||||
|
|
||||||
|
## Certificate Validation Process
|
||||||
|
|
||||||
|
### Hierarchy of CAs
|
||||||
|
|
||||||
|
We have a root CA at the top. Then there are issuing CAs below. We usually request certificates to the issuing CA. Note that the issuing CAs also have their own certificate, which is signed by the next higher-level CA.
|
||||||
|
|
||||||
|
### Certificate Validation
|
||||||
|
|
||||||
|
[^1]
|
||||||
|
|
||||||
|
Since we have a hierarchy of CAs, certificate validation must also follow the hierarchy. When we receive a certificate, it is highly likely to be signed by an non-root CA.
|
||||||
|
|
||||||
|
Thus we validate certificates by the following process. Suppose we received a certificate $A$.
|
||||||
|
|
||||||
|
> 1. In $A$, check the issuer's DN and request the issuer's certificate $B$.
|
||||||
|
> 2. Verify the signature of $A$ using the public key of the issuer in $B$.
|
||||||
|
> 3. Recursively validate $B$ following the above steps.
|
||||||
|
|
||||||
|
We will request the certificate of a root CA at the end. If everything went well, all the intermediate certificates will have been verified. Now we must verify the certificate of a root CA, but a root CA does not have any higher level CAs.
|
||||||
|
|
||||||
|
Root CAs are decided publicly by the [CA/Browser forum](https://cabforum.org/). Thus they are acknowledged by the public community, and we agree that root CAs can be trusted. Therefore, root CAs sign their own certificates.
|
||||||
|
|
||||||
|
In many web browsers, root CAs are whitelisted so that they are always trusted.
|
||||||
|
|
||||||
|
### Self-Signed Certificates
|
||||||
|
|
||||||
|
As in the example of root CAs, there are certificates that is signed by the same identity as the subject. i.e, the issuer and the subject are the same. We call these **self-signed certificates**.
|
||||||
|
|
||||||
|
We generally don't trust self-signed certificates, since they can be created easily. Anyone can generate a keypair, create a certificate and sign it by oneself.
|
||||||
|
|
||||||
|
But there are some places where self-signed certificates are handy. The first example is the case of root CAs. Also, since issuing a certificate from a CA requires money, using a self-signed certificate for test servers saves money and time.
|
||||||
|
|
||||||
|
There are also some problems with self-signed certificates. These certificates are self-created and self-signed, so the certificate can contain arbitrary values.[^2] For example, the certificate can be valid for a thousand years, which is usually not possible for CA-issued certificates. Lastly, self-signed certificates are hard to revoke by nature, since it is not issued by CAs.
|
||||||
|
|
||||||
|
## Certificate Revocation
|
||||||
|
|
||||||
|
### Key Pair Lifecycle
|
||||||
|
|
||||||
|
A key is generated, and a certificate is issued with the key. If the key has expired, we revoke the certificate.
|
||||||
|
|
||||||
|
- Keys should be generated by the owner, for non-repudiation.
|
||||||
|
- Dual key pair model
|
||||||
|
- Separate key pairs for encryption/decryption and signature.
|
||||||
|
|
||||||
|
### Certificate Revocation
|
||||||
|
|
||||||
|
There are some cases where certificates must be revoked.
|
||||||
|
|
||||||
|
- Certificate is mis-issued (not the right identity).
|
||||||
|
- Key can be compromised by attackers.
|
||||||
|
- One may forget the passphrase for the certificate.
|
||||||
|
- The private key may get lost.
|
||||||
|
|
||||||
|
*PKI is only as secure as the revocation mechanism.* This is because revocation mechanism is hard to handle.
|
||||||
|
|
||||||
|
- The CA revokes the certificates.
|
||||||
|
- The replying party checks the revocation status using **certificate revocation lists** and **online certificate status protocol**.
|
||||||
|
- The certificate tells us where to get the revocation information.
|
||||||
|
|
||||||
|
### Certificate Revocation Lists (CRL)
|
||||||
|
|
||||||
|
The **certificate revocation list** (CRL) contains information about itself and revoked certificates.
|
||||||
|
|
||||||
|
- For each revoked certificate, the serial number and the revocation date is recorded.
|
||||||
|
- Also contains next update date.
|
||||||
|
- This list should be publicly available, so that anyone can check if the certificate is revoked.
|
||||||
|
- The verifier will look at the CRL distribution URL in the certificate and receive the CRLs.
|
||||||
|
|
||||||
|
CRL checking is done in the following way.
|
||||||
|
|
||||||
|
> 1. A client connects to a website and receives the certificate of the server.
|
||||||
|
> 2. The client queries the certificate revocation server and downloads CRLs.
|
||||||
|
> 3. The client checks whether the certificate is revoked or not.
|
||||||
|
|
||||||
|
But Distributing CRL in real-time is not possible. Furthermore, CRL lifecycles/update periods can vary depending on CAs. Thus there can be attacks between CRL updates. Also, CRL sizes will keep increasing over time, so it gets harder to download and manage the CRLs.
|
||||||
|
|
||||||
|
### Online Certificate Status Protocol (OCSP)
|
||||||
|
|
||||||
|
The **online certificate status protocol** (OCSP) is another way to handle certificate revocation. Basically, the client queries a OCSP server for revocation information.
|
||||||
|
|
||||||
|
There is a **OCSP server** that runs 24/7, responding to queries. This server can be run by the CAs or may be delegated to some other entities. The address of the OCSP server is specified in the certificate.
|
||||||
|
|
||||||
|
Using OCSP, revocation check is done in the following way.
|
||||||
|
|
||||||
|
> 1. A client connects to a website and receives the certificate of the server.
|
||||||
|
> 2. Using the OCSP server address in the certificate, the client queries the OCSP server with the serial number of the certificate.
|
||||||
|
> 3. The server checks the database, and sends a signed response containing revocation information.
|
||||||
|
|
||||||
|
This method has a privacy problem. The client queries with the serial number, so the OCSP server can track what kinds of website the client has visited. Also if OCSP server is not available or under too many requests, the response may be unavailable or slow. Browsers soft-fail and assume that the certificate has no problem.[^3]
|
||||||
|
|
||||||
|
#### OCSP Stapling
|
||||||
|
|
||||||
|
The privacy issue can be solved using **OCSP stapling**. When the client connects to the web server, the *web server queries the OCSP server* and gives the response to the client. It staples the OCSP response with the certificate, hence the name.
|
||||||
|
|
||||||
|
Thus the client does not have to query the OCSP server about where it is visiting.
|
||||||
|
|
||||||
|
## Problems with PKI
|
||||||
|
|
||||||
|
- If a CA is compromised, it can issue a certificate for any name.
|
||||||
|
- Or CA may not check every detail but still issue it.
|
||||||
|
- Fraudulent certificates look perfectly valid.
|
||||||
|
- PKI is only secure as the weakest CA.
|
||||||
|
- As revoked certificates increase, it is hard to manage them.[^4]
|
||||||
|
- Certificate verification depends on the implementations of the browser.
|
||||||
|
- Users often ignore certificate warnings.
|
||||||
|
|
||||||
|
### Solving PKI Issues
|
||||||
|
|
||||||
|
We use different kinds of certificates. Stronger validations are done by the CA.
|
||||||
|
|
||||||
|
- **Domain Validation** (DV) certificate
|
||||||
|
- CA issues this certificate to anyone listed in the contact in the public record associated with a domain name.
|
||||||
|
- CA exchanges confirmation emails with an address listed in the domain's WHOIS record.
|
||||||
|
- **Organization Validation** (OV) certificate
|
||||||
|
- CA carefully examines the organization or the individual.
|
||||||
|
- **Extended Validation** (EV) certificate
|
||||||
|
- Most rigorous identity check is done on the organization or individual.
|
||||||
|
- Online finance companies use this.
|
||||||
|
|
||||||
|
But if CA is compromised or the private key of the CA is leaked, certificates may be fake. We need more evidence that some given certificate is valid. Thus, we add other independent sources that can be used to validate the certificate.
|
||||||
|
|
||||||
|
The answer to this is **certificate transparency**. When a certificate is issued, it is logged to a public log server, and it is monitored by the server. Issuing certificates is transparent to the public. Read more from [certificate transparency (Wikipedia)](https://en.wikipedia.org/wiki/Certificate_Transparency).
|
||||||
|
|
||||||
|
[^1]: Image from [Wikipedia](https://en.wikipedia.org/wiki/File:Chain_Of_Trust.svg).
|
||||||
|
[^2]: Can someone pretend to be a root CA by creating a fake certificate of a root CA?
|
||||||
|
[^3]: Is this okay?
|
||||||
|
[^4]: Is there a reason to keep the list of revoked certificates? The CA can just return *invalid* on non-existing serial numbers...?
|
||||||
333
_posts/Lecture Notes/Internet Security/2023-10-18-tls.md
Normal file
333
_posts/Lecture Notes/Internet Security/2023-10-18-tls.md
Normal file
@@ -0,0 +1,333 @@
|
|||||||
|
---
|
||||||
|
share: true
|
||||||
|
toc: true
|
||||||
|
math: true
|
||||||
|
categories:
|
||||||
|
- Lecture Notes
|
||||||
|
- Internet Security
|
||||||
|
tags:
|
||||||
|
- lecture-note
|
||||||
|
- security
|
||||||
|
title: 09. Transport Layer Security
|
||||||
|
date: 2023-10-18
|
||||||
|
github_title: 2023-10-18-tls
|
||||||
|
image:
|
||||||
|
path: /assets/img/posts/Lecture Notes/Internet Security/is-09-tls-handshake.png
|
||||||
|
attachment:
|
||||||
|
folder: assets/img/posts/Lecture Notes/Internet Security
|
||||||
|
---
|
||||||
|
|
||||||
|
This is a brief comparison of HTTP and HTTPS
|
||||||
|
|
||||||
|
- HTTP: HyperText Transfer Protocol
|
||||||
|
- HTTPS: HyperText Transfer Protocol **Secure**
|
||||||
|
- Uses certificates, encryption, TLS.
|
||||||
|
- Used for privacy.
|
||||||
|
|
||||||
|
## TLS Overview
|
||||||
|
|
||||||
|
**Transport Layer Security** (TLS) **protocol** provides privacy (confidentiality) and data integrity between two communicating parties.
|
||||||
|
|
||||||
|
- TLS is built on top of TCP/IP.
|
||||||
|
- TLS is based on **secure socket layer** (SSL), but now SSL is deprecated.
|
||||||
|
|
||||||
|
You can check if TLS is used on your browser. The address should begin with `https` and there should be a green lock icon.
|
||||||
|
|
||||||
|
### TLS History
|
||||||
|
|
||||||
|
- 1994: SSL 1.0
|
||||||
|
- Netscape (browser company) used it internally for their browser.
|
||||||
|
- Not publicly released.
|
||||||
|
- 1995: SSL 2.0
|
||||||
|
- Published by Netscape.
|
||||||
|
- There were several weaknesses.
|
||||||
|
- 1996: SSL 3.0
|
||||||
|
- Designed by Netscape and Paul Kocher.
|
||||||
|
- 1999: TLS 1.0
|
||||||
|
- IETF makes RFC 2246 based on SSL 3.0.
|
||||||
|
- Not inter-operable with SSL 3.0
|
||||||
|
- TLS uses HMAC instead of MAC.
|
||||||
|
- TLS can run on any port.
|
||||||
|
- 2006: TLS 1.1
|
||||||
|
- RFC 4346
|
||||||
|
- Protection against CBC padding attacks.
|
||||||
|
- 2008: TLS 1.2
|
||||||
|
- RFC 5246
|
||||||
|
- More options in cipher suites like SHA256, AES.
|
||||||
|
- 2018: TLS 1.3
|
||||||
|
- RFC 8446
|
||||||
|
- Insecure ciphers such as RC4, DES were removed.
|
||||||
|
- Streamline RTT handshakes. (0-RTT mode)
|
||||||
|
|
||||||
|
## CBC Padding Oracle Attack
|
||||||
|
|
||||||
|
Recall [CBC Mode (Internet Security)](../2023-09-18-symmetric-key-cryptography-2#cipher-block-chaining-mode-cbc) .
|
||||||
|
|
||||||
|
Suppose that each block has $8$ bytes. If the message size is not a multiple of the block size, we pad the message. If we need to pad $b$ bytes, we pad $b$ bytes with $b$, encoded in binary.
|
||||||
|
|
||||||
|
If the padding is not valid, the decryption algorithm outputs a *padding error* during the decryption process. The attacker can observe if a padding error has occurred, and use this information to recover the plaintext.
|
||||||
|
|
||||||
|
To defend this attack, we can use [encrypt-then-MAC (Modern Cryptography)](../../modern-cryptography/2023-09-26-cca-security-authenticated-encryption#encrypt-then-mac-etm), or hide the padding error.
|
||||||
|
|
||||||
|
### Attack in Detail
|
||||||
|
|
||||||
|
We will perform a **chosen ciphertext attack** to fully recover the plaintext.
|
||||||
|
|
||||||
|
Suppose that we obtains a ciphertext $(\mathrm{IV}, c_1, c_2)$, which is an encryption of two blocks $m = m_0 \parallel m_1$, including the padding. By the CBC encryption algorithm we know that
|
||||||
|
|
||||||
|
$$
|
||||||
|
c_1 = E_k(m_0 \oplus \mathrm{IV}), \qquad c_2 = E_k(m_1 \oplus c_1).
|
||||||
|
$$
|
||||||
|
|
||||||
|
We don't know exactly how many padding bits there were, but it doesn't matter. We brute force by **changing the last byte of $c_1$** and requesting the decryption of the modified ciphertext $(\mathrm{IV}, c_1', c_2)$.
|
||||||
|
|
||||||
|
The decryption process of the last block is $c_1 \oplus D_k(c_2)$, so by changing the last byte of $c_1$, we hope to get a decryption result that ends with $\texttt{0x01}$. Then the last byte $\texttt{0x01}$ will be treated as a padding and padding errors will not occur. So we keep trying until we don't get a padding error.
|
||||||
|
|
||||||
|
Now, suppose that we successfully changed the last byte of $c_1$ to $b$, so that the last byte of $(c_1[0\dots6] \parallel b) \oplus D_k(c_2)$ is $\texttt{0x01}$. Next, we change the second-last bit $c_1[6]$ and request the decryption and hope to get an output that ends with $\texttt{0x0202}$. The last two bytes will also be treated as a padding and we won't get a padding error.
|
||||||
|
|
||||||
|
We repeat the above process until we get a modified ciphertext $c_1' \parallel c_2$, where the decryption result ends with $8$ bytes of $\texttt{0x08}$. Then now we know that
|
||||||
|
|
||||||
|
$$
|
||||||
|
c_1' \oplus D_k(c_2) = \texttt{0x08}^8.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Then we can recover $D_k(c_2) = c_1' \oplus \texttt{0x08}^8$, and then since $m_1 = c_1 \oplus D_k(c_2)$,
|
||||||
|
|
||||||
|
$$
|
||||||
|
m_1 = c_1 \oplus D_k(c_2) = c_1 \oplus c_1' \oplus \texttt{0x08}^8,
|
||||||
|
$$
|
||||||
|
|
||||||
|
allowing us to recover the whole message $m_1$.
|
||||||
|
|
||||||
|
Now to recover $m_0$, we modify the $\mathrm{IV}$ using the same method as above. This time, we do not use $c_2$ and request a decryption of $(\mathrm{IV}', c_1)$ only. If some $\mathrm{IV}'$ gives a decryption result that ends with $8$ bytes of $\texttt{0x08}$, we have that
|
||||||
|
|
||||||
|
$$
|
||||||
|
\mathrm{IV}' \oplus D_k(c_1) = \texttt{0x08}^8.
|
||||||
|
$$
|
||||||
|
|
||||||
|
Similarly, we recover $m_0$ by
|
||||||
|
|
||||||
|
$$
|
||||||
|
m_0 = \mathrm{IV} \oplus D_k(c_1) = \mathrm{IV} \oplus \mathrm{IV}' \oplus \texttt{0x08}^8.
|
||||||
|
$$
|
||||||
|
|
||||||
|
## Hashed MAC (HMAC)
|
||||||
|
|
||||||
|
Let $H$ be a has function. We defined MAC as $H(k \parallel m)$ where $k$ is a key and $m$ is a message. This MAC is insecure if $H$ has [Merkle-Damgård construction](../../modern-cryptography/2023-09-28-hash-functions#merkle-damg%C3%A5rd-transform), since it is vulnerable to length extension attacks. See [prepending the key in MAC is insecure (Modern Cryptography)](../../modern-cryptography/2023-09-28-hash-functions#prepending-the-key).
|
||||||
|
|
||||||
|
Choose a key $k \leftarrow \mathcal{K}$, and set
|
||||||
|
|
||||||
|
$$
|
||||||
|
k_1 = k \oplus \texttt{ipad}, \quad k_2 = k\oplus \texttt{opad}
|
||||||
|
$$
|
||||||
|
|
||||||
|
where $\texttt{ipad} = \texttt{0x363636}...$ and $\texttt{opad} = \texttt{0x5C5C5C}...$. Then
|
||||||
|
|
||||||
|
$$
|
||||||
|
\mathrm{HMAC}(k, m) = H(k_2 \parallel H(k_1 \parallel m)).
|
||||||
|
$$
|
||||||
|
|
||||||
|
## TLS Details
|
||||||
|
|
||||||
|
- TLS consists of two main protocols (4 in total)
|
||||||
|
- **Handshake protocol** is the most important.
|
||||||
|
- Uses public key cryptography to share a secret key between the client and the server.
|
||||||
|
- Record protocol
|
||||||
|
- Use the shared key from the handshake to protect communication.
|
||||||
|
|
||||||
|
### TLS Handshake Protocol
|
||||||
|
|
||||||
|
Here's how the client and the server establishes a connection using the TLS handshake protocol. (TLS 1.2, RFC 5246)
|
||||||
|
|
||||||
|
> 1. Two parties agree on the following.
|
||||||
|
> - The version of the protocol.
|
||||||
|
> - The set of cipher suites (cryptographic algorithms) to be used.
|
||||||
|
> 2. The client uses digital certificates to authenticate the server.
|
||||||
|
> 3. Use the server's public key to share a secret.
|
||||||
|
> 4. Both parties generate a symmetric key from the shared secret.
|
||||||
|
|
||||||
|
[^1]
|
||||||
|
|
||||||
|
- `ServerKeyExchange`, `ClientKeyExchange` is optional. Used sometimes if Diffie-Hellman is used.
|
||||||
|
- The actual messages and process differ for each protocol and ciphers used.
|
||||||
|
- **All messages after `ChangeCipherSpec` are encrypted.**
|
||||||
|
|
||||||
|
#### ClientHello
|
||||||
|
|
||||||
|
- Client sends the TLS protocol version and cipher suites that it supports.
|
||||||
|
- The version is the highest version supported by the client.
|
||||||
|
- A random number $N_c$ for generating the secret is sent.
|
||||||
|
- A session ID may be sent if the client wants to resume an old session.
|
||||||
|
|
||||||
|
#### ServerHello
|
||||||
|
|
||||||
|
- Server sends the TLS version and cipher suite to use.
|
||||||
|
- The TLS version will be the highest version supported by both parties.
|
||||||
|
- The server will pick the strongest cryptographic algorithm offered by the client.
|
||||||
|
- The server also sends a random number $N_s$.
|
||||||
|
|
||||||
|
#### Certificate/ServerKeyExchange
|
||||||
|
|
||||||
|
- The server sends its public key certificate.
|
||||||
|
- The actual data depends on the cipher suite used.
|
||||||
|
- For example, it can contain RSA public key, or Diffie-Hellman public key.
|
||||||
|
- The server will send `ServerHelloDone`.
|
||||||
|
- The client will verify the server's certificate.
|
||||||
|
|
||||||
|
#### ClientKeyExchange
|
||||||
|
|
||||||
|
- Client sends *premaster secret* (PMS) $secret_c$.
|
||||||
|
- This is encrypted with server's public key.
|
||||||
|
- This secret key material will be used to generate the secret key.
|
||||||
|
- Both parties derive a shared **session key** from $N_c$, $N_s$, $secret_c$.
|
||||||
|
- If the protocol is correct, the same key should be generated.
|
||||||
|
|
||||||
|
#### Finished
|
||||||
|
|
||||||
|
Both parties now switch to encrypted communication. Both parties exchange a `Finished` message.
|
||||||
|
|
||||||
|
- This message is the first message protected by the previously agreed cipher.
|
||||||
|
- The message contains the **hash** of all sent and received messages during the handshake.
|
||||||
|
- This is to ensure that the key exchange and authentication process were successful and the messages were not tampered with.
|
||||||
|
- The hash must be verified by the other side.
|
||||||
|
|
||||||
|
More information in [Section 7.4.9 (RFC 5246)](https://www.rfc-editor.org/rfc/rfc5246#section-7.4.9).
|
||||||
|
|
||||||
|
### Generating Master and Secret Keys
|
||||||
|
|
||||||
|
This is from the RFC 5246 document. PRF is a pseudorandom function.
|
||||||
|
|
||||||
|
```
|
||||||
|
master_secret = PRF(pre_master_secret, "master secret",
|
||||||
|
ClientHello.random + ServerHello.random)
|
||||||
|
[0..47];
|
||||||
|
|
||||||
|
key_block = PRF(SecurityParameters.master_secret,
|
||||||
|
"key expansion",
|
||||||
|
SecurityParameters.server_random +
|
||||||
|
SecurityParameters.client_random);
|
||||||
|
```
|
||||||
|
|
||||||
|
- Why do we use `pre_master_secret` and `master_secret`?
|
||||||
|
- To provide greater consistency between TLS cipher suites.[^2]
|
||||||
|
|
||||||
|
## Version Rollback Attack (SSL)
|
||||||
|
|
||||||
|
- Client sends TLS version 3.0, but the attacker in the middle modifies it and sends version 2.0.
|
||||||
|
- Server thinks that the client only supports SSL 2.0.
|
||||||
|
- Then the client and the server communicates using SSL 2.0, which may be insecure.
|
||||||
|
|
||||||
|
### Chosen Protocol Attacks
|
||||||
|
|
||||||
|
- Also known as **Downgrade Attacks**.
|
||||||
|
- Dangerous since old versions have security vulnerabilities.
|
||||||
|
- Attackers can perform attacks using the old, broken version.
|
||||||
|
- Weak protocols, weak cryptographic algorithms, etc.
|
||||||
|
- Newer versions (patched) must be *backward-compatible*, since not all people upgrade right away.
|
||||||
|
- Methods to prevent this attack.
|
||||||
|
- Drop backward compatibility.
|
||||||
|
- Authenticate the version number.
|
||||||
|
|
||||||
|
#### Version Checking in SSL 3.0
|
||||||
|
|
||||||
|
- When sending the premaster secret, also send the protocol version.
|
||||||
|
- This data is encrypted with the server's public key, so it cannot be tampered.
|
||||||
|
- The server will decrypt and check if the protocol version matches the version in the `ClientHello` message.
|
||||||
|
|
||||||
|
## Other TLS/SSL Protocols
|
||||||
|
|
||||||
|
These two protocols run over the record protocol.
|
||||||
|
|
||||||
|
- **Alert protocol**
|
||||||
|
- Handling of sessions, warnings and errors.
|
||||||
|
- **Change cipher spec protocol**
|
||||||
|
- Not a part of handshake protocol.
|
||||||
|
- Indicates that the parties are changing to the agreed cipher suite.
|
||||||
|
|
||||||
|
## Issues in TLS 1.2
|
||||||
|
|
||||||
|
- **Forward secrecy** is not supported.
|
||||||
|
- It used weak cryptographic algorithms.
|
||||||
|
- Compression is not as efficient.
|
||||||
|
|
||||||
|
### Forward Secrecy
|
||||||
|
|
||||||
|
> **Forward secrecy** is a feature of key agreement protocols that session keys will not be compromised even if long-term secrets used in the session key exchange are compromised.[^3]
|
||||||
|
|
||||||
|
> An encryption system has the property of **forward secrecy** if plaintext (decrypted) inspection of the data exchange that occurs during key agreement phase of session initiation does not reveal the key that was used to encrypt the remainder of the session.[^3]
|
||||||
|
|
||||||
|
- Forward secrecy prevents an **NSA-style attack**.
|
||||||
|
- Save all TLS traffic starting from TLS handshake.
|
||||||
|
- Obtain the server's private key later with a court order or hacking in.
|
||||||
|
- Decrypt all the stored traffic.
|
||||||
|
- If an ephemeral session key is used for each new message, forward secrecy is provided.
|
||||||
|
- Other messages cannot be decrypted even if a session key is compromised.
|
||||||
|
- From TLS 1.3, fixed by only using Diffie-Hellman ephemeral (`DHE-*`) ciphers.
|
||||||
|
|
||||||
|
### DH-RSA and DHE-RSA
|
||||||
|
|
||||||
|
Actual secret sharing is done using Diffie-Hellman key exchange, and RSA is used for digital signatures.
|
||||||
|
|
||||||
|
#### DH-RSA
|
||||||
|
|
||||||
|
- Server's **permanent** key pair is a Diffie-Hellman key pair.
|
||||||
|
- Certificate has the server's public key $g^s \bmod p$.
|
||||||
|
- Certificate is signed by the CA using RSA.
|
||||||
|
- The client sends $g^c \bmod p$.
|
||||||
|
|
||||||
|
#### DHE-RSA
|
||||||
|
|
||||||
|
- **For each TLS session**, the server generates a random number $s$.
|
||||||
|
- The server's public key is $g^s \bmod p$.
|
||||||
|
- Certificate is signed using RSA.
|
||||||
|
- The client sends $g^c \bmod p$.
|
||||||
|
- The server and client agree on a secret $g^{sc} \bmod p$.
|
||||||
|
- The ephemeral keys $s$ and $g^{sc} \bmod p$ is discarded after the session.
|
||||||
|
|
||||||
|
## TLS 1.3
|
||||||
|
|
||||||
|
- TLS 1.3 does not support RSA, nor other vulnerable cipher suites.
|
||||||
|
- Shortened TLS handshake, thus faster and more secure.
|
||||||
|
- Reduced RTTs (round trip time).
|
||||||
|
- Achieves forward secrecy.
|
||||||
|
|
||||||
|
### TLS 1.3 Handshake
|
||||||
|
|
||||||
|
We previously had 2 round trips, but now we have one. The main difference is the client hello part.
|
||||||
|
|
||||||
|
- **Client hello**
|
||||||
|
- Protocol version, client random, cipher suites are sent.
|
||||||
|
- **Parameters for calculating the premaster secret is also sent.**[^4]
|
||||||
|
- **Server generates master secret**
|
||||||
|
- Server has client random, parameters and cipher suites.
|
||||||
|
- Using the server random, generate the master secret.
|
||||||
|
- **Server hello** and **Finished**
|
||||||
|
- Server's certificate, digital signature, server random, chosen cipher suite is sent.
|
||||||
|
- Master secret has been generated, so `Finished` is sent.
|
||||||
|
- **Client Finished**
|
||||||
|
- Client verifies the certificate, generates master secret, sends `Finished`.
|
||||||
|
|
||||||
|
### 0-RTT for Session Resumption
|
||||||
|
|
||||||
|
TLS 1.3 also supports an event faster handshake that doesn't require and round trips.
|
||||||
|
|
||||||
|
- Works only if the user has visited the website before.
|
||||||
|
- The both parties can derive another shared secret from the first session.
|
||||||
|
- **Resumption main secret**, **pre-shared key** (PSK)
|
||||||
|
- The server sends a **session ticket** during the first session.
|
||||||
|
- The client sends this ticket along with the first encrypted message of the new session.
|
||||||
|
|
||||||
|
#### Replay Attacks in 0-RTT
|
||||||
|
|
||||||
|
0-RTT is susceptible to **replay attacks**.
|
||||||
|
|
||||||
|
- Different servers (of the same domain) cannot catch replay attacks.
|
||||||
|
- Initial data in TLS resumption should be handled carefully.
|
||||||
|
- For example, only allow methods that do not change state (like HTTP GET) without any parameters.
|
||||||
|
|
||||||
|
Read more in [Introducing 0-RTT (Cloudflare Blog)](https://blog.cloudflare.com/introducing-0-rtt/).
|
||||||
|
|
||||||
|
[^1]: Source: [The SSL Store](https://www.thesslstore.com/blog/explaining-ssl-handshake/).
|
||||||
|
[^2]: Source: [Cryptography SE](https://crypto.stackexchange.com/questions/24780/what-is-the-purpose-of-pre-master-secret-in-ssl-tls).
|
||||||
|
[^3]: Source: [Forward secrecy (Wikipedia)](https://en.wikipedia.org/wiki/Forward_secrecy).
|
||||||
|
[^4]: The client is assuming that it knows the server's preferred key exchange method, since many insecure cipher suites have been removed. Now, the number of possible cipher suites has been reduced.
|
||||||
@@ -171,6 +171,8 @@ Since the adversary can see the ciphertext, this kind of relation leaks some inf
|
|||||||
|
|
||||||
Also, the key is (at least) as long as the message. This is why OTP is rarely used today. When sending a long message, two parties must communicate a very long key that is as long as the message, *every single time*! This makes it hard to manage the key.
|
Also, the key is (at least) as long as the message. This is why OTP is rarely used today. When sending a long message, two parties must communicate a very long key that is as long as the message, *every single time*! This makes it hard to manage the key.
|
||||||
|
|
||||||
|
## Shannon's Theorem
|
||||||
|
|
||||||
So is there a way to reduce the key size without losing perfect secrecy? Sadly, no. In fact, the key space must be as least as large as the message space. This is a requirement for perfectly secret schemes.
|
So is there a way to reduce the key size without losing perfect secrecy? Sadly, no. In fact, the key space must be as least as large as the message space. This is a requirement for perfectly secret schemes.
|
||||||
|
|
||||||
> **Theorem**. If $(G, E, D)$ is a perfectly secret encryption scheme, then $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$.
|
> **Theorem**. If $(G, E, D)$ is a perfectly secret encryption scheme, then $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$.
|
||||||
@@ -290,7 +292,7 @@ We can deduce that if a PRG is predictable, then it is insecure.
|
|||||||
|
|
||||||
*Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]$ is non-negligible.
|
*Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]$ is non-negligible.
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
1. The challenger PRG will send a bit string $x$ to $\mathcal{B}$.
|
1. The challenger PRG will send a bit string $x$ to $\mathcal{B}$.
|
||||||
- In experiment $0$, PRG gives pseudorandom string $G(k)$.
|
- In experiment $0$, PRG gives pseudorandom string $G(k)$.
|
||||||
@@ -316,7 +318,7 @@ The theorem implies that if next bit predictors cannot distinguish $G$ from true
|
|||||||
|
|
||||||
To motivate the definition of semantic security, we consider a **security game framework** (attack game) between a **challenger** (ex. the creator of some cryptographic scheme) and an **adversary** $\mathcal{A}$ (ex. attacker of the scheme).
|
To motivate the definition of semantic security, we consider a **security game framework** (attack game) between a **challenger** (ex. the creator of some cryptographic scheme) and an **adversary** $\mathcal{A}$ (ex. attacker of the scheme).
|
||||||
|
|
||||||

|

|
||||||
|
|
||||||
> **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows:
|
> **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows:
|
||||||
>
|
>
|
||||||
|
|||||||
Binary file not shown.
|
After Width: | Height: | Size: 270 KiB |
Binary file not shown.
|
After Width: | Height: | Size: 47 KiB |
Reference in New Issue
Block a user