Compare commits

...

11 Commits

Author SHA1 Message Date
13755ba204 PUSH NOTE : 07. Public Key Cryptography.md (#124) 2023-10-28 00:26:23 +09:00
dd3e21344e [PUBLISHER] upload files #123 2023-10-28 00:25:10 +09:00
0d6ba31ba6 [PUBLISHER] upload files #122 2023-10-27 21:14:00 +09:00
0f1120b2f6 [PUBLISHER] upload files #121 2023-10-27 21:11:58 +09:00
871ca66457 [PUBLISHER] upload files #120 2023-10-27 21:09:42 +09:00
33846b79a1 [PUBLISHER] upload files #119 2023-10-27 21:06:55 +09:00
66f0e0a50e [PUBLISHER] upload files #118 2023-10-27 21:05:43 +09:00
922844d638 [PUBLISHER] upload files #117
* PUSH NOTE : 01. Security Introduction.md

* PUSH ATTACHMENT : is-01-cryptosystem.png
2023-10-27 14:37:23 +09:00
47872c6bef [PUBLISHER] upload files #116
* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* PUSH ATTACHMENT : is-03-feistel-function.png

* PUSH ATTACHMENT : is-03-ecb-encryption.png

* PUSH ATTACHMENT : is-03-cbc-encryption.png

* PUSH ATTACHMENT : is-03-cfb-encryption.png

* PUSH ATTACHMENT : is-03-ofb-encryption.png

* PUSH ATTACHMENT : is-03-ctr-encryption.png
2023-10-27 14:36:55 +09:00
4d68a99404 [PUBLISHER] upload files #115
* PUSH NOTE : 1. OTP, Stream Ciphers and PRGs.md

* PUSH ATTACHMENT : mc-01-prg-game.png

* PUSH ATTACHMENT : mc-01-ss.png
2023-10-27 11:17:24 +09:00
da098b4126 [PUBLISHER] upload files #114 2023-10-27 11:06:42 +09:00
7 changed files with 720 additions and 52 deletions

View File

@@ -36,7 +36,7 @@ attachment:
In this course, we are mainly interested in system/network security!
There are two categories in **IT Security**, (though the boundary is blurry)
- **Computer** (system) **security** uses automated tools and mechanisms to protect **data in a computer**, against hackers, malware, etc.
- **Computer** (system) **security** uses automated tools and mechanisms to protect the **data in a computer**, against hackers, malware, etc.
- **Internet** (network) **security** prevents, detects, and corrects security violations that involve the **transmission of information** in a network.
In internet security, we assume that:
@@ -52,7 +52,7 @@ In internet security, we assume that:
- inserting, modifying, deleting, replaying messages
- poisoning data
- impersonate and pretend to be someone else
- Conventionally, we use the terms:
- Conventionally, we use the following names:
- Alice and Bob for the two parties participating in the communication.
- Eve (or Mallory, Oscar) for the adversary.
@@ -94,9 +94,9 @@ This is only an overview, so the attacks are introduced briefly.
There are two types of attacks in security attacks
- **Active attacks**: modify the content of messages
- Ex. (D)DoS, MITM, poisoning, smurf attack, system attacks.
- *Prevention* is important since the active attacks are a danger to *data integrity* and *availability*.
- *Prevention* is important since the active attacks concern *data integrity* and *availability*.
- **Passive attacks**: does not modify information, but observes the content or copies it.
- Ex. eavesdropping, port scanning (idle scan secretly scanns).
- Ex. eavesdropping, port scanning (idle scan secretly scans).
- *Detection* is important since passive attacks are a danger to *confidentiality*.
## Security Services and Mechanisms
@@ -112,7 +112,7 @@ What kind of security services do we want? The basic network security services m
Additionally, we also need:
- **Authentication**: a way to authenticate users (ID, passwords)
- **Non-repudiation**: ensure that no party can deny that it sent or received a message or approved some information
- Assurance that someone cannot deny the validity of something
- Assurance that someone cannot deny the validity of message or information
### Attacks Against CIA Triad
@@ -142,10 +142,10 @@ There are many ways of achieving security.
- It may be desirable to not leak *any* information, so one might add padding to the traffic, so the traffic is indistinguishable by the adversary (prevents side-channel attacks)
- **Digital signatures**: provides authenticity of digital messages or documents
- **Trusted Third Party** (TTP): a safe third-party that we can trust
- If we have a TTP, a lot of problems go away. We can always ask the TTP for the truth
- But TTP can become a *single point of failure* (SPOF), and security architectures may become too dependent on the TTP
- If we have a TTP, a lot of problems go away. We can always ask the TTP for the truth.
- But TTP can become a *single point of failure* (SPOF), and security architectures may become too dependent on the TTP.
- **Append-only server**: keeps track of all modifications, good for auditing
- Blockchain is a kind of append-only data structure
- Blockchain is a kind of append-only data structure.
## Cryptography
@@ -155,7 +155,7 @@ There are many ways of achieving security.
### Basics of a Cryptosystem
![is-01-cryptosystem.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-01-cryptosystem.png)
![is-01-cryptosystem.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-01-cryptosystem.png#)
- A **message** in *plaintext* is given to an **encryption algorithm**.
- The encryption algorithm uses an **encryption key** to create a *ciphertext*.
@@ -168,7 +168,7 @@ There are many ways of achieving security.
There are two criteria for classifying cryptosystems.
- How are the keys used?
- **Symmetric** cryptography uses a single key for both encryption and decryption
- **Symmetric** cryptography uses a single key for both encryption and decryption.
- **Public key** cryptography uses different keys for encryption and decryption, respectively.
- How are plaintexts processed?
- **Block cipher**
@@ -232,7 +232,7 @@ In a smartphone, assets (things of value) would be
For example,
|Attacker|Abilities|Goals|
|-|-|-|
|:-:|-|-|
|Thief|Steal the phone|Take the device|
|FBI|Lot of things...|Obtain evidence from the device|
|Eavesdropper|Observe network traffic|Steal information|

View File

@@ -24,7 +24,7 @@ github_title: 2023-09-11-symmetric-key-cryptography-1
- A strong encryption algorithm, which is known to the public.
- Kerckhoff's principle!
- A secret key known only to sender and receiver.
- We assume the **existence of a a secure channel for distributing the key**.
- We assume the **existence of a a secure channel for distributing the key**.[^1]
- **Correctness requirement**
- Let $m$, $k$ denote the message and the key.
- For encryption/decryption algorithm $E$ and $D$,
@@ -32,7 +32,7 @@ github_title: 2023-09-11-symmetric-key-cryptography-1
## Cryptographic Attacks
In increasing order of increasing power of the attacker,
In increasing order of the power of the attacker,
- **Ciphertext only attacks**: the attacker has ciphertexts, and tries to obtain information.
- **Known plaintext attack**: the attacker has a collection of plaintext/ciphertext pairs.
@@ -44,8 +44,10 @@ In increasing order of increasing power of the attacker,
The following two properties should hold for a secure cipher.
- **Diffusion** hides the relationship between the ciphertext and the plaintext.
- It should be hard to obtain the plaintext from the ciphertext.
- Changing a single bit of the plaintext affects several bits of the ciphertext, and vice versa.
- **Confusion** hides the relationship between the ciphertext and the key.
- It should be hard to obtain the key from the ciphertext.
- Each bit of the ciphertext should depend on several parts of the key.
## Primitives
@@ -66,8 +68,9 @@ In **substitution cipher**, encryption is done by replacing units of plaintext w
- In Caesar cipher, $a = 1$ and $b = 3$.
- Encryption: $E(x) = ax + b \pmod m$.
- Decryption: $D(x) = a^{-1}(x - b) \pmod m$.
- There are $12$ possible values for $a$, and $26$ possible values for $b$.
- If we use the $26$ alphabets, there are $12$ possible values for $a$, and $26$ possible values for $b$.
- $a^{-1}$ does not exist for all $m$.
- We need that $\gcd(a, m) = 1$. The number of possible $a$ values is $\phi(m)$.
- This scheme is not secure either, since we can try all possibilities and check if the message makes sense.
#### Monoalphabetic Substitution Cipher
@@ -79,17 +82,17 @@ In **substitution cipher**, encryption is done by replacing units of plaintext w
- Decryption is done by replacing each letter $x$ by $\pi^{-1}(x)$.
- This scheme is still not secure, since we can try all possibilities on a *modern* computer.
To attack this scheme, we use frequency analysis. Calculate the frequency of each letter and compare it with the actual distribution of English letters. Also, we could use bigrams (2-letters)
To attack this scheme, we use frequency analysis. Calculate the frequency of each letter and compare it with the actual distribution of English letters. We could also use *bigrams* (2-letters) for calculating the frequency.
#### Vigenère Cipher
- A polyalphabetic substitution
- Given a key length $m$, take key $k = (k_1, k_2, \dots, k_m)$.
- For the $i$-th letter $x$, set $j = i \pmod m$.
- For the $i$-th letter $x$, set $j = i \bmod m$.
- Encryption is done by replacing $x$ by $x + k_{j}$.
- Decryption is done by replacing $x$ by $x - k_j$.
To attack this scheme, find the key length by *index of coincidence*. Then use frequency analysis.
To attack this scheme, find the key length by [*index of coincidence*](https://en.wikipedia.org/wiki/Index_of_coincidence). Then use frequency analysis.
#### Hill Cipher
@@ -113,6 +116,48 @@ This scheme is vulnerable to known plaintext attack, since the equation can be s
- To encrypt, reorder the columns by the chosen permutation.
- Then the ciphertext is taken by taking letters in column major order.
##### Example
Suppose we encrypt the following text:
$$
\texttt{CRYPTOGRAPHY INTERNET SECURITY}
$$
Choose a key $\sigma = (1, 4, 5, 2, 3, 6)$. Then
$$
\begin{matrix} \\
4 & 3 & 6 & 5 & 2 & 1 \\ \hline
\texttt{C} & \texttt{R} & \texttt{Y} & \texttt{P} & \texttt{T} & \texttt{O} \\
\texttt{G} & \texttt{R} & \texttt{A} & \texttt{P} & \texttt{H} & \texttt{Y} \\
\texttt{I} & \texttt{N} & \texttt{T} & \texttt{E} & \texttt{R} & \texttt{N} \\
\texttt{E} & \texttt{T} & \texttt{S} & \texttt{E} & \texttt{C} & \texttt{U} \\
\texttt{R} & \texttt{I} & \texttt{T} & \texttt{Y}
\end{matrix}
$$
Now reorder the columns,
$$
\begin{matrix} \\
1 & 2 & 3 & 4 & 5 & 6 \\ \hline
\texttt{O} & \texttt{T} & \texttt{R} & \texttt{C} & \texttt{P} & \texttt{Y} \\
\texttt{Y} & \texttt{H} & \texttt{R} & \texttt{G} & \texttt{P} & \texttt{A} \\
\texttt{N} & \texttt{R} & \texttt{N} & \texttt{I} & \texttt{E} & \texttt{T} \\
\texttt{U} & \texttt{C} & \texttt{T} & \texttt{E} & \texttt{E} & \texttt{S} \\
&& \texttt{I} & \texttt{R} & \texttt{Y} & \texttt{T}
\end{matrix}
$$
The ciphertext is
$$
\texttt{OYNU THRC RRNTI CGIER PPEEY YATST}.
$$
The decryption process is the reverse of this operation. It seems to be breakable by inspecting the $i$-th letter of each block and reordering the letters to check if any reordering makes sense.
### Exclusive OR (XOR)
- A bitwise operation $x \oplus y = x + y \pmod 2$.
@@ -130,8 +175,8 @@ This scheme is vulnerable to known plaintext attack, since the equation can be s
$$
\begin{align*}
\mathrm{Pr}[C = 0] &= \mathrm{Pr}[M = 0 \land K = 0] + \mathrm{Pr}[M = 1 \land K = 1] \\ &= \mathrm{Pr}[M = 0] \cdot \mathrm{Pr}[K = 0] + \mathrm{Pr}[M = 1] \cdot \mathrm{Pr}[K = 1] \\
&= \frac{1}{2}\left(\mathrm{Pr}[M = 0] + \mathrm{Pr}[M = 1]\right) \\
\Pr[C = 0] &= \Pr[M = 0 \land K = 0] + \Pr[M = 1 \land K = 1] \\ &= \Pr[M = 0] \cdot \Pr[K = 0] + \Pr[M = 1] \cdot \Pr[K = 1] \\
&= \frac{1}{2}\left(\Pr[M = 0] + \Pr[M = 1]\right) \\
&= \frac{1}{2}.
\end{align*}
$$
@@ -140,20 +185,20 @@ The case for $C = 1$ is similar.
### One-Time Pad (OTP)
Omitted.
![1. OTP, Stream Ciphers and PRGs > One-Time Pad (OTP)](2023-09-07-otp-stream-cipher-prgs.md#one-time-pad-otp)
## Perfect Secrecy
> **Definition.** Let $(E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. We assume that $\lvert \mathcal{K} \rvert = \lvert \mathcal{M} \rvert = \lvert \mathcal{C} \rvert$. The cipher is **perfectly secure** if for all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
>
> $$
> \mathrm{Pr}[\mathcal{M} = m \mid \mathcal{C} = c] = \mathrm{Pr}[\mathcal{M} = m].
> \Pr[\mathcal{M} = m \mid \mathcal{C} = c] = \Pr[\mathcal{M} = m].
> $$
>
> Or equivalently, for all $m_0, m_1 \in \mathcal{M}$, $c \in \mathcal{C}$,
>
> $$
> \mathrm{Pr}[E(k, m _ 0) = c] = \mathrm{Pr}[E(k, m _ 1) = c]
> \Pr[E(k, m _ 0) = c] = \Pr[E(k, m _ 1) = c]
> $$
>
> where $k$ is chosen uniformly in $\mathcal{K}$.
@@ -163,7 +208,7 @@ In other words, the adversary learns nothing from the ciphertext.
With this definition, we can show that **OTP is perfectly secure**. For all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
$$
\mathrm{Pr}[E(k, m) = c] = \frac{1}{\lvert \mathcal{K} \rvert}
\Pr[E(k, m) = c] = \frac{1}{\lvert \mathcal{K} \rvert}
$$
since for each $m$ and $c$, $k$ is determined uniquely.
@@ -278,3 +323,5 @@ Given a bit string (defined in the specification), the sender performs long divi
- $c \oplus (x \parallel \mathrm{CRC}(x)) = k_s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$
- The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
- CRC check by the receiver will succeed.
[^1]: This assumption will be removed when we learn public key cryptography.

View File

@@ -28,8 +28,8 @@ attachment:
### Modules
- **S-box**: a substitution module
- Usually for confusion
- $m \times n$ lookup box is needed, since it should be invertible
- Usually for confusion, also gives diffusion
- $m \times n$ lookup box is used for implementation
- **P-box**: a permutation module
- Usually for diffusion
- Compared to the number of input bits,
@@ -42,28 +42,28 @@ attachment:
- Standardized in 1979.
- Block size is $64$ bits ($8$ bytes)
- $64$ bits input $\rightarrow$ $64$ bits output
- Key is $56$ bits, but there are $8$ bits representing parity, so total of $64$ bits
- Every $8$th bit is a parity bit
- Key is $56$ bits, and every $8$th bit is a parity bit.
- Thus $64$ bits in total
### Encryption
1. From the $56$-bit key, generate $16$ different $48$ bit keys $k_1, \dots, k_{16}$.
2. The plaintext message goes through the P-box.
3. The output goes through $16$ rounds, and in the round $i$, key $k_i$ is used.
2. The plaintext message goes through an initial permutation.
3. The output goes through $16$ rounds, and key $k_i$ is used in round $i$.
4. After $16$ rounds, split the output into two $32$ bit halves and swap them.
5. The output goes through the inverse of the P-box from Step 1.
5. The output goes through the inverse of the permutation from Step 1.
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.[^1]
In each round $i$,
In each round $i$, the following operation is performed:
$$
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1})
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1}).
$$
#### The Feistel Function
![is-03-feistel-function.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-feistel-function.png)
![is-03-feistel-function.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-feistel-function.png#)
The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunks. Each chunk is expanded to $6$ bits using a P-box. Now, we have 48 bits of data, so apply XOR with the key for this round. Next, each $6$-bit block is compressed back to $4$ bits using a S-box. Finally, there is a (straight) permutation at the end, resulting in $32$ bit data.
@@ -108,10 +108,10 @@ Thus $F$ and $G$ are inverses of each other, thus $f$ doesn't have to be inverti
Also, note that
$$
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i),
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i).
$$
so evaluating the decryption round is actually equivalent to running the encryption round with upper/lower $32$ bit halves swapped. Hence the reason for swapping each $32$ bit halves.
Notice that evaluating $G$ is equivalent to evaluating $F$ on a encrypted block, with their upper/lower $32$ bit halves swapped. We get $L_i \oplus f(R_i) \parallel R_i$ exactly when we swap each halves of $F(L_i \parallel R_i)$. Thus, we can use the same hardware for encryption and decryption, which is the reason for swapping each $32$ bit halves.
## Advanced Encryption Standard (AES)
@@ -130,7 +130,7 @@ Each round consists of the following:
- **AddRoundKey**: XOR with round key
The first and last rounds are a little different.
- Before the first round, AddRoundKey is done.
- AddRoundKey is done before the first round.
- The last round does not have MixColumns.
The objectives of AES:
@@ -138,7 +138,7 @@ The objectives of AES:
- Code must be compact, and should run fast on many CPUs
- Design must be simple
### Modules
### Layers
#### SubBytes
@@ -157,7 +157,7 @@ The objectives of AES:
- For each column, each byte is replaced by a value
- The value depends on all 4 bytes of the column
- Each column is processed separately
- Thus effectively, it is a matrix multiplication (Hill cipher)
- Thus effectively, it is a matrix multiplication (Hill cipher).[^2]
#### AddRoundKey
@@ -171,7 +171,7 @@ These 4 modules are all invertible!
- Why is there a AddRoundKey at the beginning?
- Why is the last round different?
Both are for engineering purposes, to make the encryption and decryption process the same. (Check!)
Both are for engineering purposes, to make the encryption and decryption process the same.[^3]
## Modes of Operations
@@ -179,14 +179,14 @@ AES, DES use fixed block size for encryption. How do we encrypt longer messages?
### Electronic Codebook Mode (ECB)
![is-03-ecb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ecb-encryption.png)
![is-03-ecb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ecb-encryption.png#)
- Codebook is a mapping table.
- For the $i$-th plaintext block, we use key $k$ to encrypt and obtain the $i$-th ciphertext block.
- **Uses the same key for all blocks**
- Adjacent blocks are independent of each other.
- Advantages
- Good when run in parallel
- Fast when run in parallel
- Limitations
- Repetitions in messages (if aligned with the block) may lead to repetitions in the ciphertext
- Susceptible to *cut-and-paste attacks*
@@ -198,7 +198,7 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
### Cipher Block Chaining Mode (CBC)
![is-03-cbc-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-cbc-encryption.png)
![is-03-cbc-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-cbc-encryption.png#)
- Two identical messages produce to different ciphertexts.
- This prevents chosen plaintext attacks
@@ -234,8 +234,8 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- If the IV is the same, then the encryption of the same plaintext is the same.
- Thus IVs should be random.
- IV are not required to be secret, but
- No IVs should be reused under the same key
- IV changes should be unpredictable
- **No IVs should be reused under the same key**
- **IV changes should be unpredictable**
- On IV reuse, same message will generate the same ciphertext if key isn't changed
- If IV is predictable, CBC is vulnerable to chosen plaintext attacks.
- Define Eve's new message $m' = \mathrm{IV} _ {\mathrm{E}} \oplus \mathrm{IV} _ {\mathrm{A}} \oplus g$, where
@@ -248,12 +248,12 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
### Cipher Feedback Mode (CFB)
![is-03-cfb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-cfb-encryption.png)
![is-03-cfb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-cfb-encryption.png#)
- The message is treated as a stream of bits; similar to stream cipher
- **Result of the encryption is fed to the next stage.**
- Standard allows any number of bits to be fed to the next stage
- It is most efficient to use all $64$ bits (CFB-64)
- It is most efficient to use all bits.
- Initialization vector is used.
- Same requirements on the IV as CBC mode.
- Should be randomized, and should not be predictable.
@@ -277,13 +277,13 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- CFB mode is self-recovering.
- 1 bit error in the ciphertext corrupts some number of blocks.
- Bit errors in the ciphertext will cause bit errors at the same position.
- Since this ciphertext is fed to the next block, the error is propagated
- Since this ciphertext is fed to the next block, the error is propagated.
- Some implementations (like CFB-8) use shift registers, so errors will be propagated as long as the erroneous bit is in the shift register.
- If the error is removed from the shift register, it automatically recovers.
### Output Feedback Mode (OFB)
![is-03-ofb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ofb-encryption.png)
![is-03-ofb-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ofb-encryption.png#)
- Very similar to stream cipher.
- Initialization vector is used as a seed to generate the key stream.
@@ -316,14 +316,39 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
### Counter Mode (CTR)
![is-03-ctr-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ctr-encryption.png)
![is-03-ctr-encryption.png](../../../assets/img/posts/Lecture%20Notes/Internet%20Security/is-03-ctr-encryption.png#)
- Without chaining, we use a counter (typically incremented by $1$).
- Counter starts from the initialization vector.
- Highly parallelizable.
- Can decrypt from any arbitrary position.
- Counter should not be repeated for the same key.
- Suppose that the same counter $ctr$ is used for encrypting $m_0$ and $m_1$.
- Encryption results are: $(ctr, E(k, ctr) \oplus m_0), (ctr, E(k, ctr) \oplus m_1)$.
- Then the attacker can obtain $m_0 \oplus m_1$.
## Modes of Operations Summary
|Criteria\Modes|ECB|CBC|CFB|OFB|CTR|
|:-:|:-:|:-:|:-:|:-:|:-:|
|IV|-|Yes|Yes|Yes|Counter|
|Encryption Parallelizable|Yes|No|No|Yes\*|Yes|
|Decryption Parallelizable|Yes|Yes|Yes|Yes\*|Yes|
|Random Read Access|Yes|Yes|Yes|No|Yes|
|Self-Recovering|-|Yes|Yes|-|-|
- OFB is parallelizable only if the keystream is generated in advance.
- We don't have to consider self-recovery if the ciphertext is not fed into the encryption of the next block.
- Errors in the ciphertext are not be propagated for ECB, OFB and CTR.
- **Random read access**
- Suppose that a part of the plaintext changes.
- In OFB, the *whole* keystream must be recalculated to fix the ciphertext.
- But for other modes, only a part of the ciphertext needs to be changed, using the information from the previous block if necesary.
---
Images are from [Wikipedia](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation).
[^1]: Some people call this function the *mangler* function.
[^2]: Over the finite field $\mathrm{GF}(2^8)$.
[^3]: See also a helpful [question](https://crypto.stackexchange.com/questions/1346/why-is-mixcolumns-omitted-from-the-last-round-of-aes) on cryptography SE.

View File

@@ -0,0 +1,277 @@
---
share: true
toc: true
math: true
categories:
- Lecture Notes
- Internet Security
tags:
- lecture-note
- security
- cryptography
- number-theory
title: 05. Modular Arithmetic (2)
date: 2023-10-04
github_title: 2023-10-04-modular-arithmetic-2
---
## Exponentiation by Squaring
Suppose we want to calculate $a^n$ where $n$ is very large, like $n \approx 2^{1000}$. A naive multiplication would take $\mathcal{O}(n)$ multiplications. We will ignore integer overflow for simplicity.
```c
int naive_exponentiation(int a, int n) {
int result = 1;
for (int i = 0; i < n; ++i) {
result *= a;
}
return result;
}
```
Using the above implementation, computing $3^{2^{63} - 1}$ takes almost forever...
Instead, we use **exponentiation by squaring** method. Notice the following,
$$
a^n = \begin{cases}
(a^2)^{\frac{n}{2}} & (n \text{ is even})\\
a \cdot (a^2)^{\frac{n-1}{2}} & (n \text{ is odd})
\end{cases}.
$$
Therefore, the exponent is reduced by half for every multiplication. Here is the implementation. The base cases are to be handled separately.
```c
int exponentiation_by_squaring(int a, int n) {
if (n == 0) {
return 1;
} else if (n == 1) {
return a;
}
int result = 1;
if (n % 2 == 0) {
return exponentiation_by_squaring(a * a, n / 2);
} else {
return a * exponentiation_by_squaring(a * a, (n - 1) / 2);
}
}
```
The above code executes about $\mathcal{O}(\log n)$ multiplications. Now we can actually get an answer for $3^{2^{63} - 1}$.
Alternatively, here is an iterative version of the above for those who want to save some memory.
```c
int exponentiation_by_squaring_iterative(int a, int n) {
int result = 1;
int base = a, exponent = n;
while (exponent > 0) {
if (n % 2 == 1) {
result *= base;
}
base *= base;
exponent /= 2;
}
return result;
}
```
For even better (maybe faster) results, we need the help of elementary number theory.
## Fermat's Little Theorem
> **Theorem.** Let $p$ be prime. For $a \in \mathbb{Z}$ such that $\gcd(a, p) = 1$,
>
> $$
> a^{p-1} \equiv 1 \pmod p.
> $$
*Proof*. (Using group theory) The statement can be rewritten as follows. For $a \neq 0$ in $\mathbb{Z}_p$, $a^{p-1} = 1$ in $\mathbb{Z}_p$. Since $\mathbb{Z}_p^*$ is a (multiplicative) group of order $p-1$, the order of $a$ should divide $p-1$. Therefore, $a^{p-1} = 1$ in $\mathbb{Z}_p$.
Here is an elementary proof not using group theory.
*Proof*. (Elementary) Let $S = \left\lbrace 0, 1, \dots, p-1 \right\rbrace$. Consider a map $f : S \rightarrow S$ defined as $x \mapsto ax \bmod p$ ($a \neq 0$).
We will show that $f$ is injective. Suppose that $ax \equiv ay \pmod p$ for distinct $x, y \in S$. Since $\gcd(a, p) = 1$, $a$ has a multiplicative inverse, thus $x \equiv y \pmod p$. Then $x, y$ should be same elements of $S$.
By injectivity, $f(i)$ are distinct for all $i \in S$, so $f$ is a permutation on $S$. Therefore, the product of all elements of $S$ must be equal to the product of all $f(i)$ for $i \in S$.
$$
(p-1)! \equiv f(1)f(2)\cdots f(p-1) \equiv a^{p-1} \cdot (p-1)!\pmod p.
$$
Since $\gcd(i, p) = 1$ for all $i \in S$, we can multiply the multiplicative inverse for all $i \in S$ and we get $a^{p-1} \equiv 1 \pmod p$.
## Euler's Totient Function
For composite modulus, we have Euler's generalization. Before proving the theorem, we first need to define Euler's totient function.
> **Definition.** Let $n \in \mathbb{N}$. Define $\phi(n)$ as the number of positive integers $k \leq n$ such that $\gcd(n, k) = 1$.
For direct calculation, we use the following formula.
> **Lemma.** For $n \in \mathbb{N}$, the following holds.
>
> $$
> \phi(n) = n \cdot \prod_{p \mid n} \left( 1 - \frac{1}{p} \right)
> $$
>
> where $p$ is a prime number dividing $n$.
So to calculate $\phi(n)$, we need to **factorize** $n$. From the formula above, we have some corollaries.
> **Corollary.** For prime numbers $p, q$ and $k \in \mathbb{N}$, the following hold.
> 1. $\phi(p) = p - 1$.
> 2. $\phi(pq) = (p-1)(q-1)$.
> 3. $\phi(p^k) = p^{k-1}(p-1)$.
### Reduced Set of Residues
Let $n \in \mathbb{N}$. The **complete set of residues** was denoted $\mathbb{Z}_n$ and
$$
\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n-1 \right\rbrace.
$$
We also often use the **reduced set of residues**.
> **Definition.** The **reduced set of residues** is the set of residues that are relatively prime to $n$. We denote this set as $\mathbb{Z}_n^*$.
>
> $$
> \mathbb{Z}_n^* = \left\lbrace a \in \mathbb{Z}_n \setminus \left\lbrace 0 \right\rbrace : \gcd(a, n) = 1 \right\rbrace.
> $$
Then by definition, we have the following result.
> **Lemma.** $\left\lvert \mathbb{Z}_n^* \right\lvert = \phi(n)$.
We can also show that $\mathbb{Z}_n^*$ is a multiplicative group.
> **Lemma.** $\mathbb{Z}_n^*$ is a multiplicative group.
*Proof*. Let $a, b \in \mathbb{Z}_n^{ * }$. We must check if $ab \in \mathbb{Z}_n^{ * }$. Since $\gcd(a, n) = \gcd(b, n) = 1$, $\gcd(ab, n) = 1$. This is because if $d = \gcd(ab, n) > 1$, then a prime factor $p$ of $d$ must divide $a$ or $b$ and also $n$. Then $\gcd(a, n) \geq p$ or $\gcd(b, n) \geq p$, which is a contradiction. Thus $ab \in \mathbb{Z}_n^{ * }$.
Associativity holds trivially, as a subset of $\mathbb{Z}_n$. We also have an identity element $1$, and inverse of $a \in \mathbb{Z}_n^*$ exists since $\gcd(a, n) = 1$.
Now we can prove Euler's generalization.
## Euler's Generalization
> **Theorem.** Let $a \in \mathbb{Z}$ such that $\gcd(a, n) = 1$. Then
>
> $$
> a^{\phi(n)} \equiv 1 \pmod n.
> $$
*Proof*. Since $\gcd(a, n) = 1$, $a \in \mathbb{Z}_n^{ * }$. Then $a^\left\lvert \mathbb{Z}_n^{ * } \right\lvert = 1$ in $\mathbb{Z}_n$. By the above lemma, we have the desired result.
*Proof*. (Elementary) Set $f : \mathbb{Z}_n^* \rightarrow \mathbb{Z}_n^*$ as $x \mapsto ax \bmod n$, then the rest of the reasoning follows similarly as in the proof of Fermat's little theorem.
Using the above result, we remark an important result that will be used in RSA.
> **Lemma.** Let $n \in \mathbb{N}$. For $a, b \in \mathbb{Z}$ and $x \in \mathbb{Z}_n^*$, if $a \equiv b \pmod{\phi(n)}$, then $x^a \equiv x^b \pmod n$.
*Proof*. $a = b + k\phi(n)$ for some $k \in \mathbb{Z}$. Then
$$
x^a \equiv x^{b + k\phi(n)} = (x^{\phi(n)})^k \cdot x^b \equiv x^b \pmod n
$$
by Euler's generalization.
## Groups Based on Modular Arithmetic
> **Definition.** A **group** is a set $G$ with a binary operation $* : G \times G \rightarrow G$, satisfying the following properties.
>
> - $(\mathsf{G1})$ The binary operation $*$ is **closed**.
> - $(\mathsf{G2})$ The binary operation $*$ is **associative**, so $(a * b) * c = a * (b * c)$ for all $a, b, c \in G$.
> - $(\mathsf{G3})$ $G$ has an **identity** element $e$ such that $e * a = a * e = a$ for all $a \in G$.
> - $(\mathsf{G4})$ There is an **inverse** for every element of $G$. For each $a \in G$, there exists $x \in G$ such that $a * x = x * a = e$. We write $x = a^{-1}$ in this case.
$\mathbb{Z}_n$ is an additive group, and $\mathbb{Z}_n^*$ is a multiplicative group.
## Chinese Remainder Theorem (CRT)
> **Theorem.** Let $n_1, \dots, n_k$ integers greater than $1$, and let $N = n_1n_2\cdots n_k$. If $n_i$ are pairwise relatively prime, then the system of equations $x \equiv a_i \pmod {n_i}$ has a unique solution modulo $N$.
>
> *(Abstract Algebra)* The map
>
> $$
> x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
> $$
>
> defines a ring isomorphism
>
> $$
> \mathbb{Z}_N \simeq \mathbb{Z}_{n_1} \times \mathbb{Z}_{n_2} \times \cdots \times \mathbb{Z}_{n_k}.
> $$
*Proof*. (**Existence**) Let $N_i = N/n_i$. Then $\gcd(N_i, n_i) = 1$. By the extended Euclidean algorithm, there exist integers $M_i, m_i$ such that $M_iN_i + m_in_i= 1$. Now set
$$
x = \sum_{i=1}^k a_i M_i N_i.
$$
Then $x \equiv a_iM_iN_i \equiv a_i(1 - m_in_i) \equiv a_i \pmod {n_i}$ for all $i = 1, \dots, k$.
(**Uniqueness**) Suppose that we have two distinct solutions $x, y$ modulo $N$. $x, y$ are solutions to $x \equiv a_i \pmod {n_i}$, so $n_i \mid (x - y)$ for all $i$. Therefore we have
$$
\mathrm{lcm}(n_1, \dots, n_k) \mid (x - y).
$$
But $n_i$ are pairwise relatively prime, so $\mathrm{lcm}(n_1, \dots, n_k) = N$ and $N \mid (x-y)$. Hence $x \equiv y \pmod N$.
*Proof*. (**Abstract Algebra**) The above uniqueness proof shows that the map
$$
x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
$$
is injective. By pigeonhole principle, this map must also be surjective. This map is also a ring homomorphism, by the properties of modular arithmetic. We have a ring isomorphism.
### Notes on the Proof of the Chinese Remainder Theorem
The elementary proof given above gives a *direct construction* of the solution. It is clear and easy to understand, and tells us how to find the actual solution.
But when the above proof is used in actual computation, it involves computations of very large numbers. The following is an implementation.
```cpp
// remainder holds the a_i values
// modulus holds the n_i values
int chinese_remainder_theorem(vector<int>& remainder, vector<int>& modulus) {
int product = 1;
for (int m : modulus) {
product *= m;
}
int result = 0;
for (int i = 0; i < (int) modulus.size(); ++i) {
int N_i = product / modulus[i];
result += remainder[i] * modular_inverse(N_i, modulus[i]) * N_i;
result %= product;
}
return result;
}
```
The `modular_inverse` function uses the extended Euclidean algorithm to find $M_i$ in the proof. For large moduli and many equations, $N_i = N / n_i$ results in a very large number, which is hard to handle (if your language has integer overflow) and takes longer to compute.
A better way is to construct the solution **inductively**. Find a solution for the first two equations,
$$
\begin{array}{c}
x \equiv a_1 \pmod{n_1} \\
x \equiv a_2 \pmod{n_2}
\end{array} \implies x \equiv a_{1, 2} \pmod{n_1n_2}
$$
and using the result, add the next equation $x \equiv a_3 \pmod{n_3}$ and find a solution.[^1]
Lastly, the ring isomorphism actually tells us a lot and is quite effective for computation. Since the two rings are *isomorphic*, operations in $\mathbb{Z} _ N$ can be done independently in each $\mathbb{Z} _ {n_i}$ and then merged back to $\mathbb{Z} _ N$. $N$ was a large number, so computations can be much faster in $\mathbb{Z} _ {n _ i}$. Specifically, we will see how this fact is used for computations in RSA.
[^1]: I have an implementation in my repository. [Link](https://github.com/calofmijuck/BOJ/blob/4b29e0c7f487aac3186661176d2795f85f0ab21b/Codes/23000/23062.cpp#L38).

View File

@@ -0,0 +1,179 @@
---
share: true
toc: true
math: true
categories:
- Lecture Notes
- Internet Security
tags:
- lecture-note
- security
- cryptography
- number-theory
title: 06. RSA and ElGamal Encryption
date: 2023-10-04
github_title: 2023-10-04-rsa-elgamal
---
## Exponential Inverses
Suppose we are given integers $a$ and $N$. For any integer $x$ that is relatively prime to $N$, we choose $b$ so that
$$
\tag{$*$}
ab \equiv 1 \pmod{\phi(N)}.
$$
Then we have
$$
x^{ab} \equiv x^{1 + k\phi(N)} \equiv x \pmod N
$$
by Euler's generalization.
> **Definition.** The integer $b$ satisfying $(\ast)$ is called the **exponential inverse of $a$ modulo $N$**.
Using exponential inverses will be a key idea in the RSA cryptosystem.
## RSA Cryptosystem
This is an explanation of *textbook* RSA encryption scheme.
### Key Generation
- We pick two large primes $p, q$ and set $N = pq$.
- Select $(e, d)$ so that $ed \equiv 1 \pmod{\phi(N)}$.
- Set $(N, e)$ as the **public key** and make it public.
- Set $d$ as the **private key** and keep it secret.
### RSA Encryption and Decryption
Suppose we want to encrypt a message $m \in \mathbb{Z}_N$.
- **Encryption**
- Using the public key $(N, e)$, compute the ciphertext $c = m^e \bmod N$.
- **Decryption**
- Recover the original message by computing $c^d \bmod N$.
### Correctness of RSA?
Since $ed \equiv 1 \pmod{\phi(N)}$, we have
$$
c^d \equiv m^{ed} \equiv m \pmod N
$$
by the properties of exponential inverses.
Wait, but the properties requires that $\gcd(m, N) = 1$. So it seems like we can't use some values of $m$. Furthermore, it should be computationally infeasible to recover $d$ using $e$ and $N$.
### Regarding the Choice of $N$
If $N$ is prime, it is very easy to find $d$. Since the relation $ed \equiv 1 \pmod {(N-1)}$ holds, we directly see that $d$ can be computed efficiently using the extended Euclidean algorithm.
The next simplest case would be setting $N = pq$ for two large primes $p$ and $q$. We expose $N$ to the public but hide primes $p$ and $q$. Now suppose the attacker wants to compute $d$ using $(N, e)$. The attacker knows that $ed \equiv 1 \pmod {\phi(N)}$, and $\phi(N) = (p-1)(q-1)$. So to calculate $d$, the attacker must know $\phi(N)$, which requires the **factorization of $N$**.
If the factorization $N = pq$ is known, finding $d$ is easy. But factoring large prime numbers (especially a product of two primes of similar size) is known to be very difficult.[^1] No one has formally proven this, but we believe and assume that it is hard.[^2]
## Chinese Remainder Theorem in RSA
Assume that the message $m$ is not divisible by both $p$ and $q$. By Fermat's little theorem, we have $m^{p-1} \equiv 1 \pmod p$ and $m^{q-1} \equiv 1 \pmod q$.
Therefore, for decryption in RSA, the following holds. Note that $N = pq$.
$$
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m \cdot (m^{p-1})^{k(q-1)} \equiv m \cdot 1^{k(q-1)} \equiv m \pmod p.
$$
A similar result holds for modulus $q$. This does not exactly recover the message yet, since $m$ could have been chosen to be larger than $p$. The above equation is true, but during actual computation, one may get a result that is less than $p$. *This may not be equal to the original message*.[^3]
Since $N = pq$, we use the Chinese remainder theorem. Instead of computing $c^d \pmod N$, we can compute
$$
c^d \equiv m \pmod p, \qquad c^d \equiv m \pmod q
$$
independently and solve the system of equations to recover the message.
## Can I Encrypt $p$ with RSA?
Now we return to the problem where $\gcd(m, N) \neq 1$. The probability of $\gcd(m, N) \neq 1$ is actually $\frac{1}{p} + \frac{1}{q} - \frac{1}{pq}$, so if we take large primes $p, q \approx 2^{1000}$ as in RSA2048, the probability of this occurring is roughly $2^{-999}$, which is negligible. But for completeness, we also prove for this case.
$e, d$ are still chosen to satisfy $ed \equiv 1 \pmod {\phi(N)}$. Suppose we want to decrypt $c \equiv m^e \pmod N$.
We will also use the Chinese remainder theorem here.
Since $\gcd(m, N) \neq 1$ and $N = pq$, we have $p \mid m$. So if we compute in $\mathbb{Z}_p$, we will get $0$,
$$
c^d \equiv m^{ed} \equiv 0^{ed} \equiv 0 \pmod p.
$$
We also do the computation in $\mathbb{Z}_q$ and get
$$
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m\cdot (m^{q-1})^{k(p-1)} \equiv m \cdot 1^{k(p-1)} \equiv m \pmod q.
$$
Here, we used the fact that $m^{q-1} \equiv 1 \pmod q$. This holds because if $p \mid m$, $m$ is a multiple of $p$ that is less than $N$, so $m = pm'$ for some $m'$ such that $1 \leq m' < q$. Then $\gcd(m, q) = \gcd(pm', q) = 1$ since $q$ does not divide $p$ and $m'$ is less than $q$.
Therefore, from $c^d \equiv 0 \pmod p$ and $c^d \equiv (m \bmod q) \pmod q$, we can recover a unique solution $c^d \equiv m \pmod N$.
Now we must argue that the recovered solution is actually equal to the original $m$. But what we did above was showing that $m^{ed}$ and $m$ in $\mathbb{Z}_N$ are mapped to the same element $(0, m \bmod q)$ in $\mathbb{Z}_p \times \mathbb{Z}_q$. Since the Chinese remainder theorem tells us that this mapping is an isomorphism, $m^{ed}$ and $m$ must have been the same elements of $\mathbb{Z}_N$ in the first place.
Notice that we did not require $m$ to be relatively prime to $N$. Thus the RSA encryption scheme is correct for any $m \in \mathbb{Z}_N$.
## Correctness of RSA with Fermat's Little Theorem
Actually, the above argument can be proven only with Fermat's little theorem. In the above proof, the Chinese remainder theorem was used to transform the operation, but for $N = pq$, the situation is simple enough that this theorem is not necessarily required.
Let $M = m^{ed} - m$. We have shown above only using Fermat's little theorem that $p \mid M$ and $q \mid M$, for any choice of $m \in \mathbb{Z}_N$. Then since $N = pq = \mathrm{lcm}(p, q)$, we have $N \mid M$, so $m^{ed} \equiv m \pmod N$. Hence the RSA scheme is correct.
So we don't actually need Euler's generalization for proving the correctness of RSA...?! In fact, the proof given in the original paper of RSA used Fermat's little theorem.
## Discrete Logarithms
This is an inverse problem of exponentiation. The inverse of exponentials is logarithms, so we consider the **discrete logarithm of a number modulo $p$**.
Given $y \equiv g^x \pmod p$ for some prime $p$, we want to find $x = \log_g y$. We set $g$ to be a generator of the group $\mathbb{Z}_p$ or $\mathbb{Z}_p^*$, since if $g$ is the generator, a solution always exists.
Read more in [discrete logarithm problem (Modern Cryptography)](2023-10-03-key-exchange.md#discrete-logarithm-problem-dl).
## ElGamal Encryption
This is an encryption scheme built upon the hardness of the DLP.
> 1. Let $p$ be a large prime.
> 2. Select a generator $g \in \mathbb{Z}_p^*$.
> 3. Choose a private key $x \in \mathbb{Z}_p^*$.
> 4. Compute the public key $y = g^x \pmod p$.
> - $p, g, y$ will be publicly known.
> - $x$ is kept secret.
### ElGamal Encryption and Decryption
Suppose we encrypt a message $m \in \mathbb{Z}_p^*$.
> 1. The sender chooses a random $k \in \mathbb{Z}_p^*$, called *ephemeral key*.
> 2. Compute $c_1 = g^k \pmod p$ and $c_2 = my^k \pmod p$.
> 3. $c_1, c_2$ are sent to the receiver.
> 4. The receiver calculates $c_1^x \equiv g^{xk} \equiv y^k \pmod p$, and find the inverse $y^{-k} \in \mathbb{Z}_p^*$.
> 5. Then $c_2y^{-k} \equiv m \pmod p$, recovering the message.
The attacker will see $g^k$. By the hardness of DLP, the attacker is unable to recover $k$ even if he knows $g$.
#### Ephemeral Key Should Be Distinct
If the same $k$ is used twice, the encryption is not secure. Suppose we encrypt two different messages $m_1, m_2 \in \mathbb{Z} _ p^{ * }$. The attacker will see $(g^k, m_1y^k)$ and $(g^k, m_2 y^k)$. Then since we are in a multiplicative group $\mathbb{Z} _ p^{ * }$, inverses exist. So
$$
m_1y^k \cdot (m_2 y^k)^{-1} \equiv m_1m_2^{-1} \equiv 1 \pmod p
$$
which implies that $m_1 \equiv m_2 \pmod p$, leaking some information.
[^1]: If one of the primes is small, factoring is easy. Therefore we require that $p, q$ both be large primes.
[^2]: There is a quantum polynomial time (BQP) algorithm for integer factorization. See [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm).
[^3]: This part of the explanation is not necessary if we use abstract algebra!

View File

@@ -0,0 +1,138 @@
---
share: true
toc: true
math: true
categories:
- Lecture Notes
- Internet Security
tags:
- lecture-note
- security
- cryptography
title: 07. Public Key Cryptography
date: 2023-10-09
github_title: 2023-10-09-public-key-cryptography
---
In symmetric key cryptography, we have a problem with key sharing and management. More info in the first few paragraphs of [Key Exchange (Modern Cryptography)](2023-10-03-key-exchange.md#).
## Public Key Cryptography
We use **two** keys for public key cryptography. The keys are called *public key* and *private key*. These two keys are related to each other, but it is almost impossible to calculate the private key from the public key.
- **Public key** is *public*, and anyone can use it to encrypt messages or verify signatures.
- **Private key** (or secret key) is only kept by the owner. It is used to decrypt messages or create signatures.
We will denote public keys as $pk$ and private keys as $sk$.
These keys are created to be used in **trapdoor one-way functions**.
### One-way Function
A **one-way function** is a function that is easy to compute, but hard to compute the pre-image of any output. Here are some common examples.
- *Cryptographic hash functions*: [Hash Functions (Modern Cryptography)](2023-09-28-hash-functions.md#collision-resistance).
- *Factoring a large integer*: It is easy to multiply to integers even if they're large, but factoring is very hard.
- *Discrete logarithm problem*: It is easy to exponentiate a number, but it is hard to find the discrete logarithm.
But a one-way function is not enough. Suppose that $f$ is a one way function with a public key $pk$. It will be easy to encrypt a message $m$ as $f(pk, m)$, but recovering $m$ is hard even for the intended recipient.
### Trapdoor One-way Function
A **trapdoor one-way function** has a *trapdoor*. It is computationally difficult to find the preimage, but with the trapdoor, the inverting is easy.
In public key cryptography, the trapdoor is the *private key* that makes it easy to invert the one-way function $f$. So the recipient can efficiently invert $f$ and recover the message $m$.
### Encryption and Decryption
In public key cryptography, encryption and decryption are done as follows.
Suppose that Alice wants to send a secret message to Bob. Alice must encrypt the message using **Bob's public key**, so that only Bob can decrypt the message.
> 1. Alice takes a plaintext and encrypts it using Bob's public key.
> 2. The ciphertext is sent to Bob.
> 3. Bob uses his private key to decrypt the ciphertext.
Mathematically, let $pk, sk$ be Bob's public key and private key.
> 1. Alice computes the ciphertext $c = f(pk, m)$ of the message $m$.
> 2. $c$ is sent to Bob.
> 3. Bob computes $m = f^{-1}(sk, c)$ and recovers $m$.
### Authentication
Public key cryptography can be used also for **authentication**. If some ciphertext can be decrypted with Alice's public key, we can verify that the message was from Alice.
We will learn more about this when we learn digital signatures.
### Applications of Public Key Cryptography
- **Encryption and decryption**: for private communication.
- **Digital signatures**: authentication, as explained above.
- This was not possible with symmetric cryptography since both parties have the key, so does not satisfy non-repudiation.
- **Key exchange**
- We assumed that in symmetric cryptography, there was a secure channel to share the secret key.
- We use public key cryptography to exchange and agree on the secret key for the symmetric cipher.
- Public key cryptography takes longer to calculate, so it is preferable to use symmetric ciphers.
But a problem still remains. How does one verify that this key is indeed from that identity? In the example above, how does Alice know that this public key is from Bob and not someone else's? This problem will be solved using **public key infrastructure**.
## Diffie-Hellman Key Exchange
Choose a large prime $p$ and a generator $g$ of $\mathbb{Z}_p^{ * }$. The description of $g$ and $p$ will be known to the public.
> 1. Alice chooses some $x \in \mathbb{Z}_p^{ * }$ and sends $g^x \bmod p$ to Bob.
> 2. Bob chooses some $y \in \mathbb{Z}_p^{ * }$ and sends $g^y \bmod p$ to Alice.
> 3. Alice and Bob calculate $g^{xy} \bmod p$ separately.
> 4. Eve can see $g^x \bmod p$, $g^y \bmod p$ but cannot calculate $g^{xy} \bmod p$.
Refer to [Diffie-Hellman Key Exchange (Modern Cryptography)](2023-10-03-key-exchange.md#diffie-hellman-key-exchange-dhke).
## Message Integrity
A function $H$ takes an input of arbitrary length message and outputs a fixed length string. The output is called **message digest**, *tag*, *fingerprint* or **hash**.
Here, the $H$ is called a **hash function**. This function is many-to-one, but it is usually computationally infeasible to find a collision.
**Desirable Properties of $H$**.
- $H$ should be easy to calculate.
- It should be hard to recover $m$ from $H(m)$. (one-wayness)
- It should be computationally difficult to find a collision. (collision resistance)
- The output should seem random.
Using this function, we can check whether if the message was tampered during transmission.
### Message Authentication Code (MAC)
We assume that Alice and Bob already share a secret $k$. Alice wants to send a message $m$ to Bob.
> 1. Alice signs the message using the key and calculates the tag $t = H(k, m)$.
> 2. Alice sends the message and tag.
> 3. Bob calculates the tag $t'$ from the received message. If $t'$ does not match with $t$, Bob detects that the message was modified.
We only care about message integrity in MACs, so the message is not encrypted.
### Properties of MAC
- MACs are based on symmetric keys, so communicating parties must share a key.
- MACs should be able to accept messages of arbitrary length.
- MACs should output a fixed-length string.
- MACs should provide message integrity. Any manipulations in transit will be detected, and receiving party is assured of the origin of the message.
- MACs **do not** support non-repudiation.
- Since both parties have the secret, any two party can create the message.
## Digital Signatures
**Digital signatures** achieve *integrity*, *non-repudiation* and *authentication*. We leverage public key cryptography.
Suppose Alice wants to **sign** a message $m$. Alice has public key $pk$ and private key $sk$.
> 1. Alice calculates $\sigma = D(sk, m)$ and sends $m \parallel \sigma$.
> 2. Bob receives it and calculates $E(pk, \sigma)$ and compares it with $m$.
> - The key $pk$ here is Alice's public key.
- Since the signature can be decrypted using Alice's public key, it must have been signed using Alice's private key.
- Thus the message must have been from Alice.
- Verification is done using Alice's public key, so anyone can verify the message.
- Messages are usually long, so we take a hash function $H$ to shorten it, and sign $H(m)$ instead.

View File

@@ -171,6 +171,8 @@ Since the adversary can see the ciphertext, this kind of relation leaks some inf
Also, the key is (at least) as long as the message. This is why OTP is rarely used today. When sending a long message, two parties must communicate a very long key that is as long as the message, *every single time*! This makes it hard to manage the key.
## Shannon's Theorem
So is there a way to reduce the key size without losing perfect secrecy? Sadly, no. In fact, the key space must be as least as large as the message space. This is a requirement for perfectly secret schemes.
> **Theorem**. If $(G, E, D)$ is a perfectly secret encryption scheme, then $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$.
@@ -290,7 +292,7 @@ We can deduce that if a PRG is predictable, then it is insecure.
*Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]$ is non-negligible.
![mc-01-prg-game.png](../../../assets/img/posts/Lecture%20Notes/Modern%20Cryptography/mc-01-prg-game.png)
![mc-01-prg-game.png](../../../assets/img/posts/Lecture%20Notes/Modern%20Cryptography/mc-01-prg-game.png#)
1. The challenger PRG will send a bit string $x$ to $\mathcal{B}$.
- In experiment $0$, PRG gives pseudorandom string $G(k)$.
@@ -316,7 +318,7 @@ The theorem implies that if next bit predictors cannot distinguish $G$ from true
To motivate the definition of semantic security, we consider a **security game framework** (attack game) between a **challenger** (ex. the creator of some cryptographic scheme) and an **adversary** $\mathcal{A}$ (ex. attacker of the scheme).
![mc-01-ss.png](../../../assets/img/posts/Lecture%20Notes/Modern%20Cryptography/mc-01-ss.png)
![mc-01-ss.png](../../../assets/img/posts/Lecture%20Notes/Modern%20Cryptography/mc-01-ss.png#)
> **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows:
>