mirror of
https://github.com/calofmijuck/blog.git
synced 2025-12-06 22:53:51 +00:00
[PUBLISHER] upload files #159
* PUSH NOTE : 09. Transport Layer Security.md * PUSH ATTACHMENT : is-09-tls-handshake.png * PUSH NOTE : 08. Public Key Infrastructure.md * PUSH ATTACHMENT : is-08-certificate-validation.png * PUSH NOTE : 07. Public Key Cryptography.md * PUSH NOTE : 06. RSA and ElGamal Encryption.md * PUSH NOTE : 05. Modular Arithmetic (2).md * PUSH NOTE : 04. Modular Arithmetic (1).md * PUSH NOTE : 03. Symmetric Key Cryptography (2).md * PUSH ATTACHMENT : is-03-feistel-function.png * PUSH ATTACHMENT : is-03-ecb-encryption.png * PUSH ATTACHMENT : is-03-cbc-encryption.png * PUSH ATTACHMENT : is-03-cfb-encryption.png * PUSH ATTACHMENT : is-03-ofb-encryption.png * PUSH ATTACHMENT : is-03-ctr-encryption.png * PUSH NOTE : 02. Symmetric Key Cryptography (1).md * PUSH NOTE : 01. Security Introduction.md * PUSH ATTACHMENT : is-01-cryptosystem.png * PUSH NOTE : 9. Public Key Encryption.md * PUSH ATTACHMENT : mc-09-ss-pke.png * PUSH NOTE : 7. Key Exchange.md * PUSH ATTACHMENT : mc-07-dhke.png * PUSH ATTACHMENT : mc-07-dhke-mitm.png * PUSH ATTACHMENT : mc-07-merkle-puzzles.png * PUSH NOTE : 6. Hash Functions.md * PUSH ATTACHMENT : mc-06-merkle-damgard.png * PUSH ATTACHMENT : mc-06-davies-meyer.png * PUSH ATTACHMENT : mc-06-hmac.png * PUSH NOTE : 5. CCA-Security and Authenticated Encryption.md * PUSH ATTACHMENT : mc-05-ci.png * PUSH ATTACHMENT : mc-05-etm-mte.png * PUSH NOTE : 4. Message Authentication Codes.md * PUSH ATTACHMENT : mc-04-mac.png * PUSH ATTACHMENT : mc-04-mac-security.png * PUSH ATTACHMENT : mc-04-cbc-mac.png * PUSH ATTACHMENT : mc-04-ecbc-mac.png * PUSH NOTE : 2. PRFs, PRPs and Block Ciphers.md * PUSH ATTACHMENT : mc-02-block-cipher.png * PUSH ATTACHMENT : mc-02-feistel-network.png * PUSH ATTACHMENT : mc-02-des-round.png * PUSH ATTACHMENT : mc-02-DES.png * PUSH ATTACHMENT : mc-02-aes-128.png * PUSH ATTACHMENT : mc-02-2des-mitm.png * PUSH NOTE : 16. The GMW Protocol.md * PUSH ATTACHMENT : mc-16-beaver-triple.png * PUSH NOTE : 13. Sigma Protocols.md * PUSH ATTACHMENT : mc-13-sigma-protocol.png * PUSH ATTACHMENT : mc-10-schnorr-identification.png * PUSH ATTACHMENT : mc-13-okamoto.png * PUSH ATTACHMENT : mc-13-chaum-pedersen.png * PUSH ATTACHMENT : mc-13-gq-protocol.png * PUSH NOTE : 12. Zero-Knowledge Proofs (Introduction).md * PUSH ATTACHMENT : mc-12-id-protocol.png * PUSH NOTE : 10. Digital Signatures.md * PUSH ATTACHMENT : mc-10-dsig-security.png * PUSH NOTE : 1. OTP, Stream Ciphers and PRGs.md * PUSH ATTACHMENT : mc-01-prg-game.png * PUSH ATTACHMENT : mc-01-ss.png * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-09-10-security-intro.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-09-11-symmetric-key-cryptography-1.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-09-18-symmetric-key-cryptography-2.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-09-25-modular-arithmetic-1.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-10-04-modular-arithmetic-2.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-10-04-rsa-elgamal.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-10-09-public-key-cryptography.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-10-16-pki.md * DELETE FILE : _posts/Lecture Notes/Internet Security/2023-10-18-tls.md * DELETE FILE : _posts/lecture-notes/internet-security/2023-10-19-public-key-encryption.md * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-01-cryptosystem.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-cbc-encryption.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-cfb-encryption.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-ctr-encryption.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-ecb-encryption.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-feistel-function.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-03-ofb-encryption.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-08-certificate-validation.png * DELETE FILE : assets/img/posts/Lecture Notes/Internet Security/is-09-tls-handshake.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-01-prg-game.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-01-ss.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-2des-mitm.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-DES.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-aes-128.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-block-cipher.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-des-round.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-feistel-network.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-04-cbc-mac.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-04-ecbc-mac.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-04-mac-security.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-04-mac.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-05-ci.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-05-etm-mte.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-06-davies-meyer.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-06-hmac.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-06-merkle-damgard.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-07-dhke-mitm.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-07-dhke.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-07-merkle-puzzles.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-09-ss-pke.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-10-dsig-security.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-10-schnorr-identification.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-12-id-protocol.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-13-chaum-pedersen.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-13-gq-protocol.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-13-okamoto.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-13-sigma-protocol.png * DELETE FILE : assets/img/posts/Lecture Notes/Modern Cryptography/mc-16-beaver-triple.png
This commit is contained in:
@@ -0,0 +1,261 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: false
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- network
|
||||
- security
|
||||
- lecture-note
|
||||
title: 01. Security Introduction
|
||||
date: 2023-09-10
|
||||
github_title: 2023-09-10-security-intro
|
||||
image:
|
||||
path: /assets/img/posts/lecture-notes/internet-security/is-01-cryptosystem.png
|
||||
attachment:
|
||||
folder: assets/img/posts/lecture-notes/internet-security
|
||||
---
|
||||
|
||||
> Every program has at least two purposes: the one for which it was written, and another for which it wasn't. - Alan J. Perlis
|
||||
|
||||
## Security Overview
|
||||
|
||||
### Security
|
||||
|
||||
**Security** may mean different things.
|
||||
- Emotional security
|
||||
- Physical security: physical separation of assets
|
||||
- Resource exhaustion: mitigating DoS attacks
|
||||
- **System security**
|
||||
- **Network security**
|
||||
- Cryptography
|
||||
- Social Engineering: email pranksters (impersonating)
|
||||
|
||||
In this course, we are mainly interested in system/network security!
|
||||
|
||||
There are two categories in **IT Security**, (though the boundary is blurry)
|
||||
- **Computer** (system) **security** uses automated tools and mechanisms to protect the **data in a computer**, against hackers, malware, etc.
|
||||
- **Internet** (network) **security** prevents, detects, and corrects security violations that involve the **transmission of information** in a network.
|
||||
|
||||
In internet security, we assume that:
|
||||
- Everything on the network can be an attack target.
|
||||
- Every transmitted bit can be tapped (eavesdropped).
|
||||
|
||||
### Modeling in Network Security
|
||||
|
||||
- Basically, we have a sender and a receiver, and they communicate through the internet.
|
||||
- **Sender and receiver want to communicate *securely***.
|
||||
- But the adversary can attack the communication channel. For instance,
|
||||
- tapping, eavesdropping, snooping messages
|
||||
- inserting, modifying, deleting, replaying messages
|
||||
- poisoning data
|
||||
- impersonate and pretend to be someone else
|
||||
- Conventionally, we use the following names:
|
||||
- Alice and Bob for the two parties participating in the communication.
|
||||
- Eve (or Mallory, Oscar) for the adversary.
|
||||
|
||||
## Security Attacks
|
||||
|
||||
This is only an overview, so the attacks are introduced briefly.
|
||||
|
||||
### Computer/Network Attacks
|
||||
|
||||
- Malware: malicious software
|
||||
- virus, worm, Trojan, spyware, ransomware
|
||||
- Bots that automate malicious tasks
|
||||
- [Buffer Overflow](https://en.wikipedia.org/wiki/Buffer_overflow) (BOF)
|
||||
- Denial of Service (DoS)
|
||||
- Distributed DoS (DDoS) if numerous hosts are used
|
||||
- Network-based attacks (upcoming)
|
||||
- Physical Attacks
|
||||
- [Van Eck phreaking](https://en.wikipedia.org/wiki/Van_Eck_phreaking)
|
||||
- Energy weapons (electromagnetic waves)
|
||||
- Password Attacks
|
||||
- Password guessing, dictionary attacks, brute force attacks
|
||||
- Information gathering attacks
|
||||
- through phone, web, SNS (watch out what you post)
|
||||
- Phishing with cloned websites
|
||||
- the information you enter will be sent to the attacker
|
||||
- (Port) Scanning: searching for open ports on a server
|
||||
- [Side Channel Attacks](https://en.wikipedia.org/wiki/Side-channel_attack): attacks based on extra information rather than the flaws in the design of the protocol or algorithm itself.
|
||||
- Timing information, power consumption can be used
|
||||
- Data remanence: reading sensitive data after they have been deleted
|
||||
|
||||
### Network-based Attacks
|
||||
|
||||
- Cryptographic attacks: decrypting ciphertext, finding the key
|
||||
- Spoofing: ARP, DNS, cache poisoning
|
||||
- Session hijacking
|
||||
- Impersonation, man-in-the-middle (MITM) attacks
|
||||
- Network domain specific attacks: wireless, web, mobile, IoT etc.
|
||||
|
||||
There are two types of attacks in security attacks
|
||||
- **Active attacks**: modify the content of messages
|
||||
- Ex. (D)DoS, MITM, poisoning, smurf attack, system attacks.
|
||||
- *Prevention* is important since the active attacks concern *data integrity* and *availability*.
|
||||
- **Passive attacks**: does not modify information, but observes the content or copies it.
|
||||
- Ex. eavesdropping, port scanning (idle scan secretly scans).
|
||||
- *Detection* is important since passive attacks are a danger to *confidentiality*.
|
||||
|
||||
## Security Services and Mechanisms
|
||||
|
||||
### CIA Triad
|
||||
|
||||
What kind of security services do we want? The basic network security services must support the following. These are also known as the **CIA triad**.
|
||||
|
||||
- **Confidentiality**: the data must be kept secret (privacy)
|
||||
- **Integrity**: the data must not be modified during transmission (consistency, accuracy, trustworthiness)
|
||||
- **Availability**: information should be consistently and readily accessible
|
||||
|
||||
Additionally, we also need:
|
||||
- **Authentication**: a way to authenticate users (ID, passwords)
|
||||
- **Non-repudiation**: ensure that no party can deny that it sent or received a message or approved some information
|
||||
- Assurance that someone cannot deny the validity of message or information
|
||||
|
||||
### Attacks Against CIA Triad
|
||||
|
||||
- Confidentiality: snooping, traffic analysis
|
||||
- Integrity: modification, masquerading, replaying, repudiation
|
||||
- Availability: denial of service
|
||||
|
||||
### More Security Services
|
||||
|
||||
- **Access control**: controlling privileges to access assets
|
||||
- identification, authentication (credential validation), authorization
|
||||
- **Anonymity**: name or identification is hidden
|
||||
- **Accountability**: any actions of an entity can be traced uniquely to that entity
|
||||
- similar to responsibility of an entity to some event or incident
|
||||
- **Security audit**: assessment or evaluation of an organization's security systems
|
||||
- **Privacy**: keeping data safe in transit and in storage
|
||||
- **Digital forensics**: recovering data from digital devices
|
||||
|
||||
### Security Mechanisms
|
||||
|
||||
There are many ways of achieving security.
|
||||
|
||||
- **Cryptography**: encryption/decryption of data
|
||||
- **Credential**: ID, password, certificates
|
||||
- **Message digest**: usage of hash functions and message authentication codes (MAC)
|
||||
- **Traffic padding**: to keep traffic size equal
|
||||
- It may be desirable to not leak *any* information, so one might add padding to the traffic, so the traffic is indistinguishable by the adversary (prevents side-channel attacks)
|
||||
- **Digital signatures**: provides authenticity of digital messages or documents
|
||||
- **Trusted Third Party** (TTP): a safe third-party that we can trust
|
||||
- If we have a TTP, a lot of problems go away. We can always ask the TTP for the truth.
|
||||
- But TTP can become a *single point of failure* (SPOF), and security architectures may become too dependent on the TTP.
|
||||
- **Append-only server**: keeps track of all modifications, good for auditing
|
||||
- Blockchain is a kind of append-only data structure.
|
||||
|
||||
## Cryptography
|
||||
|
||||
> **Cryptography** is the study of mathematical techniques for securing digital information, systems, and distributed computations against adversarial attacks.[^1]
|
||||
|
||||
**Cryptanalysis** is the study of methods for obtaining the meaning of encrypted information without access to the key.
|
||||
|
||||
### Basics of a Cryptosystem
|
||||
|
||||

|
||||
|
||||
- A **message** in *plaintext* is given to an **encryption algorithm**.
|
||||
- The encryption algorithm uses an **encryption key** to create a *ciphertext*.
|
||||
- The ciphertext if given to a **decryption algorithm**.
|
||||
- The decryption algorithm uses a **decryption key** to recover the original plaintext.
|
||||
- The encryption/decryption keys are only known to the sender/receiver.
|
||||
|
||||
### Classification of Cryptosystems
|
||||
|
||||
There are two criteria for classifying cryptosystems.
|
||||
|
||||
- How are the keys used?
|
||||
- **Symmetric** cryptography uses a single key for both encryption and decryption.
|
||||
- **Public key** cryptography uses different keys for encryption and decryption, respectively.
|
||||
- How are plaintexts processed?
|
||||
- **Block cipher**
|
||||
- **Stream cipher**
|
||||
|
||||
### Kerckhoffs' Principle
|
||||
|
||||
There are two choices to achieve the security of a cryptosystem.
|
||||
|
||||
1. Keep the encryption/decryption scheme secret. (security through obscurity)
|
||||
2. Keep the key secret.
|
||||
|
||||
But in real life, we use the second method and keep the key secret.
|
||||
|
||||
> The cipher method must not be required to be secret, and it must be able to fall into the hands of the enemy without inconvenience.[^1]
|
||||
|
||||
**Kerckhoffs' principle** demands that *security rely solely on the secrecy of the key*. Even if everything about the system is publicly known, except for the key.
|
||||
|
||||
Why? Here are some of the arguments in favor of Kerckhoffs' principle.
|
||||
|
||||
1. It is significantly easier to maintain the secrecy of a short key than to keep an encryption scheme secret.
|
||||
- Information about the scheme might be leaked or reverse engineered.
|
||||
2. In case the secret information is exposed, it is much easier to replace the key, than to replace the encryption scheme.
|
||||
- Generating a new random key is relatively easy, but generating a new, *secure* encryption scheme is non-trivial.
|
||||
3. The public can review an encryption scheme to check for vulnerabilities.
|
||||
- *Standardization* of schemes is possible, supporting compatibility between different users.
|
||||
- It is beneficial to use strong schemes that have gone through public scrutiny.
|
||||
|
||||
## Threat Modeling
|
||||
|
||||
What should we consider when we are designing secure systems? We should consider what attacks are possible. **Threat modeling** is the process of systematically identifying the threats faced by a system.
|
||||
|
||||
1. Identify the values of assets.
|
||||
2. Enumerate the *attack surfaces*.
|
||||
3. Hypothesize attackers.
|
||||
- What kinds of assets would they want?
|
||||
- Are they able to attack through vulnerable surfaces?
|
||||
4. Survey mitigations.
|
||||
5. Balance costs vs. risks.
|
||||
|
||||
We consider the case of a smartphone.
|
||||
|
||||
### Identifying Assets
|
||||
|
||||
In a smartphone, assets (things of value) would be
|
||||
- Saved credentials such as passwords
|
||||
- *Personally identifiable information* (PII) such as social security number
|
||||
- Contacts, pictures, sensitive documents, credit card data
|
||||
- Access to sensors such as camera, microphone, network traffic or location
|
||||
- The device itself
|
||||
|
||||
### Attack Surfaces
|
||||
|
||||
- Physically stealing the device
|
||||
- Tricking the user to install malicious applications
|
||||
- Passive eavesdropping on the network
|
||||
- Backdoors in the OS
|
||||
|
||||
### Hypothetical Attackers
|
||||
|
||||
For example,
|
||||
|
||||
|Attacker|Abilities|Goals|
|
||||
|:-:|-|-|
|
||||
|Thief|Steal the phone|Take the device|
|
||||
|FBI|Lot of things...|Obtain evidence from the device|
|
||||
|Eavesdropper|Observe network traffic|Steal information|
|
||||
|
||||
### Surveying Mitigations
|
||||
|
||||
Next, we survey how to mitigate the attacks.
|
||||
|
||||
Suppose we are mitigating theft. One could:
|
||||
- Apply strong authentication using passwords or biometrics
|
||||
- But this is annoying to the user
|
||||
- Use full device encryption
|
||||
- Use remote device tracking and format the device
|
||||
- May not work if the device is disconnected from the internet
|
||||
|
||||
For blocking eavesdroppers, one could apply HTTPS everywhere or use a VPN. But it's hard to check if apps are actually using HTTPS or not, and VPNs may slow down connection.
|
||||
|
||||
### Cost vs. Risk Analysis
|
||||
|
||||
- How costly is the mitigation?
|
||||
- Applying strong password is not very costly.
|
||||
- How likely is the attack?
|
||||
- Attacks from FBI are very unlikely for an average person.
|
||||
|
||||
[^1]: J. Katz, Introduction to Modern Cryptography
|
||||
@@ -0,0 +1,338 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- security
|
||||
- lecture-note
|
||||
- cryptography
|
||||
title: 02. Symmetric Key Cryptography (1)
|
||||
date: 2023-09-11
|
||||
github_title: 2023-09-11-symmetric-key-cryptography-1
|
||||
---
|
||||
|
||||
## Symmetric Encryption
|
||||
|
||||
- Alice and Bob use the same key for encryption and decryption.
|
||||
- This was the only type of encryption before the invention of public-key cryptography.
|
||||
|
||||
### Requirements
|
||||
|
||||
- A strong encryption algorithm, which is known to the public.
|
||||
- Kerckhoff's principle!
|
||||
- A secret key known only to sender and receiver.
|
||||
- We assume the **existence of a a secure channel for distributing the key**.[^1]
|
||||
- **Correctness requirement**
|
||||
- Let $m$, $k$ denote the message and the key.
|
||||
- For encryption/decryption algorithm $E$ and $D$,
|
||||
- $D(k, E(k, m)) = m$.
|
||||
|
||||
## Cryptographic Attacks
|
||||
|
||||
In increasing order of the power of the attacker,
|
||||
|
||||
- **Ciphertext only attacks**: the attacker has ciphertexts, and tries to obtain information.
|
||||
- **Known plaintext attack**: the attacker has a collection of plaintext/ciphertext pairs.
|
||||
- **Chosen plaintext attack**: the attacker has a collection of plaintext/ciphertext pairs *for any plaintext chosen by the attacker*.
|
||||
- **Chosen ciphertext attack**: the attacker has a collection of plaintext/ciphertext pairs *for any ciphertext chosen by the attacker*.
|
||||
|
||||
## Requirements for a Secure Cipher
|
||||
|
||||
The following two properties should hold for a secure cipher.
|
||||
- **Diffusion** hides the relationship between the ciphertext and the plaintext.
|
||||
- It should be hard to obtain the plaintext from the ciphertext.
|
||||
- Changing a single bit of the plaintext affects several bits of the ciphertext, and vice versa.
|
||||
- **Confusion** hides the relationship between the ciphertext and the key.
|
||||
- It should be hard to obtain the key from the ciphertext.
|
||||
- Each bit of the ciphertext should depend on several parts of the key.
|
||||
|
||||
## Primitives
|
||||
|
||||
### Substitution Cipher
|
||||
|
||||
In **substitution cipher**, encryption is done by replacing units of plaintext with ciphertext, with a fixed algorithm.
|
||||
|
||||
#### Caesar Cipher
|
||||
|
||||
- Encryption is done by $E(x) = x + 3 \pmod{26}$.
|
||||
- $E(\texttt{A}) = \texttt{D}$, $E(\texttt{Z}) = \texttt{D}$, etc.
|
||||
- Decryption can be done by $D(x) = x - 3 \pmod{26}$.
|
||||
- This scheme is not secure, since we can try all $26$ possibilities.
|
||||
|
||||
#### Affine Cipher
|
||||
|
||||
- Set two integers $a, b$ for the key.
|
||||
- In Caesar cipher, $a = 1$ and $b = 3$.
|
||||
- Encryption: $E(x) = ax + b \pmod m$.
|
||||
- Decryption: $D(x) = a^{-1}(x - b) \pmod m$.
|
||||
- If we use the $26$ alphabets, there are $12$ possible values for $a$, and $26$ possible values for $b$.
|
||||
- $a^{-1}$ does not exist for all $m$.
|
||||
- We need that $\gcd(a, m) = 1$. The number of possible $a$ values is $\phi(m)$.
|
||||
- This scheme is not secure either, since we can try all possibilities and check if the message makes sense.
|
||||
|
||||
#### Monoalphabetic Substitution Cipher
|
||||
|
||||
- The key is any permutation $\pi$ defined on the set $\Sigma = \lbrace \texttt{A}, \texttt{B}, \dots, \texttt{Z} \rbrace$.
|
||||
- There are $26!$ possible keys.
|
||||
- Note that permutations are bijections.
|
||||
- Encryption is done by replacing each letter $x$ by $\pi(x)$.
|
||||
- Decryption is done by replacing each letter $x$ by $\pi^{-1}(x)$.
|
||||
- This scheme is still not secure, since we can try all possibilities on a *modern* computer.
|
||||
|
||||
To attack this scheme, we use frequency analysis. Calculate the frequency of each letter and compare it with the actual distribution of English letters. We could also use *bigrams* (2-letters) for calculating the frequency.
|
||||
|
||||
#### Vigenère Cipher
|
||||
|
||||
- A polyalphabetic substitution
|
||||
- Given a key length $m$, take key $k = (k_1, k_2, \dots, k_m)$.
|
||||
- For the $i$-th letter $x$, set $j = i \bmod m$.
|
||||
- Encryption is done by replacing $x$ by $x + k_{j}$.
|
||||
- Decryption is done by replacing $x$ by $x - k_j$.
|
||||
|
||||
To attack this scheme, find the key length by [*index of coincidence*](https://en.wikipedia.org/wiki/Index_of_coincidence). Then use frequency analysis.
|
||||
|
||||
#### Hill Cipher
|
||||
|
||||
- A polyalphabetic substitution
|
||||
- A key is a *invertible* matrix $K = (k _ {ij}) _ {m \times m}$ where $k _ {ij} \in \mathbb{Z} _ {26}$.
|
||||
- Encryption/decryption is done by multiplying $K$ or $K^{-1}$.
|
||||
|
||||
This scheme is vulnerable to known plaintext attack, since the equation can be solved for $K$.
|
||||
|
||||
### Transposition Cipher
|
||||
|
||||
- Positions held by units of plaintext are shifted using some permutation.
|
||||
- Also known as permutation cipher.
|
||||
|
||||
#### Columnar Cipher
|
||||
|
||||
- Set the number of columns $n$.
|
||||
- For the key, use a permutation defined on a set of $n$ elements.
|
||||
- There are $n!$ possible keys.
|
||||
- Write the plaintext message in row major order.
|
||||
- To encrypt, reorder the columns by the chosen permutation.
|
||||
- Then the ciphertext is taken by taking letters in column major order.
|
||||
|
||||
##### Example
|
||||
|
||||
Suppose we encrypt the following text:
|
||||
|
||||
$$
|
||||
\texttt{CRYPTOGRAPHY INTERNET SECURITY}
|
||||
$$
|
||||
|
||||
Choose a key $\sigma = (1, 4, 5, 2, 3, 6)$. Then
|
||||
|
||||
$$
|
||||
\begin{matrix} \\
|
||||
4 & 3 & 6 & 5 & 2 & 1 \\ \hline
|
||||
\texttt{C} & \texttt{R} & \texttt{Y} & \texttt{P} & \texttt{T} & \texttt{O} \\
|
||||
\texttt{G} & \texttt{R} & \texttt{A} & \texttt{P} & \texttt{H} & \texttt{Y} \\
|
||||
\texttt{I} & \texttt{N} & \texttt{T} & \texttt{E} & \texttt{R} & \texttt{N} \\
|
||||
\texttt{E} & \texttt{T} & \texttt{S} & \texttt{E} & \texttt{C} & \texttt{U} \\
|
||||
\texttt{R} & \texttt{I} & \texttt{T} & \texttt{Y}
|
||||
\end{matrix}
|
||||
$$
|
||||
|
||||
Now reorder the columns,
|
||||
|
||||
$$
|
||||
\begin{matrix} \\
|
||||
1 & 2 & 3 & 4 & 5 & 6 \\ \hline
|
||||
\texttt{O} & \texttt{T} & \texttt{R} & \texttt{C} & \texttt{P} & \texttt{Y} \\
|
||||
\texttt{Y} & \texttt{H} & \texttt{R} & \texttt{G} & \texttt{P} & \texttt{A} \\
|
||||
\texttt{N} & \texttt{R} & \texttt{N} & \texttt{I} & \texttt{E} & \texttt{T} \\
|
||||
\texttt{U} & \texttt{C} & \texttt{T} & \texttt{E} & \texttt{E} & \texttt{S} \\
|
||||
&& \texttt{I} & \texttt{R} & \texttt{Y} & \texttt{T}
|
||||
\end{matrix}
|
||||
$$
|
||||
|
||||
The ciphertext is
|
||||
|
||||
$$
|
||||
\texttt{OYNU THRC RRNTI CGIER PPEEY YATST}.
|
||||
$$
|
||||
|
||||
The decryption process is the reverse of this operation. It seems to be breakable by inspecting the $i$-th letter of each block and reordering the letters to check if any reordering makes sense.
|
||||
|
||||
### Exclusive OR (XOR)
|
||||
|
||||
- A bitwise operation $x \oplus y = x + y \pmod 2$.
|
||||
- For the message $m$, key $k \in \lbrace 0, 1 \rbrace^n$,
|
||||
- Encryption is done by $E(k, x) = x \oplus k$.
|
||||
- Decryption is done by $D(k, y) = y \oplus k$.
|
||||
- Correctness: $D(k, E(k, x)) = (x \oplus k) \oplus k = x$.
|
||||
- Vulnerable to known plaintext attack, since if $c = m \oplus k$, then $k = c \oplus m$.
|
||||
|
||||
#### A crucial property of XOR.
|
||||
|
||||
> **Theorem.** Suppose that the message $M$ has an arbitrary distribution over $\lbrace 0, 1 \rbrace^n$. If the key $K$ is independently uniformly distributed over $\lbrace 0, 1 \rbrace^n$, then $C = M \oplus K$ is also uniformly distributed.
|
||||
|
||||
*Proof*. Let $n = 1$.
|
||||
|
||||
$$
|
||||
\begin{align*}
|
||||
\Pr[C = 0] &= \Pr[M = 0 \land K = 0] + \Pr[M = 1 \land K = 1] \\ &= \Pr[M = 0] \cdot \Pr[K = 0] + \Pr[M = 1] \cdot \Pr[K = 1] \\
|
||||
&= \frac{1}{2}\left(\Pr[M = 0] + \Pr[M = 1]\right) \\
|
||||
&= \frac{1}{2}.
|
||||
\end{align*}
|
||||
$$
|
||||
|
||||
The case for $C = 1$ is similar.
|
||||
|
||||
### One-Time Pad (OTP)
|
||||
|
||||
Let $m \in \left\lbrace 0, 1 \right\rbrace^n$ be the message to encrypt. Then choose a *random* key $k \in \left\lbrace 0, 1 \right\rbrace^n$, and XOR $k$ and $m$.
|
||||
|
||||
- Encryption: $E(k, m) = k \oplus m$.
|
||||
- Decryption: $D(k, c) = k \oplus c$.
|
||||
|
||||
This scheme is **provably secure**. See also [one-time pad (Modern Cryptography)](../modern-cryptography/2023-09-07-otp-stream-cipher-prgs.md#one-time-pad-(otp)).
|
||||
|
||||
## Perfect Secrecy
|
||||
|
||||
> **Definition.** Let $(E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. We assume that $\lvert \mathcal{K} \rvert = \lvert \mathcal{M} \rvert = \lvert \mathcal{C} \rvert$. The cipher is **perfectly secure** if for all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
||||
>
|
||||
> $$
|
||||
> \Pr[\mathcal{M} = m \mid \mathcal{C} = c] = \Pr[\mathcal{M} = m].
|
||||
> $$
|
||||
>
|
||||
> Or equivalently, for all $m_0, m_1 \in \mathcal{M}$, $c \in \mathcal{C}$,
|
||||
>
|
||||
> $$
|
||||
> \Pr[E(k, m _ 0) = c] = \Pr[E(k, m _ 1) = c]
|
||||
> $$
|
||||
>
|
||||
> where $k$ is chosen uniformly in $\mathcal{K}$.
|
||||
|
||||
In other words, the adversary learns nothing from the ciphertext.
|
||||
|
||||
With this definition, we can show that **OTP is perfectly secure**. For all $m \in \mathcal{M}$ and $c \in \mathcal{C}$,
|
||||
|
||||
$$
|
||||
\Pr[E(k, m) = c] = \frac{1}{\lvert \mathcal{K} \rvert}
|
||||
$$
|
||||
|
||||
since for each $m$ and $c$, $k$ is determined uniquely.
|
||||
|
||||
### Conditions for Perfect Secrecy
|
||||
|
||||
> **Theorem.** If $(E, D)$ is perfectly secure, $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$.
|
||||
|
||||
*Proof*. Assume not, then we can find some message $m_0 \in \mathcal{M}$ such that $m_0$ is not a decryption of some $c \in \mathcal{C}$. This is because the decryption algorithm $D$ is deterministic and $\lvert \mathcal{K} \rvert < \lvert \mathcal{M} \rvert$.
|
||||
|
||||
For the proof in detail, check [Shannon's Theorem (Modern Cryptography)](../modern-cryptography/2023-09-07-otp-stream-cipher-prgs.md#shannon's-theorem).
|
||||
|
||||
### Two-Time Pad is Insecure
|
||||
|
||||
It is not secure to use the same key twice. If for the key $k$ and two messages $m_1$, $m_2$,
|
||||
|
||||
$$
|
||||
c_1 \oplus c_2 = (k \oplus m_1) \oplus (k \oplus m_2) = m_1 \oplus m_2.
|
||||
$$
|
||||
|
||||
So some information is leaked, even though we cannot actually recover $m_i$ from the above equation.
|
||||
|
||||
## Two Types of Symmetric Ciphers
|
||||
|
||||
- **Stream cipher**: encrypt one bit/byte at a time
|
||||
- Generating a random key is difficult.
|
||||
- No message integrity or authentication.
|
||||
- Ex. RC4
|
||||
- **Block cipher**: encrypt a block of bits at a time
|
||||
- Can provide integrity or authentication.
|
||||
- Block ciphers usually have feedback between blocks, so errors during transmission will be propagated during the decryption process.
|
||||
- Ex. DES, AES
|
||||
|
||||
### Stream Cipher
|
||||
|
||||
We start with a secret key called **seed** with size $s$, and generate a random stream using a **pseudo random generator**. (PRG) The PRG is a function $\mathsf{Gen}: \lbrace 0, 1 \rbrace^s \rightarrow \lbrace 0, 1 \rbrace^n$, so use $\mathsf{Gen}(k)$ as the key for the one-time pad.
|
||||
|
||||
Stream cipher does not have perfect secrecy, since the key length is shorter than the message length. It is known that the security of stream ciphers depend on the security of PRGs.
|
||||
|
||||
### Linear Feedback Shift Register (LFSR)
|
||||
|
||||
The seed can be used in a **linear feedback shift register** (LFSR) to generate the actual key for the stream cipher. There are $n$ stages (or states) and the generated key stream is periodic with maximal period $2^n - 1$.
|
||||
|
||||
The links between stages may be different. But in general, if one is given $2n$ output bits of LFSR, one can solve the $n$-stage LFSR.
|
||||
|
||||
To alleviate this problem, we can combine multiple LFSRs with a $k$-input binary boolean function, so that we have high non-linearity, long period, and low correlation with the input bits.
|
||||
|
||||
## Case Study: Wi-Fi WEP
|
||||
|
||||
- Wi-Fi 802.11b WEP (Wired Equivalent Privacy)
|
||||
- Encryption in the link layer
|
||||
- **Misuse of the stream cipher RC4.**
|
||||
|
||||
### WEP
|
||||
|
||||
#### Encryption Overall
|
||||
|
||||
- Plaintext: Message + CRC
|
||||
- CRC is padded to verify the integrity of the message.
|
||||
- CRC is $32$ bits
|
||||
- Not for attacks, but for error correction
|
||||
- Initialization vector (IV): $24$ bit
|
||||
- Key: $104$ bit number to build the keystream
|
||||
- IV and the key is used to build the keystream $k_s$
|
||||
- IV + Key is $128$ bits
|
||||
- Encryption: $c = k_s \oplus (m \parallel \mathrm{CRC}(m))$
|
||||
|
||||
#### Encryption Process
|
||||
|
||||
1. Compute CRC for the message
|
||||
- CRC-32 polynomial is used
|
||||
2. Compute the keystream from IV and the key
|
||||
- IV is concatenated with the key.
|
||||
- $128$ bit input is given to the key generation algorithm.
|
||||
3. Now encrypt the plaintext with XOR.
|
||||
- The IV is prepended to the ciphertext, since the receiver needs it to decrypt.
|
||||
|
||||
#### Decryption Process
|
||||
|
||||
1. Compute the keystream from IV and the key
|
||||
- Extract the IV from the incoming frame
|
||||
2. Decrypt the ciphertext with XOR
|
||||
3. Verify the extracted message with the CRC
|
||||
|
||||
### Initialization Vector
|
||||
|
||||
- The IV is not encrypted, and carried in plaintext.
|
||||
- IV is only $24$ bits, so around $16$ million possible IVs.
|
||||
- **IV must be different for every message transmitted.**
|
||||
- 802.11 standard doesn't specify how IV is calculated.
|
||||
- Usually increment by $1$ for each frame.
|
||||
- No restrictions on reusing the IV.
|
||||
|
||||
#### IV Collision
|
||||
|
||||
- The key is fixed, and the period of IV is $2^{24}$.
|
||||
- Same IV leads to same key stream.
|
||||
- So if the adversary takes two frames with the same IV to obtain the XOR of two plaintext messages.
|
||||
- $c_1 \oplus c_2 = (p_1 \oplus k_s) \oplus (p_2 \oplus k_s) = p_1 \oplus p_2$
|
||||
- Since network traffic contents are predictable, messages can be recovered.
|
||||
- We are in the link layer, so HTTP, IP, TCP headers will be contained in the encrypted payload.
|
||||
- The header formats are usually known.
|
||||
|
||||
#### CRC Algorithm
|
||||
|
||||
Given a bit string (defined in the specification), the sender performs long division on the data. The remainder is the result of the CRC, which is appended to the data. The receiver will check by performing long division, and the remainder should be $0$ if there were no bit errors during transmission.
|
||||
|
||||
### Message Modification
|
||||
|
||||
- CRC is actually a linear function.
|
||||
- $\mathrm{CRC}(x \oplus y) = \mathrm{CRC}(x) \oplus \mathrm{CRC}(y)$.
|
||||
- The remainder of $x \oplus y$ is equal to the sum of the remainders of $x$ and $y$, since $\oplus$ is effectively an addition over $\mathbb{Z}_2$.
|
||||
- CRC function doesn't have a key, so it is forgeable.
|
||||
- **RC4 is transparent to XOR**, and messages can be modified.
|
||||
- Let $c = k_s \oplus (m \parallel \mathrm{CRC}(m))$.
|
||||
- If we XOR $(x \parallel \mathrm{CRC}(x))$, where $x$ is some malicious message.
|
||||
- $c \oplus (x \parallel \mathrm{CRC}(x)) = k_s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
|
||||
- The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
|
||||
- CRC check by the receiver will succeed.
|
||||
|
||||
[^1]: This assumption will be removed when we learn public key cryptography.
|
||||
@@ -0,0 +1,356 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
- cryptography
|
||||
title: 03. Symmetric Key Cryptography (2)
|
||||
date: 2023-09-18
|
||||
github_title: 2023-09-18-symmetric-key-cryptography-2
|
||||
image:
|
||||
path: /assets/img/posts/lecture-notes/internet-security/is-03-feistel-function.png
|
||||
attachment:
|
||||
folder: assets/img/posts/lecture-notes/internet-security
|
||||
---
|
||||
|
||||
## Block Cipher Overview
|
||||
|
||||
- We need confusion and diffusion
|
||||
- Confusion: relationship between ciphertext and key is complex
|
||||
- Diffusion: relationship between message and ciphertext is complex
|
||||
- Series of **substitutions** and **permutations** can achieve confusion and diffusion
|
||||
|
||||
### Modules
|
||||
|
||||
- **S-box**: a substitution module
|
||||
- Usually for confusion, also gives diffusion
|
||||
- $m \times n$ lookup box is used for implementation
|
||||
- **P-box**: a permutation module
|
||||
- Usually for diffusion
|
||||
- Compared to the number of input bits,
|
||||
- *Expansion* if the number of output bits is larger
|
||||
- *Compression* if the number of output bits is smaller
|
||||
- *Straight* if the number of output bits is equal
|
||||
|
||||
## Data Encryption Standard (DES)
|
||||
|
||||
- Standardized in 1979.
|
||||
- Block size is $64$ bits ($8$ bytes)
|
||||
- $64$ bits input $\rightarrow$ $64$ bits output
|
||||
- Key is $56$ bits, and every $8$th bit is a parity bit.
|
||||
- Thus $64$ bits in total
|
||||
|
||||
### Encryption
|
||||
|
||||
1. From the $56$-bit key, generate $16$ different $48$ bit keys $k_1, \dots, k_{16}$.
|
||||
2. The plaintext message goes through an initial permutation.
|
||||
3. The output goes through $16$ rounds, and key $k_i$ is used in round $i$.
|
||||
4. After $16$ rounds, split the output into two $32$ bit halves and swap them.
|
||||
5. The output goes through the inverse of the permutation from Step 1.
|
||||
|
||||
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.[^1]
|
||||
|
||||
In each round $i$, the following operation is performed:
|
||||
|
||||
$$
|
||||
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1}).
|
||||
$$
|
||||
|
||||
#### The Feistel Function
|
||||
|
||||

|
||||
|
||||
The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunks. Each chunk is expanded to $6$ bits using a P-box. Now, we have 48 bits of data, so apply XOR with the key for this round. Next, each $6$-bit block is compressed back to $4$ bits using a S-box. Finally, there is a (straight) permutation at the end, resulting in $32$ bit data.
|
||||
|
||||
The Feistel function is **not invertible.**
|
||||
|
||||
### Questions
|
||||
|
||||
- Why does the input go through the P-box and its inverse at the end?
|
||||
- Not for security, but for efficient hardware design.
|
||||
- Why do we swap each $32$ bit halves?
|
||||
- Not for security, but for engineering purposes, see below.
|
||||
- Is DES invertible?
|
||||
- Yes, message should be decrypted.
|
||||
- But the Feistel function is not invertible, since it sends $4$ bits to $6$ bits during the evaluation process. Then how is decryption possible?
|
||||
|
||||
### Decryption
|
||||
|
||||
Let $f$ be the Feistel function. We can define each round as a function $F$,
|
||||
|
||||
$$
|
||||
F(L_i \parallel R_i) = R_i \parallel L_i \oplus f(R_i).
|
||||
$$
|
||||
|
||||
Consider a function $G$, defined as
|
||||
|
||||
$$
|
||||
G(L_i \parallel R_i) = R_i \oplus f(L_i) \parallel L_i.
|
||||
$$
|
||||
|
||||
Then, we see that
|
||||
|
||||
$$
|
||||
\begin{align*}
|
||||
G(F(L_i \parallel R_i)) &= G(R_i \parallel L_i \oplus f(R_i)) \\
|
||||
&= (L_i \oplus f(R_i)) \oplus f(R_i) \parallel R_i \\
|
||||
&= L_i \parallel R_i.
|
||||
\end{align*}
|
||||
$$
|
||||
|
||||
Thus $F$ and $G$ are inverses of each other, thus $f$ doesn't have to be invertible. This is called the **Feistel cipher**.
|
||||
|
||||
Also, note that
|
||||
|
||||
$$
|
||||
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i).
|
||||
$$
|
||||
|
||||
Notice that evaluating $G$ is equivalent to evaluating $F$ on a encrypted block, with their upper/lower $32$ bit halves swapped. We get $L_i \oplus f(R_i) \parallel R_i$ exactly when we swap each halves of $F(L_i \parallel R_i)$. Thus, we can use the same hardware for encryption and decryption, which is the reason for swapping each $32$ bit halves.
|
||||
|
||||
## Advanced Encryption Standard (AES)
|
||||
|
||||
- DES key only had $56$ bits, so DES was broken in the 1990s
|
||||
- NIST standardized AES in 2001, based on Rijndael cipher
|
||||
- AES has $3$ different key lengths: $128$, $192$, $256$
|
||||
- Different number of rounds for different key lengths
|
||||
- $10$, $12$, $14$ rounds respectively
|
||||
- Input data block is $128$ bits, so viewed as $4\times 4$ table of bytes
|
||||
- This table is called the **current state**
|
||||
|
||||
Each round consists of the following:
|
||||
- **SubBytes**: byte substitution, 1 S-box on every byte
|
||||
- **ShiftRows**: permutes bytes between groups and columns
|
||||
- **MixColumns**: mix columns by using matrix multiplication in $\mathrm{GF}(2^8)$.
|
||||
- **AddRoundKey**: XOR with round key
|
||||
|
||||
The first and last rounds are a little different.
|
||||
- AddRoundKey is done before the first round.
|
||||
- The last round does not have MixColumns.
|
||||
|
||||
The objectives of AES:
|
||||
- Build resistance against known attacks
|
||||
- Code must be compact, and should run fast on many CPUs
|
||||
- Design must be simple
|
||||
|
||||
### Layers
|
||||
|
||||
#### SubBytes
|
||||
|
||||
- A simple substitution of each byte using $16 \times 16$ lookup table.
|
||||
- Each byte is split into two $4$ bit *nibbles*
|
||||
- Left half is used as row index
|
||||
- Right half is used as column index
|
||||
|
||||
#### ShiftRows
|
||||
|
||||
- A circular bytes shift for each row, so it is a permutation
|
||||
- $i$-th row is shifted $i$ times to the left. ($i = 0, 1, 2, 3$)
|
||||
|
||||
#### MixColumns
|
||||
|
||||
- For each column, each byte is replaced by a value
|
||||
- The value depends on all 4 bytes of the column
|
||||
- Each column is processed separately
|
||||
- Thus effectively, it is a matrix multiplication (Hill cipher).[^2]
|
||||
|
||||
#### AddRoundKey
|
||||
|
||||
- XOR the input with $128$ bits of the round key
|
||||
- The round key is different for each round
|
||||
|
||||
These 4 modules are all invertible!
|
||||
|
||||
### Questions
|
||||
|
||||
- Why is there a AddRoundKey at the beginning?
|
||||
- Why is the last round different?
|
||||
|
||||
Both are for engineering purposes, to make the encryption and decryption process the same.[^3]
|
||||
|
||||
## Modes of Operations
|
||||
|
||||
AES, DES use fixed block size for encryption. How do we encrypt longer messages? For long messages, there are many different ways to process each block of the message. This is called the **mode of operation**. We will look at 5 different modes of operations.
|
||||
|
||||
### Electronic Codebook Mode (ECB)
|
||||
|
||||

|
||||
|
||||
- Codebook is a mapping table.
|
||||
- For the $i$-th plaintext block, we use key $k$ to encrypt and obtain the $i$-th ciphertext block.
|
||||
- **Uses the same key for all blocks**
|
||||
- Adjacent blocks are independent of each other.
|
||||
- Advantages
|
||||
- Fast when run in parallel
|
||||
- Limitations
|
||||
- Repetitions in messages (if aligned with the block) may lead to repetitions in the ciphertext
|
||||
- Susceptible to *cut-and-paste attacks*
|
||||
- Mainly used to send a few blocks data
|
||||
|
||||
#### Cut-and-Paste Attack
|
||||
|
||||
Since the same key is used for all blocks, once a mapping from plaintext to ciphertext is known, a sequence of ciphertext blocks can be easily manipulated. The assumption here is that the encryption keys do not change frequently. So the attacker can *cut* some block from a ciphertext and *paste* it to manipulate the data. This is a chosen ciphertext attack.
|
||||
|
||||
### Cipher Block Chaining Mode (CBC)
|
||||
|
||||

|
||||
|
||||
- Two identical messages produce two different ciphertexts.
|
||||
- This prevents chosen plaintext attacks
|
||||
- Blocks are linked together in the encryption process
|
||||
- **Each previous cipher block is chained with current block**
|
||||
- Initialization vector is used
|
||||
- Encryption
|
||||
- Let $c_0$ be the initialization vector.
|
||||
- $c_i = E(k, p_i \oplus c_{i - 1})$, where $p_i$ is the $i$-th plaintext block.
|
||||
- The ciphertext is $(c_0, c_1, \dots)$.
|
||||
- Decryption
|
||||
- The first block $c_0$ contains the initialization vector.
|
||||
- $p_i = c_{i - 1} \oplus D(k, c_i)$.
|
||||
- The plaintext is $(p_1, p_2, \dots)$.
|
||||
- Used for bulk data encryption, authentication
|
||||
- Advantages
|
||||
- Parallelism in decryption.
|
||||
- Chosen plaintext attacks can be mitigated through randomized IV.
|
||||
- Limitations
|
||||
- Encryption is not parallelizable. Each ciphertext block depends on *all* previous blocks.
|
||||
- Side note: CBC can be used to check message integrity. (MAC)
|
||||
|
||||
#### Error Propagation in CBC
|
||||
|
||||
- If there is a 1-bit error in the *plaintext*, then that error will affect that block and all the other blocks afterwards.
|
||||
- This error doesn't occur frequently since we are in the same system.
|
||||
- If there is a 1-bit error in the *ciphertext*, then that error will affect only two blocks.
|
||||
- This error can happen in transit through the network.
|
||||
- CBC mode is self-recovering
|
||||
|
||||
#### Initialization Vector in CBC
|
||||
|
||||
- If the IV is the same, then the encryption of the same plaintext is the same.
|
||||
- Thus IVs should be random.
|
||||
- IV are not required to be secret, but
|
||||
- **No IVs should be reused under the same key**
|
||||
- **IV changes should be unpredictable**
|
||||
- On IV reuse, same message will generate the same ciphertext if key isn't changed
|
||||
- If IV is predictable, CBC is vulnerable to chosen plaintext attacks.
|
||||
- Suppose Eve obtains $(\mathrm{IV}_1, E_k(\mathrm{IV}_1 \oplus m))$.
|
||||
- Define Eve's new message $m' = \mathrm{IV} _ {2} \oplus \mathrm{IV} _ {1} \oplus g$, where
|
||||
- $\mathrm{IV} _ 2$ is the guess of the next IV, and
|
||||
- $g$ is a guess of Alice's original message $m$.
|
||||
- Eve requests an encryption of $m'$
|
||||
- $c' = E _ k(\mathrm{IV} _ 2 \oplus m') = E _ k(\mathrm{IV} _ \mathrm{1} \oplus g)$.
|
||||
- Then Eve can compare $c'$ and the original $c = E _ k(\mathrm{IV} _ \mathrm{1} \oplus m)$ to recover $m$.
|
||||
- Useful when there are not many cases for $m$ (or most of the message is already known).
|
||||
|
||||
### Cipher Feedback Mode (CFB)
|
||||
|
||||

|
||||
|
||||
- The message is treated as a stream of bits; similar to stream cipher
|
||||
- **Result of the encryption is fed to the next stage.**
|
||||
- Standard allows any number of bits to be fed to the next stage
|
||||
- It is most efficient to use all bits.
|
||||
- Initialization vector is used.
|
||||
- Same requirements on the IV as CBC mode.
|
||||
- Should be randomized, and should not be predictable.
|
||||
- Encryption
|
||||
- Let $c_0$ be the initialization vector.
|
||||
- $c_i = p_i \oplus E(k, c_{i - 1})$, where $p_i$ is the $i$-th plaintext block.
|
||||
- The ciphertext is $(c_0, c_1, \dots)$.
|
||||
- Decryption
|
||||
- The first block $c_0$ contains the initialization vector.
|
||||
- $p_i = c_i \oplus E(k, c_{i - 1})$. The same module is used for decryption!
|
||||
- The plaintext is $(p_1, p_2, \dots)$.
|
||||
- Advantages
|
||||
- Appropriate when data arrives in bits/bytes (similar to stream cipher)
|
||||
- Only encryption module is needed.
|
||||
- Decryption can be run in parallel.
|
||||
- Limitations
|
||||
- Encryption is not parallelizable.
|
||||
|
||||
#### Error Propagation in CFB
|
||||
|
||||
- CFB mode is self-recovering.
|
||||
- 1 bit error in the ciphertext corrupts some number of blocks.
|
||||
- Bit errors in the ciphertext will cause bit errors at the same position.
|
||||
- Since this ciphertext is fed to the next block, the error is propagated.
|
||||
- Some implementations (like CFB-8) use shift registers, so errors will be propagated as long as the erroneous bit is in the shift register.
|
||||
- If the error is removed from the shift register, it automatically recovers.
|
||||
|
||||
### Output Feedback Mode (OFB)
|
||||
|
||||

|
||||
|
||||
- Very similar to stream cipher.
|
||||
- Initialization vector is used as a seed to generate the key stream.
|
||||
- Actual encryption and decryption only consists of XOR, so it is fast.
|
||||
- Blocks are independent of each other
|
||||
- Encryption/decryption are both parallelizable after key stream is calculated.
|
||||
- Key stream generation cannot be parallelized.
|
||||
- Encryption
|
||||
- Let $s_0$ be the initialization vector.
|
||||
- $s_i = E(k, s_{i - 1})$ where $s_i$ is the $i$-th key stream.
|
||||
- $c_i = p_i \oplus s_i$.
|
||||
- The ciphertext is $(s_0, c_1, \dots)$.
|
||||
- Decryption
|
||||
- The first block $s_0$ contains the initialization vector.
|
||||
- $s_i = E(k, s_{i - 1})$. The same module is used for decryption.
|
||||
- $p_i = c_i \oplus s_i$.
|
||||
- The plaintext is $(p_1, p_2, \dots)$.
|
||||
- Note: IV and successive encryptions act as an OTP generator.
|
||||
- Advantages
|
||||
- There is no error propagation. $1$ bit error in ciphertext only affects $1$ bit in the plaintext.
|
||||
- Key streams can be generated in advance.
|
||||
- Fast when parallelized.
|
||||
- Only encryption module is needed.
|
||||
- Limitations
|
||||
- Key streams should not have repetitions.
|
||||
- We would have $c_i \oplus c_j = p_i \oplus p_j$.
|
||||
- Size of each $s_i$ should be large enough.
|
||||
- If attacker knows the plaintext and ciphertext, plaintext can be modified.
|
||||
- Same as in OTP.
|
||||
|
||||
### Counter Mode (CTR)
|
||||
|
||||

|
||||
|
||||
- Without chaining, we use a counter (typically incremented by $1$).
|
||||
- Counter starts from the initialization vector.
|
||||
- Highly parallelizable.
|
||||
- Can decrypt from any arbitrary position.
|
||||
- Counter should not be repeated for the same key.
|
||||
- Suppose that the same counter $ctr$ is used for encrypting $m_0$ and $m_1$.
|
||||
- Encryption results are: $(ctr, E(k, ctr) \oplus m_0), (ctr, E(k, ctr) \oplus m_1)$.
|
||||
- Then the attacker can obtain $m_0 \oplus m_1$.
|
||||
|
||||
## Modes of Operations Summary
|
||||
|
||||
|Criteria\Modes|ECB|CBC|CFB|OFB|CTR|
|
||||
|:-:|:-:|:-:|:-:|:-:|:-:|
|
||||
|IV|-|Yes|Yes|Yes|Counter|
|
||||
|Encryption Parallelizable|Yes|No|No|Yes\*|Yes|
|
||||
|Decryption Parallelizable|Yes|Yes|Yes|Yes\*|Yes|
|
||||
|Random Read Access|Yes|Yes|Yes|No|Yes|
|
||||
|Self-Recovering|-|Yes|Yes|-|-|
|
||||
|
||||
- OFB is parallelizable only if the keystream is generated in advance.
|
||||
- We don't have to consider self-recovery if the ciphertext is not fed into the encryption of the next block.
|
||||
- Errors in the ciphertext are not be propagated for ECB, OFB and CTR.
|
||||
- **Random read access**
|
||||
- Suppose that a part of the plaintext changes.
|
||||
- In OFB, the *whole* keystream must be recalculated to fix the ciphertext.
|
||||
- But for other modes, only a part of the ciphertext needs to be changed, using the information from the previous block if necesary.
|
||||
|
||||
---
|
||||
|
||||
Images are from [Wikipedia](https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation).
|
||||
|
||||
[^1]: Some people call this function the *mangler* function.
|
||||
[^2]: Over the finite field $\mathrm{GF}(2^8)$.
|
||||
[^3]: See also a helpful [question](https://crypto.stackexchange.com/questions/1346/why-is-mixcolumns-omitted-from-the-last-round-of-aes) on cryptography SE.
|
||||
@@ -0,0 +1,237 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
- cryptography
|
||||
- number-theory
|
||||
title: 04. Modular Arithmetic (1)
|
||||
date: 2023-09-25
|
||||
github_title: 2023-09-25-modular-arithmetic-1
|
||||
---
|
||||
|
||||
**Number theory** is a branch of mathematics devoted primarily to the study of the integers. **Modular arithmetic** is heavily used in cryptography.
|
||||
|
||||
## Divisibility
|
||||
|
||||
> **Definition.** Let $a, b, c \in \mathbb{Z}$ such that $a = bc$. Then,
|
||||
>
|
||||
> 1. $b$ and $c$ are said to **divide** $a$, and are called **factors** of $a$.
|
||||
> 2. $a$ is said to be a **multiple** of $b$ and $c$.
|
||||
|
||||
> **Notation.** For $a, b \in \mathbb{Z}$, we write $a \mid b$ if $a$ divides $b$. If not, we write $a \nmid b$.
|
||||
|
||||
These are simple lemmas for checking divisibility.
|
||||
|
||||
> **Lemma.** Let $a, b, c \in \mathbb{Z}$.
|
||||
>
|
||||
> 1. If $a \mid b$ and $a \mid c$, then $a \mid (b + c)$.
|
||||
> 2. If $a \mid b$, then $a \mid bc$.
|
||||
> 3. If $a \mid b$ and $b \mid c$, then $a \mid c$.
|
||||
|
||||
## Prime Numbers
|
||||
|
||||
> **Definition.** Integer $n \geq 2$ is **prime** if it is only divisible by $1$ and itself. If it is not prime, then it is **composite**.
|
||||
|
||||
Note that $1$ is neither prime nor composite.
|
||||
|
||||
### Primality Tests
|
||||
|
||||
It is hard to verify if some given number is prime. Many encryption schemes heavily rely on this fact.
|
||||
|
||||
The following is a simple algorithm to check if a given integer is prime.
|
||||
|
||||
```c
|
||||
bool naive_prime_test(int n) {
|
||||
if (n < 2) {
|
||||
return false;
|
||||
}
|
||||
|
||||
for (int i = 2; i < sqrt(n); ++i) {
|
||||
if (n % i == 0) {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
return true;
|
||||
}
|
||||
```
|
||||
|
||||
However, this algorithm has complexity $\mathcal{O}(\sqrt{n})$, which is slow. We have better algorithms like Fermat's test, Miller-Rabin test, Pollard's rho algorithm... (Not covered in this lecture)
|
||||
|
||||
## Division Algorithm
|
||||
|
||||
> **Theorem.** (Euclidean Division) For $a, b \in \mathbb{Z}$ with $b \neq 0$, there exist unique integers $q, r$ with $0 \leq r < \left\lvert b \right\rvert$ such that $a = bq + r$.
|
||||
|
||||
*Proof.* [By induction](https://en.wikipedia.org/wiki/Euclidean_division#Proof).
|
||||
|
||||
Other proofs use the well-ordering principle.
|
||||
|
||||
## Modulo Operation
|
||||
|
||||
There are two ways to think about 'mod': as a function, and as a congruence.
|
||||
|
||||
### Modulo as a Function
|
||||
|
||||
As a function, $a \bmod b$ return the remainder of $a$ divided by $b$. This operation is commonly denoted `%` in many programming languages.[^1]
|
||||
|
||||
### Modulo as a Congruence
|
||||
|
||||
As a congruence, it means that $a, b$ are in the same *equivalence class*.[^2]
|
||||
|
||||
> **Definition.** For $a, b, n \in \mathbb{Z}$ and $n \neq 0$, $a \equiv b \pmod n$ if and only if $n \mid (a - b)$.
|
||||
|
||||
Properties of modulo operation.
|
||||
|
||||
> **Lemma.** Suppose that $a \equiv b \pmod n$ and $c \equiv d \pmod n$. Then, the following hold.
|
||||
>
|
||||
> 1. $a + c \equiv (b + d) \pmod n$.
|
||||
> 2. $ac \equiv bd \pmod n$.
|
||||
> 3. $a^k \equiv b^k \pmod n$.
|
||||
> 4. $a \equiv (a \bmod n) \pmod n$.
|
||||
|
||||
*Proof.* Trivial. :)
|
||||
|
||||
The last one is very useful in computing. For example, if $a, b$ are very large integers, using the identity
|
||||
|
||||
$$
|
||||
(a + b)^k \equiv ((a + b) \bmod n)^k \pmod n
|
||||
$$
|
||||
|
||||
allows us to reduce the size of the numbers before exponentiation.
|
||||
|
||||
## Modular Arithmetic
|
||||
|
||||
For modulus $n$, **modular arithmetic** is operation on $\mathbb{Z}_n$.
|
||||
|
||||
### Residue Classes
|
||||
|
||||
For each positive integer $n$, we can partition $\mathbb{Z}$ into $n$ cells according to whether the remainder is $0, 1, 2, \dots, n - 1$ when the integer is divided by $n$. These cells are the **residue classes modulo $n$ in $\mathbb{Z}$**.
|
||||
|
||||
We write each residue class as follows.
|
||||
|
||||
$$
|
||||
\overline{k} = [k] = \left\lbrace m \in \mathbb{Z} : m \bmod n = k\right\rbrace
|
||||
$$
|
||||
|
||||
Consider the relation
|
||||
|
||||
$$
|
||||
R = \left\lbrace (a, b) : a \equiv b \pmod m \right\rbrace \subset \mathbb{Z} \times \mathbb{Z}
|
||||
$$
|
||||
|
||||
then $R$ has the following properties.
|
||||
|
||||
- **Reflexive**: $\forall a \in \mathbb{Z}$, $(a, a) \in R$.
|
||||
- **Symmetric**: $\forall a, b \in \mathbb{Z}$, if $(a, b) \in R$, then $(b, a) \in R$.
|
||||
- **Transitive**: $\forall a, b, c \in \mathbb{Z}$, if $(a, b), (b, c) \in R$ then $(a, c) \in R$.
|
||||
|
||||
Thus, $R$ is an **equivalence relation** and each residue class $[k]$ is an **equivalence class**.
|
||||
|
||||
We write the set of residue classes modulo $n$ as
|
||||
|
||||
$$
|
||||
\mathbb{Z}_n = \left\lbrace \overline{0}, \overline{1}, \overline{2}, \dots, \overline{n-1} \right\rbrace.
|
||||
$$
|
||||
|
||||
Note that $\mathbb{Z}_n$ is closed under addition and multiplication.
|
||||
|
||||
### Identity
|
||||
|
||||
> **Definition.** For a binary operation $\ast$ defined on a set $S$, $e$ is the **identity** if
|
||||
>
|
||||
> $$
|
||||
> \forall a \in S,\, a * e = e * a = a.
|
||||
> $$
|
||||
|
||||
In $\mathbb{Z}_n$, the additive identity is $0$, the multiplicative identity is $1$.
|
||||
|
||||
### Inverse
|
||||
|
||||
> **Definition.** For a binary operation $\ast$ defined on a set $S$, let $e$ be the identity. $x$ is the **inverse of $a$** if
|
||||
>
|
||||
> $$
|
||||
> x * a = a * x = e.
|
||||
> $$
|
||||
>
|
||||
> We write $x = a^{-1}$.
|
||||
|
||||
In the language of modular arithmetic, $x$ is the inverse of $a$ if
|
||||
|
||||
$$
|
||||
ax \equiv 1 \pmod n.
|
||||
$$
|
||||
|
||||
The inverse exists if and only if $\gcd(a, n) = 1$.
|
||||
|
||||
> **Lemma**. For $n \geq 2$ and $a \in \mathbb{Z}$, its inverse $a^{-1} \in \mathbb{Z}_n$ exists if and only if $\gcd(a, n) = 1$.
|
||||
|
||||
*Proof*. We use the extended Euclidean algorithm. There exists $u, v \in \mathbb{Z}$ such that
|
||||
|
||||
$$
|
||||
au + nv = \gcd(a, n).
|
||||
$$
|
||||
|
||||
($\impliedby$) If $\gcd(a, n) = 1$, then $au + nv = 1$, so $au = 1 - nv \equiv 1 \pmod n$. Thus $a^{-1} = u$.
|
||||
|
||||
($\implies$) Suppose that $x = a^{-1}$ exists. Then $ax \equiv 1 \pmod n$, so $ax = 1 + kn$ for some $n \in \mathbb{Z}$. Then $ax - nk = 1$. $\gcd(a, n)$ must divide the LHS, so $\gcd(a, n) = 1$.
|
||||
|
||||
## Euclidean Algorithm
|
||||
|
||||
### Greatest Common Divisor
|
||||
|
||||
> **Definition.** Let $a, b \in \mathbb{Z} \setminus \left\lbrace 0 \right\rbrace$ . The **greatest common divisor** of $a$ and $b$ is the largest integer $d$ such that $d \mid a$ and $d \mid b$. We write $d = \gcd(a, b)$.
|
||||
|
||||
> **Definition.** If $\gcd(a, b) = 1$, we say that $a$ and $b$ are **relatively prime**.
|
||||
|
||||
### Euclidean Algorithm
|
||||
|
||||
Euclidean Algorithm is an efficient way to find $\gcd(a, b)$. It relies on the following lemma.
|
||||
|
||||
> **Lemma.** For $a, b \in \mathbb{Z}$ and $b \neq 0$, $\gcd(a, b) = \gcd(b, a \bmod b)$.
|
||||
|
||||
*Proof*. By the division algorithm, there exists $q, r \in \mathbb{Z}$ such that $a = bq + r$. Here, $r = a \bmod b$.
|
||||
|
||||
Let $d = \gcd(a, b)$. Then $d \mid a$ and $d \mid b$, so $d \mid (a - bq)$. Thus $d \leq \gcd(b, r)$. Conversely, let $d' = \gcd(b, r)$. Then $d' \mid b$ and $d' \mid (a - bq)$, so $d' \mid a$. Thus $d' \leq \gcd(a, b)$. Thus $d = d'$.
|
||||
|
||||
The following code computes the greatest common divisor.
|
||||
|
||||
```c
|
||||
int gcd(int a, int b) {
|
||||
if (b == 0) {
|
||||
return a;
|
||||
} else {
|
||||
return gcd(b, a % b);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Extended Euclidean Algorithm
|
||||
|
||||
We can extend the Euclidean algorithm to compute $u, v \in \mathbb{Z}$ such that
|
||||
|
||||
$$
|
||||
ua + vb = \gcd(a, b).
|
||||
$$
|
||||
|
||||
Basically, we use the Euclidean algorithm and solve for the remainder (which is the $\gcd$).
|
||||
|
||||
#### Calculating Modular Multiplicative Inverse
|
||||
|
||||
We can use the extended Euclidean algorithm to find modular inverses. Suppose we want to calculate $a^{-1}$ in $\mathbb{Z}_n$. We assume that the inverse exist, so $\gcd(a, n) = 1$.
|
||||
|
||||
Therefore, we use the extended Euclidean algorithm and find $x, y \in \mathbb{Z}$ such that
|
||||
|
||||
$$
|
||||
ax + ny = 1.
|
||||
$$
|
||||
|
||||
Then $ax \equiv 1 - ny \equiv 1 \pmod n$, thus $x$ is the inverse of $a$ in $\mathbb{Z}_n$.
|
||||
|
||||
[^1]: Note that in C standards, `(a / b) * b + (a % b) == a`.
|
||||
[^2]: $a$ and $b$ are in the same coset of $\mathbb{Z}/n\mathbb{Z}$.
|
||||
@@ -0,0 +1,278 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
- cryptography
|
||||
- number-theory
|
||||
title: 05. Modular Arithmetic (2)
|
||||
date: 2023-10-04
|
||||
github_title: 2023-10-04-modular-arithmetic-2
|
||||
---
|
||||
|
||||
## Exponentiation by Squaring
|
||||
|
||||
Suppose we want to calculate $a^n$ where $n$ is very large, like $n \approx 2^{1000}$. A naive multiplication would take $\mathcal{O}(n)$ multiplications. We will ignore integer overflow for simplicity.
|
||||
|
||||
```c
|
||||
int naive_exponentiation(int a, int n) {
|
||||
int result = 1;
|
||||
for (int i = 0; i < n; ++i) {
|
||||
result *= a;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
```
|
||||
|
||||
Using the above implementation, computing $3^{2^{63} - 1}$ takes almost forever...
|
||||
|
||||
Instead, we use **exponentiation by squaring** method. Notice the following,
|
||||
|
||||
$$
|
||||
a^n = \begin{cases}
|
||||
(a^2)^{\frac{n}{2}} & (n \text{ is even})\\
|
||||
a \cdot (a^2)^{\frac{n-1}{2}} & (n \text{ is odd})
|
||||
\end{cases}.
|
||||
$$
|
||||
|
||||
Therefore, the exponent is reduced by half for every multiplication. Here is the implementation. The base cases are to be handled separately.
|
||||
|
||||
```c
|
||||
int exponentiation_by_squaring(int a, int n) {
|
||||
if (n == 0) {
|
||||
return 1;
|
||||
} else if (n == 1) {
|
||||
return a;
|
||||
}
|
||||
|
||||
int result = 1;
|
||||
if (n % 2 == 0) {
|
||||
return exponentiation_by_squaring(a * a, n / 2);
|
||||
} else {
|
||||
return a * exponentiation_by_squaring(a * a, (n - 1) / 2);
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The above code executes about $\mathcal{O}(\log n)$ multiplications. Now we can actually get an answer for $3^{2^{63} - 1}$.
|
||||
|
||||
Alternatively, here is an iterative version of the above for those who want to save some memory.
|
||||
|
||||
```c
|
||||
int exponentiation_by_squaring_iterative(int a, int n) {
|
||||
int result = 1;
|
||||
int base = a, exponent = n;
|
||||
while (exponent > 0) {
|
||||
if (n % 2 == 1) {
|
||||
result *= base;
|
||||
}
|
||||
|
||||
base *= base;
|
||||
exponent /= 2;
|
||||
}
|
||||
return result;
|
||||
}
|
||||
```
|
||||
|
||||
For even better (maybe faster) results, we need the help of elementary number theory.
|
||||
|
||||
## Fermat's Little Theorem
|
||||
|
||||
> **Theorem.** Let $p$ be prime. For $a \in \mathbb{Z}$ such that $\gcd(a, p) = 1$,
|
||||
>
|
||||
> $$
|
||||
> a^{p-1} \equiv 1 \pmod p.
|
||||
> $$
|
||||
|
||||
*Proof*. (Using group theory) The statement can be rewritten as follows. For $a \neq 0$ in $\mathbb{Z}_p$, $a^{p-1} = 1$ in $\mathbb{Z}_p$. Since $\mathbb{Z}_p^*$ is a (multiplicative) group of order $p-1$, the order of $a$ should divide $p-1$. Therefore, $a^{p-1} = 1$ in $\mathbb{Z}_p$.
|
||||
|
||||
Here is an elementary proof not using group theory.
|
||||
|
||||
*Proof*. (Elementary) Let $S = \left\lbrace 0, 1, \dots, p-1 \right\rbrace$. Consider a map $f : S \rightarrow S$ defined as $x \mapsto ax \bmod p$ ($a \neq 0$).
|
||||
|
||||
We will show that $f$ is injective. Suppose that $ax \equiv ay \pmod p$ for distinct $x, y \in S$. Since $\gcd(a, p) = 1$, $a$ has a multiplicative inverse, thus $x \equiv y \pmod p$. Then $x, y$ should be same elements of $S$.
|
||||
|
||||
By injectivity, $f(i)$ are distinct for all $i \in S$, so $f$ is a permutation on $S$. Therefore, the product of all elements of $S$ must be equal to the product of all $f(i)$ for $i \in S$.
|
||||
|
||||
$$
|
||||
(p-1)! \equiv f(1)f(2)\cdots f(p-1) \equiv a^{p-1} \cdot (p-1)!\pmod p.
|
||||
$$
|
||||
|
||||
Since $\gcd(i, p) = 1$ for all $i \in S$, we can multiply the multiplicative inverse for all $i \in S$ and we get $a^{p-1} \equiv 1 \pmod p$.
|
||||
|
||||
## Euler's Totient Function
|
||||
|
||||
For composite modulus, we have Euler's generalization. Before proving the theorem, we first need to define Euler's totient function.
|
||||
|
||||
> **Definition.** Let $n \in \mathbb{N}$. Define $\phi(n)$ as the number of positive integers $k \leq n$ such that $\gcd(n, k) = 1$.
|
||||
|
||||
For direct calculation, we use the following formula.
|
||||
|
||||
> **Lemma.** For $n \in \mathbb{N}$, the following holds.
|
||||
>
|
||||
> $$
|
||||
> \phi(n) = n \cdot \prod_{p \mid n} \left( 1 - \frac{1}{p} \right)
|
||||
> $$
|
||||
>
|
||||
> where $p$ is a prime number dividing $n$.
|
||||
|
||||
So to calculate $\phi(n)$, we need to **factorize** $n$. From the formula above, we have some corollaries.
|
||||
|
||||
> **Corollary.** For prime numbers $p, q$ and $k \in \mathbb{N}$, the following hold.
|
||||
> 1. $\phi(p) = p - 1$.
|
||||
> 2. $\phi(pq) = (p-1)(q-1)$.
|
||||
> 3. $\phi(p^k) = p^{k-1}(p-1)$.
|
||||
|
||||
### Reduced Set of Residues
|
||||
|
||||
Let $n \in \mathbb{N}$. The **complete set of residues** was denoted $\mathbb{Z}_n$ and
|
||||
|
||||
$$
|
||||
\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n-1 \right\rbrace.
|
||||
$$
|
||||
|
||||
We also often use the **reduced set of residues**.
|
||||
|
||||
> **Definition.** The **reduced set of residues** is the set of residues that are relatively prime to $n$. We denote this set as $\mathbb{Z}_n^*$.
|
||||
>
|
||||
> $$
|
||||
> \mathbb{Z}_n^* = \left\lbrace a \in \mathbb{Z}_n \setminus \left\lbrace 0 \right\rbrace : \gcd(a, n) = 1 \right\rbrace.
|
||||
> $$
|
||||
|
||||
Then by definition, we have the following result.
|
||||
|
||||
> **Lemma.** $\left\lvert \mathbb{Z}_n^* \right\lvert = \phi(n)$.
|
||||
|
||||
We can also show that $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||
|
||||
> **Lemma.** $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||
|
||||
*Proof*. Let $a, b \in \mathbb{Z}_n^{ * }$. We must check if $ab \in \mathbb{Z}_n^{ * }$. Since $\gcd(a, n) = \gcd(b, n) = 1$, $\gcd(ab, n) = 1$. This is because if $d = \gcd(ab, n) > 1$, then a prime factor $p$ of $d$ must divide $a$ or $b$ and also $n$. Then $\gcd(a, n) \geq p$ or $\gcd(b, n) \geq p$, which is a contradiction. Thus $ab \in \mathbb{Z}_n^{ * }$.
|
||||
|
||||
Associativity holds trivially, as a subset of $\mathbb{Z}_n$. We also have an identity element $1$, and inverse of $a \in \mathbb{Z}_n^*$ exists since $\gcd(a, n) = 1$.
|
||||
|
||||
Now we can prove Euler's generalization.
|
||||
|
||||
## Euler's Generalization
|
||||
|
||||
> **Theorem.** Let $a \in \mathbb{Z}$ such that $\gcd(a, n) = 1$. Then
|
||||
>
|
||||
> $$
|
||||
> a^{\phi(n)} \equiv 1 \pmod n.
|
||||
> $$
|
||||
|
||||
*Proof*. Since $\gcd(a, n) = 1$, $a \in \mathbb{Z}_n^{ * }$. Then $a^{\left\lvert \mathbb{Z}_n^{ * } \right\lvert} = 1$ in $\mathbb{Z}_n$. By the above lemma, we have the desired result.
|
||||
|
||||
*Proof*. (Elementary) Set $f : \mathbb{Z}_n^* \rightarrow \mathbb{Z}_n^*$ as $x \mapsto ax \bmod n$, then the rest of the reasoning follows similarly as in the proof of Fermat's little theorem.
|
||||
|
||||
Using the above result, we remark an important result that will be used in RSA.
|
||||
|
||||
> **Lemma.** Let $n \in \mathbb{N}$. For $a, b \in \mathbb{Z}$ and $x \in \mathbb{Z}_n^*$, if $a \equiv b \pmod{\phi(n)}$, then $x^a \equiv x^b \pmod n$.
|
||||
|
||||
*Proof*. $a = b + k\phi(n)$ for some $k \in \mathbb{Z}$. Then
|
||||
|
||||
$$
|
||||
x^a \equiv x^{b + k\phi(n)} = (x^{\phi(n)})^k \cdot x^b \equiv x^b \pmod n
|
||||
$$
|
||||
|
||||
by Euler's generalization.
|
||||
|
||||
## Groups Based on Modular Arithmetic
|
||||
|
||||
> **Definition.** A **group** is a set $G$ with a binary operation $* : G \times G \rightarrow G$, satisfying the following properties.
|
||||
>
|
||||
> - $(\mathsf{G1})$ The binary operation $*$ is **closed**.
|
||||
> - $(\mathsf{G2})$ The binary operation $*$ is **associative**, so $(a * b) * c = a * (b * c)$ for all $a, b, c \in G$.
|
||||
> - $(\mathsf{G3})$ $G$ has an **identity** element $e$ such that $e * a = a * e = a$ for all $a \in G$.
|
||||
> - $(\mathsf{G4})$ There is an **inverse** for every element of $G$. For each $a \in G$, there exists $x \in G$ such that $a * x = x * a = e$. We write $x = a^{-1}$ in this case.
|
||||
|
||||
$\mathbb{Z}_n$ is an additive group, and $\mathbb{Z}_n^*$ is a multiplicative group.
|
||||
|
||||
## Chinese Remainder Theorem (CRT)
|
||||
|
||||
> **Theorem.** Let $n_1, \dots, n_k$ be integers greater than $1$, and let $N = n_1n_2\cdots n_k$. If $n_i$ are pairwise relatively prime, then the system of equations $x \equiv a_i \pmod {n_i}$ has a unique solution modulo $N$.
|
||||
>
|
||||
> *(Abstract Algebra)* The map
|
||||
>
|
||||
> $$
|
||||
> x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
|
||||
> $$
|
||||
>
|
||||
> defines a ring isomorphism
|
||||
>
|
||||
> $$
|
||||
> \mathbb{Z}_N \simeq \mathbb{Z}_{n_1} \times \mathbb{Z}_{n_2} \times \cdots \times \mathbb{Z}_{n_k}.
|
||||
> $$
|
||||
|
||||
*Proof*. (**Existence**) Let $N_i = N/n_i$. Then $\gcd(N_i, n_i) = 1$. By the extended Euclidean algorithm, there exist integers $M_i, m_i$ such that $M_iN_i + m_in_i= 1$. Now set
|
||||
|
||||
$$
|
||||
x = \sum_{i=1}^k a_i M_i N_i.
|
||||
$$
|
||||
|
||||
Then $x \equiv a_iM_iN_i \equiv a_i(1 - m_in_i) \equiv a_i \pmod {n_i}$ for all $i = 1, \dots, k$.
|
||||
|
||||
(**Uniqueness**) Suppose that we have two distinct solutions $x, y$ modulo $N$. $x, y$ are solutions to $x \equiv a_i \pmod {n_i}$, so $n_i \mid (x - y)$ for all $i$. Therefore we have
|
||||
|
||||
$$
|
||||
\mathrm{lcm}(n_1, \dots, n_k) \mid (x - y).
|
||||
$$
|
||||
|
||||
But $n_i$ are pairwise relatively prime, so $\mathrm{lcm}(n_1, \dots, n_k) = N$ and $N \mid (x-y)$. Hence $x \equiv y \pmod N$.
|
||||
|
||||
*Proof*. (**Abstract Algebra**) The above uniqueness proof shows that the map
|
||||
|
||||
$$
|
||||
x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k)
|
||||
$$
|
||||
|
||||
is injective. By pigeonhole principle, this map must also be surjective. This map is also a ring homomorphism, by the properties of modular arithmetic. We have a ring isomorphism.
|
||||
|
||||
### Notes on the Proof of the Chinese Remainder Theorem
|
||||
|
||||
The elementary proof given above gives a *direct construction* of the solution. It is clear and easy to understand, and tells us how to find the actual solution.
|
||||
|
||||
But when the above proof is used in actual computation, it involves computations of very large numbers. The following is an implementation.
|
||||
|
||||
```cpp
|
||||
// remainder holds the a_i values
|
||||
// modulus holds the n_i values
|
||||
int chinese_remainder_theorem(vector<int>& remainder, vector<int>& modulus) {
|
||||
int product = 1;
|
||||
for (int m : modulus) {
|
||||
product *= m;
|
||||
}
|
||||
|
||||
int result = 0;
|
||||
for (int i = 0; i < (int) modulus.size(); ++i) {
|
||||
int N_i = product / modulus[i];
|
||||
result += remainder[i] * modular_inverse(N_i, modulus[i]) * N_i;
|
||||
result %= product;
|
||||
}
|
||||
|
||||
return result;
|
||||
}
|
||||
```
|
||||
|
||||
The `modular_inverse` function uses the extended Euclidean algorithm to find $M_i$ in the proof. For large moduli and many equations, $N_i = N / n_i$ results in a very large number, which is hard to handle (if your language has integer overflow) and takes longer to compute.
|
||||
|
||||
A better way is to construct the solution **inductively**. Find a solution for the first two equations,
|
||||
|
||||
$$
|
||||
\begin{array}{c}
|
||||
x \equiv a_1 \pmod{n_1} \\
|
||||
x \equiv a_2 \pmod{n_2}
|
||||
\end{array} \implies x \equiv a_{1, 2} \pmod{n_1n_2}
|
||||
$$
|
||||
|
||||
and using the result, add the next equation $x \equiv a_3 \pmod{n_3}$ and find a solution.[^1]
|
||||
|
||||
Lastly, the ring isomorphism actually tells us a lot and is quite effective for computation. Since the two rings are *isomorphic*, operations in $\mathbb{Z} _ N$ can be done independently in each $\mathbb{Z} _ {n_i}$ and then merged back to $\mathbb{Z} _ N$. $N$ was a large number, so computations can be much faster in $\mathbb{Z} _ {n _ i}$. Specifically, we will see how this fact is used for computations in RSA.
|
||||
|
||||
[^1]: I have an implementation in my repository. [Link](https://github.com/calofmijuck/BOJ/blob/4b29e0c7f487aac3186661176d2795f85f0ab21b/Codes/23000/23062.cpp#L38).
|
||||
180
_posts/lecture-notes/internet-security/2023-10-04-rsa-elgamal.md
Normal file
180
_posts/lecture-notes/internet-security/2023-10-04-rsa-elgamal.md
Normal file
@@ -0,0 +1,180 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
- cryptography
|
||||
- number-theory
|
||||
title: 06. RSA and ElGamal Encryption
|
||||
date: 2023-10-04
|
||||
github_title: 2023-10-04-rsa-elgamal
|
||||
---
|
||||
|
||||
## Exponential Inverses
|
||||
|
||||
Suppose we are given integers $a$ and $N$. For any integer $x$ that is relatively prime to $N$, we choose $b$ so that
|
||||
|
||||
$$
|
||||
|
||||
\tag{$*$}
|
||||
ab \equiv 1 \pmod{\phi(N)}.
|
||||
$$
|
||||
|
||||
Then we have
|
||||
|
||||
$$
|
||||
x^{ab} \equiv x^{1 + k\phi(N)} \equiv x \pmod N
|
||||
$$
|
||||
|
||||
by Euler's generalization.
|
||||
|
||||
> **Definition.** The integer $b$ satisfying $(\ast)$ is called the **exponential inverse of $a$ modulo $N$**.
|
||||
|
||||
Using exponential inverses will be a key idea in the RSA cryptosystem.
|
||||
|
||||
## RSA Cryptosystem
|
||||
|
||||
This is an explanation of *textbook* RSA encryption scheme.
|
||||
|
||||
### Key Generation
|
||||
|
||||
- We pick two large primes $p, q$ and set $N = pq$.
|
||||
- Select $(e, d)$ so that $ed \equiv 1 \pmod{\phi(N)}$.
|
||||
- Set $(N, e)$ as the **public key** and make it public.
|
||||
- Set $d$ as the **private key** and keep it secret.
|
||||
|
||||
### RSA Encryption and Decryption
|
||||
|
||||
Suppose we want to encrypt a message $m \in \mathbb{Z}_N$.
|
||||
|
||||
- **Encryption**
|
||||
- Using the public key $(N, e)$, compute the ciphertext $c = m^e \bmod N$.
|
||||
- **Decryption**
|
||||
- Recover the original message by computing $c^d \bmod N$.
|
||||
|
||||
### Correctness of RSA?
|
||||
|
||||
Since $ed \equiv 1 \pmod{\phi(N)}$, we have
|
||||
|
||||
$$
|
||||
c^d \equiv m^{ed} \equiv m \pmod N
|
||||
$$
|
||||
|
||||
by the properties of exponential inverses.
|
||||
|
||||
Wait, but the properties requires that $\gcd(m, N) = 1$. So it seems like we can't use some values of $m$. Furthermore, it should be computationally infeasible to recover $d$ using $e$ and $N$.
|
||||
|
||||
### Regarding the Choice of $N$
|
||||
|
||||
If $N$ is prime, it is very easy to find $d$. Since the relation $ed \equiv 1 \pmod {(N-1)}$ holds, we directly see that $d$ can be computed efficiently using the extended Euclidean algorithm.
|
||||
|
||||
The next simplest case would be setting $N = pq$ for two large primes $p$ and $q$. We expose $N$ to the public but hide primes $p$ and $q$. Now suppose the attacker wants to compute $d$ using $(N, e)$. The attacker knows that $ed \equiv 1 \pmod {\phi(N)}$, and $\phi(N) = (p-1)(q-1)$. So to calculate $d$, the attacker must know $\phi(N)$, which requires the **factorization of $N$**.
|
||||
|
||||
If the factorization $N = pq$ is known, finding $d$ is easy. But factoring large prime numbers (especially a product of two primes of similar size) is known to be very difficult.[^1] No one has formally proven this, but we believe and assume that it is hard.[^2]
|
||||
|
||||
## Chinese Remainder Theorem in RSA
|
||||
|
||||
Assume that the message $m$ is not divisible by both $p$ and $q$. By Fermat's little theorem, we have $m^{p-1} \equiv 1 \pmod p$ and $m^{q-1} \equiv 1 \pmod q$.
|
||||
|
||||
Therefore, for decryption in RSA, the following holds. Note that $N = pq$.
|
||||
|
||||
$$
|
||||
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m \cdot (m^{p-1})^{k(q-1)} \equiv m \cdot 1^{k(q-1)} \equiv m \pmod p.
|
||||
$$
|
||||
|
||||
A similar result holds for modulus $q$. This does not exactly recover the message yet, since $m$ could have been chosen to be larger than $p$. The above equation is true, but during actual computation, one may get a result that is less than $p$. *This may not be equal to the original message*.[^3]
|
||||
|
||||
Since $N = pq$, we use the Chinese remainder theorem. Instead of computing $c^d \pmod N$, we can compute
|
||||
|
||||
$$
|
||||
c^d \equiv m \pmod p, \qquad c^d \equiv m \pmod q
|
||||
$$
|
||||
|
||||
independently and solve the system of equations to recover the message.
|
||||
|
||||
## Can I Encrypt $p$ with RSA?
|
||||
|
||||
Now we return to the problem where $\gcd(m, N) \neq 1$. The probability of $\gcd(m, N) \neq 1$ is actually $\frac{1}{p} + \frac{1}{q} - \frac{1}{pq}$, so if we take large primes $p, q \approx 2^{1000}$ as in RSA2048, the probability of this occurring is roughly $2^{-999}$, which is negligible. But for completeness, we also prove for this case.
|
||||
|
||||
$e, d$ are still chosen to satisfy $ed \equiv 1 \pmod {\phi(N)}$. Suppose we want to decrypt $c \equiv m^e \pmod N$.
|
||||
|
||||
We will also use the Chinese remainder theorem here.
|
||||
|
||||
Since $\gcd(m, N) \neq 1$ and $N = pq$, we have $p \mid m$. So if we compute in $\mathbb{Z}_p$, we will get $0$,
|
||||
|
||||
$$
|
||||
c^d \equiv m^{ed} \equiv 0^{ed} \equiv 0 \pmod p.
|
||||
$$
|
||||
|
||||
We also do the computation in $\mathbb{Z}_q$ and get
|
||||
|
||||
$$
|
||||
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m\cdot (m^{q-1})^{k(p-1)} \equiv m \cdot 1^{k(p-1)} \equiv m \pmod q.
|
||||
$$
|
||||
|
||||
Here, we used the fact that $m^{q-1} \equiv 1 \pmod q$. This holds because if $p \mid m$, $m$ is a multiple of $p$ that is less than $N$, so $m = pm'$ for some $m'$ such that $1 \leq m' < q$. Then $\gcd(m, q) = \gcd(pm', q) = 1$ since $q$ does not divide $p$ and $m'$ is less than $q$.
|
||||
|
||||
Therefore, from $c^d \equiv 0 \pmod p$ and $c^d \equiv (m \bmod q) \pmod q$, we can recover a unique solution $c^d \equiv m \pmod N$.
|
||||
|
||||
Now we must argue that the recovered solution is actually equal to the original $m$. But what we did above was showing that $m^{ed}$ and $m$ in $\mathbb{Z}_N$ are mapped to the same element $(0, m \bmod q)$ in $\mathbb{Z}_p \times \mathbb{Z}_q$. Since the Chinese remainder theorem tells us that this mapping is an isomorphism, $m^{ed}$ and $m$ must have been the same elements of $\mathbb{Z}_N$ in the first place.
|
||||
|
||||
Notice that we did not require $m$ to be relatively prime to $N$. Thus the RSA encryption scheme is correct for any $m \in \mathbb{Z}_N$.
|
||||
|
||||
## Correctness of RSA with Fermat's Little Theorem
|
||||
|
||||
Actually, the above argument can be proven only with Fermat's little theorem. In the above proof, the Chinese remainder theorem was used to transform the operation, but for $N = pq$, the situation is simple enough that this theorem is not necessarily required.
|
||||
|
||||
Let $M = m^{ed} - m$. We have shown above only using Fermat's little theorem that $p \mid M$ and $q \mid M$, for any choice of $m \in \mathbb{Z}_N$. Then since $N = pq = \mathrm{lcm}(p, q)$, we have $N \mid M$, so $m^{ed} \equiv m \pmod N$. Hence the RSA scheme is correct.
|
||||
|
||||
So we don't actually need Euler's generalization for proving the correctness of RSA...?! In fact, the proof given in the original paper of RSA used Fermat's little theorem.
|
||||
|
||||
## Discrete Logarithms
|
||||
|
||||
This is an inverse problem of exponentiation. The inverse of exponentials is logarithms, so we consider the **discrete logarithm of a number modulo $p$**.
|
||||
|
||||
Given $y \equiv g^x \pmod p$ for some prime $p$, we want to find $x = \log_g y$. We set $g$ to be a generator of the group $\mathbb{Z}_p$ or $\mathbb{Z}_p^*$, since if $g$ is the generator, a solution always exists.
|
||||
|
||||
Read more in [discrete logarithm problem (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md#discrete-logarithm-problem-(dl)).
|
||||
|
||||
## ElGamal Encryption
|
||||
|
||||
This is an encryption scheme built upon the hardness of the DLP.
|
||||
|
||||
> 1. Let $p$ be a large prime.
|
||||
> 2. Select a generator $g \in \mathbb{Z}_p^*$.
|
||||
> 3. Choose a private key $x \in \mathbb{Z}_p^*$.
|
||||
> 4. Compute the public key $y = g^x \pmod p$.
|
||||
> - $p, g, y$ will be publicly known.
|
||||
> - $x$ is kept secret.
|
||||
|
||||
### ElGamal Encryption and Decryption
|
||||
|
||||
Suppose we encrypt a message $m \in \mathbb{Z}_p^*$.
|
||||
|
||||
> 1. The sender chooses a random $k \in \mathbb{Z}_p^*$, called *ephemeral key*.
|
||||
> 2. Compute $c_1 = g^k \pmod p$ and $c_2 = my^k \pmod p$.
|
||||
> 3. $c_1, c_2$ are sent to the receiver.
|
||||
> 4. The receiver calculates $c_1^x \equiv g^{xk} \equiv y^k \pmod p$, and find the inverse $y^{-k} \in \mathbb{Z}_p^*$.
|
||||
> 5. Then $c_2y^{-k} \equiv m \pmod p$, recovering the message.
|
||||
|
||||
The attacker will see $g^k$. By the hardness of DLP, the attacker is unable to recover $k$ even if he knows $g$.
|
||||
|
||||
#### Ephemeral Key Should Be Distinct
|
||||
|
||||
If the same $k$ is used twice, the encryption is not secure. Suppose we encrypt two different messages $m_1, m_2 \in \mathbb{Z} _ p^{ * }$. The attacker will see $(g^k, m_1y^k)$ and $(g^k, m_2 y^k)$. Then since we are in a multiplicative group $\mathbb{Z} _ p^{ * }$, inverses exist. So
|
||||
|
||||
$$
|
||||
m_1y^k \cdot (m_2 y^k)^{-1} \equiv m_1m_2^{-1} \equiv 1 \pmod p
|
||||
$$
|
||||
|
||||
which implies that $m_1 \equiv m_2 \pmod p$, leaking some information.
|
||||
|
||||
[^1]: If one of the primes is small, factoring is easy. Therefore we require that $p, q$ both be large primes.
|
||||
[^2]: There is a quantum polynomial time (BQP) algorithm for integer factorization. See [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm).
|
||||
[^3]: This part of the explanation is not necessary if we use abstract algebra!
|
||||
@@ -0,0 +1,139 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
- cryptography
|
||||
title: 07. Public Key Cryptography
|
||||
date: 2023-10-09
|
||||
github_title: 2023-10-09-public-key-cryptography
|
||||
---
|
||||
|
||||
In symmetric key cryptography, we have a problem with key sharing and management. More info in the first few paragraphs of [Key Exchange (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md).
|
||||
|
||||
## Public Key Cryptography
|
||||
|
||||
We use **two** keys for public key cryptography. The keys are called *public key* and *private key*. These two keys are related to each other, but it is almost impossible to calculate the private key from the public key.
|
||||
|
||||
- **Public key** is *public*, and anyone can use it to encrypt messages or verify signatures.
|
||||
- **Private key** (or secret key) is only kept by the owner. It is used to decrypt messages or create signatures.
|
||||
|
||||
We will denote public keys as $pk$ and private keys as $sk$.
|
||||
|
||||
These keys are created to be used in **trapdoor one-way functions**.
|
||||
|
||||
### One-way Function
|
||||
|
||||
A **one-way function** is a function that is easy to compute, but hard to compute the pre-image of any output. Here are some common examples.
|
||||
|
||||
- *Cryptographic hash functions*: [Hash Functions (Modern Cryptography)](../modern-cryptography/2023-09-28-hash-functions.md#collision-resistance).
|
||||
- *Factoring a large integer*: It is easy to multiply to integers even if they're large, but factoring is very hard.
|
||||
- *Discrete logarithm problem*: It is easy to exponentiate a number, but it is hard to find the discrete logarithm.
|
||||
|
||||
But a one-way function is not enough. Suppose that $f$ is a one way function with a public key $pk$. It will be easy to encrypt a message $m$ as $f(pk, m)$, but recovering $m$ is hard even for the intended recipient.
|
||||
|
||||
### Trapdoor One-way Function
|
||||
|
||||
A **trapdoor one-way function** has a *trapdoor*. It is computationally difficult to find the preimage, but with the trapdoor, the inverting is easy.
|
||||
|
||||
In public key cryptography, the trapdoor is the *private key* that makes it easy to invert the one-way function $f$. So the recipient can efficiently invert $f$ and recover the message $m$.
|
||||
|
||||
### Encryption and Decryption
|
||||
|
||||
In public key cryptography, encryption and decryption are done as follows.
|
||||
|
||||
Suppose that Alice wants to send a secret message to Bob. Alice must encrypt the message using **Bob's public key**, so that only Bob can decrypt the message.
|
||||
|
||||
> 1. Alice takes a plaintext and encrypts it using Bob's public key.
|
||||
> 2. The ciphertext is sent to Bob.
|
||||
> 3. Bob uses his private key to decrypt the ciphertext.
|
||||
|
||||
Mathematically, let $pk, sk$ be Bob's public key and private key.
|
||||
|
||||
> 1. Alice computes the ciphertext $c = f(pk, m)$ of the message $m$.
|
||||
> 2. $c$ is sent to Bob.
|
||||
> 3. Bob computes $m = f^{-1}(sk, c)$ and recovers $m$.
|
||||
|
||||
### Authentication
|
||||
|
||||
Public key cryptography can be used also for **authentication**. If some ciphertext can be decrypted with Alice's public key, we can verify that the message was from Alice.
|
||||
|
||||
We will learn more about this when we learn digital signatures.
|
||||
|
||||
### Applications of Public Key Cryptography
|
||||
|
||||
- **Encryption and decryption**: for private communication.
|
||||
- **Digital signatures**: authentication, as explained above.
|
||||
- This was not possible with symmetric cryptography since both parties have the key, so does not satisfy non-repudiation.
|
||||
- **Key exchange**
|
||||
- We assumed that in symmetric cryptography, there was a secure channel to share the secret key.
|
||||
- We use public key cryptography to exchange and agree on the secret key for the symmetric cipher.
|
||||
- Public key cryptography takes longer to calculate, so it is preferable to use symmetric ciphers.
|
||||
|
||||
But a problem still remains. How does one verify that this key is indeed from that identity? In the example above, how does Alice know that this public key is from Bob and not someone else's? This problem will be solved using **public key infrastructure**.
|
||||
|
||||
## Diffie-Hellman Key Exchange
|
||||
|
||||
Choose a large prime $p$ and a generator $g$ of $\mathbb{Z}_p^{ * }$. The description of $g$ and $p$ will be known to the public.
|
||||
|
||||
> 1. Alice chooses some $x \in \mathbb{Z}_p^{ * }$ and sends $g^x \bmod p$ to Bob.
|
||||
> 2. Bob chooses some $y \in \mathbb{Z}_p^{ * }$ and sends $g^y \bmod p$ to Alice.
|
||||
> 3. Alice and Bob calculate $g^{xy} \bmod p$ separately.
|
||||
> 4. Eve can see $g^x \bmod p$, $g^y \bmod p$ but cannot calculate $g^{xy} \bmod p$.
|
||||
|
||||
Refer to [Diffie-Hellman Key Exchange (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md#diffie-hellman-key-exchange-(dhke)).
|
||||
|
||||
## Message Integrity
|
||||
|
||||
A function $H$ takes an input of arbitrary length message and outputs a fixed length string. The output is called **message digest**, *tag*, *fingerprint* or **hash**.
|
||||
|
||||
Here, the $H$ is called a **hash function**. This function is many-to-one, but it is usually computationally infeasible to find a collision.
|
||||
|
||||
**Desirable Properties of $H$**.
|
||||
|
||||
- $H$ should be easy to calculate.
|
||||
- It should be hard to recover $m$ from $H(m)$. (one-wayness)
|
||||
- It should be computationally difficult to find a collision. (collision resistance)
|
||||
- The output should seem random.
|
||||
|
||||
Using this function, we can check whether if the message was tampered during transmission.
|
||||
|
||||
### Message Authentication Code (MAC)
|
||||
|
||||
We assume that Alice and Bob already share a secret $k$. Alice wants to send a message $m$ to Bob.
|
||||
|
||||
> 1. Alice signs the message using the key and calculates the tag $t = H(k, m)$.
|
||||
> 2. Alice sends the message and tag.
|
||||
> 3. Bob calculates the tag $t'$ from the received message. If $t'$ does not match with $t$, Bob detects that the message was modified.
|
||||
|
||||
We only care about message integrity in MACs, so the message is not encrypted.
|
||||
|
||||
### Properties of MAC
|
||||
|
||||
- MACs are based on symmetric keys, so communicating parties must share a key.
|
||||
- MACs should be able to accept messages of arbitrary length.
|
||||
- MACs should output a fixed-length string.
|
||||
- MACs should provide message integrity. Any manipulations in transit will be detected, and receiving party is assured of the origin of the message.
|
||||
- MACs **do not** support non-repudiation.
|
||||
- Since both parties have the secret, any two party can create the message.
|
||||
|
||||
## Digital Signatures
|
||||
|
||||
**Digital signatures** achieve *integrity*, *non-repudiation* and *authentication*. We leverage public key cryptography.
|
||||
|
||||
Suppose Alice wants to **sign** a message $m$. Alice has public key $pk$ and private key $sk$.
|
||||
|
||||
> 1. Alice calculates $\sigma = D(sk, m)$ and sends $m \parallel \sigma$.
|
||||
> 2. Bob receives it and calculates $E(pk, \sigma)$ and compares it with $m$.
|
||||
> - The key $pk$ here is Alice's public key.
|
||||
|
||||
- Since the signature can be verified using Alice's public key, it must have been signed using Alice's private key.
|
||||
- Thus the message must have been from Alice.
|
||||
- Verification is done using Alice's public key, so anyone can verify the message.
|
||||
- Messages are usually long, so we take a hash function $H$ to shorten it, and sign $H(m)$ instead.
|
||||
205
_posts/lecture-notes/internet-security/2023-10-16-pki.md
Normal file
205
_posts/lecture-notes/internet-security/2023-10-16-pki.md
Normal file
@@ -0,0 +1,205 @@
|
||||
---
|
||||
share: true
|
||||
toc: true
|
||||
math: true
|
||||
categories:
|
||||
- Lecture Notes
|
||||
- Internet Security
|
||||
path: _posts/lecture-notes/internet-security
|
||||
tags:
|
||||
- lecture-note
|
||||
- security
|
||||
title: 08. Public Key Infrastructure
|
||||
date: 2023-10-16
|
||||
github_title: 2023-10-16-pki
|
||||
attachment:
|
||||
folder: assets/img/posts/lecture-notes/internet-security
|
||||
---
|
||||
|
||||
Suppose that we're using RSA, Alice has public key $(N, e)$ and private key $d$. Anyone can send messages to Alice using $(N, e)$. But because anyone can generate $(N, e)$, we are not sure whether the key $(N, e)$ is *really* Alice's key. We might run into a situation where $(N, e)$ was actually some other person's key. *How do we check whose key this is?*
|
||||
|
||||
**Public key infrastructure** (PKI) solves this problem by using **certificates**.
|
||||
|
||||
## Cryptographic Certificates
|
||||
|
||||
We focus on **cryptographic certificates**.
|
||||
|
||||
> A **certificate** is an electronic document to bind a *public key* with its *owner's identity*. This binding is assured by a *digital signature* of an issuer, we call the issuer a **certificate authority**. (CA)
|
||||
|
||||
- A certificate authority is a *trusted third party* (TTP).
|
||||
- If we don't trust the CA, then we cannot trust the signature of the certificate.
|
||||
- Commercial CAs charge to issue certificates.
|
||||
- Kinds of certificates
|
||||
- Digital certificate
|
||||
- Public key certificate
|
||||
- SSL (or TLS) certificate
|
||||
- X.509 certificates (ITU-T)
|
||||
- Web server certificates
|
||||
|
||||
## Components of Public Key Infrastructure
|
||||
|
||||
- **Registration Authority** (RA)
|
||||
- Checks the individual or entity that requests the certificate.
|
||||
- It will check if the requester has a matching private key, etc.
|
||||
- This result will be sent to the certification authority.
|
||||
- **Certification Authority** (CA)
|
||||
- Issues certificates. Binds a public key to a identity.
|
||||
- User identity must be unique within each CA domain.
|
||||
- Relying parties (browsers, etc.) need a copy of CA's public key.
|
||||
- Directory Service
|
||||
- A directory of public keys and certificates, so that anyone can access it.
|
||||
- Revocation Service
|
||||
- A mechanism to check if a certificate is revoked or not.
|
||||
- Certificate revocation list (CRL), online certificate status protocol (OCSP).
|
||||
|
||||
Note that the certification authority can offload some of its work to the registration authority.
|
||||
|
||||
## Contents of a Certificate (X.509)
|
||||
|
||||
- **Serial Number**: a unique identifier for the certificate within the CA.
|
||||
- **Subject**: the identified person or entity. Also known as *distinguished name* (DN).
|
||||
- Common Name
|
||||
- Organization
|
||||
- Organizational Unit
|
||||
- Locality
|
||||
- State or Province
|
||||
- Country/Region
|
||||
- **Signature Algorithm**: the algorithm used to create the signature.
|
||||
- **Issuer**: the entity that verified the subject and issued the certificate.
|
||||
- **Valid-From**
|
||||
- **Valid-To**: certificate expiration date.
|
||||
- **Key-Usage**: purpose of this public key
|
||||
- Encryption, signature, etc.
|
||||
- **Public Key**: the public key of the subject.
|
||||
- **Signature**: the digital signature signed by the issuer.
|
||||
- For verifying that this signature is from the issuer, and can be trusted.
|
||||
|
||||
The hash of the entire certificate is called a **thumbprint**, but this is not included in the certificate.
|
||||
|
||||
## Certificate Validation Process
|
||||
|
||||
### Hierarchy of CAs
|
||||
|
||||
We have a root CA at the top. Then there are issuing CAs below. We usually request certificates to the issuing CA. Note that the issuing CAs also have their own certificate, which is signed by the next higher-level CA.
|
||||
|
||||
### Certificate Validation
|
||||
|
||||
[^1]
|
||||
|
||||
Since we have a hierarchy of CAs, certificate validation must also follow the hierarchy. When we receive a certificate, it is highly likely to be signed by an non-root CA.
|
||||
|
||||
Thus we validate certificates by the following process. Suppose we received a certificate $A$.
|
||||
|
||||
> 1. In $A$, check the issuer's DN and request the issuer's certificate $B$.
|
||||
> 2. Verify the signature of $A$ using the public key of the issuer in $B$.
|
||||
> 3. Recursively validate $B$ following the above steps.
|
||||
|
||||
We will request the certificate of a root CA at the end. If everything went well, all the intermediate certificates will have been verified. Now we must verify the certificate of a root CA, but a root CA does not have any higher level CAs.
|
||||
|
||||
Root CAs are decided publicly by the [CA/Browser forum](https://cabforum.org/). Thus they are acknowledged by the public community, and we agree that root CAs can be trusted. Therefore, root CAs sign their own certificates.
|
||||
|
||||
In many web browsers, root CAs are whitelisted so that they are always trusted.
|
||||
|
||||
### Self-Signed Certificates
|
||||
|
||||
As in the example of root CAs, there are certificates that is signed by the same identity as the subject. i.e, the issuer and the subject are the same. We call these **self-signed certificates**.
|
||||
|
||||
We generally don't trust self-signed certificates, since they can be created easily. Anyone can generate a keypair, create a certificate and sign it by oneself.
|
||||
|
||||
But there are some places where self-signed certificates are handy. The first example is the case of root CAs. Also, since issuing a certificate from a CA requires money, using a self-signed certificate for test servers saves money and time.
|
||||
|
||||
There are also some problems with self-signed certificates. These certificates are self-created and self-signed, so the certificate can contain arbitrary values.[^2] For example, the certificate can be valid for a thousand years, which is usually not possible for CA-issued certificates. Lastly, self-signed certificates are hard to revoke by nature, since it is not issued by CAs.
|
||||
|
||||
## Certificate Revocation
|
||||
|
||||
### Key Pair Lifecycle
|
||||
|
||||
A key is generated, and a certificate is issued with the key. If the key has expired, we revoke the certificate.
|
||||
|
||||
- Keys should be generated by the owner, for non-repudiation.
|
||||
- Dual key pair model
|
||||
- Separate key pairs for encryption/decryption and signature.
|
||||
|
||||
### Certificate Revocation
|
||||
|
||||
There are some cases where certificates must be revoked.
|
||||
|
||||
- Certificate is mis-issued (not the right identity).
|
||||
- Key can be compromised by attackers.
|
||||
- One may forget the passphrase for the certificate.
|
||||
- The private key may get lost.
|
||||
|
||||
*PKI is only as secure as the revocation mechanism.* This is because revocation mechanism is hard to handle.
|
||||
|
||||
- The CA revokes the certificates.
|
||||
- The replying party checks the revocation status using **certificate revocation lists** and **online certificate status protocol**.
|
||||
- The certificate tells us where to get the revocation information.
|
||||
|
||||
### Certificate Revocation Lists (CRL)
|
||||
|
||||
The **certificate revocation list** (CRL) contains information about itself and revoked certificates.
|
||||
|
||||
- For each revoked certificate, the serial number and the revocation date is recorded.
|
||||
- Also contains next update date.
|
||||
- This list should be publicly available, so that anyone can check if the certificate is revoked.
|
||||
- The verifier will look at the CRL distribution URL in the certificate and receive the CRLs.
|
||||
|
||||
CRL checking is done in the following way.
|
||||
|
||||
> 1. A client connects to a website and receives the certificate of the server.
|
||||
> 2. The client queries the certificate revocation server and downloads CRLs.
|
||||
> 3. The client checks whether the certificate is revoked or not.
|
||||
|
||||
But distributing CRL in real-time is not possible. Furthermore, CRL lifecycles/update periods can vary depending on CAs. Thus there can be attacks between CRL updates. Also, CRL sizes will keep increasing over time, so it gets harder to download and manage the CRLs.
|
||||
|
||||
### Online Certificate Status Protocol (OCSP)
|
||||
|
||||
The **online certificate status protocol** (OCSP) is another way to handle certificate revocation. Basically, the client queries a OCSP server for revocation information.
|
||||
|
||||
There is an **OCSP server** that runs 24/7, responding to queries. This server can be run by the CAs or may be delegated to some other entities. The address of the OCSP server is specified in the certificate.
|
||||
|
||||
Using OCSP, revocation check is done in the following way.
|
||||
|
||||
> 1. A client connects to a website and receives the certificate of the server.
|
||||
> 2. Using the OCSP server address in the certificate, the client queries the OCSP server with the serial number of the certificate.
|
||||
> 3. The server checks the database, and sends a signed response containing revocation information.
|
||||
|
||||
This method has a privacy problem. The client queries with the serial number, so the OCSP server can track what kinds of website the client has visited. Also if OCSP server is not available or under too many requests, the response may be unavailable or slow. Browsers soft-fail and assume that the certificate has no problem.[^3]
|
||||
|
||||
#### OCSP Stapling
|
||||
|
||||
The privacy issue can be solved using **OCSP stapling**. When the client connects to the web server, the *web server queries the OCSP server* and gives the response to the client. It staples the OCSP response with the certificate, hence the name.
|
||||
|
||||
Thus the client does not have to query the OCSP server about where it is visiting.
|
||||
|
||||
## Problems with PKI
|
||||
|
||||
- If a CA is compromised, it can issue a certificate for any name.
|
||||
- Or CA may not check every detail but still issue it.
|
||||
- Fraudulent certificates look perfectly valid.
|
||||
- PKI is only secure as the weakest CA.
|
||||
- As revoked certificates increase, it is hard to manage them.[^4]
|
||||
- Certificate verification depends on the implementations of the browser.
|
||||
- Users often ignore certificate warnings.
|
||||
|
||||
### Solving PKI Issues
|
||||
|
||||
We use different kinds of certificates. Stronger validations are done by the CA.
|
||||
|
||||
- **Domain Validation** (DV) certificate
|
||||
- CA issues this certificate to anyone listed in the contact in the public record associated with a domain name.
|
||||
- CA exchanges confirmation emails with an address listed in the domain's WHOIS record.
|
||||
- **Organization Validation** (OV) certificate
|
||||
- CA carefully examines the organization or the individual.
|
||||
- **Extended Validation** (EV) certificate
|
||||
- Most rigorous identity check is done on the organization or individual.
|
||||
- Online finance companies use this.
|
||||
|
||||
But if CA is compromised or the private key of the CA is leaked, certificates may be fake. We need more evidence that some given certificate is valid. Thus, we add other independent sources that can be used to validate the certificate.
|
||||
|
||||
The answer to this is **certificate transparency**. When a certificate is issued, it is logged to a public log server, and it is monitored by the server. Issuing certificates is transparent to the public. Read more from [certificate transparency (Wikipedia)](https://en.wikipedia.org/wiki/Certificate_Transparency).
|
||||
|
||||
[^1]: Image from [Wikipedia](https://en.wikipedia.org/wiki/File:Chain_Of_Trust.svg).
|
||||
[^2]: Can someone pretend to be a root CA by creating a fake certificate of a root CA?
|
||||
[^3]: Is this okay?
|
||||
[^4]: Is there a reason to keep the list of revoked certificates? The CA can just return *invalid* on non-existing serial numbers...?
|
||||
@@ -14,9 +14,9 @@ title: 1. One-Time Pad, Stream Ciphers and PRGs
|
||||
date: 2023-09-07
|
||||
github_title: 2023-09-07-otp-stream-cipher-prgs
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-01-ss.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-01-ss.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
## Assumptions and Notations
|
||||
@@ -293,7 +293,7 @@ We can deduce that if a PRG is predictable, then it is insecure.
|
||||
|
||||
*Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]$ is non-negligible.
|
||||
|
||||

|
||||

|
||||
|
||||
1. The challenger PRG will send a bit string $x$ to $\mathcal{B}$.
|
||||
- In experiment $0$, PRG gives pseudorandom string $G(k)$.
|
||||
@@ -319,7 +319,7 @@ The theorem implies that if next bit predictors cannot distinguish $G$ from true
|
||||
|
||||
To motivate the definition of semantic security, we consider a **security game framework** (attack game) between a **challenger** (ex. the creator of some cryptographic scheme) and an **adversary** $\mathcal{A}$ (ex. attacker of the scheme).
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows:
|
||||
>
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 2. PRFs, PRPs and Block Ciphers
|
||||
date: 2023-09-12
|
||||
github_title: 2023-09-12-prfs-prps-block-ciphers
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-02-block-cipher.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-02-block-cipher.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
## Pseudorandom Functions (PRF)
|
||||
@@ -119,7 +119,7 @@ This is a matter of *collisions* of $f(x_i)$, so we use the facts from the birth
|
||||
|
||||
A **block cipher** is actually a different name for PRPs. Since a PRP $E$ is a keyed function, applying $E(k, x)$ is in fact encryption, and applying its inverse is decryption.
|
||||
|
||||

|
||||

|
||||
|
||||
Block ciphers commonly have the following form.
|
||||
- A key $k$ is chosen uniformly from $\left\lbrace 0, 1 \right\rbrace^s$.
|
||||
@@ -141,7 +141,7 @@ Block ciphers commonly have the following form.
|
||||
|
||||
Since block ciphers are PRPs, we have to build an invertible function. Suppose we are given **any** functions $F_1, \dots, F_d : \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$. Can we build an **invertible** function $F : \left\lbrace 0, 1 \right\rbrace^{2n} \rightarrow \left\lbrace 0, 1 \right\rbrace^{2n}$?
|
||||
|
||||

|
||||

|
||||
|
||||
It turns out the answer is yes. Given an $2n$-bit long input, $L_0$ and $R_0$ denote the left and right halves ($n$ bits) of the input, respectively. Define
|
||||
|
||||
@@ -161,7 +161,7 @@ Note that we did not require $F_i$ to be invertible. We can build invertible fun
|
||||
|
||||
In DES, the function $F_i$ is the DES round function.
|
||||
|
||||

|
||||

|
||||
|
||||
The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunks. Each chunk is expanded to $6$ bits using $E$. Now, we have 48 bits of data, so apply XOR with the key for this round. Next, each $6$-bit block is compressed back to $4$ bits using a S-box. Finally, there is a permutation $P$ at the end, resulting in $32$ bit data.
|
||||
|
||||
@@ -169,7 +169,7 @@ The Feistel function takes $32$ bit data and divides it into eight $4$ bit chunk
|
||||
|
||||
DES uses $56$ bit keys that generate $16$ rounds keys. The diagram below shows that DES has 16-round Feistel networks.
|
||||
|
||||

|
||||

|
||||
|
||||
The input goes through initial/final permutation, which are inverses of each other. These have no cryptographic significance, and just for engineering.
|
||||
|
||||
@@ -177,7 +177,7 @@ The input goes through initial/final permutation, which are inverses of each oth
|
||||
|
||||
DES is not secure, since key space and block length is too small. Thankfully, we have a replacement called the **advanced encryption standard** (AES).
|
||||
|
||||

|
||||

|
||||
|
||||
- DES key only had $56$ bits, so DES was broken in the 1990s
|
||||
- NIST standardized AES in 2001, based on Rijndael cipher
|
||||
@@ -255,7 +255,7 @@ Then the key space has increased (exponentially). As for 2DES, the key space is
|
||||
|
||||
Unfortunately, 2DES is only secure as DES, with the attack strategy called **meet in the middle**. The idea is that if $c = E(k_1, E(k_2, m))$, then $D(k_1, c) = E(k_2, m)$.
|
||||
|
||||

|
||||

|
||||
|
||||
Since we have the plaintext and the ciphertext, we first build a table of $(k, E(k_2, m))$ over $k_2 \in \mathcal{K}$ and sort by $E(k_2, m)$. Next, we check if $D(k_1, c)$ is in the table for all $k_1 \in \mathcal{K}$.
|
||||
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 4. Message Authentication Codes
|
||||
date: 2023-09-21
|
||||
github_title: 2023-09-21-macs
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-04-mac-security.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-04-mac-security.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
Message authentication codes (MAC) were designed to provide message integrity. Bob receives a message from Alice and wants to know if this message was not modified during transmission. For MACs, the message itself does not have to be secret. For example, when we download a file the file itself does not have to be protected, but we need a way to verify that the file was not modified.
|
||||
@@ -27,7 +27,7 @@ On the other hand, MAC fixes data that is tampered in purpose. We will also requ
|
||||
|
||||
## Message Authentication Code
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** A **MAC** system $\Pi = (S, V)$ defined over $(\mathcal{K}, \mathcal{M}, \mathcal{T})$ is a pair of efficient algorithms $S$ and $V$ where $S$ is a **signing algorithm** and $V$ is a **verification algorithm**.
|
||||
>
|
||||
@@ -59,7 +59,7 @@ In the security definition of MACs, we allow the attacker to request tags for ar
|
||||
|
||||
For strong MACs, the attacker only has to change the tag for the attack to succeed.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\Pi = (S, V)$ be a MAC system defined over $(\mathcal{K}, \mathcal{M}, \mathcal{T})$. Given an adversary $\mathcal{A}$, the security game goes as follows.
|
||||
>
|
||||
@@ -124,7 +124,7 @@ The above construction uses a PRF, so it is restricted to messages of fixed size
|
||||
|
||||
### CBC-MAC
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** For any message $m = (m_0, m_1, \dots, m_{l-1}) \in \left\lbrace 0, 1 \right\rbrace^{nl}$, let $F_k := F(k, \cdot)$.
|
||||
>
|
||||
@@ -212,7 +212,7 @@ Since CBC-MAC is vulnerable to extension attacks, we encrypt the last block agai
|
||||
|
||||
ECBC-MAC doesn't require us to know the message length in advance, but it is relatively expensive in practice, since a block cipher has to be initialized with a new key.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Theorem.** Let $F : \mathcal{K} \times X \rightarrow X$ be a secure PRF. Then for any $l \geq 0$, $F_\mathrm{ECBC} : \mathcal{K}^2 \times X^{\leq l} \rightarrow X$ is a secure PRF.
|
||||
>
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 5. CCA-Security and Authenticated Encryption
|
||||
date: 2023-09-26
|
||||
github_title: 2023-09-26-cca-security-authenticated-encryption
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-05-ci.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-05-ci.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
Previously, we focused on semantic security against **passive adversaries**, that only eavesdrop on the ciphertext. But in the real world, there are **active adversaries** that interfere with the communication, or even modify them.
|
||||
@@ -84,7 +84,7 @@ The attacker shouldn't be able to create a new ciphertext that decrypts properly
|
||||
|
||||
In this case, we fix the decryption algorithm so that $D : \mathcal{K} \times \mathcal{C} \rightarrow \mathcal{M} \cup \left\lbrace \bot \right\rbrace$, where $\bot$ means that the ciphertext was rejected.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\mathcal{E} = (E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. Given an adversary $\mathcal{A}$, the security game goes as follows.
|
||||
>
|
||||
@@ -139,7 +139,7 @@ Most natural constructions of CCA secure schemes satisfy AE, so we don't need to
|
||||
|
||||
We want to combine CPA secure scheme and strongly secure MAC to get AE. Rather than focusing on the internal structure of the scheme, we want a general method to compose these two secure schemes so that we can get a AE secure scheme. We will see 3 examples.
|
||||
|
||||

|
||||

|
||||
|
||||
### Encrypt-and-MAC (E&M)
|
||||
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 6. Hash Functions
|
||||
date: 2023-09-28
|
||||
github_title: 2023-09-28-hash-functions
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-06-merkle-damgard.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-06-merkle-damgard.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
Hash functions are functions that take some input an compress them to produce an output of fixed size, usually just called *hash* or *digest*. A desired property of hash function is **collision resistance**.
|
||||
@@ -107,7 +107,7 @@ Now we want to construct collision resistant hash functions that work for arbitr
|
||||
|
||||
The Merkle-Damgård transform gives as a way to extend our input domain of the hash function by iterating the function.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $h : \left\lbrace 0, 1 \right\rbrace^n \times \left\lbrace 0, 1 \right\rbrace^l \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ be a hash function. The **Merkle-Damgård function derived from $h$** is a function $H$ that works as follows.
|
||||
>
|
||||
@@ -152,7 +152,7 @@ Now we only have to build a collision resistant compression function. We can bui
|
||||
|
||||
Number theoretic primitives will be shown after we learn some number theory.[^3] An example is shown in [collision resistance using DL problem (Modern Cryptography)](./2023-10-03-key-exchange.md#collision-resistance-based-on-dl-problem).
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\mathcal{E} = (E, D)$ be a block cipher over $(\mathcal{K}, X, X)$ where $X = \left\lbrace 0, 1 \right\rbrace^n$. The **Davies-Meyer compression function derived from $E$** maps inputs in $X \times \mathcal{K}$ to outputs in $X$, defined as follows.
|
||||
>
|
||||
@@ -217,7 +217,7 @@ This can be thought of as blocking the length extension attack from prepending t
|
||||
|
||||
### HMAC Definition
|
||||
|
||||

|
||||

|
||||
|
||||
This is a variant of the two-key nest, but the difference is that the keys $k_1', k_2'$ are not independent. Choose a key $k \leftarrow \mathcal{K}$, and set
|
||||
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 7. Key Exchange
|
||||
date: 2023-10-03
|
||||
github_title: 2023-10-03-key-exchange
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-07-dhke.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-07-dhke.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
In symmetric key encryption, we assumed that the two parties already share the same key. We will see how this can be done.
|
||||
@@ -75,7 +75,7 @@ $$
|
||||
|
||||
We assume that the description of $p$, $q$ and $g$ are generated at the setup and shared by all parties. Now the actual protocol goes like this.
|
||||
|
||||

|
||||

|
||||
|
||||
> 1. Alice chooses $\alpha \leftarrow \mathbb{Z}_q$ and computes $g^\alpha$.
|
||||
> 2. Bob chooses $\beta \leftarrow \mathbb{Z}_q$ and computes $g^\beta$.
|
||||
@@ -190,7 +190,7 @@ Taking $\mathcal{O}(N)$ steps is impractical in the real world, due to many comm
|
||||
|
||||
We assumed that the adversary only eavesdrops, but if the adversary carries out active attacks, then DHKE is not enough. The major problem is the lack of **authentication**. Alice and Bob are exchanging keys, but they both cannot be sure that there are in fact communicating with the other. An attacker can intercept messages and impersonate Alice or Bob. This attack is called a **man in the middle attack**, and this attack works on any key exchange protocol that lacks authentication.
|
||||
|
||||

|
||||

|
||||
|
||||
The adversary will impersonate Bob when communicating with Alice, and will do the same for Bob by pretending to be Alice. The values of $\alpha, \beta$ that Alice and Bob chose are not leaked, but the adversary can decrypt anything in the middle and obtain the plaintext.
|
||||
|
||||
@@ -212,7 +212,7 @@ Before Diffie-Hellman, Merkle proposed an idea for secure key exchange protocol
|
||||
|
||||
The idea was to use *puzzles*, which are problems that can be solved with some effort.
|
||||
|
||||

|
||||

|
||||
|
||||
> Let $\mathcal{E} = (E, D)$ be a block cipher defined over $(\mathcal{K}, \mathcal{M})$.
|
||||
> 1. Alice chooses random pairs $(k_i, s_i) \leftarrow \mathcal{K} \times \mathcal{M}$ for $i = 1, \dots, L$.
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 10. Digital Signatures
|
||||
date: 2023-10-26
|
||||
github_title: 2023-10-26-digital-signatures
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-10-dsig-security.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-10-dsig-security.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
## Digital Signatures
|
||||
@@ -57,7 +57,7 @@ $$
|
||||
|
||||
The definition is similar to the [secure MAC](./2023-09-21-macs.md#secure-mac-unforgeability). The adversary can perform a **chosen message attack**, but cannot create an **existential forgery**.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\mc{S} = (G, S, V)$ be a signature scheme defined over $(\mc{M}, \Sigma)$. Given an adversary $\mc{A}$, the game goes as follows.
|
||||
>
|
||||
@@ -184,7 +184,7 @@ This scheme is originally from the **Schnorr identification protocol**.
|
||||
|
||||
Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. We consider an interaction between two parties, prover $P$ and a verifier $V$. The prover has a secret $\alpha \in \Z_q$ and the verification key is $u = g^\alpha$. **$P$ wants to convince $V$ that he knows $\alpha$, but does not want to reveal $\alpha$**.
|
||||
|
||||

|
||||

|
||||
|
||||
The protocol $\mc{I}_\rm{sch} = (G, P, V)$ works as follows.
|
||||
|
||||
@@ -239,7 +239,7 @@ Schnorr's scheme was protected by a patent, so NIST opted for a ad-hoc signature
|
||||
|
||||
How would you trust public keys? We introduce **digital certificates** for this.
|
||||
|
||||
Read in [public key infrastructure (Internet Security)](../../Lecture%20Notes/Internet%20Security/2023-10-16-pki.md).
|
||||
Read in [public key infrastructure (Internet Security)](../internet-security/2023-10-16-pki.md).
|
||||
|
||||
[^1]: A Graduate Course in Applied Cryptography
|
||||
[^2]: By using the [Fiat-Shamir transform](./2023-11-07-sigma-protocols.md#the-fiat-shamir-transform).
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 12. Zero-Knowledge Proof (Introduction)
|
||||
date: 2023-11-02
|
||||
github_title: 2023-11-02-zkp-intro
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-12-id-protocol.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-12-id-protocol.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
- In 1980s, the notion of *zero knowledge* was proposed by Shafi Goldwasser, Silvio micali and Charles Rackoff.
|
||||
@@ -28,7 +28,7 @@ attachment:
|
||||
|
||||
## Identification Protocol
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** An **identification protocol** is a triple of algorithms $\mc{I} = (G, P, V)$ satisfying the following.
|
||||
>
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 13. Sigma Protocols
|
||||
date: 2023-11-07
|
||||
github_title: 2023-11-07-sigma-protocols
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-13-sigma-protocol.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-13-sigma-protocol.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
The previous [3-coloring example](./2023-11-02-zkp-intro.md#example-3-coloring) certainly works as a zero knowledge proof, but is quite slow, and requires a lot of interaction. There are efficient protocols for interactive proofs, we will study sigma protocols.
|
||||
@@ -27,7 +27,7 @@ The previous [3-coloring example](./2023-11-02-zkp-intro.md#example-3-coloring)
|
||||
|
||||
> **Definition.** An **effective relation** is a binary relation $\mc{R} \subset \mc{X} \times \mc{Y}$, where $\mc{X}$, $\mc{Y}$, $\mc{R}$ are efficiently recognizable finite sets. Elements of $\mc{Y}$ are called **statements**. If $(x, y) \in \mc{R}$, then $x$ is called a **witness for** $y$.
|
||||
|
||||

|
||||

|
||||
|
||||
> **Definition.** Let $\mc{R} \subset \mc{X} \times \mc{Y}$ be an effective relation. A **sigma protocol** for $\mc{R}$ is a pair of algorithms $(P, V)$ satisfying the following.
|
||||
>
|
||||
@@ -107,7 +107,7 @@ Also note that **the simulator is free to generate the messages in any convenien
|
||||
|
||||
The Schnorr identification protocol is actually a sigma protocol. Refer to [Schnorr identification protocol (Modern Cryptography)](./2023-10-26-digital-signatures.md#the-schnorr-identification-protocol) for the full description.
|
||||
|
||||

|
||||

|
||||
|
||||
> The pair $(P, V)$ is a sigma protocol for the relation $\mc{R} \subset \mc{X} \times \mc{Y}$ where
|
||||
>
|
||||
@@ -165,7 +165,7 @@ $$
|
||||
|
||||
goes as follows.
|
||||
|
||||

|
||||

|
||||
|
||||
> 1. $P$ computes random $\alpha_t, \beta_t \la \bb{Z}_q$ and sends commitment $u_t \la g^{\alpha_t}h^{\beta_t}$ to $V$.
|
||||
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
|
||||
@@ -192,7 +192,7 @@ $$
|
||||
|
||||
goes as follows.
|
||||
|
||||

|
||||

|
||||
|
||||
> 1. $P$ computes random $\beta_t \la \bb{Z}_q$ and sends commitment $v_t \la g^{\beta_t}$, $w_t \la u^{\beta_t}$ to $V$.
|
||||
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
|
||||
@@ -223,7 +223,7 @@ $$
|
||||
|
||||
goes as follows.
|
||||
|
||||

|
||||

|
||||
|
||||
> 1. $P$ computes random $x_t \la \bb{Z}_n^{\ast}$ and sends commitment $y_t \la x_t^e$ to $V$.
|
||||
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
|
||||
|
||||
@@ -14,9 +14,9 @@ title: 16. The GMW Protocol
|
||||
date: 2023-11-16
|
||||
github_title: 2023-11-16-gmw-protocol
|
||||
image:
|
||||
path: assets/img/posts/Lecture Notes/Modern Cryptography/mc-16-beaver-triple.png
|
||||
path: assets/img/posts/lecture-notes/modern-cryptography/mc-16-beaver-triple.png
|
||||
attachment:
|
||||
folder: assets/img/posts/Lecture Notes/Modern Cryptography
|
||||
folder: assets/img/posts/lecture-notes/modern-cryptography
|
||||
---
|
||||
|
||||
There are two types of MPC protocols, **generic** and **specific**. Generic protocols can compute arbitrary functions. [Garbled circuits](./2023-11-14-garbled-circuits.md#garbled-circuits) were generic protocols, since it can be used to compute any boolean circuits. In contrast, the [summation protocol](./2023-11-09-secure-mpc.md#example-secure-summation) is a specific protocol that can only be used to compute a specific function. Note that generic protocols are not necessarily better, since specific protocols are much more efficient.
|
||||
@@ -148,7 +148,7 @@ Indeed, $z_1, z_2$ are shares of $z$.[^2] See also Exercise 23.5.[^3]
|
||||
|
||||
Now, in the actual computation of AND gates, proceed as follows.
|
||||
|
||||

|
||||

|
||||
|
||||
> Each $P_i$ has a share of inputs $a_i, b_i$ and a Beaver triple $(x_i, y_i, z_i)$.
|
||||
> 1. Each $P_i$ computes $u_i = a_i + x_i$, $v_i = b_i + y_i$.
|
||||
|
||||
Reference in New Issue
Block a user