feat: breaking change (unstable) (#198)

* [PUBLISHER] upload files #175

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-19-symmetric-key-encryption.md

* [PUBLISHER] upload files #177

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-19-symmetric-key-encryptio.md

* [PUBLISHER] upload files #178

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #179

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #180

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #181

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #182

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* [PUBLISHER] upload files #183

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #184

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #185

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #186

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* [PUBLISHER] upload files #187

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* DELETE FILE : _posts/Lecture Notes/Modern Cryptography/2023-09-19-symmetric-key-encryption.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #188

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* DELETE FILE : _posts/Lecture Notes/Modern Cryptography/2023-09-19-symmetric-key-encryption.md

* chore: remove files

* [PUBLISHER] upload files #197

* PUSH NOTE : 수학 공부에 대한 고찰.md

* PUSH NOTE : 09. Lp Functions.md

* PUSH ATTACHMENT : mt-09.png

* PUSH NOTE : 08. Comparison with the Riemann Integral.md

* PUSH ATTACHMENT : mt-08.png

* PUSH NOTE : 04. Measurable Functions.md

* PUSH ATTACHMENT : mt-04.png

* PUSH NOTE : 06. Convergence Theorems.md

* PUSH ATTACHMENT : mt-06.png

* PUSH NOTE : 07. Dominated Convergence Theorem.md

* PUSH ATTACHMENT : mt-07.png

* PUSH NOTE : 05. Lebesgue Integration.md

* PUSH ATTACHMENT : mt-05.png

* PUSH NOTE : 03. Measure Spaces.md

* PUSH ATTACHMENT : mt-03.png

* PUSH NOTE : 02. Construction of Measure.md

* PUSH ATTACHMENT : mt-02.png

* PUSH NOTE : 01. Algebra of Sets and Set Functions.md

* PUSH ATTACHMENT : mt-01.png

* PUSH NOTE : Rules of Inference with Coq.md

* PUSH NOTE : 블로그 이주 이야기.md

* PUSH NOTE : Secure IAM on AWS with Multi-Account Strategy.md

* PUSH ATTACHMENT : separation-by-product.png

* PUSH NOTE : You and Your Research, Richard Hamming.md

* PUSH NOTE : 10. Digital Signatures.md

* PUSH ATTACHMENT : mc-10-dsig-security.png

* PUSH ATTACHMENT : mc-10-schnorr-identification.png

* PUSH NOTE : 9. Public Key Encryption.md

* PUSH ATTACHMENT : mc-09-ss-pke.png

* PUSH NOTE : 8. Number Theory.md

* PUSH NOTE : 7. Key Exchange.md

* PUSH ATTACHMENT : mc-07-dhke.png

* PUSH ATTACHMENT : mc-07-dhke-mitm.png

* PUSH ATTACHMENT : mc-07-merkle-puzzles.png

* PUSH NOTE : 6. Hash Functions.md

* PUSH ATTACHMENT : mc-06-merkle-damgard.png

* PUSH ATTACHMENT : mc-06-davies-meyer.png

* PUSH ATTACHMENT : mc-06-hmac.png

* PUSH NOTE : 5. CCA-Security and Authenticated Encryption.md

* PUSH ATTACHMENT : mc-05-ci.png

* PUSH ATTACHMENT : mc-05-etm-mte.png

* PUSH NOTE : 1. OTP, Stream Ciphers and PRGs.md

* PUSH ATTACHMENT : mc-01-prg-game.png

* PUSH ATTACHMENT : mc-01-ss.png

* PUSH NOTE : 4. Message Authentication Codes.md

* PUSH ATTACHMENT : mc-04-mac.png

* PUSH ATTACHMENT : mc-04-mac-security.png

* PUSH ATTACHMENT : mc-04-cbc-mac.png

* PUSH ATTACHMENT : mc-04-ecbc-mac.png

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH ATTACHMENT : is-03-ecb-encryption.png

* PUSH ATTACHMENT : is-03-cbc-encryption.png

* PUSH ATTACHMENT : is-03-ctr-encryption.png

* PUSH NOTE : 2. PRFs, PRPs and Block Ciphers.md

* PUSH ATTACHMENT : mc-02-block-cipher.png

* PUSH ATTACHMENT : mc-02-feistel-network.png

* PUSH ATTACHMENT : mc-02-des-round.png

* PUSH ATTACHMENT : mc-02-DES.png

* PUSH ATTACHMENT : mc-02-aes-128.png

* PUSH ATTACHMENT : mc-02-2des-mitm.png

* PUSH NOTE : 18. Bootstrapping & CKKS.md

* PUSH NOTE : 17. BGV Scheme.md

* PUSH NOTE : 16. The GMW Protocol.md

* PUSH ATTACHMENT : mc-16-beaver-triple.png

* PUSH NOTE : 15. Garbled Circuits.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* PUSH NOTE : 13. Sigma Protocols.md

* PUSH ATTACHMENT : mc-13-sigma-protocol.png

* PUSH ATTACHMENT : mc-13-okamoto.png

* PUSH ATTACHMENT : mc-13-chaum-pedersen.png

* PUSH ATTACHMENT : mc-13-gq-protocol.png

* PUSH NOTE : 12. Zero-Knowledge Proofs (Introduction).md

* PUSH ATTACHMENT : mc-12-id-protocol.png

* PUSH NOTE : 11. Advanced Topics.md

* PUSH NOTE : 0. Introduction.md

* PUSH NOTE : 02. Symmetric Key Cryptography (1).md

* PUSH NOTE : 09. Transport Layer Security.md

* PUSH ATTACHMENT : is-09-tls-handshake.png

* PUSH NOTE : 08. Public Key Infrastructure.md

* PUSH ATTACHMENT : is-08-certificate-validation.png

* PUSH NOTE : 07. Public Key Cryptography.md

* PUSH NOTE : 06. RSA and ElGamal Encryption.md

* PUSH NOTE : 05. Modular Arithmetic (2).md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* PUSH ATTACHMENT : is-03-feistel-function.png

* PUSH ATTACHMENT : is-03-cfb-encryption.png

* PUSH ATTACHMENT : is-03-ofb-encryption.png

* PUSH NOTE : 04. Modular Arithmetic (1).md

* PUSH NOTE : 01. Security Introduction.md

* PUSH ATTACHMENT : is-01-cryptosystem.png

* PUSH NOTE : Search Time in Hash Tables.md

* PUSH NOTE : 랜덤 PS일지 (1).md

* chore: rearrange articles

* feat: fix paths

* feat: fix all broken links

* feat: title font to palatino
This commit is contained in:
2024-11-13 14:28:45 +09:00
committed by GitHub
parent c9f7af5f3d
commit 23aeb29ad8
78 changed files with 2105 additions and 2030 deletions

View File

@@ -3,18 +3,19 @@ share: true
pin: true pin: true
categories: categories:
- Development - Development
path: _posts/development
tags: tags:
- AWS - AWS
- dev - dev
title: Secure IAM on AWS with Multi-Account Strategy title: Secure IAM on AWS with Multi-Account Strategy
date: 2024-02-26 date: 2024-02-26
github_title: 2024-02-26-secure-iam github_title: 2024-02-26-secure-iam
image: /assets/img/posts/Development/separation-by-product.png image: /assets/img/posts/development/separation-by-product.png
attachment: attachment:
folder: assets/img/posts/Development folder: assets/img/posts/development
--- ---
![separation-by-product.png](../../assets/img/posts/Development/separation-by-product.png) ![separation-by-product.png](../../assets/img/posts/development/separation-by-product.png)
2024\. 2. B.S. Graduation Paper, Received Best Paper Award! 2024\. 2. B.S. Graduation Paper, Received Best Paper Award!

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops, docker] tags: [kubernetes, sre, devops, docker]
title: "01. Introducing Kubernetes" title: "01. Introducing Kubernetes"
date: "2021-02-28" date: "2021-02-28"
github_title: "2021-02-28-01-introducing-k8s" github_title: "2021-02-28-01-introducing-k8s"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-01.jpeg path: /assets/img/posts/development/kubernetes/k8s-01.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-01.jpeg](/assets/img/posts/Development/Kubernetes/k8s-01.jpeg) _Overview of Kubernetes Architecture (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-1)_ ![k8s-01.jpeg](/assets/img/posts/development/kubernetes/k8s-01.jpeg) _Overview of Kubernetes Architecture (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-1)_
기존에는 소프트웨어가 커다란 덩어리였지만 최근에는 독립적으로 작동하는 작은 **마이크로서비스**(microservice)로 나뉘고 있다. 이들은 독립적으로 동작하기 때문에, 개발하고 배포하거나 스케일링을 따로 해줄 수 있다는 장점이 있으며, 이 장점은 빠르게 변화하는 소프트웨어의 요구사항을 반영하기에 적합하다. 기존에는 소프트웨어가 커다란 덩어리였지만 최근에는 독립적으로 작동하는 작은 **마이크로서비스**(microservice)로 나뉘고 있다. 이들은 독립적으로 동작하기 때문에, 개발하고 배포하거나 스케일링을 따로 해줄 수 있다는 장점이 있으며, 이 장점은 빠르게 변화하는 소프트웨어의 요구사항을 반영하기에 적합하다.
@@ -81,7 +82,7 @@ attachment:
VM은 독자적인 운영체제를 사용하기 때문에 시스템 프로세스가 필요하고 이는 추가적인 리소스의 소모로 이어진다. (윈도우 VM에 RAM 4GB씩 준다고 생각하면 몇 개를 띄울 수 있을지...) VM은 독자적인 운영체제를 사용하기 때문에 시스템 프로세스가 필요하고 이는 추가적인 리소스의 소모로 이어진다. (윈도우 VM에 RAM 4GB씩 준다고 생각하면 몇 개를 띄울 수 있을지...)
반면 컨테이너는 호스트 머신에서 돌아가는 프로세스이기 때문에 추가적인 시스템 프로세스가 필요 없어서 애플리케이션이 필요한 리소스만 소모하게 된다. VM에 비해 훨씬 가볍기 때문에 한 머신에서 여러 컨테이너를 돌릴 수 있게 된다. 반면 컨테이너는 호스트 머신에서 돌아가는 프로세스이기 때문에 추가적인 시스템 프로세스가 필요 없어서 애플리케이션이 필요한 리소스만 소모하게 된다. VM에 비해 훨씬 가볍기 때문에 한 머신에서 여러 컨테이너를 돌릴 수 있게 된다.
VM을 사용하게 되면 hypervisor 가 하드웨어 자원을 가상 자원(virtual resource)으로 나누어 각 VM안의 OS가 사용할 수 있게 해준다. VM 안에서 돌아가는 애플리케이션은 VM의 OS에 system call 을 하게 되고, VM의 커널은 hypervisor를 통해 호스트의 CPU에서 명령을 수행하게 된다. VM을 사용하게 되면 hypervisor 가 하드웨어 자원을 가상 자원(virtual resource)으로 나누어 각 VM안의 OS가 사용할 수 있게 해준다. VM 안에서 돌아가는 애플리케이션은 VM의 OS에 system call 을 하게 되고, VM의 커널은 hypervisor를 통해 호스트의 CPU에서 명령을 수행하게 된다.
@@ -165,7 +166,7 @@ VM은 자체적으로 OS를 가지고 있기 때문에 VM을 사용하게 되면
- **스케쥴러**: 애플리케이션 배포시 워커 노드에 서비스를 할당한다. - **스케쥴러**: 애플리케이션 배포시 워커 노드에 서비스를 할당한다.
- **Controller Manager**: 클러스터 수준의 기능을 담당한다. 컴포넌트를 복제하거나 워커 노드 개수를 관리하거나, 노드 에러를 처리하는 등 작업을 담당한다. - **Controller Manager**: 클러스터 수준의 기능을 담당한다. 컴포넌트를 복제하거나 워커 노드 개수를 관리하거나, 노드 에러를 처리하는 등 작업을 담당한다.
- **etcd**: 클러스터의 설정을 저장하는 persistent 분산 데이터 스토어이다. - **etcd**: 클러스터의 설정을 저장하는 persistent 분산 데이터 스토어이다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops, docker] tags: [kubernetes, sre, devops, docker]
title: "02. First Steps with Docker and Kubernetes" title: "02. First Steps with Docker and Kubernetes"
date: "2021-03-07" date: "2021-03-07"
github_title: "2021-03-07-02-first-steps" github_title: "2021-03-07-02-first-steps"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-02.jpeg path: /assets/img/posts/development/kubernetes/k8s-02.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-02.jpeg](/assets/img/posts/Development/Kubernetes/k8s-02.jpeg) _Running a container image in Kubernetes (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-2)_ ![k8s-02.jpeg](/assets/img/posts/development/kubernetes/k8s-02.jpeg) _Running a container image in Kubernetes (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-2)_
도커와 쿠버네티스를 사용하여 간단한 애플리케이션을 배포해 보자! 도커와 쿠버네티스를 사용하여 간단한 애플리케이션을 배포해 보자!

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "03. Pods: Running Containers in Kubernetes" title: "03. Pods: Running Containers in Kubernetes"
date: "2021-03-17" date: "2021-03-17"
github_title: "2021-03-17-03-pods" github_title: "2021-03-17-03-pods"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-03.jpeg path: /assets/img/posts/development/kubernetes/k8s-03.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-03.jpeg](/assets/img/posts/Development/Kubernetes/k8s-03.jpeg) _A container shouldnt run multiple processes. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-3)_ ![k8s-03.jpeg](/assets/img/posts/development/kubernetes/k8s-03.jpeg) _A container shouldnt run multiple processes. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-3)_
다양한 쿠버네티스 오브젝트 (resources) 를 살펴보는 단원이다. 가장 기본이 되는 Pod 부터 시작한다. 이외의 모든 것들은 pod 를 관리하거나, pod 를 노출하거나, pod 에 의해 사용된다. 다양한 쿠버네티스 오브젝트 (resources) 를 살펴보는 단원이다. 가장 기본이 되는 Pod 부터 시작한다. 이외의 모든 것들은 pod 를 관리하거나, pod 를 노출하거나, pod 에 의해 사용된다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "04. Replication and Other Controllers: Deploying Managed Pods" title: "04. Replication and Other Controllers: Deploying Managed Pods"
date: "2021-03-21" date: "2021-03-21"
github_title: "2021-03-21-04-replication-and-controllers" github_title: "2021-03-21-04-replication-and-controllers"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-04.jpeg path: /assets/img/posts/development/kubernetes/k8s-04.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-04.jpeg](/assets/img/posts/Development/Kubernetes/k8s-04.jpeg) _ReplicationController recreating pods. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-4)_ ![k8s-04.jpeg](/assets/img/posts/development/kubernetes/k8s-04.jpeg) _ReplicationController recreating pods. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-4)_
3장에서는 pod 를 직접 관리하는 방법에 대해 살펴봤다. 하지만 실무에서는 pod 의 관리가 자동으로 되길 원한다. 이를 위해 ReplicationController 나 Deployment 를 사용한다. 3장에서는 pod 를 직접 관리하는 방법에 대해 살펴봤다. 하지만 실무에서는 pod 의 관리가 자동으로 되길 원한다. 이를 위해 ReplicationController 나 Deployment 를 사용한다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "05. Services: Enabling Clients to Discover and Talk to Pods" title: "05. Services: Enabling Clients to Discover and Talk to Pods"
date: "2021-04-07" date: "2021-04-07"
github_title: "2021-04-07-05-services" github_title: "2021-04-07-05-services"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-05.jpeg path: /assets/img/posts/development/kubernetes/k8s-05.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-05.jpeg](/assets/img/posts/Development/Kubernetes/k8s-05.jpeg) _Using `kubectl exec` to test out a connection to the service by running curl in one of the pods. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-5)_ ![k8s-05.jpeg](/assets/img/posts/development/kubernetes/k8s-05.jpeg) _Using `kubectl exec` to test out a connection to the service by running curl in one of the pods. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-5)_
많은 앱들이 request (요청) 을 받아 서비스를 제공하는 형태인데, 이런 요청을 보내려면 IP 주소를 알아야 한다. 한편 Kubernetes 를 사용하게 되면 pod 의 IP 주소를 알아야 하는데, Kubernetes 의 pod 들은 굉장히 동적이므로 이들의 IP 주소를 알아낼 방법이 필요하다. 많은 앱들이 request (요청) 을 받아 서비스를 제공하는 형태인데, 이런 요청을 보내려면 IP 주소를 알아야 한다. 한편 Kubernetes 를 사용하게 되면 pod 의 IP 주소를 알아야 하는데, Kubernetes 의 pod 들은 굉장히 동적이므로 이들의 IP 주소를 알아낼 방법이 필요하다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "06. Volumes: Attaching Disk Storage to Containers" title: "06. Volumes: Attaching Disk Storage to Containers"
date: "2021-04-07" date: "2021-04-07"
github_title: "2021-04-07-06-volumes" github_title: "2021-04-07-06-volumes"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-06.jpeg path: /assets/img/posts/development/kubernetes/k8s-06.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-06.jpeg](/assets/img/posts/Development/Kubernetes/k8s-06.jpeg) _The complete picture of dynamic provisioning of PersistentVolumes. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-6)_ ![k8s-06.jpeg](/assets/img/posts/development/kubernetes/k8s-06.jpeg) _The complete picture of dynamic provisioning of PersistentVolumes. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-6)_
컨테이너가 재시작되면 기존 작업 내역이 모두 사라지게 될 수 있으므로, 컨테이너의 작업 내역을 저장하고 같은 pod 내의 다른 컨테이너가 함께 사용하는 저장 공간이다. 컨테이너가 재시작되면 기존 작업 내역이 모두 사라지게 될 수 있으므로, 컨테이너의 작업 내역을 저장하고 같은 pod 내의 다른 컨테이너가 함께 사용하는 저장 공간이다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "07. ConfigMaps and Secrets: Configuring Applications" title: "07. ConfigMaps and Secrets: Configuring Applications"
date: "2021-04-18" date: "2021-04-18"
github_title: "2021-04-18-07-configmaps-and-secrets" github_title: "2021-04-18-07-configmaps-and-secrets"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-07.jpeg path: /assets/img/posts/development/kubernetes/k8s-07.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-07.jpeg](/assets/img/posts/Development/Kubernetes/k8s-07.jpeg) _Combining a ConfigMap and a Secret to run your fortune-https pod (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-7)_ ![k8s-07.jpeg](/assets/img/posts/development/kubernetes/k8s-07.jpeg) _Combining a ConfigMap and a Secret to run your fortune-https pod (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-7)_
거의 대부분의 앱은 설정(configuration)이 필요하다. 개발 서버, 배포 서버의 설정 사항 (접속하려는 DB 서버 주소 등)이 다를 수도 있고, 클라우드 등에 접속하기 위한 access key 가 필요하거나, 데이터를 암호화하는 encryption key 도 설정해야하는 경우가 있다. 이러한 경우에 해당 값들을 도커 이미지 자체에 넣어버리면 보안 상 취약하고, 또 설정 사항을 변경하는 경우 이미지를 다시 빌드해야하는 등 불편함이 따른다. 거의 대부분의 앱은 설정(configuration)이 필요하다. 개발 서버, 배포 서버의 설정 사항 (접속하려는 DB 서버 주소 등)이 다를 수도 있고, 클라우드 등에 접속하기 위한 access key 가 필요하거나, 데이터를 암호화하는 encryption key 도 설정해야하는 경우가 있다. 이러한 경우에 해당 값들을 도커 이미지 자체에 넣어버리면 보안 상 취약하고, 또 설정 사항을 변경하는 경우 이미지를 다시 빌드해야하는 등 불편함이 따른다.

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "08. Accessing Pod Metadata and Other Resources from Applications" title: "08. Accessing Pod Metadata and Other Resources from Applications"
date: "2021-04-18" date: "2021-04-18"
github_title: "2021-04-18-08-accessing-pod-metadata" github_title: "2021-04-18-08-accessing-pod-metadata"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-08.jpeg path: /assets/img/posts/development/kubernetes/k8s-08.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-08.jpeg](/assets/img/posts/Development/Kubernetes/k8s-08.jpeg) _Using the files from the default-token Secret to talk to the API server (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-8)_ ![k8s-08.jpeg](/assets/img/posts/development/kubernetes/k8s-08.jpeg) _Using the files from the default-token Secret to talk to the API server (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-8)_
### 주요 내용 ### 주요 내용
@@ -138,7 +139,7 @@ spec:
volumes: volumes:
- name: downward - name: downward
downwardAPI: downwardAPI:
items: # 설정한 경로의 파일에 필요한 정보가 저장된다 items: # 설정한 경로의 파일에 필요한 정보가 저장된다
- path: "podName" - path: "podName"
fieldRef: fieldRef:
fieldPath: metadata.name fieldPath: metadata.name
@@ -217,7 +218,7 @@ API 서버는 HTTPS 를 사용하므로, 인증 없이는 직접 요청을 보
`kubectl proxy` 를 이용하면 로컬에서 HTTP 요청을 받아서 Kubernetes API 서버로 요청을 전달해 주고, 인증도 알아서 처리해준다. 더불어 매 요청마다 서버의 인증서를 확인하여 MITM attack 을 막고 실제 서버와 통신할 수 있도록 해준다. `kubectl proxy` 를 이용하면 로컬에서 HTTP 요청을 받아서 Kubernetes API 서버로 요청을 전달해 주고, 인증도 알아서 처리해준다. 더불어 매 요청마다 서버의 인증서를 확인하여 MITM attack 을 막고 실제 서버와 통신할 수 있도록 해준다.
``` ```
$ kubectl proxy $ kubectl proxy
Starting to serve on 127.0.0.1:8001 Starting to serve on 127.0.0.1:8001
``` ```
@@ -380,13 +381,13 @@ root@curl:/# curl https://kubernetes -k
"kind": "Status", "kind": "Status",
"apiVersion": "v1", "apiVersion": "v1",
"metadata": { "metadata": {
}, },
"status": "Failure", "status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\"", "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
"reason": "Forbidden", "reason": "Forbidden",
"details": { "details": {
}, },
"code": 403 "code": 403
} }
@@ -404,13 +405,13 @@ root@curl:/# curl --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
"kind": "Status", "kind": "Status",
"apiVersion": "v1", "apiVersion": "v1",
"metadata": { "metadata": {
}, },
"status": "Failure", "status": "Failure",
"message": "forbidden: User \"system:anonymous\" cannot get path \"/\"", "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
"reason": "Forbidden", "reason": "Forbidden",
"details": { "details": {
}, },
"code": 403 "code": 403
} }
@@ -434,13 +435,13 @@ $ curl -H "Authorization: Bearer $TOKEN" https://kubernetes
``` ```
> Role-based access control ([RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/)) 이 실행되고 있는 클러스터에서는 위 `curl` 에서 에러가 난다. 일단 테스트를 위해서 임시적으로 > Role-based access control ([RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/)) 이 실행되고 있는 클러스터에서는 위 `curl` 에서 에러가 난다. 일단 테스트를 위해서 임시적으로
> >
> ```bash > ```bash
> kubectl create clusterrolebinding permissive-binding \ > kubectl create clusterrolebinding permissive-binding \
> --clusterrole=cluster-admin \ > --clusterrole=cluster-admin \
> --group=system:serviceaccounts > --group=system:serviceaccounts
> ``` > ```
> >
> 를 입력하여 모든 serviceaccounts 에 admin 권한을 줄 수 있다 ㅋㅋㅋ; > 를 입력하여 모든 serviceaccounts 에 admin 권한을 줄 수 있다 ㅋㅋㅋ;
#### 현재 pod 의 namespace 가져오기 #### 현재 pod 의 namespace 가져오기

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "09. Deployments: Updating Applications Declaratively" title: "09. Deployments: Updating Applications Declaratively"
date: "2021-04-30" date: "2021-04-30"
github_title: "2021-04-30-09-deployments" github_title: "2021-04-30-09-deployments"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-09.jpeg path: /assets/img/posts/development/kubernetes/k8s-09.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-09.jpeg](/assets/img/posts/Development/Kubernetes/k8s-09.jpeg) _Rolling update of Deployments (출처: livebook.manning.com/book/kubernetes-in-action/chapter-9)_ ![k8s-09.jpeg](/assets/img/posts/development/kubernetes/k8s-09.jpeg) _Rolling update of Deployments (출처: livebook.manning.com/book/kubernetes-in-action/chapter-9)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "10. StatefulSets: Deploying Replicated Stateful Applications" title: "10. StatefulSets: Deploying Replicated Stateful Applications"
date: "2021-05-17" date: "2021-05-17"
github_title: "2021-05-17-10-statefulsets" github_title: "2021-05-17-10-statefulsets"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-10.jpeg path: /assets/img/posts/development/kubernetes/k8s-10.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-10.jpeg](/assets/img/posts/Development/Kubernetes/k8s-10.jpeg) _A stateful pod may be rescheduled to a different node, but it retains the name, hostname, and storage. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-10)_ ![k8s-10.jpeg](/assets/img/posts/development/kubernetes/k8s-10.jpeg) _A stateful pod may be rescheduled to a different node, but it retains the name, hostname, and storage. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-10)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "11. Understanding Kubernetes Internals" title: "11. Understanding Kubernetes Internals"
date: "2021-05-30" date: "2021-05-30"
github_title: "2021-05-30-11-k8s-internals" github_title: "2021-05-30-11-k8s-internals"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-11.jpeg path: /assets/img/posts/development/kubernetes/k8s-11.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-11.jpeg](/assets/img/posts/Development/Kubernetes/k8s-11.jpeg) _The chain of events that unfolds when a Deployment resource is posted to the API server (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-11)_ ![k8s-11.jpeg](/assets/img/posts/development/kubernetes/k8s-11.jpeg) _The chain of events that unfolds when a Deployment resource is posted to the API server (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-11)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "12. Securing the Kubernetes API Server" title: "12. Securing the Kubernetes API Server"
date: "2021-06-06" date: "2021-06-06"
github_title: "2021-06-06-12-securing-k8s-api-server" github_title: "2021-06-06-12-securing-k8s-api-server"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-12.jpeg path: /assets/img/posts/development/kubernetes/k8s-12.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-12.jpeg](/assets/img/posts/Development/Kubernetes/k8s-12.jpeg) _Roles grant permissions, whereas RoleBindings bind Roles to subjects (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-12)_ ![k8s-12.jpeg](/assets/img/posts/development/kubernetes/k8s-12.jpeg) _Roles grant permissions, whereas RoleBindings bind Roles to subjects (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-12)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "13. Securing Cluster Nodes and the Network" title: "13. Securing Cluster Nodes and the Network"
date: "2021-06-29" date: "2021-06-29"
github_title: "2021-06-29-13-securing-nodes-and-network" github_title: "2021-06-29-13-securing-nodes-and-network"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-13.jpeg path: /assets/img/posts/development/kubernetes/k8s-13.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-13.jpeg](/assets/img/posts/Development/Kubernetes/k8s-13.jpeg) _A pod with hostNetwork: true uses the node's network interfaces instead of its own. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-13)_ ![k8s-13.jpeg](/assets/img/posts/development/kubernetes/k8s-13.jpeg) _A pod with hostNetwork: true uses the node's network interfaces instead of its own. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-13)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "14. Managing Pods' Computational Resources" title: "14. Managing Pods' Computational Resources"
date: "2021-07-11" date: "2021-07-11"
github_title: "2021-07-11-14-managing-computation-resources" github_title: "2021-07-11-14-managing-computation-resources"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-14.jpeg path: /assets/img/posts/development/kubernetes/k8s-14.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-14.jpeg](/assets/img/posts/Development/Kubernetes/k8s-14.jpeg) _The Scheduler only cares about requests, not actual usage. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-14)_ ![k8s-14.jpeg](/assets/img/posts/development/kubernetes/k8s-14.jpeg) _The Scheduler only cares about requests, not actual usage. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-14)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "15. Automatic Scaling of Pods and Cluster Nodes" title: "15. Automatic Scaling of Pods and Cluster Nodes"
date: "2021-07-18" date: "2021-07-18"
github_title: "2021-07-18-15-autoscaling" github_title: "2021-07-18-15-autoscaling"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-15.jpeg path: /assets/img/posts/development/kubernetes/k8s-15.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-15.jpeg](/assets/img/posts/Development/Kubernetes/k8s-15.jpeg) _How the autoscaler obtains metrics and rescales the target deployment (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-15)_ ![k8s-15.jpeg](/assets/img/posts/development/kubernetes/k8s-15.jpeg) _How the autoscaler obtains metrics and rescales the target deployment (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-15)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "16. Advanced Scheduling" title: "16. Advanced Scheduling"
date: "2021-08-15" date: "2021-08-15"
github_title: "2021-08-15-16-advanced-scheduling" github_title: "2021-08-15-16-advanced-scheduling"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-16.jpeg path: /assets/img/posts/development/kubernetes/k8s-16.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-16.jpeg](/assets/img/posts/Development/Kubernetes/k8s-16.jpeg) _A pod is only scheduled to a node if it tolerates the nodes taints. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-16)_ ![k8s-16.jpeg](/assets/img/posts/development/kubernetes/k8s-16.jpeg) _A pod is only scheduled to a node if it tolerates the nodes taints. (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-16)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "17. Best Practices for Developing Apps" title: "17. Best Practices for Developing Apps"
date: "2021-08-15" date: "2021-08-15"
github_title: "2021-08-15-17-best-practices" github_title: "2021-08-15-17-best-practices"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-17.jpeg path: /assets/img/posts/development/kubernetes/k8s-17.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-17.jpeg](/assets/img/posts/Development/Kubernetes/k8s-17.jpeg) _Resources in a typical application (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-17)_ ![k8s-17.jpeg](/assets/img/posts/development/kubernetes/k8s-17.jpeg) _Resources in a typical application (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-17)_
### 주요 내용 ### 주요 내용

View File

@@ -2,17 +2,18 @@
share: true share: true
toc: true toc: true
categories: [Development, Kubernetes] categories: [Development, Kubernetes]
path: "_posts/development/kubernetes"
tags: [kubernetes, sre, devops] tags: [kubernetes, sre, devops]
title: "18. Extending Kubernetes" title: "18. Extending Kubernetes"
date: "2021-09-04" date: "2021-09-04"
github_title: "2021-09-04-18-extending-k8s" github_title: "2021-09-04-18-extending-k8s"
image: image:
path: /assets/img/posts/Development/Kubernetes/k8s-18.jpeg path: /assets/img/posts/development/kubernetes/k8s-18.jpeg
attachment: attachment:
folder: assets/img/posts/Development/Kubernetes folder: assets/img/posts/development/kubernetes
--- ---
![k8s-18.jpeg](/assets/img/posts/Development/Kubernetes/k8s-18.jpeg) _API Server Aggregation (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-18)_ ![k8s-18.jpeg](/assets/img/posts/development/kubernetes/k8s-18.jpeg) _API Server Aggregation (출처: https://livebook.manning.com/book/kubernetes-in-action/chapter-18)_
### 주요 내용 ### 주요 내용

View File

@@ -1,16 +1,21 @@
--- ---
share: true share: true
toc: true toc: true
categories: [Development, Web] categories:
tags: [development, web] - Development
title: "블로그 이주 이야기" - Web
date: "2023-06-25" path: _posts/development/web
github_title: "2023-06-25-blog-moving" tags:
- development
- web
title: 블로그 이주 이야기
date: 2023-06-25
github_title: 2023-06-25-blog-moving
image: image:
path: /assets/img/posts/blog-logo.png path: /assets/img/posts/blog-logo.png
--- ---
![blog-logo.png](/assets/img/posts/blog-logo.png) _New blog logo_ ![blog-logo.png](../../../assets/img/posts/blog-logo.png) _New blog logo_
오래 전, Github Pages가 불편하다는 이유로 티스토리로 옮겼었다. 오래 전, Github Pages가 불편하다는 이유로 티스토리로 옮겼었다.
근데 어쩌다 보니 결국 다시 돌아오게 되었다. 근데 어쩌다 보니 결국 다시 돌아오게 되었다.
@@ -57,15 +62,15 @@ image:
마지막으로 Graph View가 좀 사기인 것 같다. Reddit에서 Obsidian에 수업 필기하는 사람들의 Graph View 결과를 몇 개 봤는데, 문서 간의 수많은 연결고리가 마치 지식이 서로 연결되어 있는 느낌을 줬다. 사실 처음으로 Obsidian을 접했을 때 해보고 싶었던 것이 있었다. 배우고 있는 과목의 내용을 잘 정리해서 서로 링크로 연결하여 그 과목에서 배운 내용에 대한 지도를 만드는 것이다. 기회가 된다면 꼭 한 번 해보고 그 결과물을 Graph View로 보고 싶다. 마지막으로 Graph View가 좀 사기인 것 같다. Reddit에서 Obsidian에 수업 필기하는 사람들의 Graph View 결과를 몇 개 봤는데, 문서 간의 수많은 연결고리가 마치 지식이 서로 연결되어 있는 느낌을 줬다. 사실 처음으로 Obsidian을 접했을 때 해보고 싶었던 것이 있었다. 배우고 있는 과목의 내용을 잘 정리해서 서로 링크로 연결하여 그 과목에서 배운 내용에 대한 지도를 만드는 것이다. 기회가 된다면 꼭 한 번 해보고 그 결과물을 Graph View로 보고 싶다.
아무튼, 종합하면 Notion보다는 Obsidian이 맞겠다는 판단을 하고 과감하게 Obsidian으로 옮겼다. 아무튼, 종합하면 Notion보다는 Obsidian이 맞겠다는 판단을 하고 과감하게 Obsidian으로 옮겼다.
물론 이 결정을 한 시점은 시험기간이었음을 밝힌다.[^1] 물론 이 결정을 한 시점은 시험기간이었음을 밝힌다.[^1]
## Obsidian with Github Publisher Plugin ## Obsidian with Github Publisher Plugin
Obsidian을 Github과 연동하기 위해 [Obsidian Github Publisher](https://github.com/ObsidianPublisher/obsidian-github-publisher) 플러그인을 사용할 수 있다. Obsidian을 Github과 연동하기 위해 [Obsidian Github Publisher](https://github.com/ObsidianPublisher/obsidian-github-publisher) 플러그인을 사용할 수 있다.
![github-publisher.png](/assets/img/posts/github-publisher.png){: .shadow } _플러그인 설정 화면: 어느 폴더에 어떤 이름으로 파일을 업로드할지 설정할 수 있다._ ![github-publisher.png](../../../assets/img/posts/github-publisher.png){: .shadow } _플러그인 설정 화면: 어느 폴더에 어떤 이름으로 파일을 업로드할지 설정할 수 있다._
이 플러그인을 사용하면 Obsidian의 문서 중에서 `share: true` 로 마킹된 문서들을 레포에 저장할 수 있게 된다. 그렇다면 블로그 글을 Obsidian에서 작성하고, 플러그인을 이용해 레포에 push하게 되면, 자동으로 빌드/배포가 이뤄져서 블로그에 반영되는 것을 확인할 수 있을 것이다. 이 플러그인을 사용하면 Obsidian의 문서 중에서 `share: true` 로 마킹된 문서들을 레포에 저장할 수 있게 된다. 그렇다면 블로그 글을 Obsidian에서 작성하고, 플러그인을 이용해 레포에 push하게 되면, 자동으로 빌드/배포가 이뤄져서 블로그에 반영되는 것을 확인할 수 있을 것이다.

View File

@@ -2,11 +2,15 @@
share: true share: true
toc: true toc: true
math: true math: true
categories: [Mathematics] categories:
tags: [math, study] - Mathematics
title: "수학 공부에 대한 고찰" path: _posts/mathematics
date: "2022-02-03" tags:
github_title: "2022-04-08-thoughts-on-studying-math" - math
- study
title: 수학 공부에 대한 고찰
date: 2022-02-03
github_title: 2022-04-08-thoughts-on-studying-math
--- ---
과외돌이 수업을 위해 새로운 교재를 골라야 했다. 교재를 고민하던 도중 내가 생각하는 수학 공부 방법을 설명하기에 매우 좋은 예시가 생겨서 이렇게 글로 남기게 되었다. 과외돌이 수업을 위해 새로운 교재를 골라야 했다. 교재를 고민하던 도중 내가 생각하는 수학 공부 방법을 설명하기에 매우 좋은 예시가 생겨서 이렇게 글로 남기게 되었다.
@@ -19,13 +23,13 @@ github_title: "2022-04-08-thoughts-on-studying-math"
딱히 특별한 내용은 없고 기본 개념 설명 되어있는 것 같아서 어디가 특별하냐고 했더니 예시로 한 부분을 보여줬는데, $x = a$ 에 대하여 대칭인 함수를 적분하는 방법, 점 대칭인 함수를 적분하는 방법에 대해 소개하고 있었다. 딱히 특별한 내용은 없고 기본 개념 설명 되어있는 것 같아서 어디가 특별하냐고 했더니 예시로 한 부분을 보여줬는데, $x = a$ 에 대하여 대칭인 함수를 적분하는 방법, 점 대칭인 함수를 적분하는 방법에 대해 소개하고 있었다.
1. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 $x = m$ 에 대하여 대칭일 때, 1. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 $x = m$ 에 대하여 대칭일 때,
$$\int_{m-a}^{m+a} f(x)\,dx = 2 \int_{m}^{m+a} f(x)\,dx.\quad (a \in \mathbb{R})$$ $$\int _ {m-a}^{m+a} f(x)\,dx = 2 \int _ {m}^{m+a} f(x)\,dx.\quad (a \in \mathbb{R})$$
2. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 점 $(m, n)$에 대하여 대칭일 때, 2. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 점 $(m, n)$에 대하여 대칭일 때,
$$\int_{m-a}^{m+a} f(x)\,dx = 2an. \quad (a \in \mathbb{R})$$ $$\int _ {m-a}^{m+a} f(x)\,dx = 2an. \quad (a \in \mathbb{R})$$
보자마자 막 엄청 특별한 내용은 아니라는 것을 깨달았고, 이런 내용은 특별하게 누가 정리해주지 않아도 공부를 지엽적으로 하지 않고 충분히 생각을 많이 하면 알 수 있다고 얘기했다. 그렇다고 모르는게 이상하다는 의미는 절대 아니다. 보자마자 막 엄청 특별한 내용은 아니라는 것을 깨달았고, 이런 내용은 특별하게 누가 정리해주지 않아도 공부를 지엽적으로 하지 않고 충분히 생각을 많이 하면 알 수 있다고 얘기했다. 그렇다고 모르는게 이상하다는 의미는 절대 아니다.
@@ -37,13 +41,13 @@ github_title: "2022-04-08-thoughts-on-studying-math"
적분 파트에서 \[우함수와 기함수의 적분\]은 기본 개념으로 대부분의 책이 가르친다고 가정한다. 적분 파트에서 \[우함수와 기함수의 적분\]은 기본 개념으로 대부분의 책이 가르친다고 가정한다.
1. $\mathbb{R}$에서 정의된 연속함수 $f(x)$가 우함수일 때, 1. $\mathbb{R}$에서 정의된 연속함수 $f(x)$가 우함수일 때,
$$\int_{-a}^{a} f(x)\,dx = 2 \int_{0}^{a} f(x)\,dx. \quad (a \in \mathbb{R})$$ $$\int _ {-a}^{a} f(x)\,dx = 2 \int _ {0}^{a} f(x)\,dx. \quad (a \in \mathbb{R})$$
2. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 기함수일 때, 2. $\mathbb{R}$에서 정의된 연속함수 $f(x)$의 그래프가 기함수일 때,
$$\int_{-a}^{a} f(x)\,dx = 0. \quad (a \in \mathbb{R})$$ $$\int _ {-a}^{a} f(x)\,dx = 0. \quad (a \in \mathbb{R})$$
### 개념 공부 하면서 스스로 일반화 해보기 ### 개념 공부 하면서 스스로 일반화 해보기
@@ -83,10 +87,10 @@ github_title: "2022-04-08-thoughts-on-studying-math"
Fourier Series를 바로 꺼냈다. Fourier Series를 바로 꺼냈다.
- 모든 주기함수를 삼각함수의 무한 합으로 나타낼 수 있다! - 모든 주기함수를 삼각함수의 무한 합으로 나타낼 수 있다!
- 증명 과정에서 $\sin, \cos$ (주기함수이자 기함수, 우함수) 를 적분하는 아이디어를 사용 - 증명 과정에서 $\sin, \cos$ (주기함수이자 기함수, 우함수) 를 적분하는 아이디어를 사용
- 신호 처리 분야에서 굉장히 많이 쓰이는 내용이고, 이를 응용한 각종 Fourier Transform - DFT, FFT, QFT, STFT 등 - 은 공학 분야에서 이미 널리 쓰이는 중이다. - 신호 처리 분야에서 굉장히 많이 쓰이는 내용이고, 이를 응용한 각종 Fourier Transform - DFT, FFT, QFT, STFT 등 - 은 공학 분야에서 이미 널리 쓰이는 중이다.
나는 그냥 공부가 좋아서 하기 때문에 어떤 개념이 현실적으로 유용한지에 대해서는 많은 고민을 해보지 않는 편이다. 그래서 이 개념은 대칭성을 가지는 함수의 적분을 계산할 때 단순히 계산의 편의를 위해 존재한다고 생각하고 있었다. 동생의 질문 덕분에 이 개념이 현실에서 어떻게 사용되고 있는지 나도 깨닫게 되었다. 나는 그냥 공부가 좋아서 하기 때문에 어떤 개념이 현실적으로 유용한지에 대해서는 많은 고민을 해보지 않는 편이다. 그래서 이 개념은 대칭성을 가지는 함수의 적분을 계산할 때 단순히 계산의 편의를 위해 존재한다고 생각하고 있었다. 동생의 질문 덕분에 이 개념이 현실에서 어떻게 사용되고 있는지 나도 깨닫게 되었다.
알고있는 지식들이 서로 연결되어 깨달음을 얻었을 때 느껴지는 쾌감은 정말 최고다! 오늘도 지식이 늘었다! 👍 알고있는 지식들이 서로 연결되어 깨달음을 얻었을 때 느껴지는 쾌감은 정말 최고다! 오늘도 지식이 늘었다! 👍

View File

@@ -1,266 +0,0 @@
---
share: true
toc: true
math: true
categories:
- Mathematics
- Measure Theory
tags:
- math
- analysis
- measure-theory
title: 02. Construction of Measure
date: 2023-01-23
github_title: 2023-01-23-construction-of-measure
image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-02.png
attachment:
folder: assets/img/posts/Mathematics/Measure Theory
---
![mt-02.png](../../../assets/img/posts/Mathematics/Measure%20Theory/mt-02.png)
이제 본격적으로 집합을 재보도록 하겠습니다. 우리가 잴 수 있는 집합들부터 시작합니다. $\mathbb{R}^p$에서 논의할 건데, 이제 여기서부터는 $\mathbb{R}$의 구간의 열림/닫힘을 모두 포괄하여 정의합니다. 즉, $\mathbb{R}$의 구간이라고 하면 $[a, b], (a, b), [a, b), (a, b]$ 네 가지 경우를 모두 포함합니다.
## Elementary Sets
**정의.** ($\mathbb{R}^p$의 구간) $a_i, b_i \in \mathbb{R}$, $a_i \leq b_i$ 라 하자. $I_i$가 $\mathbb{R}$의 구간이라고 할 때, $\mathbb{R}^p$의 구간은
$$\prod_ {i=1}^p I_i = I_1 \times \cdots \times I_p$$
와 같이 정의한다.
예를 들어 $\mathbb{R}^2$의 구간이라 하면 직사각형 영역, $\mathbb{R}^3$의 구간이라 하면 직육면체 영역을 떠올릴 수 있습니다. 단, 경계는 포함되지 않을 수도 있습니다.
이러한 구간들을 유한개 모아 합집합하여 얻은 집합을 모아 elementary set이라 합니다.
**정의.** (Elementary Set) 어떤 집합이 유한개 구간의 합집합으로 표현되면 그 집합을 **elementary set**이라고 한다. 그리고 $\mathbb{R}^p$의 elementary set의 모임을 $\Sigma$로 표기한다.
임의의 구간은 유계입니다. 따라서 구간의 유한한 합집합도 유계일 것입니다.
**참고.** 임의의 elementary set은 유계이다.
Elementary set의 모임에서 집합의 연산을 정의할 수 있을 것입니다. 이 때, $\Sigma$가 ring이 된다는 것을 간단하게 확인할 수 있습니다.
**명제.** $\Sigma$는 ring이다. 하지만 전체 공간인 $\mathbb{R}^p$를 포함하고 있지 않기 때문에 $\sigma$-ring은 아니다.
구간의 길이를 재는 방법은 아주 잘 알고 있습니다. 유한개 구간의 합집합인 elementary set에서도 쉽게 잴 수 있습니다. 이제 길이 함수 $m: \Sigma \rightarrow[0, \infty)$ 을 정의하겠습니다. 아직 measure는 아닙니다.
**정의.** $a_i, b_i \in \mathbb{R}$ 가 구간 $I_i$의 양 끝점이라 하자. $\mathbb{R}^p$의 구간 $I = \displaystyle\prod_ {i=1}^p I_i$ 에 대하여,
$$m(I) = \prod_ {i=1}^p (b_i - a_i)$$
로 정의한다.
**정의.** $I_i$가 쌍마다 서로소인 $\mathbb{R}^p$의 구간이라 하자. $A = \displaystyle\bigcup_ {i=1}^n I_i$ 에 대하여
$$m(A) = \sum_ {i=1}^n m(I_i)$$
로 정의한다.
$\mathbb{R}, \mathbb{R}^2, \mathbb{R}^3$에서 생각해보면 $m$은 곧 길이, 넓이, 부피와 대응되는 함수임을 알 수 있습니다. 또한 쌍마다 서로소인 구간의 합집합에 대해서는 각 구간의 함숫값을 더한 것으로 정의합니다. 어떤 집합을 겹치지 않게 구간으로 나눌 수 있다면, 집합의 ‘길이’가 각 구간의 ‘길이’ 합이 되는 것은 자연스럽습니다.
그리고 이 정의는 well-defined 입니다. $A \in \Sigma$ 에 대해서 서로소인 유한개 구간의 합집합으로 나타내는 방법이 유일하지 않아도, $m$ 값은 같습니다.
**참고.** $m$은 $\Sigma$ 위에서 additive이다. 따라서 $m : \Sigma \rightarrow[0, \infty)$ 은 additive set function이다.
여기서 추가로 regularity 조건을 만족했으면 좋겠습니다.
**정의.** (Regularity) Set function $\mu: \Sigma \rightarrow[0, \infty]$ 가 additive라 하자. 모든 $A \in \Sigma$ 와 $\epsilon > 0$ 에 대하여
> 닫힌집합 $F \in \Sigma$, 열린집합 $G \in \Sigma$ 가 존재하여 $F \subseteq A \subseteq G$ 이고 $\mu(G) - \epsilon \leq \mu(A) \leq \mu(F) + \epsilon$
이면 $\mu$가 $\Sigma$ 위에서 **regular**하다고 정의한다.
위에서 정의한 $m$이 regular한 것은 쉽게 확인할 수 있습니다.
이제 set function $\mu: \Sigma \rightarrow[0, \infty)$ 가 finite, regular, additive 하다고 가정합니다.
**정의.** (Outer Measure) $E \in \mathcal{P}(\mathbb{R}^p)$ 의 **outer measure** $\mu^\ast: \mathcal{P}(\mathbb{R}^p) \rightarrow[0, \infty]$ 는
$$\mu^\ast(E) = \inf \left\lbrace \sum_ {n=1}^\infty \mu(A_n) : \text{열린집합 } A_n \in \Sigma \text{ 에 대하여 } E \subseteq\bigcup_ {n=1}^\infty A_n\right\rbrace.$$
로 정의한다.
Outer measure라 부르는 이유는 $E$의 바깥에서 길이를 재서 근사하기 때문입니다. Outer measure는 모든 power set에 대해서 정의할 수 있으니, 이를 이용해서 모든 집합을 잴 수 있으면 좋겠습니다. 하지만 measure가 되려면 countably additive 해야하는데, 이 조건이 가장 만족하기 까다로운 조건입니다. 실제로 countably additive 조건이 성립하지 않습니다.
**참고.**
- $\mu^\ast \geq 0$ 이다.
- $E_1 \subseteq E_2$ 이면 $\mu^\ast(E_1) \leq \mu^\ast(E_2)$ 이다. (단조성)
**정리.**
1. $A \in \Sigma$ 이면 $\mu^\ast(A) = \mu(A)$.[^1]
2. Countable subadditivity가 성립한다.
$$\mu^\ast\left( \bigcup_ {n=1}^\infty E_n \right) \leq \sum_ {n=1}^\infty \mu^\ast(E_n), \quad (\forall E_n \in \mathcal{P}(\mathbb{R}^p))$$
**증명.**
(1) $A \in \Sigma$, $\epsilon > 0$ 라 두자. $\mu$의 regularity를 이용하면, 열린집합 $G \in \Sigma$ 가 존재하여 $A \subseteq G$ 이고
$$\mu^\ast(A) \leq \mu(G) \leq \mu(A) + \epsilon$$
이다. $\mu^\ast$의 정의에 의해 열린집합 $A_n \in \Sigma$ 가 존재하여 $A \subseteq\displaystyle\bigcup_ {n=1}^\infty A_n$ 이고
$$\sum_ {n=1}^\infty \mu(A_n) \leq \mu^\ast(A) + \epsilon$$
이다. 마찬가지로 regularity에 의해 닫힌집합 $F \in \Sigma$ 가 존재하여 $F\subseteq A$ 이고 $\mu(A) \leq \mu(F) + \epsilon$ 이다. $F \subseteq\mathbb{R}^p$ 는 유계이고 닫힌집합이므로 compact set이고, finite open cover를 택할 수 있다. 즉, 적당한 $N \in \mathbb{N}$ 에 대하여 $F \subseteq\displaystyle\bigcup_ {i=1}^N A_ {i}$ 가 성립한다.
따라서
$$\mu(A) \leq \mu(F) + \epsilon \leq \sum_ {i=1}^N \mu(A_i) \leq \sum_ {i=1}^n \mu(A_i) + \epsilon \leq \mu^\ast(A) + 2\epsilon$$
이제 $\epsilon \rightarrow 0$ 로 두면 $\mu(A) = \mu^\ast(A)$ 를 얻는다.
\(2\) 부등식의 양변이 모두 $\infty$ 이면 증명할 것이 없으므로, 양변이 모두 유한하다고 가정하여 모든 $n\in \mathbb{N}$ 에 대해 $\mu^\ast(E_n) < \infty$ 하자. $\epsilon > 0$ 로 두고, 각 $n \in \mathbb{N}$ 에 대하여 열린집합 $A_ {n, k} \in \Sigma$ 가 존재하여 $E_n \subseteq\displaystyle\bigcup_ {k=1}^\infty A_ {n, k}$ 이고 $\displaystyle\sum_ {k=1}^\infty \mu(A_ {n,k}) \leq \mu^\ast(E_n) + 2^{-n}\epsilon$ 이다.
$\mu^\ast$는 하한(infimum)으로 정의되었기 때문에,
$$\mu^\ast\left( \bigcup_ {n=1}^\infty E_n \right) \leq \sum_ {n=1}^\infty \sum_ {k=1}^\infty \mu(A_ {n,k}) \leq \sum_ {n=1}^\infty \mu^\ast(E_n) + \epsilon$$
가 성립하고, $\epsilon \rightarrow 0$ 로 두면 부등식이 성립함을 알 수 있다.
## $\mu$-measurable Sets
Countably additive 조건이 성립하는 집합들만 모아서 measure를 construct 하려고 합니다. 아래 내용은 이를 위한 사전 준비 작업입니다.
**표기법.** (대칭차집합) $A \mathop{\mathrm{\triangle}}B = (A\setminus B) \cup (B \setminus A)$.
**정의.**
- $d(A, B) = \mu^\ast(A \mathop{\mathrm{\triangle}}B)$ 로 정의한다.
- 집합열 $A_n$에 대하여 $d(A_n, A) \rightarrow 0$ 이면 $A_n \rightarrow A$ 로 정의한다.
**참고.**
- $A, B, C \in \mathbb{R}^p$ 에 대하여 $d(A, B) \leq d(A, C) + d(C, B)$ 이다.
- $A_1, B_2, B_1, B_2 \in \mathbb{R}^p$ 일 때, 다음이 성립한다.
$$\left.\begin{array}{c}d(A_1 \cup A_2, B_1 \cup B_2) \\d(A_1 \cap A_2, B_1 \cap B_2) \\d(A_1 \setminus A_2, B_1 \setminus B_2)\end{array}\right\rbrace\leq d(A_1, B_1) + d(A_2, B_2).$$
**정의.** (Finitely $\mu$-measurable) 집합 $A_n \in \Sigma$ 이 존재하여 $A_n \rightarrow A$ 이면 $A$가 **finitely $\mu$-measurable**이라 한다. 그리고 finitely $\mu$-measurable한 집합의 모임을 $\mathfrak{M}_F(\mu)$로 표기한다.
위 정의는 $\mu$라는 set function에 의해 $\mu^\ast (A_n \mathop{\mathrm{\triangle}}A) \rightarrow 0$ 이 되는 elementary set $A_n$이 존재한다는 의미입니다.
**정의.** ($\mu$-measurable) $A_n \in \mathfrak{M}_F(\mu)$ 에 대하여 $A = \displaystyle\bigcup_ {n=1}^\infty A_n$ 이면 $A$가 **$\mu$-measurable**이라 한다. 그리고 $\mu$-measurable한 집합의 모임을 $\mathfrak{M}(\mu)$로 표기한다.
**참고.** $\mu^\ast(A) = d(A, \varnothing) \leq d(A, B) + \mu^\ast(B)$.
**명제.** $\mu^\ast(A)$ 또는 $\mu^\ast(B)$가 유한하면, 다음이 성립한다.
$$\lvert \mu^\ast(A) - \mu^\ast(B) \rvert \leq d(A, B).$$
**따름정리.** $A \in \mathfrak{M}_F(\mu)$ 이면 $\mu^\ast(A) < \infty$ 이다.
**증명.** $A_n \in \Sigma$ 존재하여 $A_n \rightarrow A$ 이고, $N \in \mathbb{N}$ 존재하여
$$\mu^\ast(A) \leq d(A_N, A) + \mu^\ast(A_N) \leq 1 + \mu^\ast(A_N) < \infty$$
이다.
**따름정리.** $A_n \rightarrow A$ 이고 $A_n, A \in \mathfrak{M}_F(\mu)$ 이면 $\mu^\ast(A_n)\rightarrow\mu^\ast(A) < \infty$ 이다.
**증명.** $\mu^\ast(A)$, $\mu^\ast(A_n)$ 유한하므로, $n \rightarrow\infty$ $\lvert \mu^\ast(A_n) - \mu^\ast(A) \rvert \leq d(A_n, A) \rightarrow 0$ 이다.
## Construction of Measure
준비가 끝났으니 measure를 construct 해보겠습니다! $\mathcal{P}(\mathbb{R}^p)$에서는 없지만 정의역을 $\mathfrak{M}(\mu)$ 조금 좁히면 measure가 된다는 뜻입니다.
**정리.** $\mathfrak{M}(\mu)$ $\sigma$-algebra 이고 $\mu^\ast$ $\mathfrak{M}(\mu)$ measure가 된다.
**증명.** $\mathfrak{M}(\mu)$ $\sigma$-algebra이고 $\mu^\ast$ $\mathfrak{M}(\mu)$에서 countably additive임을 보이면 충분하다.
**(Step 0)** *$\mathfrak{M}_F(\mu)$는 ring이다.*
$A, B \in \mathfrak{M}_F(\mu)$ 하자. 그러면 $A_n, B_n \in \Sigma$ 존재하여 $A_n \rightarrow A$, $B_n \rightarrow B$ 된다. 그러면
$$\left.\begin{array}{c}d(A_n \cup B_n, A \cup B) \\ d(A_n \cap B_n, A \cap B) \\ d(A_n \setminus B_n, A \setminus B)\end{array}\right\rbrace\leq d(A_n, A) + d(B_n, B) \rightarrow 0$$
이므로 $A_n \cup B_n \rightarrow A \cup B, A_n \setminus B_n \rightarrow A\setminus B$ 이기 때문에 $\mathfrak{M}_F(\mu)$ ring이다.
**(Step 1)** *$\mu^\ast$는 $\mathfrak{M}_F(\mu)$ 위에서 additive이다*.
$\Sigma$ 위에서는 $\mu = \mu^\ast$ 이므로, 따름정리에 의해
$$\begin{matrix} \mu(A_n) \rightarrow\mu^\ast(A), & \mu(A_n\cup B_n) \rightarrow\mu^\ast(A\cup B), \\ \mu(B_n) \rightarrow\mu^\ast(B), & \mu(A_n\cap B_n) \rightarrow\mu^\ast(A\cap B) \end{matrix}$$
성립함을 있다. 일반적으로 $\mu(A_n) + \mu(B_n) = \mu(A_n \cup B_n) + \mu(A_n \cap B_n)$ 이므로 여기서 $n \rightarrow\infty$ 두면
$$\mu^\ast(A) + \mu^\ast(B) = \mu^\ast(A\cup B) + \mu^\ast(A \cap B)$$
얻는다. $A \cap B = \varnothing$ 라는 조건이 추가되면 $\mu^\ast$ additive임을 있다.
**(Step 2)** *$\mathfrak{M}_F(\mu) = \lbrace A \in \mathfrak{M}(\mu) : \mu^\ast(A) < \infty\rbrace$.*[^2]
**Claim**. 쌍마다 서로소인 $\mathfrak{M}_F(\mu)$ 원소들을 잡아 이들의 합집합으로 $A \in \mathfrak{M}(\mu)$ 표현할 있다.
**증명.** $A_n' \in \mathfrak{M}_F(\mu)$ 대하여 $A = \bigcup A_n'$ 두자.
> $A_1 = A_1'$, $n \geq 2$ 이면 $A_n = A_n' \setminus(A_1'\cup \cdots \cup A_ {n-1}')$
같이 정의하면 $A_n$ 쌍마다 서로소이고 $A_n \in \mathfrak{M}_F(\mu)$ 임을 있다.
사실을 이용하여 $A_n \in \mathfrak{M}_F(\mu)$ 대하여 $A = \displaystyle\bigcup_ {n=1}^\infty A_n$ 으로 두자.
1. Countable subadditivity에 의해 $\displaystyle\mu^\ast(A) \leq \sum_ {n=1}^{\infty} \mu^\ast (A_n)$ 성립한다.
2. Step 1에 의해 $\displaystyle\bigcup_ {n=1}^k A_n \subseteq A$, $\displaystyle\sum_ {n=1}^{k} \mu^\ast(A_n) \leq \mu^\ast(A)$ 이다. $k \rightarrow\infty$ 두면 $\displaystyle\mu^\ast(A) \geq \sum_ {n=1}^\infty \mu^\ast(A_n)$ 임을 있다.
따라서 $\displaystyle\mu^\ast(A) = \sum_ {n=1}^\infty \mu^\ast(A_n)$ 이다.[^3] [^4]
이제 $B_n =\displaystyle\bigcup_ {k=1}^n A_k$ 두자. $\mu^\ast(A) < \infty$ 가정하면 $\displaystyle\sum_ {n=1}^\infty \mu^\ast(A_n)$ 수렴성에 의해
$$\displaystyle d(A, B_n) = \mu^\ast\left( \bigcup_ {k=n+1}^\infty A_k \right) = \sum_ {k=n+1}^{\infty} \mu^\ast(A_i) \rightarrow 0 \text{ as } n \rightarrow\infty$$
임을 있다.
$B_n \in \mathfrak{M}_F(\mu)$ 이므로 $C_n \in \Sigma$ 잡아 $n \in \mathbb{N}$ 대하여 $d(B_n, C_n)$ 임의로 작게 만들 있다. 그러면 $d(A, C_n) \leq d(A, B_n) + d(B_n, C_n)$ 이므로 충분히 $n$ 대하여 $d(A, C_n)$ 임의로 작게 만들 있다. 따라서 $C_n \rightarrow A$ 임을 있고 $A \in \mathfrak{M}_F(\mu)$ 라는 결론을 내릴 있다.
**(Step 3)** *$\mu^\ast$는 $\mathfrak{M}(\mu)$ 위에서 countably additive이다.*
$A_n \in \mathfrak{M}(\mu)$ $A \in \mathfrak{M}(\mu)$ 분할이라 하자. 적당한 $m \in \mathbb{N}$ 대하여 $\mu^\ast(A_m) = \infty$ 이면
$$\mu^\ast\left( \bigcup_ {n=1}^\infty A_n \right) \geq \mu^\ast(A_m) = \infty = \sum_ {n=1}^\infty \mu^\ast(A_n)$$
이므로 countable additivity가 성립한다.
이제 모든 $n\in \mathbb{N}$ 대하여 $\mu^\ast(A_n) < \infty$ 이면, Step 2에 의해 $A_n \in \mathfrak{M}_F(\mu)$ 이고
$$\mu^\ast(A) = \mu^\ast\left( \bigcup_ {n=1}^\infty A_n \right) = \sum_ {n=1}^\infty \mu^\ast(A_n)$$
성립한다.
**(Step 4)** *$\mathfrak{M}(\mu)$는 $\sigma$-ring이다.*
$A_n \in \mathfrak{M}(\mu)$ 이면 $B_ {n, k} \in \mathfrak{M}_F(\mu)$ 존재하여 $\displaystyle A_n = \bigcup_k B_ {n,k}$ 이다. 그러면
$$\bigcup_n A_n = \bigcup_ {n, k} B_ {n, k} \in \mathfrak{M}(\mu)$$
이다.
$A, B \in \mathfrak{M}(\mu)$ 하면 $A_n, B_n \in \mathfrak{M}_F(\mu)$ 대해 $\displaystyle A = \bigcup A_n$, $\displaystyle B = \bigcup B_n$ 이므로,
$$A \setminus B = \bigcup_ {n=1}^\infty \left( A_n \setminus B \right) = \bigcup_ {n=1}^\infty (A_n\setminus(A_n\cap B))$$
임을 있다. 그러므로 $A_n \cap B \in \mathfrak{M}_F(\mu)$ 것만 보이면 충분하다. 정의에 의해
$$A_n \cap B = \bigcup_ {k=1}^\infty (A_n \cap B_k) \in \mathfrak{M}(\mu)$$
이고 $\mu^\ast(A_n \cap B) \leq \mu^\ast(A_n) < \infty$ 이므로 $A_n\cap B \in \mathfrak{M}_F(\mu)$ 이다. 따라서 $A \setminus B$ $\mathfrak{M}_F(\mu)$ 원소들의 countable 합집합으로 표현되므로 $A\setminus B \in \mathfrak{M}(\mu)$ 이다.
따라서 $\mathfrak{M}(\mu)$ $\sigma$-ring이고 $\sigma$-algebra이다.
---
이제 $\Sigma$ 위의 $\mu$ 정의를 $\mathfrak{M}(\mu)$ ($\sigma$-algebra) 확장하여 $\mathfrak{M}(\mu)$ 위에서는 $\mu = \mu^\ast$ 정의합니다. $\Sigma$ 위에서 $\mu = m$ , 이와 같이 확장한 $\mathfrak{M}(m)$ 위의 $m$ **Lebesgue measure** on $\mathbb{R}^p$ 합니다. 그리고 $A \in \mathfrak{M}(m)$ Lebesgue measurable set이라 합니다.
[^1]: $A$ open이 아니면 자명하지 않은 명제입니다.
[^2]: $A$ $\mu$-measurable인데 $\mu^\ast(A) < \infty$이면 $A$ finitely $\mu$-measurable이다.
[^3]: $A$ countable union of sets in $\mathfrak{M}_F(\mu)$이므로 $\mu^\ast$ set의 $\mu^\ast$ 합이 된다.
[^4]: 아직 증명이 끝나지 않았습니다. $A_n$ $\mathfrak{M}(\mu)$ 원소가 아니라 $\mathfrak{M}_F(\mu)$ 원소입니다.

View File

@@ -1,200 +0,0 @@
---
share: true
toc: true
math: true
categories: [Mathematics, Measure Theory]
tags: [math, analysis, measure-theory]
title: "06. Convergence Theorems"
date: "2023-03-25"
github_title: "2023-03-25-convergence-theorems"
image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-06.png
attachment:
folder: assets/img/posts/Mathematics/Measure Theory
---
르벡 적분 이론에서 굉장히 자주 사용되는 수렴 정리에 대해 다루겠습니다. 이 정리들을 사용하면 굉장히 유용한 결과를 쉽게 얻을 수 있습니다.
## Monotone Convergence Theorem
먼저 단조 수렴 정리(monotone convergence theorem, MCT)입니다. 이 정리에서는 $f_n \geq 0$ 인 것이 매우 중요합니다.
![mt-06.png](/assets/img/posts/Mathematics/Measure%20Theory/mt-06.png)
**정리.** (단조 수렴 정리) $f_n: X \rightarrow[0, \infty]$ 가 measurable이고 모든 $x \in X$ 에 대하여 $f_n(x) \leq f_ {n+1}(x)$ 라 하자.
$$\lim_ {n\rightarrow\infty} f_n(x) = \sup_ {n} f_n(x) = f(x)$$
로 두면,
$$\int f \,d{\mu} = \lim_ {n\rightarrow\infty} \int f_n \,d{\mu} = \sup_ {n \in \mathbb{N}} \int f_n \,d{\mu}$$
이다.
**증명.**
($\geq$) $f_n(x) \leq f(x)$ 이므로 단조성을 이용하면 모든 $n \in \mathbb{N}$ 에 대하여 $\displaystyle\int f_n \,d{\mu} \leq \displaystyle\int f \,d{\mu}$ 이다. 따라서 다음이 성립한다.
$$\sup_n \int f_n \,d{\mu} \leq \int f \,d{\mu}.$$
($\leq$) 실수 $c \in (0, 1)$ 를 잡자. 마지막에 $c \nearrow 1$ 로 둘 것이다. 이제 measurable simple function $s$가 $0 \leq s \leq f$ 라 하자. 그러면 모든 $x \in X$ 에 대하여 $c \cdot s(x) < f(x)$ 것이다.
이제
$$E_n = \lbrace x \in X : f_n(x) \geq cs(x)\rbrace$$
으로 두면, $f_n(x) - cs(x)$ measurable function이므로 $E_n$ 또한 measurable이다. 여기서 $f_n$ 증가하므로 $E_n\subseteq E_ {n+1} \subseteq\cdots$ 임을 있고 $f_n \rightarrow f$ 이므로 $\bigcup_ {n=1}^\infty E_n = X$ 이다.
충분히 $N \in \mathbb{N}$ 대하여 $n \geq N$ , 모든 $x$ 대하여 $f(x) \geq f_n(x) > cs(x)$ 가 되게 할 수 있다. 그리고 $f_n \geq f_n \chi_ {E_n} \geq cs \chi_ {E_n}$ 이므로
$$\tag{\(\star\)} \int f_n \,d{\mu} \geq \int f_n \chi_ {E_n} \,d{\mu} \geq c\int s \chi_ {E_n} \,d{\mu},$$
이고 여기서 $s, \chi_ {E_n}$는 simple function이다. 그러므로 $s = \sum_ {k=0}^m y_k \chi_ {A_k}$ 라고 적으면
$$s\chi_ {E_n} = \sum_ {k=0}^m y_k \chi_ {A_k\cap E_n} \implies \int s \chi_ {E_n} \,d{\mu} = \sum_ {k=0}^m y_k \mu(A_k\cap E_n)$$
이다. $n\rightarrow\infty$ 일 때 $A_k\cap E_n \nearrow A_k$ 이므로, continuity of measure를 사용해 $\mu(A_k \cap E_n) \nearrow \mu(A_k)$ 를 얻고
$$\lim_ {n\rightarrow\infty} \int s \chi_ {E_n}\,d{\mu} = \int s \,d{\mu}$$
임도 알 수 있다. 이제 ($\star$)를 이용하면
$$\lim_ {n\rightarrow\infty} \int f_n \,d{\mu} \geq c\int s \,d{\mu}$$
이므로, $c \nearrow 1$ 로 두고 $0\leq s\leq f$ 에 대하여 $\sup$을 취하면
$$\lim_ {n\rightarrow\infty} \int f_n \,d{\mu} \geq \sup_ {0\leq s\leq f} \int s \,d{\mu} = \int f \,d{\mu}$$
가 되어 원하는 결과를 얻는다.
**참고.** 만약 부등식 $0 \leq f_n \leq f_ {n+1}$ 이 정의역 전체가 아닌 정의역의 부분집합 $E$에서만 성립한다고 하면, 다음과 같이 생각할 수 있다.
$$0 \leq f_n \chi_E \leq f_ {n+1} \chi_E \nearrow f \chi_E.$$
그러므로 단조 수렴 정리가 $E$에서도 성립함을 알 수 있다.
> $E$에서 $0\leq f_n \leq f_ {n+1} \nearrow f$ 이면 $\displaystyle\lim_ {n\rightarrow\infty} \int_E f_n \,d{\mu} = \int_E f \,d{\mu}$.
**참고.** 함수열 $f_n$이 증가하는 경우에만 정리가 성립합니다. 감소하는 경우에는 반례로 함수 $f_n = \chi_ {[n, \infty)}$ 를 생각할 수 있습니다. 그러면 $n \rightarrow\infty$ 일 때 $\chi_ {[n, \infty)} \searrow 0$ 입니다.
그러면 Lebesgue measure $m$에 대하여
$$\infty = \int \chi_ {[n, \infty)} \,d{m} \neq \int 0 \,d{m} = 0$$
이 되어 단조 수렴 정리가 성립하지 않음을 확인할 수 있습니다.
---
지난 번에 $f \geq 0$ 가 measurable이면 증가하는 measurable simple 함수열 $s_n$이 존재함을 보였고, 이 $s_n$에 대하여 적분값을 계산하여
$$\int_E s_n \,d{\mu} = \sum_ {i=1}^{n2^n} \frac{i - 1}{2^n}\mu\left( \left\lbrace x \in E : \frac{i-1}{2^n} \leq f(x) \leq \frac{i}{2^n}\right\rbrace \right) + n\mu(\lbrace x \in E : f(x)\geq n\rbrace)$$
라는 결과까지 얻었습니다. 그런데 여기서
$$f(x) = \displaystyle\lim_ {n\rightarrow\infty} s_n(x)$$
이기 때문에, 단조 수렴 정리에 의해
$$\int_E f \,d{\mu} = \lim_ {n\rightarrow\infty} \int_E s_n \,d{\mu}$$
가 성립하여 기대했던 결과를 얻었습니다. 지난 번 설명한 것처럼, 이는 곧 르벡 적분은 치역을 잘게 잘라 넓이를 계산한 것으로 이해할 수 있다는 의미가 됩니다.
---
다음은 단조 수렴 정리를 활용하여 유용한 결과를 쉽게 얻을 수 있는 예제입니다.
**참고.** Measurable function $f, g \geq 0$ 과 $\alpha, \beta \in [0, \infty)$ 에 대하여 다음이 성립한다.
$$\int_E \left( \alpha f + \beta g \right) \,d{\mu} = \alpha \int_E f \,d{\mu} + \beta \int_E g\,d{\mu}.$$
**증명.** Measurable function은 measurable simple function으로 근사할 수 있고, $f, g \geq 0$ 이므로 단조증가하도록 잡을 수 있다. 그러므로 measurable simple function $f_n$, $g_n$에 대하여 $0 \leq f_n \leq f_ {n+1} \nearrow f$, $0 \leq g_n \leq g_ {n+1} \nearrow g$ 으로 잡는다.
그러면 $\alpha f_n + \beta g_n \nearrow \alpha f + \beta g$ 이고 $\alpha f_n + \beta g_n$ 은 단조증가하는 measurable simple 함수열이다. 따라서 단조 수렴 정리에 의해
$$\int_E \left( \alpha f_n + \beta g_n \right) \,d{\mu} = \alpha \int_E f_n \,d{\mu} + \beta \int_E g_n \,d{\mu} \rightarrow\alpha \int_E f \,d{\mu} + \beta \int_E g\,d{\mu}$$
이다.
이와 비슷한 방법을 급수에도 적용할 수 있습니다.
**정리.** Measurable function $f_n: X \rightarrow[0, \infty]$ 에 대하여 $\sum_ {n=1}^\infty f_n$는 measurable이고, 단조 수렴 정리에 의해 다음이 성립한다.
$$\int_E \sum_ {n=1}^\infty f_n \,d{\mu} = \sum_ {n=1}^\infty \int_E f_n \,d{\mu}.$$
**증명.** $\sum_ {n=1}^\infty f_n$는 measurable function의 극한이므로 measurable이다. 무한급수를 부분합의 극한으로 생각하면 $f_n \geq 0$ 이므로 부분합이 증가함을 알 수 있다. 따라서 단조 수렴 정리를 적용하여 결론을 얻는다.
## Fatou's Lemma
단조 수렴 정리와 동치인 수렴 정리를 하나 더 소개합니다. Fatou's lemma로 알려져 있습니다.
**정리.** (Fatou) $f_n \geq 0$ 가 measurable이고 $E$가 measurable이라 하자. 다음이 성립한다.
$$\int_E \liminf_ {n\rightarrow\infty} f_n \,d{\mu} \leq \liminf_ {n\rightarrow\infty} \int_E f_n \,d{\mu}.$$
**증명.** $g_n = \displaystyle\inf_ {k \geq n} f_k$ 으로 두면 $\displaystyle\lim_ {n \rightarrow\infty} g_n = \liminf_ {n\rightarrow\infty} f_n$ 이다. $g_n$이 증가함은 쉽게 확인할 수 있으며 $g_n \geq 0$ 이다. $g_n$의 정의로부터 모든 $k \geq n$ 에 대하여 $g_n \leq f_k$ 이므로,
$$\int_E g_n \,d{\mu} \leq \inf_ {k\geq n} \int_E f_k \,d{\mu}$$
이다. 여기서 $n \rightarrow\infty$ 로 두면
$$\int_E \liminf_ {n\rightarrow\infty} f_n \,d{\mu} = \lim_ {n \rightarrow\infty} \int_E g_n \,d{\mu} \leq \lim_ {n \rightarrow\infty} \inf_ {k \geq n}\int_E f_k \,d{\mu} = \liminf_ {n \rightarrow\infty} \int_E f_n \,d{\mu}$$
이 된다. 여기서 첫 번째 등호는 단조 수렴 정리에 의해 성립한다.
**참고.** 위 증명에서는 단조 수렴 정리를 활용했습니다. 반대로 이 정리를 가정하면 단조 수렴 정리를 증명할 수 있기도 합니다. 따라서 이 둘은 동치입니다. 증명은 생략합니다.
**참고.** 왠지 위와 비슷한 결론이 $\limsup$에 대해서도 성립해야 할 것 같습니다. 구체적으로,
$$\int_E \limsup_ {n \rightarrow\infty} f_n \,d{\mu} \geq \limsup_ {n \rightarrow\infty} \int_E f_n \,d{\mu}$$
일 것 같습니다. 안타깝게도 이는 성립하지 않습니다. 반례로 앞서 소개한 $\chi_ {[n, \infty)}$를 한 번 더 가져올 수 있습니다. 좌변을 계산해 보면 0이지만, 우변을 계산해 보면 $\infty$입니다. 나중에 소개하겠지만, $\lvert f_n \rvert \leq g$ 를 만족하는 함수 $g \in \mathcal{L}^{1}$ 가 존재해야 위 부등식이 성립합니다.
## Properties of the Lebesgue Integral
르벡 적분의 몇 가지 성질을 소개하고 마칩니다.
1. $f$가 measurable이고 $E$에서 bounded이며 $\mu(E) < \infty$ , 적당한 실수 $M > 0$ 에 대하여 $\lvert f \rvert \leq M$ 이므로
$$\int_E \lvert f \rvert \,d{\mu} \leq \int_E M \,d{\mu} = M\mu(E) < \infty$$
임을 있습니다. 그러므로 $f \in \mathcal{L}^{1}(E, \mu)$ 입니다. $E$ measure가 finite라는 가정 하에, bounded function은 모두 르벡 적분 가능합니다.
2. $f, g \in \mathcal{L}^{1}(E, \mu)$ 이고 $E$에서 $f \leq g$ , 단조성이 성립함을 보이려고 합니다. 앞에서는 $0 \leq f \leq g$ 경우에만 단조성을 증명했었는데, 이를 확장하여 함수가 음의 값을 가지는 경우에도 증명하고 싶습니다. 그러므로 양수인 부분과 음수인 부분을 나누어 고려하여 다음과 같이 적을 있습니다.
$$\chi_E (x) f^+(x) \leq \chi_E(x) g^+(x), \qquad \chi_E(x) g^-(x) \leq \chi_E (x) f^-(x)$$
이로부터
$$\int_E f^+ \,d{\mu} \leq \int_E g^+ \,d{\mu} < \infty, \qquad \int_E g^- \,d{\mu} \leq \int_E f^- \,d{\mu} < \infty$$
얻습니다. 따라서
$$\int_E f\,d{\mu} \leq \int_E g \,d{\mu}$$
성립하고, 함수가 음의 값을 가지는 경우에도 단조성이 성립함을 있습니다.
3. $f \in \mathcal{L}^{1}(E, \mu)$, $c \in \mathbb{R}$ 하면 $cf \in \mathcal{L}^{1}(E, \mu)$ 입니다. 왜냐하면
$$\int_E \lvert c \rvert\lvert f \rvert \,d{\mu} = \lvert c \rvert \int_E \lvert f \rvert\,d{\mu} < \infty$$
이기 때문입니다. 적분이 가능하니 실제 적분값을 계산할 선형성이 성립했으면 좋겠습니다. 앞에서는 음이 아닌 실수에 대해서만 증명했었는데, 이도 마찬가지로 확장하려 합니다. $c < 0$ 경우만 보이면 됩니다. , $(cf)^+ = -cf^-$, $(cf)^- = -cf^+$ 이므로, 다음이 성립합니다.
$$\int_E cf \,d{\mu} = \int_E (cf)^+ - \int_E (cf)^- \,d{\mu} = -c \int_E f^- \,d{\mu} - (-c) \int_E f^+ \,d{\mu} = c\int_E f\,d{\mu}.$$
4. Measurable function $f$ 대하여 $E$에서 $a \leq f(x) \leq b$ 이고 $\mu(E) < \infty$ 다음이 성립합니다.
$$\int_E a \chi_E \,d{\mu} \leq \int_E f\chi_E \,d{\mu} \leq \int_E b \chi_E \,d{\mu} \implies a \mu(E) \leq \int_E f \,d{\mu} \leq b \mu(E).$$
$f$ 르벡 적분 가능하다는 사실은 $f$ bounded라는 사실을 이용합니다.
5. $f \in \mathcal{L}^{1}(E, \mu)$ measurable set $A \subseteq E$ 주어지는 경우, $f$ $E$ 부분집합인 $A$ 위에서도 르벡 적분 가능합니다. 이는 다음 부등식에서 확인할 있습니다.
$$\int_A \lvert f \rvert \,d{\mu} \leq \int_E \lvert f \rvert\,d{\mu} < \infty.$$
6. 만약 measure가 0인 집합에서 적분을 하면 어떻게 될까요? $\mu(E) = 0$ 하고, measurable function $f$ 적분해 보겠습니다. 여기서 $\min\lbrace \lvert f \rvert, n\rbrace\chi_E$ measurable이며 $n \rightarrow\infty$ $\min\lbrace \lvert f \rvert, n\rbrace\chi_E \nearrow \lvert f \rvert\chi_E$ 임을 이용합니다. 마지막으로 단조 수렴 정리를 적용하면
$$\begin{aligned} \int_E \lvert f \rvert \,d{\mu} &= \lim_ {n \rightarrow\infty} \int_E \min\lbrace \lvert f \rvert, n\rbrace \,d{\mu} \\ &\leq \lim_ {n \rightarrow\infty} \int_E n \,d{\mu} = \lim_ {n \rightarrow\infty} n\mu(E) = 0 \end{aligned}$$
임을 얻습니다. 따라서 $f \in \mathcal{L}^{1}(E, \mu)$ 이고, $\displaystyle\int_E f \,d{\mu} = 0$ 되어 적분값이 0임을 있습니다. , measure가 0인 집합 위에서 적분하면 결과는 0이 됩니다.[^1]
[^1]: 편의상 $0\cdot\infty = 0$ 으로 정의했기 때문에 $f \equiv \infty$ 경우에도 성립합니다.

View File

@@ -1,130 +0,0 @@
---
share: true
toc: true
math: true
categories: [Mathematics, Measure Theory]
tags: [math, analysis, measure-theory]
title: "08. Comparison with the Riemann Integral"
date: "2023-06-20"
github_title: "2023-06-20-comparison-with-riemann-integral"
image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-08.png
attachment:
folder: assets/img/posts/Mathematics/Measure Theory
---
![mt-08.png](/assets/img/posts/Mathematics/Measure%20Theory/mt-08.png)
## Comparison with the Riemann Integral
먼저 혼동을 막기 위해 Lebesgue measure $m$에 대하여 르벡 적분을
$$\int_ {[a, b]} f \,d{m} = \int_ {[a, b]} f \,d{x} = \int_a^b f \,d{x}$$
와 같이 표기하고, 리만 적분은
$$\mathcal{R}\int_a^b f\,d{x}$$
로 표기하겠습니다.
**정리.** $a, b \in \mathbb{R}$ 에 대하여 $a < b$ 이고 함수 $f$ 유계라고 하자.
1. $f \in \mathcal{R}[a, b]$ 이면 $f \in \mathcal{L}^{1}[a, b]$ 이고 $\displaystyle\int_a^b f\,d{x} = \mathcal{R}\int_a^b f \,d{x}$ 이다.
2. $f \in \mathcal{R}[a, b]$ $\iff$ $f$ 연속 $m$-a.e. on $[a, b]$.
쉽게 풀어서 적어보면, (1) $f$ $[a, b]$에서 리만 적분 가능하면 르벡 적분 또한 가능하며, 적분 값이 같다는 의미입니다. 르벡 적분이 리만 적분보다 강력하다는 것을 있습니다.
또한 (2) 리만 적분 가능성에 대한 동치 조건을 알려줍니다. Almost everywhere라는 조건이 붙었기 때문에, $\mathcal{L}^1$ equivalence class를 고려하면 사실상 연속함수에 대해서만 리만 적분이 가능하다는 뜻이 됩니다.
**증명.** $k \in \mathbb{N}$ 대하여 구간 $[a, b]$ 분할 $P_k = \lbrace a = x_0^k < x_1^k < \cdots < x_ {n_k}^k = b\rbrace$ 잡는다. $P_k \subseteq P_ {k+1}$ (refinement) 이고 $\lvert x_ {i}^k - x_ {i-1}^k \rvert < \frac{1}{k}$ 되도록 한다.
그러면 리만 적분의 정의로부터
$$\lim_ {k \rightarrow\infty} L(P_k, f) = \mathcal{R}\underline{\int_ {a}^{b}} f\,d{x}, \quad \lim_ {k \rightarrow\infty} U(P_k, f) = \mathcal{R} \overline{\int_ {a}^{b}} f \,d{x}$$
임을 있다.
이제 measurable simple function $U_k, L_k$ 다음과 같이 잡는다.
$$U_k = \sum_ {i=1}^{n_k} \sup_ {x_ {i-1}^k \leq y \leq x_ {i}^k} f(y) \chi_ {(x_ {i-1}^k, x_i^k]}, \quad L_k = \sum_ {i=1}^{n_k} \inf_ {x_ {i-1}^k \leq y \leq x_ {i}^k} f(y) \chi_ {(x_ {i-1}^k, x_i^k]}.$$
그러면 구간 $[a, b]$ 위에서 $L_k \leq f \leq U_k$ 것은 당연하고, 르벡 적분이 가능하므로
$$\int_a^b L_k \,d{x} = L(P_k, f), \quad \int_a^b U_k \,d{x} = U(P_k, f)$$
됨을 있다. 여기서 $P_k \subseteq P_ {k + 1}$ 되도록 잡았기 때문에, $L_k$ 증가하는 수열, $U_k$ 감소하는 수열이다.
그러므로
$$L(x) = \lim_ {k \rightarrow\infty} L_k(x), \quad U(x) = \lim_ {k \rightarrow\infty} U_k(x)$$
정의했을 , 극한이 존재함을 있다. 여기서 $f, L_k, U_k$ 모두 유계인 함수이므로 지배 수렴 정리에 의해
$$\int_a^b L \,d{x} = \lim_ {k \rightarrow\infty} \int_a^b L_k \,d{x} = \lim_ {k \rightarrow\infty} L(P_k, f) = \mathcal{R}\underline{\int_ {a}^{b}} f\,d{x} < \infty,$$
$$\int_a^b U\,d{x} = \lim_ {k \rightarrow\infty} \int_a^b U_k \,d{x} = \lim_ {k \rightarrow\infty} U(P_k, f) = \mathcal{R} \overline{\int_ {a}^{b}} f \,d{x} < \infty$$
이므로 $L, U \in \mathcal{L}^{1}[a, b]$ 이다.
사실을 종합하면 $f \in \mathcal{R}[a, b]$ ,
$$\mathcal{R}\underline{\int_ {a}^{b}} f\,d{x} = \mathcal{R}\overline{\int_ {a}^{b}} f\,d{x}$$
이므로
$$\int_a^b (U - L)\,d{x} = 0$$
되어 $U = L$ $m$-a.e. on $[a, b]$라는 사실을 있다. 역으로 이를 거꾸로 읽어보면 $U = L$ $m$-a.e. on $[a, b]$ $f \in \mathcal{R}[a, b]$ 되는 또한 있다.
(1) 논의에 의해 $f \in \mathcal{R}[a, b]$ 이면 $f = U = L$ a.e. on $[a, b]$ 이다. 따라서 $f$ measurable.
$$\int_a^b f \,d{x} = \mathcal{R}\int_a^b f\,d{x} < \infty \implies f \in \mathcal{L}^{1}[a, b].$$
(2) 만약 $x \notin \bigcup_ {k=1}^{\infty} P_k$ 라고 가정하면, 임의의 $\epsilon > 0$ 에 대해 충분히 큰 $n \in \mathbb{N}$ 을 잡았을 때 적당한 $j_0 \in \mathbb{N}$ 이 존재하여 $x \in (t_ {j_0-1}^n, t_ {j_0}^n)$ 이면서
$$\lvert L_n(x) - L(x) \rvert + \lvert U_n(x) - U(x) \rvert < \epsilon$$
되도록 있다. 그러면 $y \in (t_ {j_0-1}^n, t_ {j_0}^n)$
$$\begin{aligned} \lvert f(x) - f(y) \rvert & \leq M_ {j_0}^n - m_ {j_0}^n = M_ {j_0}^n - U(x) + U(x) - L(x) + L(x) - m_ {j_0}^n \\ & \leq U(x) - L(x) + \epsilon \end{aligned}$$
됨을 있다.
부등식에 의해 $y \in \lbrace x : U(x) = L(x)\rbrace \setminus\bigcup_ {k=1}^{\infty} P_k$ 이면 $f$ $y$에서 연속임을 있게 된다.
따라서, $f$ 연속인 점들의 집합을 $C_f$ 하면
$$\lbrace x : U(x) = L(x)\rbrace \setminus\bigcup_ {k=1}^{\infty} P_k \subseteq C_f \subseteq\lbrace x : U(x) = L(x)\rbrace$$
된다. 한편 $\bigcup_ {k=1}^{\infty} P_k$ measure가 0 이므로, $U = L$ $m$-a.e. 것과 $f$ 연속 $m$-a.e. 것은 동치이다. 논의의 결과를 이용하면 $f \in \mathcal{R}[a, b]$ 것과 $f$ 연속 $m$-a.e. 것은 동치이다.
아래는 증명의 부산물입니다.
**참고.**
1. $x \notin \bigcup_ {k=1}^\infty P_k$ 이면 $f$ $x$에서 연속 $\iff f(x) = U(x) = L(x)$ 이다.
2. $L(x) \leq f(x) \leq U(x)$ 이고 measurable function의 극한인 $L(x), U(x)$ 또한 measurable이다.
3. $f$ 유계라는 조건이 있기 때문에 $f \geq 0$ 경우만 생각해도 충분하다. $\lvert f \rvert \leq M$ 라고 하면 $f$ 대신 $f + M$ 생각하면 되기 때문이다.
이제 리만 적분의 유용한 성질들을 가지고 와서 사용할 있습니다.
1. $f \geq 0$ 이고 measurable일 , $f_n = f\chi_ {[0, n]}$으로 정의한다. 단조 수렴 정리에 의해
$$\int_0^\infty f \,d{x} = \lim_ {n \rightarrow\infty} \int_0^\infty f_n \,d{x} = \lim_ {n \rightarrow\infty} \int_0^n f \,d{x}$$
이다. 마지막 적분을 리만 적분으로 계산할 있다.
2. 닫힌 유계 구간 $I \subseteq(0, \infty)$ 대하여 $f \in \mathcal{R}(I)$ 하면 $f \in \mathcal{L}^{1}(I)$ 이다. $f_n = f\chi_ {[0, n]}$ 으로 잡으면 $\lvert f_n \rvert \leq f$ 이므로 지배 수렴 정리를 적용하여
$$\int_0^\infty f \,d{x} = \lim_ {n \rightarrow\infty} \int_0^\infty f_n \,d{x} = \lim_ {n \rightarrow\infty} \int_0^n f \,d{x} = \lim_ {n \rightarrow\infty} \mathcal{R} \int_0^n f \,d{x}$$
임을 있다.
마찬가지로 $f_n = f\chi_ {(1/n, 1)}$ 으로 잡은 경우에도 지배 수렴 정리에 의해
$$\int_0^1 f\,d{x} = \lim_ {n \rightarrow\infty} \int_ {0}^1 f_n \,d{x} = \lim_ {n \rightarrow\infty}\int_ {1/n}^1 f \,d{x} = \lim_ {n \rightarrow\infty} \mathcal{R}\int_ {1/n}^1 f \,d{x}$$
된다.

View File

@@ -1,210 +0,0 @@
---
share: true
toc: true
math: true
categories: [Mathematics, Measure Theory]
tags: [math, analysis, measure-theory]
title: "09. $\\mathcal{L}^p$ Functions"
date: "2023-07-31"
github_title: "2023-07-31-Lp-functions"
image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-09.png
attachment:
folder: assets/img/posts/Mathematics/Measure Theory
---
![mt-09.png](/assets/img/posts/Mathematics/Measure%20Theory/mt-09.png){: .w-50}
## Integration on Complex Valued Function
Let $(X, \mathscr{F}, \mu)$ be a measure space, and $E \in \mathscr{F}$.
**정의.**
1. A complex valued function $f = u + iv$, (where $u, v$ are real functions) is measurable if $u$ and $v$ are both measurable.
2. For a complex function $f$,
$$f \in \mathcal{L}^{1}(E, \mu) \iff \int_E \left\lvert f \right\rvert \,d{\mu} < \infty \iff u, v \in \mathcal{L}^{1}(E, \mu).$$
3. If $f = u + iv \in \mathcal{L}^{1}(E, \mu)$, we define
$$\int_E f \,d{\mu} = \int_E u \,d{\mu} + i\int_E v \,d{\mu}.$$
**참고.**
1. Linearity also holds for complex valued functions. For $f_1, f_2 \in \mathcal{L}^{1}(\mu)$ and $\alpha \in \mathbb{C}$,
$$\int_E \left( f_1 + \alpha f_2 \right) \,d{\mu} = \int_E f_1 \,d{\mu} + \alpha \int_E f_2 \,d{\mu}.$$
2. Choose $c \in \mathbb{C}$ and $\left\lvert c \right\rvert = 1$ such that $\displaystyle c \int_E f \,d{\mu} \geq 0$. This is possible since multiplying by $c$ is equivalent to a rotation.
Now set $cf = u + vi$ where $u, v$ are real functions and the integral of $v$ over $E$ is $0$. Then,
$$\begin{aligned} \left\lvert \int_E f \,d{\mu} \right\rvert & = c \int_E f\,d{\mu} = \int_E u \,d{\mu} \\ & \leq \int_E (u^2+v^2)^{1/2} \,d{\mu} \\ & = \int_E \left\lvert cf \right\rvert \,d{\mu} = \int_E \left\lvert f \right\rvert \,d{\mu}. \end{aligned}$$
## Functions of Class $\mathcal{L}^{p}$
### $\mathcal{L}^p$ Space
Assume that $(X, \mathscr{F}, \mu)$ is given and $X = E$.
**정의.** ($\mathcal{L}^{p}$) A complex function $f$ is in $\mathcal{L}^{p}(\mu)$ if $f$ is measurable and $\displaystyle\int_E \left\lvert f \right\rvert ^p \,d{\mu} < \infty$.
**정의.** ($\mathcal{L}^{p}$-norm) **$\mathcal{L}^{p}$-norm** of $f$ is defined as
$$\left\lVert f \right\rVert_p = \left[\int_E \left\lvert f \right\rvert ^p \,d{\mu} \right]^{1/p}.$$
### Inequalities
**정리.** (Young Inequality) For $a, b \geq 0$, if $p > 1$ and $1/p + 1/q = 1$, then
$$ab \leq \frac{a^p}{p} + \frac{b^q}{q}.$$
**증명.** From $1/p + 1/q = 1$, $p - 1 = \frac{1}{q - 1}$. The graph $y = x^{p - 1}$ is equal to the graph of $x = y^{q - 1}$. Sketch the graph on the $xy$-plane and consider the area bounded by $x = 0$, $x = a$, $y = 0$, $y = b$. Then we directly see that
$$\int_0^a x^{p-1} \,d{x} + \int_0^b y^{q-1} \,d{y} \geq ab,$$
with equality when $a^p = b^q$. Evaluating the integral gives the desired inequality.
**참고.** For $\mathscr{F}$-measurable $f, g$ on $X$,
$$\left\lvert fg \right\rvert \leq \frac{\left\lvert f \right\rvert ^p}{p} + \frac{\left\lvert g \right\rvert ^q}{q} \implies \left\lVert fg \right\rVert_1 \leq \frac{\left\lVert f \right\rVert_p^p}{p} + \frac{\left\lVert g \right\rVert_q^q}{q}$$
by Young inequality. In particular, if $\left\lVert f \right\rVert_p = \left\lVert g \right\rVert_q = 1$, then $\left\lVert fg \right\rVert_1 \leq 1$.
**정리.** (Hölder Inequality) Let $1 < p < \infty$ and $\displaystyle\frac{1}{p} + \frac{1}{q} = 1$. If $f, g$ are measurable,
$$\left\lVert fg \right\rVert_1 \leq \left\lVert f \right\rVert_p \left\lVert g \right\rVert_q.$$
So if $f \in \mathcal{L}^{p}(\mu)$ and $g \in \mathcal{L}^{q}(\mu)$, then $fg \in \mathcal{L}^{1}(\mu)$.
**증명.** If $\left\lVert f \right\rVert_p = 0$ or $\left\lVert g \right\rVert_q = 0$ then $f = 0$ a.e. or $g = 0$ a.e. So $fg = 0$ a.e. and $\left\lVert fg \right\rVert_1 = 0$.
Now suppose that $\left\lVert f \right\rVert_p > 0$ and $\left\lVert g \right\rVert_q > 0$. By the remark above, the result directly follows from
$$\left\lVert \frac{f}{\left\lVert f \right\rVert_p} \cdot \frac{g}{\left\lVert g \right\rVert_q} \right\rVert_1 \leq 1.$$
**정리.** (Minkowski Inequality) For $1 \leq p < \infty$, if $f, g$ are measurable, then
$$\left\lVert f + g \right\rVert_p \leq \left\lVert f \right\rVert_p + \left\lVert g \right\rVert_p.$$
**증명.** If $f, g \notin \mathcal{L}^{p}$, the right hand side is $\infty$ and we are done. For $p = 1$, the equality is equivalent to the triangle inequality. Also if $\left\lVert f + g \right\rVert_p = 0$, the inequality holds trivially. We suppose that $p > 1$, $f, g \in \mathcal{L}^p$ and $\left\lVert f+g \right\rVert_p > 0$.
Let $q = \frac{p}{p-1}$. Since
$$\begin{aligned} \left\lvert f + g \right\rvert ^p & = \left\lvert f + g \right\rvert \cdot \left\lvert f + g \right\rvert ^{p - 1} \\ & \leq \bigl(\left\lvert f \right\rvert + \left\lvert g \right\rvert \bigr) \left\lvert f + g \right\rvert ^{p-1}, \end{aligned}$$
we have
$$\begin{aligned} \int \left\lvert f+g \right\rvert ^p & \leq \int \left\lvert f \right\rvert \cdot \left\lvert f+g \right\rvert ^{p-1} + \int \left\lvert g \right\rvert \cdot \left\lvert f+g \right\rvert ^{p-1} \\ & \leq \left( \int \left\lvert f \right\rvert ^p \right)^{1/p}\left( \int \left\lvert f+g \right\rvert ^{(p-1)q} \right)^{1/q} \\ & \quad + \left( \int \left\lvert q \right\rvert ^p \right)^{1/p}\left( \int \left\lvert f+g \right\rvert ^{(p-1)q} \right)^{1/q} \\ & = \left( \left\lVert f \right\rVert_p + \left\lVert g \right\rVert_p \right) \left( \int \left\lvert f+g \right\rvert ^p \right)^{1/q}. \end{aligned}$$
Since $\left\lVert f + g \right\rVert_p^p > 0$, we have
$$\begin{aligned} \left\lVert f + g \right\rVert_p & = \left( \int \left\lvert f+g \right\rvert ^p \right)^{1/p} \\ & = \left( \int \left\lvert f+g \right\rvert ^p \right)^{1 - \frac{1}{q}} \\ & \leq \left\lVert f \right\rVert_p + \left\lVert g \right\rVert_p. \end{aligned}$$
**정의.** $f \sim g \iff f = g$ $\mu$-a.e. and define
$$[f] = \left\lbrace g : f \sim g\right\rbrace.$$
We treat $[f]$ as an element in $\mathcal{L}^{p}(X, \mu)$, and write $f = [f]$.
**참고.**
1. We write $\left\lVert f \right\rVert_p = 0 \iff f = [0] = 0$ in the sense that $f = 0$ $\mu$-a.e.
2. Now $\lVert \cdot \rVert_p$ is a **norm** in $\mathcal{L}^{p}(X, \mu)$ so $d(f, g) = \left\lVert f - g \right\rVert_p$ is a **metric** in $\mathcal{L}^{p}(X, \mu)$.
## Completeness of $\mathcal{L}^p$
Now we have a *function space*, so we are interested in its *completeness*.
**정의.** (Convergence in $\mathcal{L}^p$) Let $f, f_n \in \mathcal{L}^{p}(\mu)$.
1. $f_n \rightarrow f$ in $\mathcal{L}^p(\mu) \iff \left\lVert f_n-f \right\rVert_p \rightarrow 0$ as $n \rightarrow\infty$.
2. $\left( f_n \right)_{n=1}^\infty$ is a Cauchy sequence in $\mathcal{L}^{p}(\mu)$ if and only if
> $\forall \epsilon > 0$, $\exists\,N > 0$ such that $n, m \geq N \implies \left\lVert f_n-f_m \right\rVert_p < \epsilon$.
**도움정리.** Let $\left( g_n \right)$ be a sequence of measurable functions. Then,
$$\left\lVert \sum_{n=1}^{\infty} \left\lvert g_n \right\rvert \right\rVert_p \leq \sum_{n=1}^{\infty} \left\lVert g_n \right\rVert_p.$$
Thus, if $\displaystyle\sum_{n=1}^{\infty} \left\lVert g_n \right\rVert_p < \infty$, then $\displaystyle\sum_{n=1}^{\infty} \left\lvert g_n \right\rvert < \infty$ $\mu$-a.e. So $\displaystyle\sum_{n=1}^{\infty} g_n < \infty$ $\mu$-a.e.
**증명.** By monotone convergence theorem and Minkowski inequality,
$$\begin{aligned} \left\lVert \sum_{n=1}^{\infty} \left\lvert g_n \right\rvert \right\rVert_p & = \lim_{m \rightarrow\infty} \left\lVert \sum_{n=1}^{m} \left\lvert g_n \right\rvert \right\rVert_p \\ & \leq \lim_{n \rightarrow\infty} \sum_{n=1}^{m} \left\lVert g_n \right\rVert_p \\ & = \sum_{n=1}^{\infty} \left\lVert g_n \right\rVert_p < \infty. \end{aligned}$$
Thus $\displaystyle\sum_{n=1}^{\infty} \left\lvert g_n \right\rvert < \infty$ $\mu$-a.e. and $\displaystyle\sum_{n=1}^{\infty} g_n < \infty$ $\mu$-a.e. by absolute convergence.
**정리.** (Fischer) Suppose $\left( f_n \right)$ is a Cauchy sequence in $\mathcal{L}^{p}(\mu)$. Then there exists $f \in \mathcal{L}^{p}(\mu)$ such that $f_n \rightarrow f$ in $\mathcal{L}^{p}(\mu)$.
**증명.** We construct $\left( n_k \right)$ by the following procedure.
$\exists\,n_1 \in \mathbb{N}$ such that $\left\lVert f_m - f_{n_1} \right\rVert_p < \frac{1}{2}$ for all $m \geq n_1$.
$\exists\,n_2 \in \mathbb{N}$ such that $\left\lVert f_m - f_{n_2} \right\rVert_p < \frac{1}{2^2}$ for all $m \geq n_2$.
Then, $\exists\,1 \leq n_1 < n_2 < \cdots < n_k$ such that $\left\lVert f_m - f_{n_k} \right\rVert_p < \frac{1}{2^k}$ for $m \geq n_k$.
Since $\displaystyle\left\lVert f_{n_{k+1}} - f_{n_k} \right\rVert_p < \frac{1}{2^k}$, we have
$$\sum_{k=1}^{\infty} \left\lVert f_{n_{k+1}} - f_{n_k} \right\rVert_p < \infty.$$
By the above lemma, $\sum \left\lvert f_{n_{k+1}} - f_{n_k} \right\rvert$ and $\sum (f_{n_{k+1}} - f_{n_k})$ are finite. Let $f_{n_0} \equiv 0$. Then as $m \rightarrow\infty$,
$$f_{n_{m+1}} = \sum_{k=0}^{m} \left( f_{n_{k+1}} - f_{n_k} \right)$$
converges $\mu$-a.e. Take $N \in \mathscr{F}$ with $\mu(N) = 0$ such that $f_{n_k}$ converges on $X \setminus N$. Let
$$f(x) = \begin{cases} \displaystyle\lim_{k \rightarrow\infty} f_{n_k} (x) & (x \in X \setminus N) \\ 0 & (x\in N) \end{cases}$$
then $f$ is measurable. Using the convergence,
$$\begin{aligned} \left\lVert f - f_{n_m} \right\rVert_p & = \left\lVert \sum_{k=m}^{\infty} \left( f_{n_{k+1}} (x) - f_{n_k}(x) \right) \right\rVert_p \\ & \leq \left\lVert \sum_{k=m}^{\infty} \left\lvert f_{n_{k+1}} (x) - f_{n_k}(x) \right\rvert \right\rVert_p \\ & \leq \sum_{k=m}^{\infty} \left\lVert f_{n_{k+1}} - f_{n_k} \right\rVert_p \leq 2^{-m} \end{aligned}$$
by the choice of $f_{n_k}$. So $f_{n_k} \rightarrow f$ in $\mathcal{L}^{p}(\mu)$. Also, $f = (f - f_{n_k}) + f_{n_k} \in \mathcal{L}^{p}(\mu)$.
Let $\epsilon > 0$ be given. Since $\left( f_n \right)$ is a Cauchy sequence in $\mathcal{L}^{p}$, $\exists\,N \in \mathbb{N}$ such that for all $n, m \geq N$, $\left\lVert f_n - f_m \right\rVert < \frac{\epsilon}{2}$. Note that $n_k \geq k$, so $n_k \geq N$ if $k \geq N$. Choose $N_1 \geq N$ such that for $k \geq N$, $\left\lVert f - f_{n_k} \right\rVert_p < \frac{\epsilon}{2}$. Then for all $k \geq N_1$,
$$\left\lVert f - f_k \right\rVert_p \leq \left\lVert f - f_{n_k} \right\rVert_p + \left\lVert f_{n_k} - f_k \right\rVert_p < \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon.$$
**참고.** $\mathcal{L}^{p}$ is a complete normed vector space, also known as **Banach space**.
**정리.** $C[a, b]$ is a dense subset of $\mathcal{L}^{p}[a, b]$. That is, for every $f \in \mathcal{L}^{p}[a, b]$ and $\epsilon > 0$, $\exists\,g \in C[a, b]$ such that $\left\lVert f - g \right\rVert_p < \epsilon$.
**증명.** Let $A$ be a closed subset in $[a, b]$, and consider a distance function
$$d(x, A) = \inf_{y\in A} \left\lvert x - y \right\rvert , \quad x \in [a, b].$$
Since $d(x, A) \leq \left\lvert x - z \right\rvert \leq \left\lvert x - y \right\rvert + \left\lvert y - z \right\rvert$ for all $z \in A$, taking infimum over $z \in A$ gives $d(x, A) \leq \left\lvert x - y \right\rvert + d(y, A)$. So
$$\left\lvert d(x, A) - d(y, A) \right\rvert \leq \left\lvert x - y \right\rvert ,$$
and $d(x, A)$ is continuous. If $d(x, A) = 0$, $\exists\,x_n \in A$ such that $\left\lvert x_n - x \right\rvert \rightarrow d(x, A) = 0$. Since $A$ is closed, $x \in A$. We know that $x \in A \iff d(x, A) = 0$.
Let
$$g_n(x) = \frac{1}{1 + n d(x, A)}.$$
$g_n$ is continuous, $g_n(x) = 1$ if and only if $x \in A$. Also for all $x \in [a, b] \setminus A$, $g_n(x) \rightarrow 0$ as $n \rightarrow\infty$. By Lebesgues dominated convergence theorem,
$$\begin{aligned} \left\lVert g_n - \chi_A \right\rVert_p^p & = \int_A \left\lvert g_n - \chi_A \right\rvert ^p \,d{x} + \int_{[a, b]\setminus A} \left\lvert g_n - \chi_A \right\rvert ^p \,d{x} \\ & = 0 + \int_{[a, b]\setminus A} \left\lvert g_n \right\rvert ^p \,d{x} \rightarrow 0 \end{aligned}$$
since $\left\lvert g_n \right\rvert ^p \leq 1$. We have shown that characteristic functions of closed sets can be approximated by continuous functions in $\mathcal{L}^{p}[a, b]$.
For every $A \in \mathfrak{M}(m)$, $\exists\,F_\text{closed} \subseteq A$ such that $m(A \setminus F) < \epsilon$. Since $\chi_A - \chi_F = \chi_{A \setminus F}$,
$$\begin{aligned} \int \left\lvert \chi_A-\chi_F \right\rvert ^p \,d{x} & = \int \left\lvert \chi_{A\setminus F} \right\rvert ^p \,d{x} \\ & = \int_{A\setminus F} \,d{x} = m(A \setminus F) < \epsilon. \end{aligned}$$
Therefore, for every $A \in \mathfrak{M}$, $\exists\,g_n \in C[a, b]$ such that $\left\lVert g_n - \chi_A \right\rVert_p \rightarrow 0$ as $n \rightarrow\infty$. So characteristic functions of any measurable set can be approximated by continuous functions in $\mathcal{L}^{p}[a, b]$.
Next, for any measurable simple function $f = \sum_{k=1}^{m}a_k \chi_{A_k}$, we can find $g_n^k \in C[a, b]$ so that
$$\left\lVert f - \sum_{k=1}^{m} a_k g_n^k \right\rVert_p = \left\lVert \sum_{k=1}^{m}a_k \left( \chi_{A_k} - g_n^k \right) \right\rVert_p \rightarrow 0.$$
Next for $f \in \mathcal{L}^{p}$ and $f \geq 0$, there exist simple functions $f_n \geq 0$ such that $f_n \nearrow f$ in $\mathcal{L}^{p}$. Finally, any $f \in \mathcal{L}^{p}$ can be written as $f = f^+ - f^-$, which completes the proof.
이러한 확장을 해보면 굉장히 routine합니다. $\chi_F$ for closed $F$ $\rightarrow$ $\chi_A$ for measurable $A$ $\rightarrow$ measurable simple $f$ $\rightarrow$ $0\leq f \in \mathcal{L}^{p} \rightarrow$ $f \in \mathcal{L}^{p}$ 같은 순서로 확장합니다.

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Mathematics - Mathematics
- Coq - Coq
path: _posts/mathematics/coq
tags: tags:
- math - math
- coq - coq

View File

@@ -0,0 +1,216 @@
---
share: true
toc: true
math: true
categories:
- Mathematics
- Measure Theory
path: _posts/mathematics/measure-theory
tags:
- math
- analysis
- measure-theory
title: 09. $\mathcal{L}^p$ Functions
date: 2023-07-31
github_title: 2023-07-31-Lp-functions
image:
path: /assets/img/posts/mathematics/measure-theory/mt-09.png
attachment:
folder: assets/img/posts/mathematics/measure-theory
---
![mt-09.png](../../../assets/img/posts/mathematics/measure-theory/mt-09.png){: .w-50}
## Integration on Complex Valued Function
Let $(X, \mathscr{F}, \mu)$ be a measure space, and $E \in \mathscr{F}$.
**정의.**
1. A complex valued function $f = u + iv$, (where $u, v$ are real functions) is measurable if $u$ and $v$ are both measurable.
2. For a complex function $f$,
$$f \in \mathcal{L}^{1}(E, \mu) \iff \int _ E \left\lvert f \right\rvert \,d{\mu} < \infty \iff u, v \in \mathcal{L}^{1}(E, \mu).$$
3. If $f = u + iv \in \mathcal{L}^{1}(E, \mu)$, we define
$$\int _ E f \,d{\mu} = \int _ E u \,d{\mu} + i\int _ E v \,d{\mu}.$$
**참고.**
1. Linearity also holds for complex valued functions. For $f _ 1, f _ 2 \in \mathcal{L}^{1}(\mu)$ and $\alpha \in \mathbb{C}$,
$$\int _ E \left( f _ 1 + \alpha f _ 2 \right) \,d{\mu} = \int _ E f _ 1 \,d{\mu} + \alpha \int _ E f _ 2 \,d{\mu}.$$
2. Choose $c \in \mathbb{C}$ and $\left\lvert c \right\rvert = 1$ such that $\displaystyle c \int _ E f \,d{\mu} \geq 0$. This is possible since multiplying by $c$ is equivalent to a rotation.
Now set $cf = u + vi$ where $u, v$ are real functions and the integral of $v$ over $E$ is $0$. Then,
$$\begin{aligned} \left\lvert \int _ E f \,d{\mu} \right\rvert & = c \int _ E f\,d{\mu} = \int _ E u \,d{\mu} \\ & \leq \int _ E (u^2+v^2)^{1/2} \,d{\mu} \\ & = \int _ E \left\lvert cf \right\rvert \,d{\mu} = \int _ E \left\lvert f \right\rvert \,d{\mu}. \end{aligned}$$
## Functions of Class $\mathcal{L}^{p}$
### $\mathcal{L}^p$ Space
Assume that $(X, \mathscr{F}, \mu)$ is given and $X = E$.
**정의.** ($\mathcal{L}^{p}$) A complex function $f$ is in $\mathcal{L}^{p}(\mu)$ if $f$ is measurable and $\displaystyle\int _ E \left\lvert f \right\rvert ^p \,d{\mu} < \infty$.
**정의.** ($\mathcal{L}^{p}$-norm) **$\mathcal{L}^{p}$-norm** of $f$ is defined as
$$\left\lVert f \right\rVert _ p = \left[\int _ E \left\lvert f \right\rvert ^p \,d{\mu} \right]^{1/p}.$$
### Inequalities
**정리.** (Young Inequality) For $a, b \geq 0$, if $p > 1$ and $1/p + 1/q = 1$, then
$$ab \leq \frac{a^p}{p} + \frac{b^q}{q}.$$
**증명.** From $1/p + 1/q = 1$, $p - 1 = \frac{1}{q - 1}$. The graph $y = x^{p - 1}$ is equal to the graph of $x = y^{q - 1}$. Sketch the graph on the $xy$-plane and consider the area bounded by $x = 0$, $x = a$, $y = 0$, $y = b$. Then we directly see that
$$\int _ 0^a x^{p-1} \,d{x} + \int _ 0^b y^{q-1} \,d{y} \geq ab,$$
with equality when $a^p = b^q$. Evaluating the integral gives the desired inequality.
**참고.** For $\mathscr{F}$-measurable $f, g$ on $X$,
$$\left\lvert fg \right\rvert \leq \frac{\left\lvert f \right\rvert ^p}{p} + \frac{\left\lvert g \right\rvert ^q}{q} \implies \left\lVert fg \right\rVert _ 1 \leq \frac{\left\lVert f \right\rVert _ p^p}{p} + \frac{\left\lVert g \right\rVert _ q^q}{q}$$
by Young inequality. In particular, if $\left\lVert f \right\rVert _ p = \left\lVert g \right\rVert _ q = 1$, then $\left\lVert fg \right\rVert _ 1 \leq 1$.
**정리.** (Hölder Inequality) Let $1 < p < \infty$ and $\displaystyle\frac{1}{p} + \frac{1}{q} = 1$. If $f, g$ are measurable,
$$\left\lVert fg \right\rVert _ 1 \leq \left\lVert f \right\rVert _ p \left\lVert g \right\rVert _ q.$$
So if $f \in \mathcal{L}^{p}(\mu)$ and $g \in \mathcal{L}^{q}(\mu)$, then $fg \in \mathcal{L}^{1}(\mu)$.
**증명.** If $\left\lVert f \right\rVert _ p = 0$ or $\left\lVert g \right\rVert _ q = 0$ then $f = 0$ a.e. or $g = 0$ a.e. So $fg = 0$ a.e. and $\left\lVert fg \right\rVert _ 1 = 0$.
Now suppose that $\left\lVert f \right\rVert _ p > 0$ and $\left\lVert g \right\rVert _ q > 0$. By the remark above, the result directly follows from
$$\left\lVert \frac{f}{\left\lVert f \right\rVert _ p} \cdot \frac{g}{\left\lVert g \right\rVert _ q} \right\rVert _ 1 \leq 1.$$
**정리.** (Minkowski Inequality) For $1 \leq p < \infty$, if $f, g$ are measurable, then
$$\left\lVert f + g \right\rVert _ p \leq \left\lVert f \right\rVert _ p + \left\lVert g \right\rVert _ p.$$
**증명.** If $f, g \notin \mathcal{L}^{p}$, the right hand side is $\infty$ and we are done. For $p = 1$, the equality is equivalent to the triangle inequality. Also if $\left\lVert f + g \right\rVert _ p = 0$, the inequality holds trivially. We suppose that $p > 1$, $f, g \in \mathcal{L}^p$ and $\left\lVert f+g \right\rVert _ p > 0$.
Let $q = \frac{p}{p-1}$. Since
$$\begin{aligned} \left\lvert f + g \right\rvert ^p & = \left\lvert f + g \right\rvert \cdot \left\lvert f + g \right\rvert ^{p - 1} \\ & \leq \bigl(\left\lvert f \right\rvert + \left\lvert g \right\rvert \bigr) \left\lvert f + g \right\rvert ^{p-1}, \end{aligned}$$
we have
$$\begin{aligned} \int \left\lvert f+g \right\rvert ^p & \leq \int \left\lvert f \right\rvert \cdot \left\lvert f+g \right\rvert ^{p-1} + \int \left\lvert g \right\rvert \cdot \left\lvert f+g \right\rvert ^{p-1} \\ & \leq \left( \int \left\lvert f \right\rvert ^p \right)^{1/p}\left( \int \left\lvert f+g \right\rvert ^{(p-1)q} \right)^{1/q} \\ & \quad + \left( \int \left\lvert q \right\rvert ^p \right)^{1/p}\left( \int \left\lvert f+g \right\rvert ^{(p-1)q} \right)^{1/q} \\ & = \left( \left\lVert f \right\rVert _ p + \left\lVert g \right\rVert _ p \right) \left( \int \left\lvert f+g \right\rvert ^p \right)^{1/q}. \end{aligned}$$
Since $\left\lVert f + g \right\rVert _ p^p > 0$, we have
$$\begin{aligned} \left\lVert f + g \right\rVert _ p & = \left( \int \left\lvert f+g \right\rvert ^p \right)^{1/p} \\ & = \left( \int \left\lvert f+g \right\rvert ^p \right)^{1 - \frac{1}{q}} \\ & \leq \left\lVert f \right\rVert _ p + \left\lVert g \right\rVert _ p. \end{aligned}$$
**정의.** $f \sim g \iff f = g$ $\mu$-a.e. and define
$$[f] = \left\lbrace g : f \sim g\right\rbrace.$$
We treat $[f]$ as an element in $\mathcal{L}^{p}(X, \mu)$, and write $f = [f]$.
**참고.**
1. We write $\left\lVert f \right\rVert _ p = 0 \iff f = [0] = 0$ in the sense that $f = 0$ $\mu$-a.e.
2. Now $\lVert \cdot \rVert _ p$ is a **norm** in $\mathcal{L}^{p}(X, \mu)$ so $d(f, g) = \left\lVert f - g \right\rVert _ p$ is a **metric** in $\mathcal{L}^{p}(X, \mu)$.
## Completeness of $\mathcal{L}^p$
Now we have a *function space*, so we are interested in its *completeness*.
**정의.** (Convergence in $\mathcal{L}^p$) Let $f, f _ n \in \mathcal{L}^{p}(\mu)$.
1. $f _ n \rightarrow f$ in $\mathcal{L}^p(\mu) \iff \left\lVert f _ n-f \right\rVert _ p \rightarrow 0$ as $n \rightarrow\infty$.
2. $\left( f _ n \right) _ {n=1}^\infty$ is a Cauchy sequence in $\mathcal{L}^{p}(\mu)$ if and only if
> $\forall \epsilon > 0$, $\exists\,N > 0$ such that $n, m \geq N \implies \left\lVert f _ n-f _ m \right\rVert _ p < \epsilon$.
**도움정리.** Let $\left( g _ n \right)$ be a sequence of measurable functions. Then,
$$\left\lVert \sum _ {n=1}^{\infty} \left\lvert g _ n \right\rvert \right\rVert _ p \leq \sum _ {n=1}^{\infty} \left\lVert g _ n \right\rVert _ p.$$
Thus, if $\displaystyle\sum _ {n=1}^{\infty} \left\lVert g _ n \right\rVert _ p < \infty$, then $\displaystyle\sum _ {n=1}^{\infty} \left\lvert g _ n \right\rvert < \infty$ $\mu$-a.e. So $\displaystyle\sum _ {n=1}^{\infty} g _ n < \infty$ $\mu$-a.e.
**증명.** By monotone convergence theorem and Minkowski inequality,
$$\begin{aligned} \left\lVert \sum _ {n=1}^{\infty} \left\lvert g _ n \right\rvert \right\rVert _ p & = \lim _ {m \rightarrow\infty} \left\lVert \sum _ {n=1}^{m} \left\lvert g _ n \right\rvert \right\rVert _ p \\ & \leq \lim _ {n \rightarrow\infty} \sum _ {n=1}^{m} \left\lVert g _ n \right\rVert _ p \\ & = \sum _ {n=1}^{\infty} \left\lVert g _ n \right\rVert _ p < \infty. \end{aligned}$$
Thus $\displaystyle\sum _ {n=1}^{\infty} \left\lvert g _ n \right\rvert < \infty$ $\mu$-a.e. and $\displaystyle\sum _ {n=1}^{\infty} g _ n < \infty$ $\mu$-a.e. by absolute convergence.
**정리.** (Fischer) Suppose $\left( f _ n \right)$ is a Cauchy sequence in $\mathcal{L}^{p}(\mu)$. Then there exists $f \in \mathcal{L}^{p}(\mu)$ such that $f _ n \rightarrow f$ in $\mathcal{L}^{p}(\mu)$.
**증명.** We construct $\left( n _ k \right)$ by the following procedure.
$\exists\,n _ 1 \in \mathbb{N}$ such that $\left\lVert f _ m - f _ {n _ 1} \right\rVert _ p < \frac{1}{2}$ for all $m \geq n _ 1$.
$\exists\,n _ 2 \in \mathbb{N}$ such that $\left\lVert f _ m - f _ {n _ 2} \right\rVert _ p < \frac{1}{2^2}$ for all $m \geq n _ 2$.
Then, $\exists\,1 \leq n _ 1 < n _ 2 < \cdots < n _ k$ such that $\left\lVert f _ m - f _ {n _ k} \right\rVert _ p < \frac{1}{2^k}$ for $m \geq n _ k$.
Since $\displaystyle\left\lVert f _ {n _ {k+1}} - f _ {n _ k} \right\rVert _ p < \frac{1}{2^k}$, we have
$$\sum _ {k=1}^{\infty} \left\lVert f _ {n _ {k+1}} - f _ {n _ k} \right\rVert _ p < \infty.$$
By the above lemma, $\sum \left\lvert f _ {n _ {k+1}} - f _ {n _ k} \right\rvert$ and $\sum (f _ {n _ {k+1}} - f _ {n _ k})$ are finite. Let $f _ {n _ 0} \equiv 0$. Then as $m \rightarrow\infty$,
$$f _ {n _ {m+1}} = \sum _ {k=0}^{m} \left( f _ {n _ {k+1}} - f _ {n _ k} \right)$$
converges $\mu$-a.e. Take $N \in \mathscr{F}$ with $\mu(N) = 0$ such that $f _ {n _ k}$ converges on $X \setminus N$. Let
$$f(x) = \begin{cases} \displaystyle\lim _ {k \rightarrow\infty} f _ {n _ k} (x) & (x \in X \setminus N) \\ 0 & (x\in N) \end{cases}$$
then $f$ is measurable. Using the convergence,
$$\begin{aligned} \left\lVert f - f _ {n _ m} \right\rVert _ p & = \left\lVert \sum _ {k=m}^{\infty} \left( f _ {n _ {k+1}} (x) - f _ {n _ k}(x) \right) \right\rVert _ p \\ & \leq \left\lVert \sum _ {k=m}^{\infty} \left\lvert f _ {n _ {k+1}} (x) - f _ {n _ k}(x) \right\rvert \right\rVert _ p \\ & \leq \sum _ {k=m}^{\infty} \left\lVert f _ {n _ {k+1}} - f _ {n _ k} \right\rVert _ p \leq 2^{-m} \end{aligned}$$
by the choice of $f _ {n _ k}$. So $f _ {n _ k} \rightarrow f$ in $\mathcal{L}^{p}(\mu)$. Also, $f = (f - f _ {n _ k}) + f _ {n _ k} \in \mathcal{L}^{p}(\mu)$.
Let $\epsilon > 0$ be given. Since $\left( f _ n \right)$ is a Cauchy sequence in $\mathcal{L}^{p}$, $\exists\,N \in \mathbb{N}$ such that for all $n, m \geq N$, $\left\lVert f _ n - f _ m \right\rVert < \frac{\epsilon}{2}$. Note that $n _ k \geq k$, so $n _ k \geq N$ if $k \geq N$. Choose $N _ 1 \geq N$ such that for $k \geq N$, $\left\lVert f - f _ {n _ k} \right\rVert _ p < \frac{\epsilon}{2}$. Then for all $k \geq N _ 1$,
$$\left\lVert f - f _ k \right\rVert _ p \leq \left\lVert f - f _ {n _ k} \right\rVert _ p + \left\lVert f _ {n _ k} - f _ k \right\rVert _ p < \frac{\epsilon}{2} + \frac{\epsilon}{2} = \epsilon.$$
**참고.** $\mathcal{L}^{p}$ is a complete normed vector space, also known as **Banach space**.
**정리.** $C[a, b]$ is a dense subset of $\mathcal{L}^{p}[a, b]$. That is, for every $f \in \mathcal{L}^{p}[a, b]$ and $\epsilon > 0$, $\exists\,g \in C[a, b]$ such that $\left\lVert f - g \right\rVert _ p < \epsilon$.
**증명.** Let $A$ be a closed subset in $[a, b]$, and consider a distance function
$$d(x, A) = \inf _ {y\in A} \left\lvert x - y \right\rvert , \quad x \in [a, b].$$
Since $d(x, A) \leq \left\lvert x - z \right\rvert \leq \left\lvert x - y \right\rvert + \left\lvert y - z \right\rvert$ for all $z \in A$, taking infimum over $z \in A$ gives $d(x, A) \leq \left\lvert x - y \right\rvert + d(y, A)$. So
$$\left\lvert d(x, A) - d(y, A) \right\rvert \leq \left\lvert x - y \right\rvert ,$$
and $d(x, A)$ is continuous. If $d(x, A) = 0$, $\exists\,x _ n \in A$ such that $\left\lvert x _ n - x \right\rvert \rightarrow d(x, A) = 0$. Since $A$ is closed, $x \in A$. We know that $x \in A \iff d(x, A) = 0$.
Let
$$g _ n(x) = \frac{1}{1 + n d(x, A)}.$$
$g _ n$ is continuous, $g _ n(x) = 1$ if and only if $x \in A$. Also for all $x \in [a, b] \setminus A$, $g _ n(x) \rightarrow 0$ as $n \rightarrow\infty$. By Lebesgues dominated convergence theorem,
$$\begin{aligned} \left\lVert g _ n - \chi _ A \right\rVert _ p^p & = \int _ A \left\lvert g _ n - \chi _ A \right\rvert ^p \,d{x} + \int _ {[a, b]\setminus A} \left\lvert g _ n - \chi _ A \right\rvert ^p \,d{x} \\ & = 0 + \int _ {[a, b]\setminus A} \left\lvert g _ n \right\rvert ^p \,d{x} \rightarrow 0 \end{aligned}$$
since $\left\lvert g _ n \right\rvert ^p \leq 1$. We have shown that characteristic functions of closed sets can be approximated by continuous functions in $\mathcal{L}^{p}[a, b]$.
For every $A \in \mathfrak{M}(m)$, $\exists\,F _ \text{closed} \subseteq A$ such that $m(A \setminus F) < \epsilon$. Since $\chi _ A - \chi _ F = \chi _ {A \setminus F}$,
$$\begin{aligned} \int \left\lvert \chi _ A-\chi _ F \right\rvert ^p \,d{x} & = \int \left\lvert \chi _ {A\setminus F} \right\rvert ^p \,d{x} \\ & = \int _ {A\setminus F} \,d{x} = m(A \setminus F) < \epsilon. \end{aligned}$$
Therefore, for every $A \in \mathfrak{M}$, $\exists\,g _ n \in C[a, b]$ such that $\left\lVert g _ n - \chi _ A \right\rVert _ p \rightarrow 0$ as $n \rightarrow\infty$. So characteristic functions of any measurable set can be approximated by continuous functions in $\mathcal{L}^{p}[a, b]$.
Next, for any measurable simple function $f = \sum _ {k=1}^{m}a _ k \chi _ {A _ k}$, we can find $g _ n^k \in C[a, b]$ so that
$$\left\lVert f - \sum _ {k=1}^{m} a _ k g _ n^k \right\rVert _ p = \left\lVert \sum _ {k=1}^{m}a _ k \left( \chi _ {A _ k} - g _ n^k \right) \right\rVert _ p \rightarrow 0.$$
Next for $f \in \mathcal{L}^{p}$ and $f \geq 0$, there exist simple functions $f _ n \geq 0$ such that $f _ n \nearrow f$ in $\mathcal{L}^{p}$. Finally, any $f \in \mathcal{L}^{p}$ can be written as $f = f^+ - f^-$, which completes the proof.
이러한 확장을 해보면 굉장히 routine합니다. $\chi _ F$ for closed $F$ $\rightarrow$ $\chi _ A$ for measurable $A$ $\rightarrow$ measurable simple $f$ $\rightarrow$ $0\leq f \in \mathcal{L}^{p} \rightarrow$ $f \in \mathcal{L}^{p}$ 같은 순서로 확장합니다.

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Algorithms - Algorithms
- BOJ - BOJ
path: _posts/algorithms/boj
tags: tags:
- algorithms - algorithms
- boj - boj
@@ -54,7 +55,7 @@ $$
> **정리.** $\alpha = 3 + \sqrt{5}$, $\beta = 3 - \sqrt{5}$ 일 때, $\alpha^n + \beta^n \in \mathbb{N}$ for all $n \in \mathbb{N}$.[^2] > **정리.** $\alpha = 3 + \sqrt{5}$, $\beta = 3 - \sqrt{5}$ 일 때, $\alpha^n + \beta^n \in \mathbb{N}$ for all $n \in \mathbb{N}$.[^2]
여기서 핵심은 $0 < \beta < 1$ 임을 이용하는 것이다. 따라서, $\alpha^n$ 정수부분은 $\alpha^n + \beta^n - 1$ 된다. 이제 $\alpha^n + \beta^n$ 구하면 된다. 근과 계수의 관계를 이용하면 수열 $s_n = \alpha^n + \beta^n$ 대한 귀납적 정의를 얻을 있다. 여기서 핵심은 $0 < \beta < 1$ 임을 이용하는 것이다. 따라서, $\alpha^n$ 정수부분은 $\alpha^n + \beta^n - 1$ 된다. 이제 $\alpha^n + \beta^n$ 구하면 된다. 근과 계수의 관계를 이용하면 수열 $s _ n = \alpha^n + \beta^n$ 대한 귀납적 정의를 얻을 있다.
$n$이 크기 때문에 계산은 행렬을 빠르게 거듭제곱하는 형태를 사용하면 된다. $n$이 크기 때문에 계산은 행렬을 빠르게 거듭제곱하는 형태를 사용하면 된다.
@@ -64,26 +65,26 @@ $n$이 크기 때문에 계산은 행렬을 빠르게 거듭제곱하는 형태
우선 주어진 범위 내에서 가능한 곱셈의 횟수는 최대 $29$번이다. ($2^{30} > 10^9$) 그러므로 모든 $m$에 대해 조사할 시간은 충분하다. 우선 주어진 범위 내에서 가능한 곱셈의 횟수는 최대 $29$번이다. ($2^{30} > 10^9$) 그러므로 모든 $m$에 대해 조사할 시간은 충분하다.
곱셈을 $g$번 한다고 가정하고, 이 때 $0, \dots, g$ 번째 덧셈 횟수를 $h_i$라 하자. 그렇다면 입력이 $x$일 때, 우리가 생성한 프로그램의 최종 값은 곱셈을 $g$번 한다고 가정하고, 이 때 $0, \dots, g$ 번째 덧셈 횟수를 $h _ i$라 하자. 그렇다면 입력이 $x$일 때, 우리가 생성한 프로그램의 최종 값은
$$ $$
x \cdot m^g + a \sum_{i=0}^g m^{g-i} h_i x \cdot m^g + a \sum _ {i=0}^g m^{g-i} h _ i
$$ $$
가 된다. 입력 범위 $[p, q]$를 $[r, s]$로 보내야 하므로, 부등식을 세워보면 가 된다. 입력 범위 $[p, q]$를 $[r, s]$로 보내야 하므로, 부등식을 세워보면
$$ $$
\tag{1} \tag{1}
a\sum_{i=0}^g m^{g-i} h_i \in \left[ r - p \cdot m^g, s - q\cdot m^g \right] a\sum _ {i=0}^g m^{g-i} h _ i \in \left[ r - p \cdot m^g, s - q\cdot m^g \right]
$$ $$
여야 함을 알 수 있다. 그러면 이제 프로그램의 길이를 최소화 해야하는데, 길이가 $\sum_{i=0}^g h_i + g$ 이므로 $h_i$의 합을 최소화하는 방향으로 생각해본다. 좌변의 식을 조금 풀어서 써보면 여야 함을 알 수 있다. 그러면 이제 프로그램의 길이를 최소화 해야하는데, 길이가 $\sum _ {i=0}^g h _ i + g$ 이므로 $h _ i$의 합을 최소화하는 방향으로 생각해본다. 좌변의 식을 조금 풀어서 써보면
$$ $$
a\sum_{i=0}^g m^{g-i} h_i =(am^g)h_0 + (am^{g-1})h_1 + \cdots + (a)h_g a\sum _ {i=0}^g m^{g-i} h _ i =(am^g)h _ 0 + (am^{g-1})h _ 1 + \cdots + (a)h _ g
$$ $$
이므로 $\sum h_i$를 최소화 하려면 $h_0$부터 $h_g$까지 순서대로 잡아주면 된다. 이 때 $(1)$의 범위를 만족해야 하므로, 범위를 만족하는 최소의 $h_0$가 잡히면 바로 끝. 만약 불가능하다면 $r - p\cdot m^g$를 넘지 않는 최대의 $h_0$를 잡고 이번에는 구간 양 끝에서 $am^gh_0$를 뺀 뒤 같은 방법으로 $h_1$을 잡으면 된다. 이므로 $\sum h _ i$를 최소화 하려면 $h _ 0$부터 $h _ g$까지 순서대로 잡아주면 된다. 이 때 $(1)$의 범위를 만족해야 하므로, 범위를 만족하는 최소의 $h _ 0$가 잡히면 바로 끝. 만약 불가능하다면 $r - p\cdot m^g$를 넘지 않는 최대의 $h _ 0$를 잡고 이번에는 구간 양 끝에서 $am^gh _ 0$를 뺀 뒤 같은 방법으로 $h _ 1$을 잡으면 된다.
모든 가능한 프로그램의 후보를 얻었다면, 가장 짧은 것을 찾고 사전 순으로 제일 먼저 오는 것을 찾으면 된다. 사전 순 정렬의 경우 귀납적으로 생각하면 쉽게 구현할 수 있다. 앞에서부터 연산의 종류와 횟수를 비교하면 된다. 모든 가능한 프로그램의 후보를 얻었다면, 가장 짧은 것을 찾고 사전 순으로 제일 먼저 오는 것을 찾으면 된다. 사전 순 정렬의 경우 귀납적으로 생각하면 쉽게 구현할 수 있다. 앞에서부터 연산의 종류와 횟수를 비교하면 된다.
@@ -96,7 +97,7 @@ $$
이는 [Catalan's triangle](https://en.wikipedia.org/wiki/Catalan%27s_triangle)의 응용이다. $i$개의 `(`와 $n-i$개의 `)`로 길이 $n$인 괄호 문자열을 구성하고, $k$개의 색으로 칠한다고 했으니 정답은 이는 [Catalan's triangle](https://en.wikipedia.org/wiki/Catalan%27s_triangle)의 응용이다. $i$개의 `(`와 $n-i$개의 `)`로 길이 $n$인 괄호 문자열을 구성하고, $k$개의 색으로 칠한다고 했으니 정답은
$$ $$
\sum_{i=\lceil n/2\rceil}^n C(i, n-i)\cdot k^i \sum _ {i=\lceil n/2\rceil}^n C(i, n-i)\cdot k^i
$$ $$
이다. 색칠하는 방법의 수가 $k^i$인 이유는 각 `)`가 짝이 되는 `(`와 색이 같아야 하므로 `(`의 색만 정하면 되기 때문이다. 이다. 색칠하는 방법의 수가 $k^i$인 이유는 각 `)`가 짝이 되는 `(`와 색이 같아야 하므로 `(`의 색만 정하면 되기 때문이다.

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Algorithms - Algorithms
- Data Structures - Data Structures
path: _posts/algorithms/data-structures
tags: tags:
- algorithms - algorithms
- data-structures - data-structures
@@ -41,14 +42,14 @@ These results imply that the `search` operation takes almost constant time.
The probability of collision is $\frac{1}{m}$, since the hash function is uniform by assumption. If $x$ was inserted as the $i$-th element, the number of elements to search equals The probability of collision is $\frac{1}{m}$, since the hash function is uniform by assumption. If $x$ was inserted as the $i$-th element, the number of elements to search equals
$$ $$
1 + \sum_{j = i + 1}^n \frac{1}{m}. 1 + \sum _ {j = i + 1}^n \frac{1}{m}.
$$ $$
The additional $1$ comes from searching $x$ itself. Averaging over all $i$ gives the final result. The additional $1$ comes from searching $x$ itself. Averaging over all $i$ gives the final result.
$$ $$
\begin{aligned} \begin{aligned}
\frac{1}{n}\sum_{i=1}^n \paren{1 + \sum_{j=i+1}^n \frac{1}{m}} &= 1 + \frac{1}{mn} \sum_{i=1}^n \sum_{j=i+1}^n 1 \\ \frac{1}{n}\sum _ {i=1}^n \paren{1 + \sum _ {j=i+1}^n \frac{1}{m}} &= 1 + \frac{1}{mn} \sum _ {i=1}^n \sum _ {j=i+1}^n 1 \\
&= 1 + \frac{1}{mn}\paren{n^2 - \frac{n(n+1)}{2}} \\ &= 1 + \frac{1}{mn}\paren{n^2 - \frac{n(n+1)}{2}} \\
&= 1 + \frac{n(n-1)}{2mn} \\ &= 1 + \frac{n(n-1)}{2mn} \\
&= 1+ \frac{(n-1)\alpha}{2n} = 1+ \frac{\alpha}{2} - \frac{\alpha}{2n}. &= 1+ \frac{(n-1)\alpha}{2n} = 1+ \frac{\alpha}{2} - \frac{\alpha}{2n}.
@@ -66,7 +67,7 @@ For open addressing, we first assume that $\alpha < 1$. The case $\alpha = 1$ wi
*Proof*. Let the random variable $X$ be the number of probes made in an unsuccessful search. We want to find $\bf{E}[X]$, so we use the identity *Proof*. Let the random variable $X$ be the number of probes made in an unsuccessful search. We want to find $\bf{E}[X]$, so we use the identity
$$ $$
\bf{E}[X] = \sum_{i \geq 1} \Pr[X \geq i]. \bf{E}[X] = \sum _ {i \geq 1} \Pr[X \geq i].
$$ $$
We want to find a bound for $\Pr[X \geq i]$. For $X \geq i$ to happen, $i - 1$ probes must fail, i.e., it must probe to an occupied bucket. On the $j$-th probe, there are $m - j + 1$ buckets left to be probed, and $n - j + 1$ elements not probed yet. Thus the $j$-th probe fails with probability $\frac{n - j + 1}{m - j + 1} < \frac{n}{m}$. Therefore, We want to find a bound for $\Pr[X \geq i]$. For $X \geq i$ to happen, $i - 1$ probes must fail, i.e., it must probe to an occupied bucket. On the $j$-th probe, there are $m - j + 1$ buckets left to be probed, and $n - j + 1$ elements not probed yet. Thus the $j$-th probe fails with probability $\frac{n - j + 1}{m - j + 1} < \frac{n}{m}$. Therefore,
@@ -81,7 +82,7 @@ $$
Now we have Now we have
$$ $$
\bf{E}[X] = \sum_{i \geq 1} \Pr[X \geq i] \leq \sum_{i\geq 1} \alpha^{i-1} = \frac{1}{1 - \alpha}. \bf{E}[X] = \sum _ {i \geq 1} \Pr[X \geq i] \leq \sum _ {i\geq 1} \alpha^{i-1} = \frac{1}{1 - \alpha}.
$$ $$
### Successful Search ### Successful Search
@@ -90,12 +91,12 @@ $$
*Proof*. On a successful search, the sequence of probes is exactly the same as the sequence of probes when that element was inserted. *Proof*. On a successful search, the sequence of probes is exactly the same as the sequence of probes when that element was inserted.
Suppose that an element $x$ was the $i$-th inserted element. At the moment of insertion, the load factor is ${} \alpha_i = (i-1)/m {}$. By the above theorem, the expected number of probes must have been ${} 1/(1 -\alpha_i) = \frac{m}{m-(i-1)} {}$. Averaging this over all $i$ gives Suppose that an element $x$ was the $i$-th inserted element. At the moment of insertion, the load factor is ${} \alpha _ i = (i-1)/m {}$. By the above theorem, the expected number of probes must have been ${} 1/(1 -\alpha _ i) = \frac{m}{m-(i-1)} {}$. Averaging this over all $i$ gives
$$ $$
\begin{aligned} \begin{aligned}
\frac{1}{n} \sum_{i=1}^n \frac{m}{m - (i - 1)} &= \frac{m}{n} \sum_{i=0}^{n-1} \frac{1}{m - i} \\ \frac{1}{n} \sum _ {i=1}^n \frac{m}{m - (i - 1)} &= \frac{m}{n} \sum _ {i=0}^{n-1} \frac{1}{m - i} \\
&\leq \frac{1}{\alpha} \int_{m-n}^m \frac{1}{x}\,dx \\ &\leq \frac{1}{\alpha} \int _ {m-n}^m \frac{1}{x}\,dx \\
&= \frac{1}{\alpha} \log \frac{1}{1-\alpha}. &= \frac{1}{\alpha} \log \frac{1}{1-\alpha}.
\end{aligned} \end{aligned}
$$ $$
@@ -107,7 +108,7 @@ First of all, on an unsuccessful search, all $m$ buckets should be probed.
On a successful search, set $m = n$ on the above argument, then the average number of probes is On a successful search, set $m = n$ on the above argument, then the average number of probes is
$$ $$
\frac{1}{m} \sum_{i=1}^m \frac{m}{m - (i - 1)} = \sum_{i=1}^m \frac{1}{i} = H_m, \frac{1}{m} \sum _ {i=1}^m \frac{m}{m - (i - 1)} = \sum _ {i=1}^m \frac{1}{i} = H _ m,
$$ $$
where $H_m$ is the $m$-th harmonic number. where $H _ m$ is the $m$-th harmonic number.

View File

@@ -2,6 +2,7 @@
share: true share: true
categories: categories:
- Articles - Articles
path: _posts/articles
tags: tags:
- research - research
- career - career

View File

@@ -88,17 +88,17 @@ To attack this scheme, we use frequency analysis. Calculate the frequency of eac
#### Vigenère Cipher #### Vigenère Cipher
- A polyalphabetic substitution - A polyalphabetic substitution
- Given a key length $m$, take key $k = (k_1, k_2, \dots, k_m)$. - Given a key length $m$, take key $k = (k _ 1, k _ 2, \dots, k _ m)$.
- For the $i$-th letter $x$, set $j = i \bmod m$. - For the $i$-th letter $x$, set $j = i \bmod m$.
- Encryption is done by replacing $x$ by $x + k_{j}$. - Encryption is done by replacing $x$ by $x + k _ {j}$.
- Decryption is done by replacing $x$ by $x - k_j$. - Decryption is done by replacing $x$ by $x - k _ j$.
To attack this scheme, find the key length by [*index of coincidence*](https://en.wikipedia.org/wiki/Index_of_coincidence). Then use frequency analysis. To attack this scheme, find the key length by [*index of coincidence*](https://en.wikipedia.org/wiki/Index_of_coincidence). Then use frequency analysis.
#### Hill Cipher #### Hill Cipher
- A polyalphabetic substitution - A polyalphabetic substitution
- A key is a *invertible* matrix $K = (k_{ij})_{m \times m}$ where $k_{ij} \in \mathbb{Z}_{26}$. - A key is a *invertible* matrix $K = (k _ {ij}) _ {m \times m}$ where $k _ {ij} \in \mathbb{Z} _ {26}$.
- Encryption/decryption is done by multiplying $K$ or $K^{-1}$. - Encryption/decryption is done by multiplying $K$ or $K^{-1}$.
This scheme is vulnerable to known plaintext attack, since the equation can be solved for $K$. This scheme is vulnerable to known plaintext attack, since the equation can be solved for $K$.
@@ -191,7 +191,7 @@ Let $m \in \left\lbrace 0, 1 \right\rbrace^n$ be the message to encrypt. Then ch
- Encryption: $E(k, m) = k \oplus m$. - Encryption: $E(k, m) = k \oplus m$.
- Decryption: $D(k, c) = k \oplus c$. - Decryption: $D(k, c) = k \oplus c$.
This scheme is **provably secure**. See also [one-time pad (Modern Cryptography)](../modern-cryptography/2023-09-07-otp-stream-cipher-prgs.md#one-time-pad-(otp)). This scheme is **provably secure**. See also [one-time pad (Modern Cryptography)](../../modern-cryptography/2023-09-07-otp-stream-cipher-prgs/#one-time-pad-(otp)).
## Perfect Secrecy ## Perfect Secrecy
@@ -201,10 +201,10 @@ This scheme is **provably secure**. See also [one-time pad (Modern Cryptography)
> \Pr[\mathcal{M} = m \mid \mathcal{C} = c] = \Pr[\mathcal{M} = m]. > \Pr[\mathcal{M} = m \mid \mathcal{C} = c] = \Pr[\mathcal{M} = m].
> $$ > $$
> >
> Or equivalently, for all $m_0, m_1 \in \mathcal{M}$, $c \in \mathcal{C}$, > Or equivalently, for all $m _ 0, m _ 1 \in \mathcal{M}$, $c \in \mathcal{C}$,
> >
> $$ > $$
> \Pr[E(k, m_0) = c] = \Pr[E(k, m_1) = c] > \Pr[E(k, m _ 0) = c] = \Pr[E(k, m _ 1) = c]
> $$ > $$
> >
> where $k$ is chosen uniformly in $\mathcal{K}$. > where $k$ is chosen uniformly in $\mathcal{K}$.
@@ -223,19 +223,19 @@ since for each $m$ and $c$, $k$ is determined uniquely.
> **Theorem.** If $(E, D)$ is perfectly secure, $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$. > **Theorem.** If $(E, D)$ is perfectly secure, $\lvert \mathcal{K} \rvert \geq \lvert \mathcal{M} \rvert$.
*Proof*. Assume not, then we can find some message $m_0 \in \mathcal{M}$ such that $m_0$ is not a decryption of some $c \in \mathcal{C}$. This is because the decryption algorithm $D$ is deterministic and $\lvert \mathcal{K} \rvert < \lvert \mathcal{M} \rvert$. *Proof*. Assume not, then we can find some message $m _ 0 \in \mathcal{M}$ such that $m _ 0$ is not a decryption of some $c \in \mathcal{C}$. This is because the decryption algorithm $D$ is deterministic and $\lvert \mathcal{K} \rvert < \lvert \mathcal{M} \rvert$.
For the proof in detail, check [Shannon's Theorem (Modern Cryptography)](../modern-cryptography/2023-09-07-otp-stream-cipher-prgs.md#shannon's-theorem). For the proof in detail, check [Shannon's Theorem (Modern Cryptography)](../../modern-cryptography/2023-09-07-otp-stream-cipher-prgs/#shannon's-theorem).
### Two-Time Pad is Insecure ### Two-Time Pad is Insecure
It is not secure to use the same key twice. If for the key $k$ and two messages $m_1$, $m_2$, It is not secure to use the same key twice. If for the key $k$ and two messages $m _ 1$, $m _ 2$,
$$ $$
c_1 \oplus c_2 = (k \oplus m_1) \oplus (k \oplus m_2) = m_1 \oplus m_2. c _ 1 \oplus c _ 2 = (k \oplus m _ 1) \oplus (k \oplus m _ 2) = m _ 1 \oplus m _ 2.
$$ $$
So some information is leaked, even though we cannot actually recover $m_i$ from the above equation. So some information is leaked, even though we cannot actually recover $m _ i$ from the above equation.
## Two Types of Symmetric Ciphers ## Two Types of Symmetric Ciphers
@@ -278,9 +278,9 @@ To alleviate this problem, we can combine multiple LFSRs with a $k$-input binary
- Not for attacks, but for error correction - Not for attacks, but for error correction
- Initialization vector (IV): $24$ bit - Initialization vector (IV): $24$ bit
- Key: $104$ bit number to build the keystream - Key: $104$ bit number to build the keystream
- IV and the key is used to build the keystream $k_s$ - IV and the key is used to build the keystream $k _ s$
- IV + Key is $128$ bits - IV + Key is $128$ bits
- Encryption: $c = k_s \oplus (m \parallel \mathrm{CRC}(m))$ - Encryption: $c = k _ s \oplus (m \parallel \mathrm{CRC}(m))$
#### Encryption Process #### Encryption Process
@@ -313,7 +313,7 @@ To alleviate this problem, we can combine multiple LFSRs with a $k$-input binary
- The key is fixed, and the period of IV is $2^{24}$. - The key is fixed, and the period of IV is $2^{24}$.
- Same IV leads to same key stream. - Same IV leads to same key stream.
- So if the adversary takes two frames with the same IV to obtain the XOR of two plaintext messages. - So if the adversary takes two frames with the same IV to obtain the XOR of two plaintext messages.
- $c_1 \oplus c_2 = (p_1 \oplus k_s) \oplus (p_2 \oplus k_s) = p_1 \oplus p_2$ - $c _ 1 \oplus c _ 2 = (p _ 1 \oplus k _ s) \oplus (p _ 2 \oplus k _ s) = p _ 1 \oplus p _ 2$
- Since network traffic contents are predictable, messages can be recovered. - Since network traffic contents are predictable, messages can be recovered.
- We are in the link layer, so HTTP, IP, TCP headers will be contained in the encrypted payload. - We are in the link layer, so HTTP, IP, TCP headers will be contained in the encrypted payload.
- The header formats are usually known. - The header formats are usually known.
@@ -326,12 +326,12 @@ Given a bit string (defined in the specification), the sender performs long divi
- CRC is actually a linear function. - CRC is actually a linear function.
- $\mathrm{CRC}(x \oplus y) = \mathrm{CRC}(x) \oplus \mathrm{CRC}(y)$. - $\mathrm{CRC}(x \oplus y) = \mathrm{CRC}(x) \oplus \mathrm{CRC}(y)$.
- The remainder of $x \oplus y$ is equal to the sum of the remainders of $x$ and $y$, since $\oplus$ is effectively an addition over $\mathbb{Z}_2$. - The remainder of $x \oplus y$ is equal to the sum of the remainders of $x$ and $y$, since $\oplus$ is effectively an addition over $\mathbb{Z} _ 2$.
- CRC function doesn't have a key, so it is forgeable. - CRC function doesn't have a key, so it is forgeable.
- **RC4 is transparent to XOR**, and messages can be modified. - **RC4 is transparent to XOR**, and messages can be modified.
- Let $c = k_s \oplus (m \parallel \mathrm{CRC}(m))$. - Let $c = k _ s \oplus (m \parallel \mathrm{CRC}(m))$.
- If we XOR $(x \parallel \mathrm{CRC}(x))$, where $x$ is some malicious message. - If we XOR $(x \parallel \mathrm{CRC}(x))$, where $x$ is some malicious message.
- $c \oplus (x \parallel \mathrm{CRC}(x)) = k_s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$. - $c \oplus (x \parallel \mathrm{CRC}(x)) = k _ s \oplus (m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
- The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$. - The receiver will decrypt and get $(m\oplus x \parallel \mathrm{CRC}(m\oplus x))$.
- CRC check by the receiver will succeed. - CRC check by the receiver will succeed.

View File

@@ -48,18 +48,18 @@ attachment:
### Encryption ### Encryption
1. From the $56$-bit key, generate $16$ different $48$ bit keys $k_1, \dots, k_{16}$. 1. From the $56$-bit key, generate $16$ different $48$ bit keys $k _ 1, \dots, k _ {16}$.
2. The plaintext message goes through an initial permutation. 2. The plaintext message goes through an initial permutation.
3. The output goes through $16$ rounds, and key $k_i$ is used in round $i$. 3. The output goes through $16$ rounds, and key $k _ i$ is used in round $i$.
4. After $16$ rounds, split the output into two $32$ bit halves and swap them. 4. After $16$ rounds, split the output into two $32$ bit halves and swap them.
5. The output goes through the inverse of the permutation from Step 1. 5. The output goes through the inverse of the permutation from Step 1.
Let $L_{i-1} \parallel R_{i-1}$ be the output of round $i-1$, where $L_{i-1}$ and $R_{i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.[^1] Let $L _ {i-1} \parallel R _ {i-1}$ be the output of round $i-1$, where $L _ {i-1}$ and $R _ {i-1}$ are $32$ bit halves. Also let $f$ be the Feistel function.[^1]
In each round $i$, the following operation is performed: In each round $i$, the following operation is performed:
$$ $$
L_i = R_{i - 1}, \qquad R_i = L_{i-1} \oplus f(k_i, R_{i-1}). L _ i = R _ {i - 1}, \qquad R _ i = L _ {i-1} \oplus f(k _ i, R _ {i-1}).
$$ $$
#### The Feistel Function #### The Feistel Function
@@ -85,22 +85,22 @@ The Feistel function is **not invertible.**
Let $f$ be the Feistel function. We can define each round as a function $F$, Let $f$ be the Feistel function. We can define each round as a function $F$,
$$ $$
F(L_i \parallel R_i) = R_i \parallel L_i \oplus f(R_i). F(L _ i \parallel R _ i) = R _ i \parallel L _ i \oplus f(R _ i).
$$ $$
Consider a function $G$, defined as Consider a function $G$, defined as
$$ $$
G(L_i \parallel R_i) = R_i \oplus f(L_i) \parallel L_i. G(L _ i \parallel R _ i) = R _ i \oplus f(L _ i) \parallel L _ i.
$$ $$
Then, we see that Then, we see that
$$ $$
\begin{align*} \begin{align*}
G(F(L_i \parallel R_i)) &= G(R_i \parallel L_i \oplus f(R_i)) \\ G(F(L _ i \parallel R _ i)) &= G(R _ i \parallel L _ i \oplus f(R _ i)) \\
&= (L_i \oplus f(R_i)) \oplus f(R_i) \parallel R_i \\ &= (L _ i \oplus f(R _ i)) \oplus f(R _ i) \parallel R _ i \\
&= L_i \parallel R_i. &= L _ i \parallel R _ i.
\end{align*} \end{align*}
$$ $$
@@ -109,10 +109,10 @@ Thus $F$ and $G$ are inverses of each other, thus $f$ doesn't have to be inverti
Also, note that Also, note that
$$ $$
G(L_i \parallel R_i) = F(L_i \oplus f(R_i) \parallel R_i). G(L _ i \parallel R _ i) = F(L _ i \oplus f(R _ i) \parallel R _ i).
$$ $$
Notice that evaluating $G$ is equivalent to evaluating $F$ on a encrypted block, with their upper/lower $32$ bit halves swapped. We get $L_i \oplus f(R_i) \parallel R_i$ exactly when we swap each halves of $F(L_i \parallel R_i)$. Thus, we can use the same hardware for encryption and decryption, which is the reason for swapping each $32$ bit halves. Notice that evaluating $G$ is equivalent to evaluating $F$ on a encrypted block, with their upper/lower $32$ bit halves swapped. We get $L _ i \oplus f(R _ i) \parallel R _ i$ exactly when we swap each halves of $F(L _ i \parallel R _ i)$. Thus, we can use the same hardware for encryption and decryption, which is the reason for swapping each $32$ bit halves.
## Advanced Encryption Standard (AES) ## Advanced Encryption Standard (AES)
@@ -207,13 +207,13 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- **Each previous cipher block is chained with current block** - **Each previous cipher block is chained with current block**
- Initialization vector is used - Initialization vector is used
- Encryption - Encryption
- Let $c_0$ be the initialization vector. - Let $c _ 0$ be the initialization vector.
- $c_i = E(k, p_i \oplus c_{i - 1})$, where $p_i$ is the $i$-th plaintext block. - $c _ i = E(k, p _ i \oplus c _ {i - 1})$, where $p _ i$ is the $i$-th plaintext block.
- The ciphertext is $(c_0, c_1, \dots)$. - The ciphertext is $(c _ 0, c _ 1, \dots)$.
- Decryption - Decryption
- The first block $c_0$ contains the initialization vector. - The first block $c _ 0$ contains the initialization vector.
- $p_i = c_{i - 1} \oplus D(k, c_i)$. - $p _ i = c _ {i - 1} \oplus D(k, c _ i)$.
- The plaintext is $(p_1, p_2, \dots)$. - The plaintext is $(p _ 1, p _ 2, \dots)$.
- Used for bulk data encryption, authentication - Used for bulk data encryption, authentication
- Advantages - Advantages
- Parallelism in decryption. - Parallelism in decryption.
@@ -239,13 +239,13 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- **IV changes should be unpredictable** - **IV changes should be unpredictable**
- On IV reuse, same message will generate the same ciphertext if key isn't changed - On IV reuse, same message will generate the same ciphertext if key isn't changed
- If IV is predictable, CBC is vulnerable to chosen plaintext attacks. - If IV is predictable, CBC is vulnerable to chosen plaintext attacks.
- Suppose Eve obtains $(\mathrm{IV}_1, E_k(\mathrm{IV}_1 \oplus m))$. - Suppose Eve obtains $(\mathrm{IV} _ 1, E _ k(\mathrm{IV} _ 1 \oplus m))$.
- Define Eve's new message $m' = \mathrm{IV}_{2} \oplus \mathrm{IV}_{1} \oplus g$, where - Define Eve's new message $m' = \mathrm{IV} _ {2} \oplus \mathrm{IV} _ {1} \oplus g$, where
- $\mathrm{IV}_2$ is the guess of the next IV, and - $\mathrm{IV} _ 2$ is the guess of the next IV, and
- $g$ is a guess of Alice's original message $m$. - $g$ is a guess of Alice's original message $m$.
- Eve requests an encryption of $m'$ - Eve requests an encryption of $m'$
- $c' = E_k(\mathrm{IV}_2 \oplus m') = E_k(\mathrm{IV}_\mathrm{1} \oplus g)$. - $c' = E _ k(\mathrm{IV} _ 2 \oplus m') = E _ k(\mathrm{IV} _ \mathrm{1} \oplus g)$.
- Then Eve can compare $c'$ and the original $c = E_k(\mathrm{IV}_\mathrm{1} \oplus m)$ to recover $m$. - Then Eve can compare $c'$ and the original $c = E _ k(\mathrm{IV} _ \mathrm{1} \oplus m)$ to recover $m$.
- Useful when there are not many cases for $m$ (or most of the message is already known). - Useful when there are not many cases for $m$ (or most of the message is already known).
### Cipher Feedback Mode (CFB) ### Cipher Feedback Mode (CFB)
@@ -260,13 +260,13 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- Same requirements on the IV as CBC mode. - Same requirements on the IV as CBC mode.
- Should be randomized, and should not be predictable. - Should be randomized, and should not be predictable.
- Encryption - Encryption
- Let $c_0$ be the initialization vector. - Let $c _ 0$ be the initialization vector.
- $c_i = p_i \oplus E(k, c_{i - 1})$, where $p_i$ is the $i$-th plaintext block. - $c _ i = p _ i \oplus E(k, c _ {i - 1})$, where $p _ i$ is the $i$-th plaintext block.
- The ciphertext is $(c_0, c_1, \dots)$. - The ciphertext is $(c _ 0, c _ 1, \dots)$.
- Decryption - Decryption
- The first block $c_0$ contains the initialization vector. - The first block $c _ 0$ contains the initialization vector.
- $p_i = c_i \oplus E(k, c_{i - 1})$. The same module is used for decryption! - $p _ i = c _ i \oplus E(k, c _ {i - 1})$. The same module is used for decryption!
- The plaintext is $(p_1, p_2, \dots)$. - The plaintext is $(p _ 1, p _ 2, \dots)$.
- Advantages - Advantages
- Appropriate when data arrives in bits/bytes (similar to stream cipher) - Appropriate when data arrives in bits/bytes (similar to stream cipher)
- Only encryption module is needed. - Only encryption module is needed.
@@ -294,15 +294,15 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- Encryption/decryption are both parallelizable after key stream is calculated. - Encryption/decryption are both parallelizable after key stream is calculated.
- Key stream generation cannot be parallelized. - Key stream generation cannot be parallelized.
- Encryption - Encryption
- Let $s_0$ be the initialization vector. - Let $s _ 0$ be the initialization vector.
- $s_i = E(k, s_{i - 1})$ where $s_i$ is the $i$-th key stream. - $s _ i = E(k, s _ {i - 1})$ where $s _ i$ is the $i$-th key stream.
- $c_i = p_i \oplus s_i$. - $c _ i = p _ i \oplus s _ i$.
- The ciphertext is $(s_0, c_1, \dots)$. - The ciphertext is $(s _ 0, c _ 1, \dots)$.
- Decryption - Decryption
- The first block $s_0$ contains the initialization vector. - The first block $s _ 0$ contains the initialization vector.
- $s_i = E(k, s_{i - 1})$. The same module is used for decryption. - $s _ i = E(k, s _ {i - 1})$. The same module is used for decryption.
- $p_i = c_i \oplus s_i$. - $p _ i = c _ i \oplus s _ i$.
- The plaintext is $(p_1, p_2, \dots)$. - The plaintext is $(p _ 1, p _ 2, \dots)$.
- Note: IV and successive encryptions act as an OTP generator. - Note: IV and successive encryptions act as an OTP generator.
- Advantages - Advantages
- There is no error propagation. $1$ bit error in ciphertext only affects $1$ bit in the plaintext. - There is no error propagation. $1$ bit error in ciphertext only affects $1$ bit in the plaintext.
@@ -311,8 +311,8 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- Only encryption module is needed. - Only encryption module is needed.
- Limitations - Limitations
- Key streams should not have repetitions. - Key streams should not have repetitions.
- We would have $c_i \oplus c_j = p_i \oplus p_j$. - We would have $c _ i \oplus c _ j = p _ i \oplus p _ j$.
- Size of each $s_i$ should be large enough. - Size of each $s _ i$ should be large enough.
- If attacker knows the plaintext and ciphertext, plaintext can be modified. - If attacker knows the plaintext and ciphertext, plaintext can be modified.
- Same as in OTP. - Same as in OTP.
@@ -325,9 +325,9 @@ Since the same key is used for all blocks, once a mapping from plaintext to ciph
- Highly parallelizable. - Highly parallelizable.
- Can decrypt from any arbitrary position. - Can decrypt from any arbitrary position.
- Counter should not be repeated for the same key. - Counter should not be repeated for the same key.
- Suppose that the same counter $ctr$ is used for encrypting $m_0$ and $m_1$. - Suppose that the same counter $ctr$ is used for encrypting $m _ 0$ and $m _ 1$.
- Encryption results are: $(ctr, E(k, ctr) \oplus m_0), (ctr, E(k, ctr) \oplus m_1)$. - Encryption results are: $(ctr, E(k, ctr) \oplus m _ 0), (ctr, E(k, ctr) \oplus m _ 1)$.
- Then the attacker can obtain $m_0 \oplus m_1$. - Then the attacker can obtain $m _ 0 \oplus m _ 1$.
## Modes of Operations Summary ## Modes of Operations Summary

View File

@@ -107,7 +107,7 @@ allows us to reduce the size of the numbers before exponentiation.
## Modular Arithmetic ## Modular Arithmetic
For modulus $n$, **modular arithmetic** is operation on $\mathbb{Z}_n$. For modulus $n$, **modular arithmetic** is operation on $\mathbb{Z} _ n$.
### Residue Classes ### Residue Classes
@@ -136,10 +136,10 @@ Thus, $R$ is an **equivalence relation** and each residue class $[k]$ is an **eq
We write the set of residue classes modulo $n$ as We write the set of residue classes modulo $n$ as
$$ $$
\mathbb{Z}_n = \left\lbrace \overline{0}, \overline{1}, \overline{2}, \dots, \overline{n-1} \right\rbrace. \mathbb{Z} _ n = \left\lbrace \overline{0}, \overline{1}, \overline{2}, \dots, \overline{n-1} \right\rbrace.
$$ $$
Note that $\mathbb{Z}_n$ is closed under addition and multiplication. Note that $\mathbb{Z} _ n$ is closed under addition and multiplication.
### Identity ### Identity
@@ -149,7 +149,7 @@ Note that $\mathbb{Z}_n$ is closed under addition and multiplication.
> \forall a \in S,\, a * e = e * a = a. > \forall a \in S,\, a * e = e * a = a.
> $$ > $$
In $\mathbb{Z}_n$, the additive identity is $0$, the multiplicative identity is $1$. In $\mathbb{Z} _ n$, the additive identity is $0$, the multiplicative identity is $1$.
### Inverse ### Inverse
@@ -169,7 +169,7 @@ $$
The inverse exists if and only if $\gcd(a, n) = 1$. The inverse exists if and only if $\gcd(a, n) = 1$.
> **Lemma**. For $n \geq 2$ and $a \in \mathbb{Z}$, its inverse $a^{-1} \in \mathbb{Z}_n$ exists if and only if $\gcd(a, n) = 1$. > **Lemma**. For $n \geq 2$ and $a \in \mathbb{Z}$, its inverse $a^{-1} \in \mathbb{Z} _ n$ exists if and only if $\gcd(a, n) = 1$.
*Proof*. We use the extended Euclidean algorithm. There exists $u, v \in \mathbb{Z}$ such that *Proof*. We use the extended Euclidean algorithm. There exists $u, v \in \mathbb{Z}$ such that
@@ -223,7 +223,7 @@ Basically, we use the Euclidean algorithm and solve for the remainder (which is
#### Calculating Modular Multiplicative Inverse #### Calculating Modular Multiplicative Inverse
We can use the extended Euclidean algorithm to find modular inverses. Suppose we want to calculate $a^{-1}$ in $\mathbb{Z}_n$. We assume that the inverse exist, so $\gcd(a, n) = 1$. We can use the extended Euclidean algorithm to find modular inverses. Suppose we want to calculate $a^{-1}$ in $\mathbb{Z} _ n$. We assume that the inverse exist, so $\gcd(a, n) = 1$.
Therefore, we use the extended Euclidean algorithm and find $x, y \in \mathbb{Z}$ such that Therefore, we use the extended Euclidean algorithm and find $x, y \in \mathbb{Z}$ such that
@@ -231,7 +231,7 @@ $$
ax + ny = 1. ax + ny = 1.
$$ $$
Then $ax \equiv 1 - ny \equiv 1 \pmod n$, thus $x$ is the inverse of $a$ in $\mathbb{Z}_n$. Then $ax \equiv 1 - ny \equiv 1 \pmod n$, thus $x$ is the inverse of $a$ in $\mathbb{Z} _ n$.
[^1]: Note that in C standards, `(a / b) * b + (a % b) == a`. [^1]: Note that in C standards, `(a / b) * b + (a % b) == a`.
[^2]: $a$ and $b$ are in the same coset of $\mathbb{Z}/n\mathbb{Z}$. [^2]: $a$ and $b$ are in the same coset of $\mathbb{Z}/n\mathbb{Z}$.

View File

@@ -90,7 +90,7 @@ For even better (maybe faster) results, we need the help of elementary number th
> a^{p-1} \equiv 1 \pmod p. > a^{p-1} \equiv 1 \pmod p.
> $$ > $$
*Proof*. (Using group theory) The statement can be rewritten as follows. For $a \neq 0$ in $\mathbb{Z}_p$, $a^{p-1} = 1$ in $\mathbb{Z}_p$. Since $\mathbb{Z}_p^\ast$ is a (multiplicative) group of order $p-1$, the order of $a$ should divide $p-1$. Therefore, $a^{p-1} = 1$ in $\mathbb{Z}_p$. *Proof*. (Using group theory) The statement can be rewritten as follows. For $a \neq 0$ in $\mathbb{Z} _ p$, $a^{p-1} = 1$ in $\mathbb{Z} _ p$. Since $\mathbb{Z} _ p^\ast$ is a (multiplicative) group of order $p-1$, the order of $a$ should divide $p-1$. Therefore, $a^{p-1} = 1$ in $\mathbb{Z} _ p$.
Here is an elementary proof not using group theory. Here is an elementary proof not using group theory.
@@ -117,7 +117,7 @@ For direct calculation, we use the following formula.
> **Lemma.** For $n \in \mathbb{N}$, the following holds. > **Lemma.** For $n \in \mathbb{N}$, the following holds.
> >
> $$ > $$
> \phi(n) = n \cdot \prod_{p \mid n} \left( 1 - \frac{1}{p} \right) > \phi(n) = n \cdot \prod _ {p \mid n} \left( 1 - \frac{1}{p} \right)
> $$ > $$
> >
> where $p$ is a prime number dividing $n$. > where $p$ is a prime number dividing $n$.
@@ -131,31 +131,31 @@ So to calculate $\phi(n)$, we need to **factorize** $n$. From the formula above,
### Reduced Set of Residues ### Reduced Set of Residues
Let $n \in \mathbb{N}$. The **complete set of residues** was denoted $\mathbb{Z}_n$ and Let $n \in \mathbb{N}$. The **complete set of residues** was denoted $\mathbb{Z} _ n$ and
$$ $$
\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n-1 \right\rbrace. \mathbb{Z} _ n = \left\lbrace 0, 1, \dots, n-1 \right\rbrace.
$$ $$
We also often use the **reduced set of residues**. We also often use the **reduced set of residues**.
> **Definition.** The **reduced set of residues** is the set of residues that are relatively prime to $n$. We denote this set as $\mathbb{Z}_n^\ast$. > **Definition.** The **reduced set of residues** is the set of residues that are relatively prime to $n$. We denote this set as $\mathbb{Z} _ n^\ast$.
> >
> $$ > $$
> \mathbb{Z}_n^\ast = \left\lbrace a \in \mathbb{Z}_n \setminus \left\lbrace 0 \right\rbrace : \gcd(a, n) = 1 \right\rbrace. > \mathbb{Z} _ n^\ast = \left\lbrace a \in \mathbb{Z} _ n \setminus \left\lbrace 0 \right\rbrace : \gcd(a, n) = 1 \right\rbrace.
> $$ > $$
Then by definition, we have the following result. Then by definition, we have the following result.
> **Lemma.** $\left\lvert \mathbb{Z}_n^\ast \right\lvert = \phi(n)$. > **Lemma.** $\left\lvert \mathbb{Z} _ n^\ast \right\lvert = \phi(n)$.
We can also show that $\mathbb{Z}_n^\ast$ is a multiplicative group. We can also show that $\mathbb{Z} _ n^\ast$ is a multiplicative group.
> **Lemma.** $\mathbb{Z}_n^\ast$ is a multiplicative group. > **Lemma.** $\mathbb{Z} _ n^\ast$ is a multiplicative group.
*Proof*. Let $a, b \in \mathbb{Z}_n^\ast$. We must check if $ab \in \mathbb{Z}_n^\ast$. Since $\gcd(a, n) = \gcd(b, n) = 1$, $\gcd(ab, n) = 1$. This is because if $d = \gcd(ab, n) > 1$, then a prime factor $p$ of $d$ must divide $a$ or $b$ and also $n$. Then $\gcd(a, n) \geq p$ or $\gcd(b, n) \geq p$, which is a contradiction. Thus $ab \in \mathbb{Z}_n^\ast$. *Proof*. Let $a, b \in \mathbb{Z} _ n^\ast$. We must check if $ab \in \mathbb{Z} _ n^\ast$. Since $\gcd(a, n) = \gcd(b, n) = 1$, $\gcd(ab, n) = 1$. This is because if $d = \gcd(ab, n) > 1$, then a prime factor $p$ of $d$ must divide $a$ or $b$ and also $n$. Then $\gcd(a, n) \geq p$ or $\gcd(b, n) \geq p$, which is a contradiction. Thus $ab \in \mathbb{Z} _ n^\ast$.
Associativity holds trivially, as a subset of $\mathbb{Z}_n$. We also have an identity element $1$, and inverse of $a \in \mathbb{Z}_n^\ast$ exists since $\gcd(a, n) = 1$. Associativity holds trivially, as a subset of $\mathbb{Z} _ n$. We also have an identity element $1$, and inverse of $a \in \mathbb{Z} _ n^\ast$ exists since $\gcd(a, n) = 1$.
Now we can prove Euler's generalization. Now we can prove Euler's generalization.
@@ -167,13 +167,13 @@ Now we can prove Euler's generalization.
> a^{\phi(n)} \equiv 1 \pmod n. > a^{\phi(n)} \equiv 1 \pmod n.
> $$ > $$
*Proof*. Since $\gcd(a, n) = 1$, $a \in \mathbb{Z}_n^\ast$. Then $a^{\left\lvert \mathbb{Z}_n^\ast \right\lvert} = 1$ in $\mathbb{Z}_n$. By the above lemma, we have the desired result. *Proof*. Since $\gcd(a, n) = 1$, $a \in \mathbb{Z} _ n^\ast$. Then $a^{\left\lvert \mathbb{Z} _ n^\ast \right\lvert} = 1$ in $\mathbb{Z} _ n$. By the above lemma, we have the desired result.
*Proof*. (Elementary) Set $f : \mathbb{Z}_n^\ast \rightarrow \mathbb{Z}_n^\ast$ as $x \mapsto ax \bmod n$, then the rest of the reasoning follows similarly as in the proof of Fermat's little theorem. *Proof*. (Elementary) Set $f : \mathbb{Z} _ n^\ast \rightarrow \mathbb{Z} _ n^\ast$ as $x \mapsto ax \bmod n$, then the rest of the reasoning follows similarly as in the proof of Fermat's little theorem.
Using the above result, we remark an important result that will be used in RSA. Using the above result, we remark an important result that will be used in RSA.
> **Lemma.** Let $n \in \mathbb{N}$. For $a, b \in \mathbb{Z}$ and $x \in \mathbb{Z}_n^\ast$, if $a \equiv b \pmod{\phi(n)}$, then $x^a \equiv x^b \pmod n$. > **Lemma.** Let $n \in \mathbb{N}$. For $a, b \in \mathbb{Z}$ and $x \in \mathbb{Z} _ n^\ast$, if $a \equiv b \pmod{\phi(n)}$, then $x^a \equiv x^b \pmod n$.
*Proof*. $a = b + k\phi(n)$ for some $k \in \mathbb{Z}$. Then *Proof*. $a = b + k\phi(n)$ for some $k \in \mathbb{Z}$. Then
@@ -192,44 +192,44 @@ by Euler's generalization.
> - $(\mathsf{G3})$ $G$ has an **identity** element $e$ such that $e * a = a * e = a$ for all $a \in G$. > - $(\mathsf{G3})$ $G$ has an **identity** element $e$ such that $e * a = a * e = a$ for all $a \in G$.
> - $(\mathsf{G4})$ There is an **inverse** for every element of $G$. For each $a \in G$, there exists $x \in G$ such that $a * x = x * a = e$. We write $x = a^{-1}$ in this case. > - $(\mathsf{G4})$ There is an **inverse** for every element of $G$. For each $a \in G$, there exists $x \in G$ such that $a * x = x * a = e$. We write $x = a^{-1}$ in this case.
$\mathbb{Z}_n$ is an additive group, and $\mathbb{Z}_n^\ast$ is a multiplicative group. $\mathbb{Z} _ n$ is an additive group, and $\mathbb{Z} _ n^\ast$ is a multiplicative group.
## Chinese Remainder Theorem (CRT) ## Chinese Remainder Theorem (CRT)
> **Theorem.** Let $n_1, \dots, n_k$ be integers greater than $1$, and let $N = n_1n_2\cdots n_k$. If $n_i$ are pairwise relatively prime, then the system of equations $x \equiv a_i \pmod {n_i}$ has a unique solution modulo $N$. > **Theorem.** Let $n _ 1, \dots, n _ k$ be integers greater than $1$, and let $N = n _ 1n _ 2\cdots n _ k$. If $n _ i$ are pairwise relatively prime, then the system of equations $x \equiv a _ i \pmod {n _ i}$ has a unique solution modulo $N$.
> >
> *(Abstract Algebra)* The map > *(Abstract Algebra)* The map
> >
> $$ > $$
> x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k) > x \bmod N \mapsto (x \bmod n _ 1, \dots, x \bmod n _ k)
> $$ > $$
> >
> defines a ring isomorphism > defines a ring isomorphism
> >
> $$ > $$
> \mathbb{Z}_N \simeq \mathbb{Z}_{n_1} \times \mathbb{Z}_{n_2} \times \cdots \times \mathbb{Z}_{n_k}. > \mathbb{Z} _ N \simeq \mathbb{Z} _ {n _ 1} \times \mathbb{Z} _ {n _ 2} \times \cdots \times \mathbb{Z} _ {n _ k}.
> $$ > $$
*Proof*. (**Existence**) Let $N_i = N/n_i$. Then $\gcd(N_i, n_i) = 1$. By the extended Euclidean algorithm, there exist integers $M_i, m_i$ such that $M_iN_i + m_in_i= 1$. Now set *Proof*. (**Existence**) Let $N _ i = N/n _ i$. Then $\gcd(N _ i, n _ i) = 1$. By the extended Euclidean algorithm, there exist integers $M _ i, m _ i$ such that $M _ iN _ i + m _ in _ i= 1$. Now set
$$ $$
x = \sum_{i=1}^k a_i M_i N_i. x = \sum _ {i=1}^k a _ i M _ i N _ i.
$$ $$
Then $x \equiv a_iM_iN_i \equiv a_i(1 - m_in_i) \equiv a_i \pmod {n_i}$ for all $i = 1, \dots, k$. Then $x \equiv a _ iM _ iN _ i \equiv a _ i(1 - m _ in _ i) \equiv a _ i \pmod {n _ i}$ for all $i = 1, \dots, k$.
(**Uniqueness**) Suppose that we have two distinct solutions $x, y$ modulo $N$. $x, y$ are solutions to $x \equiv a_i \pmod {n_i}$, so $n_i \mid (x - y)$ for all $i$. Therefore we have (**Uniqueness**) Suppose that we have two distinct solutions $x, y$ modulo $N$. $x, y$ are solutions to $x \equiv a _ i \pmod {n _ i}$, so $n _ i \mid (x - y)$ for all $i$. Therefore we have
$$ $$
\mathrm{lcm}(n_1, \dots, n_k) \mid (x - y). \mathrm{lcm}(n _ 1, \dots, n _ k) \mid (x - y).
$$ $$
But $n_i$ are pairwise relatively prime, so $\mathrm{lcm}(n_1, \dots, n_k) = N$ and $N \mid (x-y)$. Hence $x \equiv y \pmod N$. But $n _ i$ are pairwise relatively prime, so $\mathrm{lcm}(n _ 1, \dots, n _ k) = N$ and $N \mid (x-y)$. Hence $x \equiv y \pmod N$.
*Proof*. (**Abstract Algebra**) The above uniqueness proof shows that the map *Proof*. (**Abstract Algebra**) The above uniqueness proof shows that the map
$$ $$
x \bmod N \mapsto (x \bmod n_1, \dots, x \bmod n_k) x \bmod N \mapsto (x \bmod n _ 1, \dots, x \bmod n _ k)
$$ $$
is injective. By pigeonhole principle, this map must also be surjective. This map is also a ring homomorphism, by the properties of modular arithmetic. We have a ring isomorphism. is injective. By pigeonhole principle, this map must also be surjective. This map is also a ring homomorphism, by the properties of modular arithmetic. We have a ring isomorphism.
@@ -260,19 +260,19 @@ int chinese_remainder_theorem(vector<int>& remainder, vector<int>& modulus) {
} }
``` ```
The `modular_inverse` function uses the extended Euclidean algorithm to find $M_i$ in the proof. For large moduli and many equations, $N_i = N / n_i$ results in a very large number, which is hard to handle (if your language has integer overflow) and takes longer to compute. The `modular_inverse` function uses the extended Euclidean algorithm to find $M _ i$ in the proof. For large moduli and many equations, $N _ i = N / n _ i$ results in a very large number, which is hard to handle (if your language has integer overflow) and takes longer to compute.
A better way is to construct the solution **inductively**. Find a solution for the first two equations, A better way is to construct the solution **inductively**. Find a solution for the first two equations,
$$ $$
\begin{array}{c} \begin{array}{c}
x \equiv a_1 \pmod{n_1} \\ x \equiv a _ 1 \pmod{n _ 1} \\
x \equiv a_2 \pmod{n_2} x \equiv a _ 2 \pmod{n _ 2}
\end{array} \implies x \equiv a_{1, 2} \pmod{n_1n_2} \end{array} \implies x \equiv a _ {1, 2} \pmod{n _ 1n _ 2}
$$ $$
and using the result, add the next equation $x \equiv a_3 \pmod{n_3}$ and find a solution.[^1] and using the result, add the next equation $x \equiv a _ 3 \pmod{n _ 3}$ and find a solution.[^1]
Lastly, the ring isomorphism actually tells us a lot and is quite effective for computation. Since the two rings are *isomorphic*, operations in $\mathbb{Z}_N$ can be done independently in each $\mathbb{Z}_{n_i}$ and then merged back to $\mathbb{Z}_N$. $N$ was a large number, so computations can be much faster in $\mathbb{Z}_{n_i}$. Specifically, we will see how this fact is used for computations in RSA. Lastly, the ring isomorphism actually tells us a lot and is quite effective for computation. Since the two rings are *isomorphic*, operations in $\mathbb{Z} _ N$ can be done independently in each $\mathbb{Z} _ {n _ i}$ and then merged back to $\mathbb{Z} _ N$. $N$ was a large number, so computations can be much faster in $\mathbb{Z} _ {n _ i}$. Specifically, we will see how this fact is used for computations in RSA.
[^1]: I have an implementation in my repository. [Link](https://github.com/calofmijuck/BOJ/blob/4b29e0c7f487aac3186661176d2795f85f0ab21b/Codes/23000/23062.cpp#L38). [^1]: I have an implementation in my repository. [Link](https://github.com/calofmijuck/BOJ/blob/4b29e0c7f487aac3186661176d2795f85f0ab21b/Codes/23000/23062.cpp#L38).

View File

@@ -51,7 +51,7 @@ This is an explanation of *textbook* RSA encryption scheme.
### RSA Encryption and Decryption ### RSA Encryption and Decryption
Suppose we want to encrypt a message $m \in \mathbb{Z}_N$. Suppose we want to encrypt a message $m \in \mathbb{Z} _ N$.
- **Encryption** - **Encryption**
- Using the public key $(N, e)$, compute the ciphertext $c = m^e \bmod N$. - Using the public key $(N, e)$, compute the ciphertext $c = m^e \bmod N$.
@@ -106,13 +106,13 @@ $e, d$ are still chosen to satisfy $ed \equiv 1 \pmod {\phi(N)}$. Suppose we wan
We will also use the Chinese remainder theorem here. We will also use the Chinese remainder theorem here.
Since $\gcd(m, N) \neq 1$ and $N = pq$, we have $p \mid m$. So if we compute in $\mathbb{Z}_p$, we will get $0$, Since $\gcd(m, N) \neq 1$ and $N = pq$, we have $p \mid m$. So if we compute in $\mathbb{Z} _ p$, we will get $0$,
$$ $$
c^d \equiv m^{ed} \equiv 0^{ed} \equiv 0 \pmod p. c^d \equiv m^{ed} \equiv 0^{ed} \equiv 0 \pmod p.
$$ $$
We also do the computation in $\mathbb{Z}_q$ and get We also do the computation in $\mathbb{Z} _ q$ and get
$$ $$
c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m\cdot (m^{q-1})^{k(p-1)} \equiv m \cdot 1^{k(p-1)} \equiv m \pmod q. c^d \equiv m^{ed} \equiv m^{1 + k\phi(N)} \equiv m\cdot (m^{q-1})^{k(p-1)} \equiv m \cdot 1^{k(p-1)} \equiv m \pmod q.
@@ -122,15 +122,15 @@ Here, we used the fact that $m^{q-1} \equiv 1 \pmod q$. This holds because if $p
Therefore, from $c^d \equiv 0 \pmod p$ and $c^d \equiv (m \bmod q) \pmod q$, we can recover a unique solution $c^d \equiv m \pmod N$. Therefore, from $c^d \equiv 0 \pmod p$ and $c^d \equiv (m \bmod q) \pmod q$, we can recover a unique solution $c^d \equiv m \pmod N$.
Now we must argue that the recovered solution is actually equal to the original $m$. But what we did above was showing that $m^{ed}$ and $m$ in $\mathbb{Z}_N$ are mapped to the same element $(0, m \bmod q)$ in $\mathbb{Z}_p \times \mathbb{Z}_q$. Since the Chinese remainder theorem tells us that this mapping is an isomorphism, $m^{ed}$ and $m$ must have been the same elements of $\mathbb{Z}_N$ in the first place. Now we must argue that the recovered solution is actually equal to the original $m$. But what we did above was showing that $m^{ed}$ and $m$ in $\mathbb{Z} _ N$ are mapped to the same element $(0, m \bmod q)$ in $\mathbb{Z} _ p \times \mathbb{Z} _ q$. Since the Chinese remainder theorem tells us that this mapping is an isomorphism, $m^{ed}$ and $m$ must have been the same elements of $\mathbb{Z} _ N$ in the first place.
Notice that we did not require $m$ to be relatively prime to $N$. Thus the RSA encryption scheme is correct for any $m \in \mathbb{Z}_N$. Notice that we did not require $m$ to be relatively prime to $N$. Thus the RSA encryption scheme is correct for any $m \in \mathbb{Z} _ N$.
## Correctness of RSA with Fermat's Little Theorem ## Correctness of RSA with Fermat's Little Theorem
Actually, the above argument can be proven only with Fermat's little theorem. In the above proof, the Chinese remainder theorem was used to transform the operation, but for $N = pq$, the situation is simple enough that this theorem is not necessarily required. Actually, the above argument can be proven only with Fermat's little theorem. In the above proof, the Chinese remainder theorem was used to transform the operation, but for $N = pq$, the situation is simple enough that this theorem is not necessarily required.
Let $M = m^{ed} - m$. We have shown above only using Fermat's little theorem that $p \mid M$ and $q \mid M$, for any choice of $m \in \mathbb{Z}_N$. Then since $N = pq = \mathrm{lcm}(p, q)$, we have $N \mid M$, so $m^{ed} \equiv m \pmod N$. Hence the RSA scheme is correct. Let $M = m^{ed} - m$. We have shown above only using Fermat's little theorem that $p \mid M$ and $q \mid M$, for any choice of $m \in \mathbb{Z} _ N$. Then since $N = pq = \mathrm{lcm}(p, q)$, we have $N \mid M$, so $m^{ed} \equiv m \pmod N$. Hence the RSA scheme is correct.
So we don't actually need Euler's generalization for proving the correctness of RSA...?! In fact, the proof given in the original paper of RSA used Fermat's little theorem. So we don't actually need Euler's generalization for proving the correctness of RSA...?! In fact, the proof given in the original paper of RSA used Fermat's little theorem.
@@ -138,42 +138,42 @@ So we don't actually need Euler's generalization for proving the correctness of
This is an inverse problem of exponentiation. The inverse of exponentials is logarithms, so we consider the **discrete logarithm of a number modulo $p$**. This is an inverse problem of exponentiation. The inverse of exponentials is logarithms, so we consider the **discrete logarithm of a number modulo $p$**.
Given $y \equiv g^x \pmod p$ for some prime $p$, we want to find $x = \log_g y$. We set $g$ to be a generator of the group $\mathbb{Z}_p$ or $\mathbb{Z}_p^\ast$, since if $g$ is the generator, a solution always exists. Given $y \equiv g^x \pmod p$ for some prime $p$, we want to find $x = \log _ g y$. We set $g$ to be a generator of the group $\mathbb{Z} _ p$ or $\mathbb{Z} _ p^\ast$, since if $g$ is the generator, a solution always exists.
Read more in [discrete logarithm problem (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md#discrete-logarithm-problem-(dl)). Read more in [discrete logarithm problem (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange/#discrete-logarithm-problem-(dl)).
## ElGamal Encryption ## ElGamal Encryption
This is an encryption scheme built upon the hardness of the DLP. This is an encryption scheme built upon the hardness of the DLP.
> 1. Let $p$ be a large prime. > 1. Let $p$ be a large prime.
> 2. Select a generator $g \in \mathbb{Z}_p^\ast$. > 2. Select a generator $g \in \mathbb{Z} _ p^\ast$.
> 3. Choose a private key $x \in \mathbb{Z}_p^\ast$. > 3. Choose a private key $x \in \mathbb{Z} _ p^\ast$.
> 4. Compute the public key $y = g^x \pmod p$. > 4. Compute the public key $y = g^x \pmod p$.
> - $p, g, y$ will be publicly known. > - $p, g, y$ will be publicly known.
> - $x$ is kept secret. > - $x$ is kept secret.
### ElGamal Encryption and Decryption ### ElGamal Encryption and Decryption
Suppose we encrypt a message $m \in \mathbb{Z}_p^\ast$. Suppose we encrypt a message $m \in \mathbb{Z} _ p^\ast$.
> 1. The sender chooses a random $k \in \mathbb{Z}_p^\ast$, called *ephemeral key*. > 1. The sender chooses a random $k \in \mathbb{Z} _ p^\ast$, called *ephemeral key*.
> 2. Compute $c_1 = g^k \pmod p$ and $c_2 = my^k \pmod p$. > 2. Compute $c _ 1 = g^k \pmod p$ and $c _ 2 = my^k \pmod p$.
> 3. $c_1, c_2$ are sent to the receiver. > 3. $c _ 1, c _ 2$ are sent to the receiver.
> 4. The receiver calculates $c_1^x \equiv g^{xk} \equiv y^k \pmod p$, and find the inverse $y^{-k} \in \mathbb{Z}_p^\ast$. > 4. The receiver calculates $c _ 1^x \equiv g^{xk} \equiv y^k \pmod p$, and find the inverse $y^{-k} \in \mathbb{Z} _ p^\ast$.
> 5. Then $c_2y^{-k} \equiv m \pmod p$, recovering the message. > 5. Then $c _ 2y^{-k} \equiv m \pmod p$, recovering the message.
The attacker will see $g^k$. By the hardness of DLP, the attacker is unable to recover $k$ even if he knows $g$. The attacker will see $g^k$. By the hardness of DLP, the attacker is unable to recover $k$ even if he knows $g$.
#### Ephemeral Key Should Be Distinct #### Ephemeral Key Should Be Distinct
If the same $k$ is used twice, the encryption is not secure. Suppose we encrypt two different messages $m_1, m_2 \in \mathbb{Z}_p^\ast$. The attacker will see $(g^k, m_1y^k)$ and $(g^k, m_2 y^k)$. Then since we are in a multiplicative group $\mathbb{Z}_p^\ast$, inverses exist. So If the same $k$ is used twice, the encryption is not secure. Suppose we encrypt two different messages $m _ 1, m _ 2 \in \mathbb{Z} _ p^\ast$. The attacker will see $(g^k, m _ 1y^k)$ and $(g^k, m _ 2 y^k)$. Then since we are in a multiplicative group $\mathbb{Z} _ p^\ast$, inverses exist. So
$$ $$
m_1y^k \cdot (m_2 y^k)^{-1} \equiv m_1m_2^{-1} \equiv 1 \pmod p m _ 1y^k \cdot (m _ 2 y^k)^{-1} \equiv m _ 1m _ 2^{-1} \equiv 1 \pmod p
$$ $$
which implies that $m_1 \equiv m_2 \pmod p$, leaking some information. which implies that $m _ 1 \equiv m _ 2 \pmod p$, leaking some information.
[^1]: If one of the primes is small, factoring is easy. Therefore we require that $p, q$ both be large primes. [^1]: If one of the primes is small, factoring is easy. Therefore we require that $p, q$ both be large primes.
[^2]: There is a quantum polynomial time (BQP) algorithm for integer factorization. See [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm). [^2]: There is a quantum polynomial time (BQP) algorithm for integer factorization. See [Shor's algorithm](https://en.wikipedia.org/wiki/Shor%27s_algorithm).

View File

@@ -15,7 +15,7 @@ date: 2023-10-09
github_title: 2023-10-09-public-key-cryptography github_title: 2023-10-09-public-key-cryptography
--- ---
In symmetric key cryptography, we have a problem with key sharing and management. More info in the first few paragraphs of [Key Exchange (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md). In symmetric key cryptography, we have a problem with key sharing and management. More info in the first few paragraphs of [Key Exchange (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange/).
## Public Key Cryptography ## Public Key Cryptography
@@ -32,7 +32,7 @@ These keys are created to be used in **trapdoor one-way functions**.
A **one-way function** is a function that is easy to compute, but hard to compute the pre-image of any output. Here are some common examples. A **one-way function** is a function that is easy to compute, but hard to compute the pre-image of any output. Here are some common examples.
- *Cryptographic hash functions*: [Hash Functions (Modern Cryptography)](../modern-cryptography/2023-09-28-hash-functions.md#collision-resistance). - *Cryptographic hash functions*: [Hash Functions (Modern Cryptography)](../../modern-cryptography/2023-09-28-hash-functions/#collision-resistance).
- *Factoring a large integer*: It is easy to multiply to integers even if they're large, but factoring is very hard. - *Factoring a large integer*: It is easy to multiply to integers even if they're large, but factoring is very hard.
- *Discrete logarithm problem*: It is easy to exponentiate a number, but it is hard to find the discrete logarithm. - *Discrete logarithm problem*: It is easy to exponentiate a number, but it is hard to find the discrete logarithm.
@@ -80,14 +80,14 @@ But a problem still remains. How does one verify that this key is indeed from th
## Diffie-Hellman Key Exchange ## Diffie-Hellman Key Exchange
Choose a large prime $p$ and a generator $g$ of $\mathbb{Z}_p^\ast$. The description of $g$ and $p$ will be known to the public. Choose a large prime $p$ and a generator $g$ of $\mathbb{Z} _ p^\ast$. The description of $g$ and $p$ will be known to the public.
> 1. Alice chooses some $x \in \mathbb{Z}_p^\ast$ and sends $g^x \bmod p$ to Bob. > 1. Alice chooses some $x \in \mathbb{Z} _ p^\ast$ and sends $g^x \bmod p$ to Bob.
> 2. Bob chooses some $y \in \mathbb{Z}_p^\ast$ and sends $g^y \bmod p$ to Alice. > 2. Bob chooses some $y \in \mathbb{Z} _ p^\ast$ and sends $g^y \bmod p$ to Alice.
> 3. Alice and Bob calculate $g^{xy} \bmod p$ separately. > 3. Alice and Bob calculate $g^{xy} \bmod p$ separately.
> 4. Eve can see $g^x \bmod p$, $g^y \bmod p$ but cannot calculate $g^{xy} \bmod p$. > 4. Eve can see $g^x \bmod p$, $g^y \bmod p$ but cannot calculate $g^{xy} \bmod p$.
Refer to [Diffie-Hellman Key Exchange (Modern Cryptography)](../modern-cryptography/2023-10-03-key-exchange.md#diffie-hellman-key-exchange-(dhke)). Refer to [Diffie-Hellman Key Exchange (Modern Cryptography)](../../modern-cryptography/2023-10-03-key-exchange/#diffie-hellman-key-exchange-(dhke)).
## Message Integrity ## Message Integrity

View File

@@ -74,42 +74,42 @@ To defend this attack, we can use [encrypt-then-MAC (Modern Cryptography)](../..
We will perform a **chosen ciphertext attack** to fully recover the plaintext. We will perform a **chosen ciphertext attack** to fully recover the plaintext.
Suppose that we obtain a ciphertext $(\mathrm{IV}, c_1, c_2)$, which is an encryption of two blocks $m = m_0 \parallel m_1$, including the padding. By the CBC encryption algorithm we know that Suppose that we obtain a ciphertext $(\mathrm{IV}, c _ 1, c _ 2)$, which is an encryption of two blocks $m = m _ 0 \parallel m _ 1$, including the padding. By the CBC encryption algorithm we know that
$$ $$
c_1 = E_k(m_0 \oplus \mathrm{IV}), \qquad c_2 = E_k(m_1 \oplus c_1). c _ 1 = E _ k(m _ 0 \oplus \mathrm{IV}), \qquad c _ 2 = E _ k(m _ 1 \oplus c _ 1).
$$ $$
We don't know exactly how many padding bits there were, but it doesn't matter. We brute force by **changing the last byte of $c_1$** and requesting the decryption of the modified ciphertext $(\mathrm{IV}, c_1', c_2)$. We don't know exactly how many padding bits there were, but it doesn't matter. We brute force by **changing the last byte of $c _ 1$** and requesting the decryption of the modified ciphertext $(\mathrm{IV}, c _ 1', c _ 2)$.
The decryption process of the last block is $c_1 \oplus D_k(c_2)$, so by changing the last byte of $c_1$, we hope to get a decryption result that ends with $\texttt{0x01}$. Then the last byte $\texttt{0x01}$ will be treated as a padding and padding errors will not occur. So we keep trying until we don't get a padding error.[^1] The decryption process of the last block is $c _ 1 \oplus D _ k(c _ 2)$, so by changing the last byte of $c _ 1$, we hope to get a decryption result that ends with $\texttt{0x01}$. Then the last byte $\texttt{0x01}$ will be treated as a padding and padding errors will not occur. So we keep trying until we don't get a padding error.[^1]
Now, suppose that we successfully changed the last byte of $c_1$ to $b$, so that the last byte of $(c_1[0\dots6] \parallel b) \oplus D_k(c_2)$ is $\texttt{0x01}$. Next, we change the second-last bit $c_1[6]$ and request the decryption and hope to get an output that ends with $\texttt{0x0202}$. The last two bytes will also be treated as a padding and we won't get a padding error. Now, suppose that we successfully changed the last byte of $c _ 1$ to $b$, so that the last byte of $(c _ 1[0\dots6] \parallel b) \oplus D _ k(c _ 2)$ is $\texttt{0x01}$. Next, we change the second-last bit $c _ 1[6]$ and request the decryption and hope to get an output that ends with $\texttt{0x0202}$. The last two bytes will also be treated as a padding and we won't get a padding error.
We repeat the above process until we get a modified ciphertext $c_1' \parallel c_2$, where the decryption result ends with $8$ bytes of $\texttt{0x08}$. Then now we know that We repeat the above process until we get a modified ciphertext $c _ 1' \parallel c _ 2$, where the decryption result ends with $8$ bytes of $\texttt{0x08}$. Then now we know that
$$ $$
c_1' \oplus D_k(c_2) = \texttt{0x08}^8. c _ 1' \oplus D _ k(c _ 2) = \texttt{0x08}^8.
$$ $$
Then we can recover $D_k(c_2) = c_1' \oplus \texttt{0x08}^8$, and then since $m_1 = c_1 \oplus D_k(c_2)$, Then we can recover $D _ k(c _ 2) = c _ 1' \oplus \texttt{0x08}^8$, and then since $m _ 1 = c _ 1 \oplus D _ k(c _ 2)$,
$$ $$
m_1 = c_1 \oplus D_k(c_2) = c_1 \oplus c_1' \oplus \texttt{0x08}^8, m _ 1 = c _ 1 \oplus D _ k(c _ 2) = c _ 1 \oplus c _ 1' \oplus \texttt{0x08}^8,
$$ $$
allowing us to recover the whole message $m_1$. allowing us to recover the whole message $m _ 1$.
Now to recover $m_0$, we modify the $\mathrm{IV}$ using the same method as above. This time, we do not use $c_2$ and request a decryption of $(\mathrm{IV}', c_1)$ only. If some $\mathrm{IV}'$ gives a decryption result that ends with $8$ bytes of $\texttt{0x08}$, we have that Now to recover $m _ 0$, we modify the $\mathrm{IV}$ using the same method as above. This time, we do not use $c _ 2$ and request a decryption of $(\mathrm{IV}', c _ 1)$ only. If some $\mathrm{IV}'$ gives a decryption result that ends with $8$ bytes of $\texttt{0x08}$, we have that
$$ $$
\mathrm{IV}' \oplus D_k(c_1) = \texttt{0x08}^8. \mathrm{IV}' \oplus D _ k(c _ 1) = \texttt{0x08}^8.
$$ $$
Similarly, we recover $m_0$ by Similarly, we recover $m _ 0$ by
$$ $$
m_0 = \mathrm{IV} \oplus D_k(c_1) = \mathrm{IV} \oplus \mathrm{IV}' \oplus \texttt{0x08}^8. m _ 0 = \mathrm{IV} \oplus D _ k(c _ 1) = \mathrm{IV} \oplus \mathrm{IV}' \oplus \texttt{0x08}^8.
$$ $$
## Hashed MAC (HMAC) ## Hashed MAC (HMAC)
@@ -119,13 +119,13 @@ Let $H$ be a has function. We defined MAC as $H(k \parallel m)$ where $k$ is a k
Choose a key $k \leftarrow \mathcal{K}$, and set Choose a key $k \leftarrow \mathcal{K}$, and set
$$ $$
k_1 = k \oplus \texttt{ipad}, \quad k_2 = k\oplus \texttt{opad} k _ 1 = k \oplus \texttt{ipad}, \quad k _ 2 = k\oplus \texttt{opad}
$$ $$
where $\texttt{ipad} = \texttt{0x363636}...$ and $\texttt{opad} = \texttt{0x5C5C5C}...$. Then where $\texttt{ipad} = \texttt{0x363636}...$ and $\texttt{opad} = \texttt{0x5C5C5C}...$. Then
$$ $$
\mathrm{HMAC}(k, m) = H(k_2 \parallel H(k_1 \parallel m)). \mathrm{HMAC}(k, m) = H(k _ 2 \parallel H(k _ 1 \parallel m)).
$$ $$
## TLS Details ## TLS Details
@@ -157,7 +157,7 @@ Here's how the client and the server establishes a connection using the TLS hand
- Client sends the TLS protocol version and cipher suites that it supports. - Client sends the TLS protocol version and cipher suites that it supports.
- The version is the highest version supported by the client. - The version is the highest version supported by the client.
- A random number $N_c$ for generating the secret is sent. - A random number $N _ c$ for generating the secret is sent.
- A session ID may be sent if the client wants to resume an old session. - A session ID may be sent if the client wants to resume an old session.
#### ServerHello #### ServerHello
@@ -165,7 +165,7 @@ Here's how the client and the server establishes a connection using the TLS hand
- Server sends the TLS version and cipher suite to use. - Server sends the TLS version and cipher suite to use.
- The TLS version will be the highest version supported by both parties. - The TLS version will be the highest version supported by both parties.
- The server will pick the strongest cryptographic algorithm offered by the client. - The server will pick the strongest cryptographic algorithm offered by the client.
- The server also sends a random number $N_s$. - The server also sends a random number $N _ s$.
#### Certificate/ServerKeyExchange #### Certificate/ServerKeyExchange
@@ -177,10 +177,10 @@ Here's how the client and the server establishes a connection using the TLS hand
#### ClientKeyExchange #### ClientKeyExchange
- Client sends *premaster secret* (PMS) $secret_c$. - Client sends *premaster secret* (PMS) $secret _ c$.
- This is encrypted with server's public key. - This is encrypted with server's public key.
- This secret key material will be used to generate the secret key. - This secret key material will be used to generate the secret key.
- Both parties derive a shared **session key** from $N_c$, $N_s$, $secret_c$. - Both parties derive a shared **session key** from $N _ c$, $N _ s$, $secret _ c$.
- If the protocol is correct, the same key should be generated. - If the protocol is correct, the same key should be generated.
#### Finished #### Finished

View File

@@ -28,8 +28,8 @@ Some notations first:
- $\mathcal{M}$ is the plaintext space (messages) - $\mathcal{M}$ is the plaintext space (messages)
- $\mathcal{K}$ is the key space - $\mathcal{K}$ is the key space
- $\mathcal{C}$ is the ciphertext space - $\mathcal{C}$ is the ciphertext space
- $E: \mathcal{K} \times \mathcal{M} \rightarrow \mathcal{C}$ is the encryption algorithm (sometimes $\mathsf{Enc}_k$) - $E: \mathcal{K} \times \mathcal{M} \rightarrow \mathcal{C}$ is the encryption algorithm (sometimes $\mathsf{Enc} _ k$)
- $D:\mathcal{K} \times \mathcal{C} \rightarrow \mathcal{M}$ is the decryption algorithm (sometimes $\mathsf{Dec}_k$) - $D:\mathcal{K} \times \mathcal{C} \rightarrow \mathcal{M}$ is the decryption algorithm (sometimes $\mathsf{Dec} _ k$)
In a **symmetric cipher**, a key $k \in \mathcal{K}$ is used both for encryption and decryption, giving the following correctness requirement. In a **symmetric cipher**, a key $k \in \mathcal{K}$ is used both for encryption and decryption, giving the following correctness requirement.
@@ -40,7 +40,7 @@ $$\forall k \in \mathcal{K},\, \forall m \in \mathcal{M},\, D(k, E(k, m)) = m$$
Let $\Sigma$ be the set of lowercase english alphabets. Let $\Sigma$ be the set of lowercase english alphabets.
- $\mathcal{M} = \mathcal{C} = \Sigma^\ast$, where $\Sigma^\ast$ is the [Kleene star](https://en.wikipedia.org/wiki/Kleene_star). - $\mathcal{M} = \mathcal{C} = \Sigma^\ast$, where $\Sigma^\ast$ is the [Kleene star](https://en.wikipedia.org/wiki/Kleene_star).
- $\mathcal{K} = \mathbb{Z}_{26}$ (Caesar used $3 \in \mathcal{K}$) - $\mathcal{K} = \mathbb{Z} _ {26}$ (Caesar used $3 \in \mathcal{K}$)
- $E(k, x) = x + k \pmod{26}$ for each letter $x$ of $m \in \mathcal{M}$. - $E(k, x) = x + k \pmod{26}$ for each letter $x$ of $m \in \mathcal{M}$.
- $D(k, y) = y - k \pmod{26}$ for each letter $y$ of $c \in \mathcal{C}$. - $D(k, y) = y - k \pmod{26}$ for each letter $y$ of $c \in \mathcal{C}$.
@@ -50,13 +50,13 @@ This scheme is not safe since we can try all $26$ keys and find the sentence tha
Guessing the key and checking the plaintext is hard to automate, since computers don't know what sentences make sense.[^1] In some cases, the message may be invalid in normal English, while the plaintext characters follow the same distribution. Guessing the key and checking the plaintext is hard to automate, since computers don't know what sentences make sense.[^1] In some cases, the message may be invalid in normal English, while the plaintext characters follow the same distribution.
Let $p_i \in [0, 1]$ be the frequency of the $i$-th letter in normal English text. Then it is known that Let $p _ i \in [0, 1]$ be the frequency of the $i$-th letter in normal English text. Then it is known that
$$ \sum_{i=0}^{25} p_i^2 \approx 0.065$$ $$ \sum _ {i=0}^{25} p _ i^2 \approx 0.065$$
Now, let $q_i \in [0, 1]$ be the frequency of the $i$-th letter in the given ciphertext. For the key $k \in \mathcal{K}$, $p_i \approx q_{i+k}$ for all $i$, since the $i$-th letter is mapped to the $(i + k)$-th letter. (Addition is done modulo $26$). So for each index $j \in [0, 25]$, we compute Now, let $q _ i \in [0, 1]$ be the frequency of the $i$-th letter in the given ciphertext. For the key $k \in \mathcal{K}$, $p _ i \approx q _ {i+k}$ for all $i$, since the $i$-th letter is mapped to the $(i + k)$-th letter. (Addition is done modulo $26$). So for each index $j \in [0, 25]$, we compute
$$\sum_{i=0}^{25} p_i q_{i+j}$$ $$\sum _ {i=0}^{25} p _ i q _ {i+j}$$
and choose $j$ that gives the closest value to $0.065$. and choose $j$ that gives the closest value to $0.065$.

View File

@@ -56,10 +56,10 @@ For example, the shift cipher with $\mathcal{M}$ as the set of all two-letter ch
The above definition is equivalent to the following. The above definition is equivalent to the following.
> **Definition**. An encryption scheme is **perfectly secret** if for any $m_1, m_2 \in \mathcal{M}$ and $c \in \mathcal{C}$, we have > **Definition**. An encryption scheme is **perfectly secret** if for any $m _ 1, m _ 2 \in \mathcal{M}$ and $c \in \mathcal{C}$, we have
> >
> $$ > $$
> \Pr[E(k, m_1) = c] = \Pr[E(k, m_2) = c] > \Pr[E(k, m _ 1) = c] = \Pr[E(k, m _ 2) = c]
> $$ > $$
> >
> where the probability is taken over the random choice $k \leftarrow \mathcal{K}$. > where the probability is taken over the random choice $k \leftarrow \mathcal{K}$.
@@ -101,9 +101,9 @@ $$
$$ $$
\begin{align*} \begin{align*}
\Pr[C = c] &= \sum_{m' \in \mathcal{M}} \Pr[C = c \mid M = m'] \cdot \Pr[M = m'] \\ \Pr[C = c] &= \sum _ {m' \in \mathcal{M}} \Pr[C = c \mid M = m'] \cdot \Pr[M = m'] \\
&= \sum_{m' \in \mathcal{M}} \Pr[E(k, m') = c] \cdot \Pr[M = m'] \\ &= \sum _ {m' \in \mathcal{M}} \Pr[E(k, m') = c] \cdot \Pr[M = m'] \\
&= \sum_{m' \in \mathcal{M}} \Pr[E(k, m) = c] \cdot \Pr[M = m'] \\ &= \sum _ {m' \in \mathcal{M}} \Pr[E(k, m) = c] \cdot \Pr[M = m'] \\
&= \Pr[E(k, m) = c] = \Pr[C = c \mid M = m]. &= \Pr[E(k, m) = c] = \Pr[C = c \mid M = m].
\end{align*} \end{align*}
$$ $$
@@ -141,8 +141,8 @@ since $K$ and $M$ are independent. Given any distribution on $\mathcal{M}$, we c
$$ $$
\begin{align*} \begin{align*}
\Pr[C = c] &= \sum_{m \in \mathcal{M}} \Pr[C = c \mid M = m] \cdot \Pr[M = m] \\ \Pr[C = c] &= \sum _ {m \in \mathcal{M}} \Pr[C = c \mid M = m] \cdot \Pr[M = m] \\
&= 2^{-n} \cdot \sum_{m \in \mathcal{M}} \Pr[M = m] = 2^{-n}. &= 2^{-n} \cdot \sum _ {m \in \mathcal{M}} \Pr[M = m] = 2^{-n}.
\end{align*} \end{align*}
$$ $$
@@ -162,13 +162,13 @@ Thus OTP satisfies the definition of perfect secrecy.
The OTP is perfectly secure, but there are some drawbacks to using the OTP in practice. The OTP is perfectly secure, but there are some drawbacks to using the OTP in practice.
First of all, OTP is perfectly secure *only for one message*. Suppose that we reuse the key $k$ for two messages $m_1$, $m_2$. Then since $c_1 = m_1 \oplus k$, $c_2 = m_2 \oplus k$, we have the following relation First of all, OTP is perfectly secure *only for one message*. Suppose that we reuse the key $k$ for two messages $m _ 1$, $m _ 2$. Then since $c _ 1 = m _ 1 \oplus k$, $c _ 2 = m _ 2 \oplus k$, we have the following relation
$$ $$
c_1 \oplus c_2 = (m_1 \oplus k) \oplus (m_2 \oplus k) = m_1 \oplus m_2. c _ 1 \oplus c _ 2 = (m _ 1 \oplus k) \oplus (m _ 2 \oplus k) = m _ 1 \oplus m _ 2.
$$ $$
Since the adversary can see the ciphertext, this kind of relation leaks some information about $m_1$ or $m_2$. For example, the adversary can learn exactly where the two messages differ. So if the key is reused, the scheme cannot be perfectly secret. Since the adversary can see the ciphertext, this kind of relation leaks some information about $m _ 1$ or $m _ 2$. For example, the adversary can learn exactly where the two messages differ. So if the key is reused, the scheme cannot be perfectly secret.
Also, the key is (at least) as long as the message. This is why OTP is rarely used today. When sending a long message, two parties must communicate a very long key that is as long as the message, *every single time*! This makes it hard to manage the key. Also, the key is (at least) as long as the message. This is why OTP is rarely used today. When sending a long message, two parties must communicate a very long key that is as long as the message, *every single time*! This makes it hard to manage the key.
@@ -223,7 +223,7 @@ The following is evident from the definition.
> **Lemma.** A function $f : \mathbb{N} \rightarrow \mathbb{R}$ is negligible if and only if for all $c > 0$, > **Lemma.** A function $f : \mathbb{N} \rightarrow \mathbb{R}$ is negligible if and only if for all $c > 0$,
> >
> $$ > $$
> \lim_{n \rightarrow \infty} f(n) n^c = 0. > \lim _ {n \rightarrow \infty} f(n) n^c = 0.
> $$ > $$
In practice, about $2^{-30}$ is non-negligible since it is likely to happen over $1$ GB of data. Meanwhile, $2^{-80}$, $2^{-128}$ are negligible since it is very unlikely to happen over the life of a key. In practice, about $2^{-30}$ is non-negligible since it is likely to happen over $1$ GB of data. Meanwhile, $2^{-80}$, $2^{-128}$ are negligible since it is very unlikely to happen over the life of a key.
@@ -269,7 +269,7 @@ Let $G : \lbrace 0, 1 \rbrace^s \rightarrow \lbrace 0, 1 \rbrace^n$ be a PRG and
> **Definition.** The **PRG advantage** is defined as > **Definition.** The **PRG advantage** is defined as
> >
> $$ > $$
> \mathrm{Adv}_\mathrm{PRG}[\mathcal{A} , G] = \left\lvert \Pr_{k \leftarrow \left\lbrace 0, 1 \right\rbrace^s}[\mathcal{A}(G(k)) = 1] - \Pr_{r \leftarrow \left\lbrace 0, 1 \right\rbrace^n}[\mathcal{A}(r) = 1] \right\rvert. > \mathrm{Adv} _ \mathrm{PRG}[\mathcal{A} , G] = \left\lvert \Pr _ {k \leftarrow \left\lbrace 0, 1 \right\rbrace^s}[\mathcal{A}(G(k)) = 1] - \Pr _ {r \leftarrow \left\lbrace 0, 1 \right\rbrace^n}[\mathcal{A}(r) = 1] \right\rvert.
> $$ > $$
Intuitively, the **advantage** calculates how well $\mathcal{A}$ distinguishes $G(k)$ from truly random bit strings. Recall that $\mathcal{A}$ will output $1$ if it thinks that the given bit string is random. The first probability term is the case when $\mathcal{A}$ is given a pseudorandom string, but $\mathcal{A}$ decides that the string is random. (incorrect) The second probability term is the case when $\mathcal{A}$ is given a random string and it decides that it is indeed random. (correct) Therefore, Intuitively, the **advantage** calculates how well $\mathcal{A}$ distinguishes $G(k)$ from truly random bit strings. Recall that $\mathcal{A}$ will output $1$ if it thinks that the given bit string is random. The first probability term is the case when $\mathcal{A}$ is given a pseudorandom string, but $\mathcal{A}$ decides that the string is random. (incorrect) The second probability term is the case when $\mathcal{A}$ is given a random string and it decides that it is indeed random. (correct) Therefore,
@@ -281,7 +281,7 @@ Intuitively, the **advantage** calculates how well $\mathcal{A}$ distinguishes $
Now we can define the security of PRGs. Now we can define the security of PRGs.
> **Definition.** $G : \lbrace 0, 1 \rbrace^s \rightarrow \lbrace 0, 1 \rbrace^n$ is a **secure PRG** if for any efficient statistical test $\mathcal{A}$, $\mathrm{Adv}_\mathrm{PRG}[\mathcal{A}, G]$ is negligible. > **Definition.** $G : \lbrace 0, 1 \rbrace^s \rightarrow \lbrace 0, 1 \rbrace^n$ is a **secure PRG** if for any efficient statistical test $\mathcal{A}$, $\mathrm{Adv} _ \mathrm{PRG}[\mathcal{A}, G]$ is negligible.
There are no provably secure PRGs, but we have heuristic candidates, meaning that no such efficient $\mathcal{A}$ has been found. There are no provably secure PRGs, but we have heuristic candidates, meaning that no such efficient $\mathcal{A}$ has been found.
@@ -291,7 +291,7 @@ We can deduce that if a PRG is predictable, then it is insecure.
> **Theorem.** Let $G$ be a PRG. If $G$ is predictable, then it is insecure. > **Theorem.** Let $G$ be a PRG. If $G$ is predictable, then it is insecure.
*Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]$ is non-negligible. *Proof*. Let $\mathcal{A}$ be an efficient adversary (next bit predictor) that predicts $G$. Suppose that $i$ is the index chosen by $\mathcal{A}$. With $\mathcal{A}$, we construct a statistical test $\mathcal{B}$ such that $\mathrm{Adv} _ \mathrm{PRG}[\mathcal{B}, G]$ is non-negligible.
![mc-01-prg-game.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-01-prg-game.png) ![mc-01-prg-game.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-01-prg-game.png)
@@ -301,10 +301,10 @@ We can deduce that if a PRG is predictable, then it is insecure.
2. $\mathcal{B}$ gives $x[0..i-1]$ to $\mathcal{A}$, then $\mathcal{A}$ will do some calculation and return $y$. 2. $\mathcal{B}$ gives $x[0..i-1]$ to $\mathcal{A}$, then $\mathcal{A}$ will do some calculation and return $y$.
3. $\mathcal{B}$ compares $x[i]$ with $y$, and returns $1$ if $x[i] = y$, $0$ otherwise. 3. $\mathcal{B}$ compares $x[i]$ with $y$, and returns $1$ if $x[i] = y$, $0$ otherwise.
Let $W_b$ be the event that $\mathcal{B}$ outputs $1$ in experiment $b$. For $b = 0$, $\mathcal{B}$ outputs $1$ if $\mathcal{A}$ correctly guesses $x[i]$, which happens with probability $\frac{1}{2} + \epsilon$ for non-negligible $\epsilon$. As for $b = 1$, the received string is truly random. Then the values of $x[i]$ and $y$ are independent so $\Pr[W_1] = \frac{1}{2}$. Therefore, Let $W _ b$ be the event that $\mathcal{B}$ outputs $1$ in experiment $b$. For $b = 0$, $\mathcal{B}$ outputs $1$ if $\mathcal{A}$ correctly guesses $x[i]$, which happens with probability $\frac{1}{2} + \epsilon$ for non-negligible $\epsilon$. As for $b = 1$, the received string is truly random. Then the values of $x[i]$ and $y$ are independent so $\Pr[W _ 1] = \frac{1}{2}$. Therefore,
$$ $$
\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G] = \lvert \Pr[W_0] - \Pr[W_1] \rvert = \left\lvert \frac{1}{2} + \epsilon - \frac{1}{2} \right\rvert = \epsilon, \mathrm{Adv} _ \mathrm{PRG}[\mathcal{B}, G] = \lvert \Pr[W _ 0] - \Pr[W _ 1] \rvert = \left\lvert \frac{1}{2} + \epsilon - \frac{1}{2} \right\rvert = \epsilon,
$$ $$
and the advantage is non-negligible. and the advantage is non-negligible.
@@ -324,14 +324,14 @@ To motivate the definition of semantic security, we consider a **security game f
> **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows: > **Definition.** Let $\mathcal{E} = (G, E, D)$ be a cipher defined over $(\mathcal{K}, \mathcal{M}, \mathcal{C})$. For a given adversary $\mathcal{A}$, we define two experiments $0$ and $1$. For $b \in \lbrace 0, 1 \rbrace$, define experiment $b$ as follows:
> >
> **Experiment** $b$. > **Experiment** $b$.
> 1. The adversary computes $m_0, m_1 \in \mathcal{M}$ and sends them to the challenger. > 1. The adversary computes $m _ 0, m _ 1 \in \mathcal{M}$ and sends them to the challenger.
> 2. The challenger computes $k \leftarrow \mathcal{K}$, $c \leftarrow E(k, m_b)$ and sends $c$ to the adversary. > 2. The challenger computes $k \leftarrow \mathcal{K}$, $c \leftarrow E(k, m _ b)$ and sends $c$ to the adversary.
> 3. The adversary outputs a bit $b' \in \lbrace 0, 1 \rbrace$. > 3. The adversary outputs a bit $b' \in \lbrace 0, 1 \rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. i.e, the event that $\mathcal{A}(\mathrm{EXP}(b)) = 1$. Now define the **semantic security advantage** of $\mathcal{A}$ with respect to $\mathcal{E}$ as > Let $W _ b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. i.e, the event that $\mathcal{A}(\mathrm{EXP}(b)) = 1$. Now define the **semantic security advantage** of $\mathcal{A}$ with respect to $\mathcal{E}$ as
> >
> $$ > $$
> \mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}] = \lvert \Pr [W_0] - \Pr[W_1] \rvert. > \mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}] = \lvert \Pr [W _ 0] - \Pr[W _ 1] \rvert.
> $$ > $$
As we understood the advantage of PRG, semantic security advantage can be understood the same way. If the advantage is closer to $1$, the better the adversary distinguishes the two experiments. As we understood the advantage of PRG, semantic security advantage can be understood the same way. If the advantage is closer to $1$, the better the adversary distinguishes the two experiments.
@@ -340,19 +340,19 @@ As we understood the advantage of PRG, semantic security advantage can be unders
In the same way, we can define a security game for distinguishing two distributions. In the same way, we can define a security game for distinguishing two distributions.
> **Definition.** Let $P_0$, $P_1$ be two distributions over a set $\mathcal{S}$. For any given efficient adversary $\mathcal{A}$, define experiments $0$ and $1$. > **Definition.** Let $P _ 0$, $P _ 1$ be two distributions over a set $\mathcal{S}$. For any given efficient adversary $\mathcal{A}$, define experiments $0$ and $1$.
> >
> **Experiment $b$.** > **Experiment $b$.**
> 1. The challenger computes $x \leftarrow P_b$ and sends $x$ to the adversary. > 1. The challenger computes $x \leftarrow P _ b$ and sends $x$ to the adversary.
> 2. The adversary computes and outputs a bit $b' \in \lbrace 0, 1 \rbrace$. > 2. The adversary computes and outputs a bit $b' \in \lbrace 0, 1 \rbrace$.
> >
> Let $W_b$ the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the advantage is defined as > Let $W _ b$ the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the advantage is defined as
> >
> $$ > $$
> \mathrm{Adv}[\mathcal{A}] = \lvert \Pr[W_0] - \Pr[W_1] \rvert > \mathrm{Adv}[\mathcal{A}] = \lvert \Pr[W _ 0] - \Pr[W _ 1] \rvert
> $$ > $$
> >
> If the advantage is negligible, we say that $P_0$ and $P_1$ are **computationally indistinguishable**, and write $P_0 \approx_c P_1$. > If the advantage is negligible, we say that $P _ 0$ and $P _ 1$ are **computationally indistinguishable**, and write $P _ 0 \approx _ c P _ 1$.
As an example, a PRG $G$ is secure if the two distributions $G(k)$ over $k \leftarrow \lbrace 0, 1 \rbrace^s$ and $r \leftarrow \lbrace 0, 1 \rbrace^n$ are *computationally indistinguishable*. As an example, a PRG $G$ is secure if the two distributions $G(k)$ over $k \leftarrow \lbrace 0, 1 \rbrace^s$ and $r \leftarrow \lbrace 0, 1 \rbrace^n$ are *computationally indistinguishable*.
@@ -360,14 +360,14 @@ As an example, a PRG $G$ is secure if the two distributions $G(k)$ over $k \left
So now we can define a semantically secure encryption scheme. So now we can define a semantically secure encryption scheme.
> **Definition.** An encryption scheme $\mathcal{E}$ is **semantically secure** if for any efficient adversary $\mathcal{A}$, its advantage $\mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}]$ is negligible. > **Definition.** An encryption scheme $\mathcal{E}$ is **semantically secure** if for any efficient adversary $\mathcal{A}$, its advantage $\mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}]$ is negligible.
It means that the adversary cannot distinguish whether the received message is an encryption of $m_0$ or $m_1$, even though it knows what the original messages were in the first place! It means that the adversary cannot distinguish whether the received message is an encryption of $m _ 0$ or $m _ 1$, even though it knows what the original messages were in the first place!
Using the notion of computational indistinguishability, $\mathcal{E}$ is semantically secure if for any $m_0, m_1 \in \mathcal{M}$, the distribution of ciphertexts of $m_0$ and $m_1$ with respect to $k \leftarrow \mathcal{K}$ is computationally indistinguishable. Using the notion of computational indistinguishability, $\mathcal{E}$ is semantically secure if for any $m _ 0, m _ 1 \in \mathcal{M}$, the distribution of ciphertexts of $m _ 0$ and $m _ 1$ with respect to $k \leftarrow \mathcal{K}$ is computationally indistinguishable.
$$ $$
E(K, m_0) \approx_c E(K, m_1) E(K, m _ 0) \approx _ c E(K, m _ 1)
$$ $$
## Semantic Security of the Stream Cipher ## Semantic Security of the Stream Cipher
@@ -376,47 +376,47 @@ $$
*Proof*. Let $\mathcal{A}$ be an efficient adversary that breaks the semantic security of $\mathcal{E}$. We will use $\mathcal{A}$ to construct a statistical test $\mathcal{B}$ that breaks the security of the PRG. *Proof*. Let $\mathcal{A}$ be an efficient adversary that breaks the semantic security of $\mathcal{E}$. We will use $\mathcal{A}$ to construct a statistical test $\mathcal{B}$ that breaks the security of the PRG.
Since $\mathcal{A}$ can break the semantic security of the stream cipher, for $m_0, m_1 \in \mathcal{M}$ chosen by $\mathcal{A}$, it can distinguish $m_0 \oplus G(k)$ and $m_1 \oplus G(k)$. Let $W_b$ be the event that $\mathcal{A}$ returns $1$ for $m_b \oplus G(k)$. The advantage $\mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}] = \lvert \Pr[W_0] - \Pr[W_1] \rvert$ is non-negligible. Since $\mathcal{A}$ can break the semantic security of the stream cipher, for $m _ 0, m _ 1 \in \mathcal{M}$ chosen by $\mathcal{A}$, it can distinguish $m _ 0 \oplus G(k)$ and $m _ 1 \oplus G(k)$. Let $W _ b$ be the event that $\mathcal{A}$ returns $1$ for $m _ b \oplus G(k)$. The advantage $\mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}] = \lvert \Pr[W _ 0] - \Pr[W _ 1] \rvert$ is non-negligible.
Define two new experiments as follows: Define two new experiments as follows:
1. The adversary $\mathcal{A}$ gives two messages $m_0, m_1 \in \mathcal{M}$. 1. The adversary $\mathcal{A}$ gives two messages $m _ 0, m _ 1 \in \mathcal{M}$.
2. The challenger draws a random string $r \leftarrow \lbrace 0, 1 \rbrace^n$. 2. The challenger draws a random string $r \leftarrow \lbrace 0, 1 \rbrace^n$.
3. In experiment $b$, return $m_b \oplus r$. 3. In experiment $b$, return $m _ b \oplus r$.
4. $\mathcal{A}$ will return $b' \in \lbrace 0, 1 \rbrace$. 4. $\mathcal{A}$ will return $b' \in \lbrace 0, 1 \rbrace$.
Let $W'_b$ be the event that $\mathcal{A}$ returns $1$ for $m_b \oplus r$. Then, by triangle inequality, Let $W' _ b$ be the event that $\mathcal{A}$ returns $1$ for $m _ b \oplus r$. Then, by triangle inequality,
$$ $$
\tag{2} \tag{2}
\begin{align*} \begin{align*}
\mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}] &= \lvert \Pr[W_0] - \Pr[W_1] \rvert \\ \mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}] &= \lvert \Pr[W _ 0] - \Pr[W _ 1] \rvert \\
&\leq \lvert \Pr[W_0] - \Pr[W_0'] \rvert + \lvert \Pr[W_0'] - \Pr[W_1'] \rvert \\ &\leq \lvert \Pr[W _ 0] - \Pr[W _ 0'] \rvert + \lvert \Pr[W _ 0'] - \Pr[W _ 1'] \rvert \\
&\qquad + \lvert \Pr[W_1'] - \Pr[W_1] \rvert \\ &\qquad + \lvert \Pr[W _ 1'] - \Pr[W _ 1] \rvert \\
&= \lvert \Pr[W_0] - \Pr[W_0'] \rvert + \lvert \Pr[W_1'] - \Pr[W_1] \rvert. &= \lvert \Pr[W _ 0] - \Pr[W _ 0'] \rvert + \lvert \Pr[W _ 1'] - \Pr[W _ 1] \rvert.
\end{align*} \end{align*}
$$ $$
The last equality holds since OTP is perfectly secure and thus $\lvert \Pr[W_0'] - \Pr[W_1'] \rvert = 0$. Since $\mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}]$ is non-negligible, at least one of the terms on the right hand side should also be non-negligible. The last equality holds since OTP is perfectly secure and thus $\lvert \Pr[W _ 0'] - \Pr[W _ 1'] \rvert = 0$. Since $\mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}]$ is non-negligible, at least one of the terms on the right hand side should also be non-negligible.
Without loss of generality, assume that $\lvert \Pr[W_0] - \Pr[W_0'] \rvert$ is non-negligible. This implies that $\mathcal{A}$ can distinguish $m_0 \oplus G(k)$ from $m_0 \oplus r$. Using this fact, we can construct a statistical test $\mathcal{B}$ for the PRG as follows. Without loss of generality, assume that $\lvert \Pr[W _ 0] - \Pr[W _ 0'] \rvert$ is non-negligible. This implies that $\mathcal{A}$ can distinguish $m _ 0 \oplus G(k)$ from $m _ 0 \oplus r$. Using this fact, we can construct a statistical test $\mathcal{B}$ for the PRG as follows.
1. Challenger PRG gives a bit string $x$. 1. Challenger PRG gives a bit string $x$.
- In experiment $0$, challenger gives pseudorandom string $G(k)$. - In experiment $0$, challenger gives pseudorandom string $G(k)$.
- In experiment $1$, challenger gives truly random string $r$. - In experiment $1$, challenger gives truly random string $r$.
2. Invoke $\mathcal{A}$, then $\mathcal{A}$ will send two messages $m_0, m_1 \in \mathcal{M}$. 2. Invoke $\mathcal{A}$, then $\mathcal{A}$ will send two messages $m _ 0, m _ 1 \in \mathcal{M}$.
3. Compute $c = m_0 \oplus x$ and return $c$ to $\mathcal{A}$. 3. Compute $c = m _ 0 \oplus x$ and return $c$ to $\mathcal{A}$.
4. $\mathcal{A}$ will return $b'$, and return $b'$ directly to challenger PRG. 4. $\mathcal{A}$ will return $b'$, and return $b'$ directly to challenger PRG.
Let $Y_b$ the event that $\mathcal{B}$ returns $1$ on experiment $b$. Then, we directly see that Let $Y _ b$ the event that $\mathcal{B}$ returns $1$ on experiment $b$. Then, we directly see that
$$ $$
\Pr[Y_0] = \Pr[W_0], \qquad \Pr[Y_1] = \Pr[W_0']. \Pr[Y _ 0] = \Pr[W _ 0], \qquad \Pr[Y _ 1] = \Pr[W _ 0'].
$$ $$
Therefore, the PRG advantage of $\mathcal{B}$ is Therefore, the PRG advantage of $\mathcal{B}$ is
$$ $$
\mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G] = \lvert \Pr[Y_0] - \Pr[Y_1] \rvert = \lvert \Pr[W_0] - \Pr[W_0'] \rvert, \mathrm{Adv} _ \mathrm{PRG}[\mathcal{B}, G] = \lvert \Pr[Y _ 0] - \Pr[Y _ 1] \rvert = \lvert \Pr[W _ 0] - \Pr[W _ 0'] \rvert,
$$ $$
which is non-negligible, so it breaks the security of the PRG. which is non-negligible, so it breaks the security of the PRG.
@@ -424,7 +424,7 @@ which is non-negligible, so it breaks the security of the PRG.
> **Corollary.** For any adversary $\mathcal{A}$ for the stream cipher $\mathcal{E}$, there exists an adversary $\mathcal{B}$ for a PRG $G$ such that > **Corollary.** For any adversary $\mathcal{A}$ for the stream cipher $\mathcal{E}$, there exists an adversary $\mathcal{B}$ for a PRG $G$ such that
> >
> $$ > $$
> \mathrm{Adv}_\mathrm{SS}[\mathcal{A}, \mathcal{E}] \leq 2 \cdot \mathrm{Adv}_\mathrm{PRG}[\mathcal{B}, G]. > \mathrm{Adv} _ \mathrm{SS}[\mathcal{A}, \mathcal{E}] \leq 2 \cdot \mathrm{Adv} _ \mathrm{PRG}[\mathcal{B}, G].
> $$ > $$
*Proof*. Use equation $(2)$ in the above proof. *Proof*. Use equation $(2)$ in the above proof.

View File

@@ -23,7 +23,7 @@ attachment:
> **Definition.** A **pseudorandom function** $F$ over $(\mathcal{K}, X, Y)$ is an efficiently computable algorithm $F : \mathcal{K} \times X \rightarrow Y$. > **Definition.** A **pseudorandom function** $F$ over $(\mathcal{K}, X, Y)$ is an efficiently computable algorithm $F : \mathcal{K} \times X \rightarrow Y$.
We consider a *keyed function* $F : \mathcal{K} \times X \rightarrow Y$ where $\mathcal{K}$ denotes the key space. For $k \in \mathcal{K}$, $F_k(x) := F(k, x)$ is a function from $X$ to $Y$. Thus each key $k$ induces a distribution on functions $X \rightarrow Y$. We consider a *keyed function* $F : \mathcal{K} \times X \rightarrow Y$ where $\mathcal{K}$ denotes the key space. For $k \in \mathcal{K}$, $F _ k(x) := F(k, x)$ is a function from $X$ to $Y$. Thus each key $k$ induces a distribution on functions $X \rightarrow Y$.
Note that $\left\lvert \mathcal{F}[X, Y] \right\lvert = \left\lvert Y \right\lvert^{\left\lvert X \right\lvert}$, but the number of PRFs is at most $\left\lvert K \right\lvert$. In practice, $\left\lvert K \right\lvert$ is very small compared to $\left\lvert Y \right\lvert^{\left\lvert X \right\lvert}$. So PRFs are chosen from a smaller space, but they should behave in the same way a (truly) random function does. Note that $\left\lvert \mathcal{F}[X, Y] \right\lvert = \left\lvert Y \right\lvert^{\left\lvert X \right\lvert}$, but the number of PRFs is at most $\left\lvert K \right\lvert$. In practice, $\left\lvert K \right\lvert$ is very small compared to $\left\lvert Y \right\lvert^{\left\lvert X \right\lvert}$. So PRFs are chosen from a smaller space, but they should behave in the same way a (truly) random function does.
@@ -33,43 +33,43 @@ Let $\mathcal{F}[X, Y]$ denote the set of all functions from $X$ to $Y$. A PRF $
> >
> **Experiment $b$**. > **Experiment $b$**.
> 1. The challenger selects $f \in \mathcal{F}[X, Y]$ as follows > 1. The challenger selects $f \in \mathcal{F}[X, Y]$ as follows
> - If $b = 0$, choose $k \leftarrow \mathcal{K}$ and set $f = F_k = F(k, \cdot)$. > - If $b = 0$, choose $k \leftarrow \mathcal{K}$ and set $f = F _ k = F(k, \cdot)$.
> - If $b = 1$, choose $f \leftarrow \mathcal{F}[X, Y]$. > - If $b = 1$, choose $f \leftarrow \mathcal{F}[X, Y]$.
> 2. The adversary sends a sequence of queries to the challenger. > 2. The adversary sends a sequence of queries to the challenger.
> - For $i = 1, \dots, q$, send $x_i \in X$ and receive $y_i = f(x_i) \in Y$. > - For $i = 1, \dots, q$, send $x _ i \in X$ and receive $y _ i = f(x _ i) \in Y$.
> 3. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 3. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs 1 in experiment $b$. Then the **PRF-advantage** of $\mathcal{A}$ with respect to $F$ is defined as > Let $W _ b$ be the event that $\mathcal{A}$ outputs 1 in experiment $b$. Then the **PRF-advantage** of $\mathcal{A}$ with respect to $F$ is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{PRF}}^q[\mathcal{A}, F] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert. > \mathrm{Adv} _ {\mathrm{PRF}}^q[\mathcal{A}, F] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert.
> $$ > $$
> >
> A PRF $F$ is **secure** if $\mathrm{Adv}_{\mathrm{PRF}}^q[\mathcal{A}, F]$ is negligible for any efficient $\mathcal{A}$. > A PRF $F$ is **secure** if $\mathrm{Adv} _ {\mathrm{PRF}}^q[\mathcal{A}, F]$ is negligible for any efficient $\mathcal{A}$.
In experiment $0$ above, the challenger returns $y_i = f(x_i)$. To answer the query $x_i$, the challenger would have to keep a lookup table for a random function $f \in \mathcal{F}[X, Y]$. Since $X$ and $Y$ are very large in practice, it is nearly impossible to create and manage such lookup tables.[^1] As a workaround, we can choose $y_i$ uniformly on each query $x_i$, assuming that $x_i$ wasn't queried before. This is possible since for any different inputs $x_i, x_j \in X$, $f(x_i)$ and $f(x_j)$ are random and independent. In experiment $0$ above, the challenger returns $y _ i = f(x _ i)$. To answer the query $x _ i$, the challenger would have to keep a lookup table for a random function $f \in \mathcal{F}[X, Y]$. Since $X$ and $Y$ are very large in practice, it is nearly impossible to create and manage such lookup tables.[^1] As a workaround, we can choose $y _ i$ uniformly on each query $x _ i$, assuming that $x _ i$ wasn't queried before. This is possible since for any different inputs $x _ i, x _ j \in X$, $f(x _ i)$ and $f(x _ j)$ are random and independent.
Also, there are two ways that the adversary can query the challenger. The first method is the **adaptive** method, where the adversary queries each $x_i$ one by one. In this method, the adversary can *adaptively* choose the next query $x_{i+1}$ after receiving the result from the challenger. Also, there are two ways that the adversary can query the challenger. The first method is the **adaptive** method, where the adversary queries each $x _ i$ one by one. In this method, the adversary can *adaptively* choose the next query $x _ {i+1}$ after receiving the result from the challenger.
The second method is querying all $x_i$ at once. We will consider the first method, since the adaptive method assumes greater attack power on the adversary. The second method is querying all $x _ i$ at once. We will consider the first method, since the adaptive method assumes greater attack power on the adversary.
### OTP as a PRF ### OTP as a PRF
As an example, consider the one-time pad function $F(k, x) = k \oplus x$. This function satisfies the definitions of PRFs, but it is not a secure PRF. Consider an adversary $\mathcal{A}$ that outputs $1$ if and only if $y_1 \oplus y_2 = x_1 \oplus x_2$. In experiment $0$, $\mathcal{A}$ will always output $1$, but in experiment $1$, $\mathcal{A}$ will output $1$ with probability $2^{-n}$. Thus the advantage is $1 - 2^{-n}$, which is not negligible. As an example, consider the one-time pad function $F(k, x) = k \oplus x$. This function satisfies the definitions of PRFs, but it is not a secure PRF. Consider an adversary $\mathcal{A}$ that outputs $1$ if and only if $y _ 1 \oplus y _ 2 = x _ 1 \oplus x _ 2$. In experiment $0$, $\mathcal{A}$ will always output $1$, but in experiment $1$, $\mathcal{A}$ will output $1$ with probability $2^{-n}$. Thus the advantage is $1 - 2^{-n}$, which is not negligible.
### PRFs and PRGs ### PRFs and PRGs
It is easy to construct PRGs from PRFs. We can simply evaluate $F$ on some distinct inputs. For example, given a seed $s$, we evaluate $F_s$ and obtain It is easy to construct PRGs from PRFs. We can simply evaluate $F$ on some distinct inputs. For example, given a seed $s$, we evaluate $F _ s$ and obtain
$$ $$
G(s) = F_s(1) \parallel F_s(2) \parallel \cdots \parallel F_s(n) G(s) = F _ s(1) \parallel F _ s(2) \parallel \cdots \parallel F _ s(n)
$$ $$
for any $n \in \mathbb{N}$.[^2] In fact, we can show that $G$ is secure PRG if $F$ is a secure PRF. for any $n \in \mathbb{N}$.[^2] In fact, we can show that $G$ is secure PRG if $F$ is a secure PRF.
> **Theorem.** If $F$ is a secure length preserving PRF, then $G$ in the above definition is a secure PRG. > **Theorem.** If $F$ is a secure length preserving PRF, then $G$ in the above definition is a secure PRG.
*Proof*. Suppose that $\mathcal{A}$ is an efficient PRG adversary against $G$. We construct an efficient $n$-query PRF adversary $\mathcal{B}$, that queries the challenger at $1, \dots, n$ and receives $f(1), \dots, f(n)$. $\mathcal{B}$ passes $f(1) \parallel \cdots \parallel f(n)$ to $\mathcal{A}$, and $\mathcal{B}$ outputs the output of $\mathcal{A}$. Then we see that $\mathrm{Adv}_{\mathrm{PRG}}[\mathcal{A}, G] = \mathrm{Adv}_{\mathrm{PRF}}[\mathcal{B}, F]$. So if $F$ is secure, then $G$ is secure. *Proof*. Suppose that $\mathcal{A}$ is an efficient PRG adversary against $G$. We construct an efficient $n$-query PRF adversary $\mathcal{B}$, that queries the challenger at $1, \dots, n$ and receives $f(1), \dots, f(n)$. $\mathcal{B}$ passes $f(1) \parallel \cdots \parallel f(n)$ to $\mathcal{A}$, and $\mathcal{B}$ outputs the output of $\mathcal{A}$. Then we see that $\mathrm{Adv} _ {\mathrm{PRG}}[\mathcal{A}, G] = \mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{B}, F]$. So if $F$ is secure, then $G$ is secure.
As for the converse, a PRG $G$ gives a PRF $F$ with small input length. If $G : \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^{n 2^m}$, we can define a PRF $F : \left\lbrace 0, 1 \right\rbrace^n \times \left\lbrace 0, 1 \right\rbrace^m \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ as follows: for a seed $s \in \left\lbrace 0, 1 \right\rbrace^n$, consider $G(s)$ as a $2^m \times n$ table and set $F(s, i)$ as the $i$-th row of $G(s)$.[^3] If $G$ is a secure PRG, then PRF $F$ is also secure. As for the converse, a PRG $G$ gives a PRF $F$ with small input length. If $G : \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^{n 2^m}$, we can define a PRF $F : \left\lbrace 0, 1 \right\rbrace^n \times \left\lbrace 0, 1 \right\rbrace^m \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ as follows: for a seed $s \in \left\lbrace 0, 1 \right\rbrace^n$, consider $G(s)$ as a $2^m \times n$ table and set $F(s, i)$ as the $i$-th row of $G(s)$.[^3] If $G$ is a secure PRG, then PRF $F$ is also secure.
@@ -85,33 +85,33 @@ Similarly, a PRP $E$ is **secure** if it is **indistinguishable from a random pe
> >
> **Experiment $b$**. > **Experiment $b$**.
> 1. The challenger selects $f \in \mathcal{P}[X]$ as follows > 1. The challenger selects $f \in \mathcal{P}[X]$ as follows
> - If $b = 0$, choose $k \leftarrow \mathcal{K}$ and set $f = E_k = E(k, \cdot)$. > - If $b = 0$, choose $k \leftarrow \mathcal{K}$ and set $f = E _ k = E(k, \cdot)$.
> - If $b = 1$, choose $f \leftarrow \mathcal{P}[X]$. > - If $b = 1$, choose $f \leftarrow \mathcal{P}[X]$.
> 2. The adversary sends a sequence of queries to the challenger. > 2. The adversary sends a sequence of queries to the challenger.
> - For $i = 1, \dots, q$, send $x_i \in X$ and receive $y_i = f(x_i) \in Y$. > - For $i = 1, \dots, q$, send $x _ i \in X$ and receive $y _ i = f(x _ i) \in Y$.
> 3. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 3. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs 1 in experiment $b$. Then the **PRP-advantage** of $\mathcal{A}$ with respect to $E$ is defined as > Let $W _ b$ be the event that $\mathcal{A}$ outputs 1 in experiment $b$. Then the **PRP-advantage** of $\mathcal{A}$ with respect to $E$ is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{PRP}}^q[\mathcal{A}, E] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert. > \mathrm{Adv} _ {\mathrm{PRP}}^q[\mathcal{A}, E] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert.
> $$ > $$
> >
> A PRP $E$ is **secure** if $\mathrm{Adv}_{\mathrm{PRP}}^q[\mathcal{A}, E]$ is negligible for any efficient $\mathcal{A}$. > A PRP $E$ is **secure** if $\mathrm{Adv} _ {\mathrm{PRP}}^q[\mathcal{A}, E]$ is negligible for any efficient $\mathcal{A}$.
### PRF Switching Lemma ### PRF Switching Lemma
Suppose that $\left\lvert X \right\lvert$ is sufficiently large. Then for $q$ queries of any adversary $\mathcal{A}$, it is highly probable that $f(x_i)$ are all distinct, regardless of whether $f$ is a PRF or a PRP. Thus we have the following property of PRPs. Suppose that $\left\lvert X \right\lvert$ is sufficiently large. Then for $q$ queries of any adversary $\mathcal{A}$, it is highly probable that $f(x _ i)$ are all distinct, regardless of whether $f$ is a PRF or a PRP. Thus we have the following property of PRPs.
> **Lemma.** If $E: \mathcal{K} \times X \rightarrow X$ is a secure PRP and $\left\lvert X \right\lvert$ is sufficiently large, then $E$ is also a secure PRF. > **Lemma.** If $E: \mathcal{K} \times X \rightarrow X$ is a secure PRP and $\left\lvert X \right\lvert$ is sufficiently large, then $E$ is also a secure PRF.
> >
> For any $q$-query adversary $\mathcal{A}$, > For any $q$-query adversary $\mathcal{A}$,
> >
> $$ > $$
> \left\lvert \mathrm{Adv}_{\mathrm{PRF}}[\mathcal{A}, E] - \mathrm{Adv}_{\mathrm{PRP}}[\mathcal{A}, E] \right\lvert \leq \frac{q^2}{2\left\lvert X \right\lvert}. > \left\lvert \mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{A}, E] - \mathrm{Adv} _ {\mathrm{PRP}}[\mathcal{A}, E] \right\lvert \leq \frac{q^2}{2\left\lvert X \right\lvert}.
> $$ > $$
This is a matter of *collisions* of $f(x_i)$, so we use the facts from the birthday problem. This is a matter of *collisions* of $f(x _ i)$, so we use the facts from the birthday problem.
*Proof*. Appendix A.4. *Proof*. Appendix A.4.
@@ -123,7 +123,7 @@ A **block cipher** is actually a different name for PRPs. Since a PRP $E$ is a k
Block ciphers commonly have the following form. Block ciphers commonly have the following form.
- A key $k$ is chosen uniformly from $\left\lbrace 0, 1 \right\rbrace^s$. - A key $k$ is chosen uniformly from $\left\lbrace 0, 1 \right\rbrace^s$.
- The key $k$ goes through *key expansion* and generates $k_1, \dots, k_n$. These are called **round keys**, where $n$ is the number of rounds. - The key $k$ goes through *key expansion* and generates $k _ 1, \dots, k _ n$. These are called **round keys**, where $n$ is the number of rounds.
- The plaintext goes through $n$ rounds, where in each round, a round function and the round key is applied to the input. - The plaintext goes through $n$ rounds, where in each round, a round function and the round key is applied to the input.
- After $n$ rounds, the ciphertext is obtained. - After $n$ rounds, the ciphertext is obtained.
@@ -139,27 +139,27 @@ Block ciphers commonly have the following form.
### Feistel Network ### Feistel Network
Since block ciphers are PRPs, we have to build an invertible function. Suppose we are given **any** functions $F_1, \dots, F_d : \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$. Can we build an **invertible** function $F : \left\lbrace 0, 1 \right\rbrace^{2n} \rightarrow \left\lbrace 0, 1 \right\rbrace^{2n}$? Since block ciphers are PRPs, we have to build an invertible function. Suppose we are given **any** functions $F _ 1, \dots, F _ d : \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$. Can we build an **invertible** function $F : \left\lbrace 0, 1 \right\rbrace^{2n} \rightarrow \left\lbrace 0, 1 \right\rbrace^{2n}$?
![mc-02-feistel-network.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-feistel-network.png) ![mc-02-feistel-network.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-feistel-network.png)
It turns out the answer is yes. Given an $2n$-bit long input, $L_0$ and $R_0$ denote the left and right halves ($n$ bits) of the input, respectively. Define It turns out the answer is yes. Given an $2n$-bit long input, $L _ 0$ and $R _ 0$ denote the left and right halves ($n$ bits) of the input, respectively. Define
$$ $$
L_i = R_{i-1}, \qquad R_i = F_i(R_{i-1}) \oplus L_{i-1} L _ i = R _ {i-1}, \qquad R _ i = F _ i(R _ {i-1}) \oplus L _ {i-1}
$$ $$
for $i = 1, \dots, d$. Then we can restore $L_0$ and $R_0$ from $L_d$ and $R_d$ by applying the same operations in **reverse order**. It is easy to see that $R_{i-1}$ can be restored from $L_i$. As for $L_{i-1}$, we need a swap. Observe that for $i = 1, \dots, d$. Then we can restore $L _ 0$ and $R _ 0$ from $L _ d$ and $R _ d$ by applying the same operations in **reverse order**. It is easy to see that $R _ {i-1}$ can be restored from $L _ i$. As for $L _ {i-1}$, we need a swap. Observe that
$$ $$
F_i(L_i) \oplus R_i = F_i(R_{i-1}) \oplus (F_i(R_{i-1}) \oplus L_{i-1}) = L_{i-1}. F _ i(L _ i) \oplus R _ i = F _ i(R _ {i-1}) \oplus (F _ i(R _ {i-1}) \oplus L _ {i-1}) = L _ {i-1}.
$$ $$
Note that we did not require $F_i$ to be invertible. We can build invertible functions from arbitrary functions! These are called **Feistel networks**. Note that we did not require $F _ i$ to be invertible. We can build invertible functions from arbitrary functions! These are called **Feistel networks**.
> **Theorem.** (Luby-Rackoff'85) If $F : K \times \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ is a secure PRF, then the $3$-round Feistel using the functions $F_i= F(k_i, \cdot)$ is a secure PRP. > **Theorem.** (Luby-Rackoff'85) If $F : K \times \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ is a secure PRF, then the $3$-round Feistel using the functions $F _ i= F(k _ i, \cdot)$ is a secure PRP.
In DES, the function $F_i$ is the DES round function. In DES, the function $F _ i$ is the DES round function.
![mc-02-des-round.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-des-round.png) ![mc-02-des-round.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-des-round.png)
@@ -227,15 +227,15 @@ These 4 modules are all invertible!
For DES, the S-box is the non-linear part. If the S-box is linear, then the entire DES cipher would be linear. For DES, the S-box is the non-linear part. If the S-box is linear, then the entire DES cipher would be linear.
Specifically, there would be a fixed binary matrix $B_1 \in \mathbb{Z}_2^{64 \times 64}$ and $B_2 \in \mathbb{Z}_2^{64 \times (48 \times 16)}$ such that Specifically, there would be a fixed binary matrix $B _ 1 \in \mathbb{Z} _ 2^{64 \times 64}$ and $B _ 2 \in \mathbb{Z} _ 2^{64 \times (48 \times 16)}$ such that
$$ $$
\mathrm{DES}(k, m) = B_1 m \oplus B_2 \mathbf{k} \mathrm{DES}(k, m) = B _ 1 m \oplus B _ 2 \mathbf{k}
$$ $$
where $\mathbf{k} \in \mathbb{Z}_2^{48 \times 16}$ is a vector of round keys. where $\mathbf{k} \in \mathbb{Z} _ 2^{48 \times 16}$ is a vector of round keys.
Then we can attack DES with the same idea as OTP. If $c_i = B_1 m_i \oplus B_2 \mathbf{k}$, then $c_1 \oplus c_2 = B_1 (m_1 \oplus m_2)$. Then we can attack DES with the same idea as OTP. If $c _ i = B _ 1 m _ i \oplus B _ 2 \mathbf{k}$, then $c _ 1 \oplus c _ 2 = B _ 1 (m _ 1 \oplus m _ 2)$.
Choosing the S-boxes at random results in a insecure block cipher, with high probability. Choosing the S-boxes at random results in a insecure block cipher, with high probability.
@@ -246,18 +246,18 @@ For DES (and AES-128), it is known that *three pairs of plaintext, ciphertext bl
If we were to find the key by brute force, DES is easy. We can strengthen the DES algorithm by using **nested ciphers**. The tradeoff here is that these are slower than the original DES. If we were to find the key by brute force, DES is easy. We can strengthen the DES algorithm by using **nested ciphers**. The tradeoff here is that these are slower than the original DES.
> Define > Define
> - (**2DES**) $2E: \mathcal{K}^2 \times \mathcal{M} \rightarrow \mathcal{M}$ as $2E((k_1, k_2), m) = E(k_1, E(k_2, m))$. > - (**2DES**) $2E: \mathcal{K}^2 \times \mathcal{M} \rightarrow \mathcal{M}$ as $2E((k _ 1, k _ 2), m) = E(k _ 1, E(k _ 2, m))$.
> - (**3DES**) $3E: \mathcal{K}^3 \times \mathcal{M} \rightarrow \mathcal{M}$ as $3E((k_1, k_2, k_3), m) = E(k_1, D(k_2, E(k_3, m)))$.[^4] > - (**3DES**) $3E: \mathcal{K}^3 \times \mathcal{M} \rightarrow \mathcal{M}$ as $3E((k _ 1, k _ 2, k _ 3), m) = E(k _ 1, D(k _ 2, E(k _ 3, m)))$.[^4]
Then the key space has increased (exponentially). As for 2DES, the key space is now $2^{112}$, so maybe nested ciphers increase the level of security. Then the key space has increased (exponentially). As for 2DES, the key space is now $2^{112}$, so maybe nested ciphers increase the level of security.
#### $2E$ is Insecure: Meet in the Middle #### $2E$ is Insecure: Meet in the Middle
Unfortunately, 2DES is only secure as DES, with the attack strategy called **meet in the middle**. The idea is that if $c = E(k_1, E(k_2, m))$, then $D(k_1, c) = E(k_2, m)$. Unfortunately, 2DES is only secure as DES, with the attack strategy called **meet in the middle**. The idea is that if $c = E(k _ 1, E(k _ 2, m))$, then $D(k _ 1, c) = E(k _ 2, m)$.
![mc-02-2des-mitm.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-2des-mitm.png) ![mc-02-2des-mitm.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-02-2des-mitm.png)
Since we have the plaintext and the ciphertext, we first build a table of $(k, E(k_2, m))$ over $k_2 \in \mathcal{K}$ and sort by $E(k_2, m)$. Next, we check if $D(k_1, c)$ is in the table for all $k_1 \in \mathcal{K}$. Since we have the plaintext and the ciphertext, we first build a table of $(k, E(k _ 2, m))$ over $k _ 2 \in \mathcal{K}$ and sort by $E(k _ 2, m)$. Next, we check if $D(k _ 1, c)$ is in the table for all $k _ 1 \in \mathcal{K}$.
The complexity of this attack is shown as follows: The complexity of this attack is shown as follows:
- Space complexity: We must store $\left\lvert \mathcal{K} \right\lvert$ entries with approximately $\log \left\lvert \mathcal{M} \right\lvert$ bits. Thus we need $\left\lvert \mathcal{K} \right\lvert \log \left\lvert \mathcal{M} \right\lvert$ bits in total. - Space complexity: We must store $\left\lvert \mathcal{K} \right\lvert$ entries with approximately $\log \left\lvert \mathcal{M} \right\lvert$ bits. Thus we need $\left\lvert \mathcal{K} \right\lvert \log \left\lvert \mathcal{M} \right\lvert$ bits in total.
@@ -273,13 +273,13 @@ The above argument can be generalized for any scheme $(E, D)$. Thus, the $2E$ co
There is another method called the **$EX$ construction** for block cipher $(E, D)$ defined over $(\mathcal{K}, \mathcal{X})$. The $EX$ construction uses a new block cipher defined as follows. Intuitively, it can be thought of as applying OTP before and after encryption. There is another method called the **$EX$ construction** for block cipher $(E, D)$ defined over $(\mathcal{K}, \mathcal{X})$. The $EX$ construction uses a new block cipher defined as follows. Intuitively, it can be thought of as applying OTP before and after encryption.
> Let $k_1 \in \mathcal{K}$ and $k_2, k_3 \in \mathcal{X}$. > Let $k _ 1 \in \mathcal{K}$ and $k _ 2, k _ 3 \in \mathcal{X}$.
> - Encryption: $EX\left((k_1, k_2, k_3), m\right) = E(k_1, m \oplus k_2) \oplus k_3$. > - Encryption: $EX\left((k _ 1, k _ 2, k _ 3), m\right) = E(k _ 1, m \oplus k _ 2) \oplus k _ 3$.
> - Decryption: $DX((k_1, k_2, k_3), c) = D(k_1, c \oplus k_3) \oplus k_2$. > - Decryption: $DX((k _ 1, k _ 2, k _ 3), c) = D(k _ 1, c \oplus k _ 3) \oplus k _ 2$.
Then the new cipher $(EX, DX)$ has a key space $\mathcal{K} \times \mathcal{X}^2$, which is much larger than $\mathcal{K}$. Specifically for DESX, the key length would be $56 + 2 \times 64 = 184$ bits. Then the new cipher $(EX, DX)$ has a key space $\mathcal{K} \times \mathcal{X}^2$, which is much larger than $\mathcal{K}$. Specifically for DESX, the key length would be $56 + 2 \times 64 = 184$ bits.
As a side note, using $E(k_1, m) \oplus k_2$ or $E(k_1, m \oplus k_2)$ does not improve security, since it can be attacked with meet in the middle method. Similarly, DESX also has $184$ bit key space, but actual search space is about $56 + 64 = 120$ bits. As a side note, using $E(k _ 1, m) \oplus k _ 2$ or $E(k _ 1, m \oplus k _ 2)$ does not improve security, since it can be attacked with meet in the middle method. Similarly, DESX also has $184$ bit key space, but actual search space is about $56 + 64 = 120$ bits.
### Attacks on AES ### Attacks on AES

View File

@@ -30,14 +30,14 @@ This notion can be formalized as a security game. The difference here is that we
> **Experiment $b$.** > **Experiment $b$.**
> 1. The challenger fixes a key $k \leftarrow \mathcal{K}$. > 1. The challenger fixes a key $k \leftarrow \mathcal{K}$.
> 2. The adversary submits a sequence of queries to the challenger: > 2. The adversary submits a sequence of queries to the challenger:
> - The $i$-th query is a pair of messages $m_{i, 0}, m_{i, 1} \in \mathcal{M}$ of the same length. > - The $i$-th query is a pair of messages $m _ {i, 0}, m _ {i, 1} \in \mathcal{M}$ of the same length.
> 3. The challenger computes $c_i = E(k, m_{i, b})$ and sends $c_i$ to the adversary. > 3. The challenger computes $c _ i = E(k, m _ {i, b})$ and sends $c _ i$ to the adversary.
> 4. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 4. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mathcal{E}$** is defined as > Let $W _ b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mathcal{E}$** is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert > \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert
> $$ > $$
> >
> If the CPA advantage is negligible for all efficient adversaries $\mathcal{A}$, then the cipher $\mathcal{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**. > If the CPA advantage is negligible for all efficient adversaries $\mathcal{A}$, then the cipher $\mathcal{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**.
@@ -48,7 +48,7 @@ The assumption that the adversary can choose any message of its choice may seem
### Deterministic Cipher is not CPA Secure ### Deterministic Cipher is not CPA Secure
Suppose that $E$ is deterministic. Then we can construct an adversary that breaks CPA security. For example, the adversary can send $(m_0, m_1)$ and $(m_0, m_2)$. Then if $b = 0$, the received ciphertext would be same, so the adversary can output $0$ and win the CPA security game. Suppose that $E$ is deterministic. Then we can construct an adversary that breaks CPA security. For example, the adversary can send $(m _ 0, m _ 1)$ and $(m _ 0, m _ 2)$. Then if $b = 0$, the received ciphertext would be same, so the adversary can output $0$ and win the CPA security game.
Therefore, for *indistinguishability under chosen plaintext attack* (IND-CPA), encryption must produce different outputs even for the same plaintext. Therefore, for *indistinguishability under chosen plaintext attack* (IND-CPA), encryption must produce different outputs even for the same plaintext.
@@ -73,15 +73,15 @@ We also formalize security for nonce-based encryption. It is basically the same
> **Experiment $b$**. > **Experiment $b$**.
> 1. The challenger fixes a key $k \leftarrow \mathcal{K}$. > 1. The challenger fixes a key $k \leftarrow \mathcal{K}$.
> 2. The adversary submits a sequence of queries to the challenger. > 2. The adversary submits a sequence of queries to the challenger.
> - The $i$-th query is a pair of messages $m_{i, 0}, m_{i, 1} \in \mathcal{M}$ of the same length, and a nonce $n_i \in \mathcal{N} \setminus \left\lbrace n_1, \dots, n_{i-1} \right\rbrace$. > - The $i$-th query is a pair of messages $m _ {i, 0}, m _ {i, 1} \in \mathcal{M}$ of the same length, and a nonce $n _ i \in \mathcal{N} \setminus \left\lbrace n _ 1, \dots, n _ {i-1} \right\rbrace$.
> - Nonces should be unique. > - Nonces should be unique.
> 3. The challenger computes $c_i = E(k, m_{i, b}, n_i)$ and sends $c_i$ to the adversary. > 3. The challenger computes $c _ i = E(k, m _ {i, b}, n _ i)$ and sends $c _ i$ to the adversary.
> 4. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 4. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mathcal{E}$** is defined as > Let $W _ b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mathcal{E}$** is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{nCPA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert > \mathrm{Adv} _ {\mathrm{nCPA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert
> $$ > $$
> >
> If the CPA advantage is negligible for all efficient adversaries $\mathcal{A}$, then the nonce-based cipher $\mathcal{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**. > If the CPA advantage is negligible for all efficient adversaries $\mathcal{A}$, then the nonce-based cipher $\mathcal{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**.
@@ -130,7 +130,7 @@ We learned how to encrypt a single block. How do we encrypt longer messages with
There are many ways of processing multiple blocks, this is called the **mode of operation**. There are many ways of processing multiple blocks, this is called the **mode of operation**.
Additional explanation available in [Modes of Operations (Internet Security)](../internet-security/2023-09-18-symmetric-key-cryptography-2.md#modes-of-operations). Additional explanation available in [Modes of Operations (Internet Security)](../../internet-security/2023-09-18-symmetric-key-cryptography-2/#modes-of-operations).
### Electronic Codebook Mode (ECB) ### Electronic Codebook Mode (ECB)
@@ -157,7 +157,7 @@ There is a security proof for CBC mode.
> For any $q$-query adversary $\mathcal{A}$, there exists a PRP adversary $\mathcal{B}$ such that > For any $q$-query adversary $\mathcal{A}$, there exists a PRP adversary $\mathcal{B}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{A}, E] \leq 2 \cdot \mathrm{Adv}_{\mathrm{PRP}}[\mathcal{B}, E] + \frac{2q^2L^2}{\left\lvert X \right\lvert}. > \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{A}, E] \leq 2 \cdot \mathrm{Adv} _ {\mathrm{PRP}}[\mathcal{B}, E] + \frac{2q^2L^2}{\left\lvert X \right\lvert}.
> $$ > $$
*Proof*. See Theorem 5.4.[^2] *Proof*. See Theorem 5.4.[^2]
@@ -166,16 +166,16 @@ From the above theorem, note that CBC is only secure as long as $q^2L^2 \ll \lef
Also, CBC mode is not secure if the adversary can predict the IV of the next message. Proceed as follows: Also, CBC mode is not secure if the adversary can predict the IV of the next message. Proceed as follows:
> 1. Query the challenger for an encryption of $m_0$ and $m_1$. > 1. Query the challenger for an encryption of $m _ 0$ and $m _ 1$.
> 2. Receive $\mathrm{IV}_0, E(k, \mathrm{IV}_0 \oplus m_0)$ and $\mathrm{IV}_1, E(k, \mathrm{IV}_1 \oplus m_1)$. > 2. Receive $\mathrm{IV} _ 0, E(k, \mathrm{IV} _ 0 \oplus m _ 0)$ and $\mathrm{IV} _ 1, E(k, \mathrm{IV} _ 1 \oplus m _ 1)$.
> 3. Predict the next IV as $\mathrm{IV}_2$, and set the new query pair as > 3. Predict the next IV as $\mathrm{IV} _ 2$, and set the new query pair as
> >
> $$ > $$
> m_0' = \mathrm{IV}_2 \oplus \mathrm{IV}_0 \oplus m_0, \quad m_1' = \mathrm{IV}_2 \oplus \mathrm{IV}_1 \oplus m_1 > m _ 0' = \mathrm{IV} _ 2 \oplus \mathrm{IV} _ 0 \oplus m _ 0, \quad m _ 1' = \mathrm{IV} _ 2 \oplus \mathrm{IV} _ 1 \oplus m _ 1
> $$ > $$
> >
> and send it to the challenger. > and send it to the challenger.
> 4. In experiment $b$, the adversary will receive $E(k, \mathrm{IV}_b \oplus m_b)$. Compare this with the result of the query from (2). The adversary wins with advantage $1$. > 4. In experiment $b$, the adversary will receive $E(k, \mathrm{IV} _ b \oplus m _ b)$. Compare this with the result of the query from (2). The adversary wins with advantage $1$.
(More on this to be added) (More on this to be added)
@@ -184,12 +184,12 @@ Also, CBC mode is not secure if the adversary can predict the IV of the next mes
We can also use a **unique** nonce to generate the IV. Specifically, We can also use a **unique** nonce to generate the IV. Specifically,
$$ $$
\mathrm{IV} = E(k_1, n) \mathrm{IV} = E(k _ 1, n)
$$ $$
where $k_1$ is the new key and $n$ is a nonce. The ciphertext starts with $n$ instead of the $\mathrm{IV}$. where $k _ 1$ is the new key and $n$ is a nonce. The ciphertext starts with $n$ instead of the $\mathrm{IV}$.
Note that if $k_1$ is the same as the key used for encrypting messages, then this scheme is insecure. See Exercise 5.14.[^2] Note that if $k _ 1$ is the same as the key used for encrypting messages, then this scheme is insecure. See Exercise 5.14.[^2]
### Counter Mode (CTR) ### Counter Mode (CTR)
@@ -209,7 +209,7 @@ There is also a security proof for CTR mode.
> For any $q$-query adversary $\mathcal{A}$ against $E$, there exists a PRF adversary $\mathcal{B}$ such that > For any $q$-query adversary $\mathcal{A}$ against $E$, there exists a PRF adversary $\mathcal{B}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{A}, E] \leq 2\cdot\mathrm{Adv}_{\mathrm{PRF}}[\mathcal{B}, F] + \frac{4q^2L}{\left\lvert X \right\lvert}. > \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{A}, E] \leq 2\cdot\mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{B}, F] + \frac{4q^2L}{\left\lvert X \right\lvert}.
> $$ > $$
*Proof.* Refer to Theorem 5.3.[^2] *Proof.* Refer to Theorem 5.3.[^2]

View File

@@ -51,11 +51,11 @@ This is called **canonical verification**. All real-world MACs use canonical ver
In the security definition of MACs, we allow the attacker to request tags for arbitrary messages of its choice, called **chosen-message attacks**. This assumption will allow the attacker to collect a bunch of valid $(m, t)$ pairs. In this setting, we require the attacker to forge a **new** valid message-tag pair, which is different from what the attacker has. Also, it is not required that the forged message $m$ have any meaning. This is called **existential forgery**. A MAC system is secure if an existential forgery is almost impossible. Note that we are giving the adversary much power in the definition, to be conservative. In the security definition of MACs, we allow the attacker to request tags for arbitrary messages of its choice, called **chosen-message attacks**. This assumption will allow the attacker to collect a bunch of valid $(m, t)$ pairs. In this setting, we require the attacker to forge a **new** valid message-tag pair, which is different from what the attacker has. Also, it is not required that the forged message $m$ have any meaning. This is called **existential forgery**. A MAC system is secure if an existential forgery is almost impossible. Note that we are giving the adversary much power in the definition, to be conservative.
- Attacker is given $t_i \leftarrow S(k, m_i)$ for $m_1, \dots, m_q$ of his choice. - Attacker is given $t _ i \leftarrow S(k, m _ i)$ for $m _ 1, \dots, m _ q$ of his choice.
- Attacker has a *signing oracle*. - Attacker has a *signing oracle*.
- Attacker's goal is **existential forgery**. - Attacker's goal is **existential forgery**.
- **MAC**: generate a *new* valid message-tag pair $(m, t)$ such that $V(k, m, t) = 1$ and $m \notin \left\lbrace m_1, \dots, m_q \right\rbrace$. - **MAC**: generate a *new* valid message-tag pair $(m, t)$ such that $V(k, m, t) = 1$ and $m \notin \left\lbrace m _ 1, \dots, m _ q \right\rbrace$.
- **Strong MAC**: generate a *new* valid message-tag pair $(m, t)$ $V(k, m, t) = 1$ and $(m, t) \notin \left\lbrace (m_1, t_1), \dots, (m_q, t_q) \right\rbrace$. - **Strong MAC**: generate a *new* valid message-tag pair $(m, t)$ $V(k, m, t) = 1$ and $(m, t) \notin \left\lbrace (m _ 1, t _ 1), \dots, (m _ q, t _ q) \right\rbrace$.
For strong MACs, the attacker only has to change the tag for the attack to succeed. For strong MACs, the attacker only has to change the tag for the attack to succeed.
@@ -65,15 +65,15 @@ For strong MACs, the attacker only has to change the tag for the attack to succe
> >
> 1. The challenger picks a random $k \leftarrow \mathcal{K}$. > 1. The challenger picks a random $k \leftarrow \mathcal{K}$.
> 2. $\mathcal{A}$ queries the challenger $q$ times. > 2. $\mathcal{A}$ queries the challenger $q$ times.
> - The $i$-th signing query is a message $m_i$, and receives $t_i \leftarrow S(k, m_i)$. > - The $i$-th signing query is a message $m _ i$, and receives $t _ i \leftarrow S(k, m _ i)$.
> 3. $\mathcal{A}$ outputs a new forged pair $(m, t)$ that is not among the queried pairs. > 3. $\mathcal{A}$ outputs a new forged pair $(m, t)$ that is not among the queried pairs.
> - $m \notin \left\lbrace m_1, \dots,m_q \right\rbrace$ > - $m \notin \left\lbrace m _ 1, \dots,m _ q \right\rbrace$
> - $(m, t) \notin \left\lbrace (m_1, t_1), \dots, (m_q, t_q) \right\rbrace$ (for strong MAC) > - $(m, t) \notin \left\lbrace (m _ 1, t _ 1), \dots, (m _ q, t _ q) \right\rbrace$ (for strong MAC)
> >
> $\mathcal{A}$ wins if $(m, t)$ is a valid pair under $k$. Let this event be $W$. The **MAC advantage** with respect to $\Pi$ is defined as > $\mathcal{A}$ wins if $(m, t)$ is a valid pair under $k$. Let this event be $W$. The **MAC advantage** with respect to $\Pi$ is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{MAC}}[\mathcal{A}, \Pi] = \Pr[W] > \mathrm{Adv} _ {\mathrm{MAC}}[\mathcal{A}, \Pi] = \Pr[W]
> $$ > $$
> >
> and a MAC $\Pi$ is secure if the advantage is negligible for any efficient $\mathcal{A}$. In this case, we say that $\Pi$ is **existentially unforgeable under a chosen message attack**. > and a MAC $\Pi$ is secure if the advantage is negligible for any efficient $\mathcal{A}$. In this case, we say that $\Pi$ is **existentially unforgeable under a chosen message attack**.
@@ -82,7 +82,7 @@ If a MAC is secure, the attacker learns almost nothing from the $q$ queries. i.e
### MAC Security with Verification Queries ### MAC Security with Verification Queries
The above definition can be modified to include **verification queries**, where the adversary $\mathcal{A}$ queries $(m_j, t_j) \in \mathcal{M} \times \mathcal{T}$ and the challenger responds with $V(k, m_j, t_j)$. $\mathcal{A}$ wins if any verification query is returned with $1$ ($\texttt{accept}$). The above definition can be modified to include **verification queries**, where the adversary $\mathcal{A}$ queries $(m _ j, t _ j) \in \mathcal{M} \times \mathcal{T}$ and the challenger responds with $V(k, m _ j, t _ j)$. $\mathcal{A}$ wins if any verification query is returned with $1$ ($\texttt{accept}$).
It can be shown that for **strong MACs**, these two definitions are equivalent. See Theorem 6.1.[^1] For (just) MACs, these are not equivalent. See Exercise 6.7.[^1] It can be shown that for **strong MACs**, these two definitions are equivalent. See Theorem 6.1.[^1] For (just) MACs, these are not equivalent. See Exercise 6.7.[^1]
@@ -113,7 +113,7 @@ This MAC is **derived from $F$**, and is deterministic. This scheme is secure as
> For every efficient MAC adversary $\mathcal{A}$ against $\Pi$, there exists an efficient PRF adversary $\mathcal{B}$ such that > For every efficient MAC adversary $\mathcal{A}$ against $\Pi$, there exists an efficient PRF adversary $\mathcal{B}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{MAC}}[\mathcal{A}, \Pi] \leq \mathrm{Adv}_{\mathrm{PRF}}[\mathcal{B}, F] + \frac{1}{\left\lvert Y \right\lvert}. > \mathrm{Adv} _ {\mathrm{MAC}}[\mathcal{A}, \Pi] \leq \mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{B}, F] + \frac{1}{\left\lvert Y \right\lvert}.
> $$ > $$
*Proof*. See Theorem 6.2.[^1] *Proof*. See Theorem 6.2.[^1]
@@ -126,13 +126,13 @@ The above construction uses a PRF, so it is restricted to messages of fixed size
![mc-04-cbc-mac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-04-cbc-mac.png) ![mc-04-cbc-mac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-04-cbc-mac.png)
> **Definition.** For any message $m = (m_0, m_1, \dots, m_{l-1}) \in \left\lbrace 0, 1 \right\rbrace^{nl}$, let $F_k := F(k, \cdot)$. > **Definition.** For any message $m = (m _ 0, m _ 1, \dots, m _ {l-1}) \in \left\lbrace 0, 1 \right\rbrace^{nl}$, let $F _ k := F(k, \cdot)$.
> >
> $$ > $$
> S_\mathrm{CBC}(m) = F_k(F_k(\cdots F_k(F_k(m_0) \oplus m_1) \oplus \cdots) \oplus m_{l-1}). > S _ \mathrm{CBC}(m) = F _ k(F _ k(\cdots F _ k(F _ k(m _ 0) \oplus m _ 1) \oplus \cdots) \oplus m _ {l-1}).
> $$ > $$
$S_\mathrm{CBC}$ is similar to CBC mode encryption, but there is no intermediate output, and the IV is fixed as $0^n$. $S _ \mathrm{CBC}$ is similar to CBC mode encryption, but there is no intermediate output, and the IV is fixed as $0^n$.
> **Theorem.** If $F : \mathcal{K} \times \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ is a secure PRF, then **for a fixed $l$**, CBC-MAC is secure for messages $\mathcal{M} = \left\lbrace 0, 1 \right\rbrace^{nl}$. > **Theorem.** If $F : \mathcal{K} \times \left\lbrace 0, 1 \right\rbrace^n \rightarrow \left\lbrace 0, 1 \right\rbrace^n$ is a secure PRF, then **for a fixed $l$**, CBC-MAC is secure for messages $\mathcal{M} = \left\lbrace 0, 1 \right\rbrace^{nl}$.
@@ -144,14 +144,14 @@ For any messages *shorter than* $nl$, CBC-MAC is not secure. So the length of th
To see this, consider the following **extension attack**. To see this, consider the following **extension attack**.
1. Pick an arbitrary $m_0 \in \left\lbrace 0, 1 \right\rbrace^n$. 1. Pick an arbitrary $m _ 0 \in \left\lbrace 0, 1 \right\rbrace^n$.
2. Request the tag $t = F(k, m_0)$. 2. Request the tag $t = F(k, m _ 0)$.
3. Set $m_1 = t \oplus m_0$ and output $(m_0, m_1) \in \left\lbrace 0, 1 \right\rbrace^{2n}$ and $t$ as the tag. 3. Set $m _ 1 = t \oplus m _ 0$ and output $(m _ 0, m _ 1) \in \left\lbrace 0, 1 \right\rbrace^{2n}$ and $t$ as the tag.
Then the verification works since Then the verification works since
$$ $$
S_\mathrm{CBC}(k, (m_0, t\oplus m_0)) = F(k, F(k, m_0) \oplus (t \oplus m_0)) = F(k, m_0) = t. S _ \mathrm{CBC}(k, (m _ 0, t\oplus m _ 0)) = F(k, F(k, m _ 0) \oplus (t \oplus m _ 0)) = F(k, m _ 0) = t.
$$ $$
#### Random IV is Insecure #### Random IV is Insecure
@@ -165,21 +165,21 @@ If we use random IV instead of $0^n$, CBC-MAC is insecure. Suppose a random IV w
Then the verification works since Then the verification works since
$$ $$
S_\mathrm{CBC}(k, \mathrm{IV} \oplus m) = F(k, (\mathrm{IV} \oplus m) \oplus \mathrm{IV}) = F(k, m) = t. S _ \mathrm{CBC}(k, \mathrm{IV} \oplus m) = F(k, (\mathrm{IV} \oplus m) \oplus \mathrm{IV}) = F(k, m) = t.
$$ $$
#### Disclosing Intermediate Values is Insecure #### Disclosing Intermediate Values is Insecure
If CBC-MAC outputs all intermediate values of $F(k, \cdot)$, then CBC-MAC is insecure. Consider the following attack. If CBC-MAC outputs all intermediate values of $F(k, \cdot)$, then CBC-MAC is insecure. Consider the following attack.
1. Pick an arbitrary $(m_0, m_1) \in \left\lbrace 0, 1 \right\rbrace^{2n}$. 1. Pick an arbitrary $(m _ 0, m _ 1) \in \left\lbrace 0, 1 \right\rbrace^{2n}$.
2. Request the computed values $(t_0, t)$, where $t_0 = F(k, m_0)$ and $t = F(k, m_1 \oplus t_0)$. 2. Request the computed values $(t _ 0, t)$, where $t _ 0 = F(k, m _ 0)$ and $t = F(k, m _ 1 \oplus t _ 0)$.
3. Send $(m_0, m_0 \oplus t_0) \in \left\lbrace 0, 1 \right\rbrace^{2n}$ and tag $t_0$. 3. Send $(m _ 0, m _ 0 \oplus t _ 0) \in \left\lbrace 0, 1 \right\rbrace^{2n}$ and tag $t _ 0$.
Then the verification works since Then the verification works since
$$ $$
S_\mathrm{CBC}(k, (m_0, m_0 \oplus t_0)) = F(k, F(k, m_0) \oplus (m_0 \oplus t_0)) = F(k, m_0) = t_0. S _ \mathrm{CBC}(k, (m _ 0, m _ 0 \oplus t _ 0)) = F(k, F(k, m _ 0) \oplus (m _ 0 \oplus t _ 0)) = F(k, m _ 0) = t _ 0.
$$ $$
The lesson is that *cryptographic constructions should be implemented exactly as it was specified, without any unproven variations*. The lesson is that *cryptographic constructions should be implemented exactly as it was specified, without any unproven variations*.
@@ -196,15 +196,15 @@ However, this cannot be used if the length of the message is not known in advanc
> **Proposition.** Appending the length of the message in CBC-MAC is insecure. > **Proposition.** Appending the length of the message in CBC-MAC is insecure.
*Proof*. Let $n$ be the length of a block. Query $m_1, m_2, m_1 \parallel n \parallel m_3$ and receive $3$ tags, $t_1 = E_k(E_k(m_1) \oplus n)$, $t_2 = E_k(E_k(m_2) \oplus n)$, $t_3 = E_k(E_k(t_1 \oplus m_3) \oplus 3n)$. *Proof*. Let $n$ be the length of a block. Query $m _ 1, m _ 2, m _ 1 \parallel n \parallel m _ 3$ and receive $3$ tags, $t _ 1 = E _ k(E _ k(m _ 1) \oplus n)$, $t _ 2 = E _ k(E _ k(m _ 2) \oplus n)$, $t _ 3 = E _ k(E _ k(t _ 1 \oplus m _ 3) \oplus 3n)$.
Now forge a message-tag pair $(m_2 \parallel n \parallel (m_3 \oplus t_1 \oplus t_2), t_3)$. Then the tag is Now forge a message-tag pair $(m _ 2 \parallel n \parallel (m _ 3 \oplus t _ 1 \oplus t _ 2), t _ 3)$. Then the tag is
$$ $$
E_k(E_k(\overbrace{E_k(E_k(m_2) \oplus n)}^{t_2} \oplus m_3 \oplus t_1 \oplus t_2) \oplus 3n) = E_k(E_k(t_1 \oplus m_3) \oplus 3n) E _ k(E _ k(\overbrace{E _ k(E _ k(m _ 2) \oplus n)}^{t _ 2} \oplus m _ 3 \oplus t _ 1 \oplus t _ 2) \oplus 3n) = E _ k(E _ k(t _ 1 \oplus m _ 3) \oplus 3n)
$$ $$
which equals $t_3$. Note that the same logic works if the length is *anywhere* in the message, except for the beginning. which equals $t _ 3$. Note that the same logic works if the length is *anywhere* in the message, except for the beginning.
### Encrypt Last Block (ECBC-MAC) ### Encrypt Last Block (ECBC-MAC)
@@ -214,12 +214,12 @@ ECBC-MAC doesn't require us to know the message length in advance, but it is rel
![mc-04-ecbc-mac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-04-ecbc-mac.png) ![mc-04-ecbc-mac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-04-ecbc-mac.png)
> **Theorem.** Let $F : \mathcal{K} \times X \rightarrow X$ be a secure PRF. Then for any $l \geq 0$, $F_\mathrm{ECBC} : \mathcal{K}^2 \times X^{\leq l} \rightarrow X$ is a secure PRF. > **Theorem.** Let $F : \mathcal{K} \times X \rightarrow X$ be a secure PRF. Then for any $l \geq 0$, $F _ \mathrm{ECBC} : \mathcal{K}^2 \times X^{\leq l} \rightarrow X$ is a secure PRF.
> >
> For any efficient $q$-query PRF adversary $\mathcal{A}$ against $F_\mathrm{ECBC}$, there exists an efficient PRF adversary $\mathcal{B}$ such that > For any efficient $q$-query PRF adversary $\mathcal{A}$ against $F _ \mathrm{ECBC}$, there exists an efficient PRF adversary $\mathcal{B}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{PRF}}[\mathcal{A}, F_\mathrm{ECBC}] \leq \mathrm{Adv}_{\mathrm{PRF}}[\mathcal{B}, F] + \frac{2q^2l^2}{\left\lvert X \right\lvert}. > \mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{A}, F _ \mathrm{ECBC}] \leq \mathrm{Adv} _ {\mathrm{PRF}}[\mathcal{B}, F] + \frac{2q^2l^2}{\left\lvert X \right\lvert}.
> $$ > $$
> >
> [^2] > [^2]
@@ -238,12 +238,12 @@ It is easy to see that (E)CBC is an extendable PRF.
#### Attacking ECBC with $\sqrt{\left\lvert X \right\lvert}$ Messages #### Attacking ECBC with $\sqrt{\left\lvert X \right\lvert}$ Messages
1. Make $q = \sqrt{\left\lvert X \right\lvert}$ queries using random messages $m_i \in X$ and obtain $t_i = F_\mathrm{ECBC}(k, m_i)$. 1. Make $q = \sqrt{\left\lvert X \right\lvert}$ queries using random messages $m _ i \in X$ and obtain $t _ i = F _ \mathrm{ECBC}(k, m _ i)$.
2. With a high probability, there is a collision $t_i = t_j$ for $i \neq j$. 2. With a high probability, there is a collision $t _ i = t _ j$ for $i \neq j$.
3. Query for $m_i \parallel m$ and receive the tag $t$. 3. Query for $m _ i \parallel m$ and receive the tag $t$.
4. Return a forged pair $(m_j \parallel m, t)$. 4. Return a forged pair $(m _ j \parallel m, t)$.
This works because ECBC is an extendable PRF. $t$ also works as a valid tag for $m_j \parallel m$. This works because ECBC is an extendable PRF. $t$ also works as a valid tag for $m _ j \parallel m$.
So ECBC becomes insecure after signing $\sqrt{\left\lvert X \right\lvert}$ messages. So ECBC becomes insecure after signing $\sqrt{\left\lvert X \right\lvert}$ messages.

View File

@@ -34,18 +34,18 @@ Now we define a stronger notion of security against **chosen ciphertext attacks*
> **Experiment $b$.** > **Experiment $b$.**
> 1. The challenger fixes a key $k \leftarrow \mathcal{K}$. > 1. The challenger fixes a key $k \leftarrow \mathcal{K}$.
> 2. $\mathcal{A}$ makes a series of queries to the challenger, which is one of the following two types. > 2. $\mathcal{A}$ makes a series of queries to the challenger, which is one of the following two types.
> - *Encryption*: Send $m_i$ and receive $c'_i = E(k, m_i)$. > - *Encryption*: Send $m _ i$ and receive $c' _ i = E(k, m _ i)$.
> - *Decryption*: Send $c_i$ and receive $m'_i = D(k, c_i)$. > - *Decryption*: Send $c _ i$ and receive $m' _ i = D(k, c _ i)$.
> - Note that $\mathcal{A}$ is not allowed to make a decryption query for any $c_i'$. > - Note that $\mathcal{A}$ is not allowed to make a decryption query for any $c _ i'$.
> 3. $\mathcal{A}$ outputs a pair of messages $(m_0^\ast , m_1^\ast)$. > 3. $\mathcal{A}$ outputs a pair of messages $(m _ 0^\ast , m _ 1^\ast)$.
> 4. The challenger generates $c^\ast \leftarrow E(k, m_b^\ast)$ and gives it to $\mathcal{A}$. > 4. The challenger generates $c^\ast \leftarrow E(k, m _ b^\ast)$ and gives it to $\mathcal{A}$.
> 5. $\mathcal{A}$ is allowed to keep making queries, but not allowed to make a decryption query for $c^\ast$. > 5. $\mathcal{A}$ is allowed to keep making queries, but not allowed to make a decryption query for $c^\ast$.
> 6. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 6. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CCA advantage with respect to $\mathcal{E}$** is defined as > Let $W _ b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. Then the **CCA advantage with respect to $\mathcal{E}$** is defined as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CCA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert. > \mathrm{Adv} _ {\mathrm{CCA}}[\mathcal{A}, \mathcal{E}] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert.
> $$ > $$
> >
> If the CCA advantage is negligible for all efficient adversaries $\mathcal{A}$, then $\mathcal{E}$ is **semantically secure against a chosen ciphertext attack**, or simply **CCA secure**. > If the CCA advantage is negligible for all efficient adversaries $\mathcal{A}$, then $\mathcal{E}$ is **semantically secure against a chosen ciphertext attack**, or simply **CCA secure**.
@@ -54,7 +54,7 @@ Now we define a stronger notion of security against **chosen ciphertext attacks*
None of the encryption schemes already seen thus far is CCA secure. None of the encryption schemes already seen thus far is CCA secure.
Recall a [CPA secure construction from PRF](./2023-09-19-symmetric-key-encryption.md#secure-construction-from-prf). This scheme is not CCA secure. Suppose that the adversary is given $c^\ast = (r, F(k, r) \oplus m_b)$. Then it can request a decryption for $c' = (r, s')$ for some $s'$ and receive $m' = s' \oplus F(k, r)$. Then $F(k, r) = m' \oplus s'$, so the adversary can successfully recover $m_b$. Recall a [CPA secure construction from PRF](../2023-09-19-symmetric-key-encryption/#secure-construction-from-prf). This scheme is not CCA secure. Suppose that the adversary is given $c^\ast = (r, F(k, r) \oplus m _ b)$. Then it can request a decryption for $c' = (r, s')$ for some $s'$ and receive $m' = s' \oplus F(k, r)$. Then $F(k, r) = m' \oplus s'$, so the adversary can successfully recover $m _ b$.
In general, any encryption scheme that allows ciphertexts to be *manipulated* in a controlled way cannot be CCA secure. In general, any encryption scheme that allows ciphertexts to be *manipulated* in a controlled way cannot be CCA secure.
@@ -66,14 +66,14 @@ Suppose that there is a proxy server in the middle, that forwards the message to
An adversary at destination 25 wants to receive the message sent to destination $80$. This can be done by modifying the destination to $\texttt{25}$. An adversary at destination 25 wants to receive the message sent to destination $80$. This can be done by modifying the destination to $\texttt{25}$.
Suppose we used CBC mode encryption. Then the first block of the ciphertext would contain the IV, the next block would contain $E(k, \mathrm{IV} \oplus m_0)$. Suppose we used CBC mode encryption. Then the first block of the ciphertext would contain the IV, the next block would contain $E(k, \mathrm{IV} \oplus m _ 0)$.
The adversary can generate a new ciphertext $c'$ without knowing the actual key. Set the new IV as $\mathrm{IV}' =\mathrm{IV} \oplus m^\ast$ where $m^\ast$ contains a payload that can change $\texttt{80}$ to $\texttt{25}$. (This can be calculated) The adversary can generate a new ciphertext $c'$ without knowing the actual key. Set the new IV as $\mathrm{IV}' =\mathrm{IV} \oplus m^\ast$ where $m^\ast$ contains a payload that can change $\texttt{80}$ to $\texttt{25}$. (This can be calculated)
Then the decryption works as normal, Then the decryption works as normal,
$$ $$
D(k, c_0) \oplus \mathrm{IV}' = (m_0 \oplus \mathrm{IV}) \oplus \mathrm{IV}' = m_0 \oplus m^\ast. D(k, c _ 0) \oplus \mathrm{IV}' = (m _ 0 \oplus \mathrm{IV}) \oplus \mathrm{IV}' = m _ 0 \oplus m^\ast.
$$ $$
The destination of the original message has been changed, even though the adversary had no information of the key. The destination of the original message has been changed, even though the adversary had no information of the key.
@@ -90,12 +90,12 @@ In this case, we fix the decryption algorithm so that $D : \mathcal{K} \times \m
> >
> 1. The challenger picks a random $k \leftarrow \mathcal{K}$. > 1. The challenger picks a random $k \leftarrow \mathcal{K}$.
> 2. $\mathcal{A}$ queries the challenger $q$ times. > 2. $\mathcal{A}$ queries the challenger $q$ times.
> - The $i$-th query is a message $m_i$, and receives $c_i \leftarrow E(k, m_i)$. > - The $i$-th query is a message $m _ i$, and receives $c _ i \leftarrow E(k, m _ i)$.
> 3. $\mathcal{A}$ outputs a candidate ciphertext $c \in \mathcal{C}$ that is not among the ciphertexts it was given by querying. > 3. $\mathcal{A}$ outputs a candidate ciphertext $c \in \mathcal{C}$ that is not among the ciphertexts it was given by querying.
> >
> $\mathcal{A}$ wins if $c$ is a valid ciphertext under $k$. i.e, $D(k, c) \neq \bot$. > $\mathcal{A}$ wins if $c$ is a valid ciphertext under $k$. i.e, $D(k, c) \neq \bot$.
> >
> The **CI advantage** with respect to $\mathcal{E}$ $\mathrm{Adv}_{\mathrm{CI}}[\mathcal{A}, \mathcal{E}]$ is defined as the probability that $\mathcal{A}$ wins the game. If the advantage is negligible for any efficient $\mathcal{A}$, we say that $\mathcal{E}$ provides **ciphertext integrity**. (CI) > The **CI advantage** with respect to $\mathcal{E}$ $\mathrm{Adv} _ {\mathrm{CI}}[\mathcal{A}, \mathcal{E}]$ is defined as the probability that $\mathcal{A}$ wins the game. If the advantage is negligible for any efficient $\mathcal{A}$, we say that $\mathcal{E}$ provides **ciphertext integrity**. (CI)
If a scheme provides ciphertext integrity, then it will almost surely receive $\bot$ for some randomly generated ciphertext, and also for a valid ciphertext that was changed a little bit. If a scheme provides ciphertext integrity, then it will almost surely receive $\bot$ for some randomly generated ciphertext, and also for a valid ciphertext that was changed a little bit.
@@ -119,10 +119,10 @@ This theorem enables us to use AE secure schemes as a CCA secure scheme.
> **Theorem.** Let $\mathcal{E} = (E, D)$ be a cipher. If $\mathcal{E}$ is AE-secure, then it is CCA-secure. > **Theorem.** Let $\mathcal{E} = (E, D)$ be a cipher. If $\mathcal{E}$ is AE-secure, then it is CCA-secure.
> >
> For any efficient $q$-query CCA adversary $\mathcal{A}$, there exists efficient adversaries $\mathcal{B}_\mathrm{CPA}$ and $\mathcal{B}_\mathrm{CI}$ such that > For any efficient $q$-query CCA adversary $\mathcal{A}$, there exists efficient adversaries $\mathcal{B} _ \mathrm{CPA}$ and $\mathcal{B} _ \mathrm{CI}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CCA}}[\mathcal{A}, \mathcal{E}] \leq \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{B}_\mathrm{CPA}, \mathcal{E}] + 2q \cdot \mathrm{Adv}_{\mathrm{CI}}[\mathcal{B}_\mathrm{CI}, \mathcal{E}]. > \mathrm{Adv} _ {\mathrm{CCA}}[\mathcal{A}, \mathcal{E}] \leq \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{B} _ \mathrm{CPA}, \mathcal{E}] + 2q \cdot \mathrm{Adv} _ {\mathrm{CI}}[\mathcal{B} _ \mathrm{CI}, \mathcal{E}].
> $$ > $$
*Proof*. Check Theorem 9.1.[^1] *Proof*. Check Theorem 9.1.[^1]
@@ -148,7 +148,7 @@ In **Encrypt-and-MAC**, encryption and authentication is done in parallel.
> Given a message $m$, the sender outputs $(c, t)$ where > Given a message $m$, the sender outputs $(c, t)$ where
> >
> $$ > $$
> c \leftarrow E(k_1, m), \quad t \leftarrow S(k_2, m). > c \leftarrow E(k _ 1, m), \quad t \leftarrow S(k _ 2, m).
> $$ > $$
This approach does not provide AE. In general, the tag may leak some information about the original message. This is because MACs do not care about the privacy of messages. This approach does not provide AE. In general, the tag may leak some information about the original message. This is because MACs do not care about the privacy of messages.
@@ -162,10 +162,10 @@ In **MAC-then-Encrypt**, the tag is computed and the message-tag pair is encrypt
> Given a message $m$, the sender outputs $c$ where > Given a message $m$, the sender outputs $c$ where
> >
> $$ > $$
> t \leftarrow S(k_2, m), \quad c \leftarrow E(k_1, m\parallel t). > t \leftarrow S(k _ 2, m), \quad c \leftarrow E(k _ 1, m\parallel t).
> $$ > $$
> >
> Decryption is done by $(m, t) \leftarrow D(k_1, c)$ and then verifying the tag with $V(k_2, m, t)$. > Decryption is done by $(m, t) \leftarrow D(k _ 1, c)$ and then verifying the tag with $V(k _ 2, m, t)$.
This is not secure either. It is known that the attacker can decrypt all traffic using a chosen ciphertext attack. (padding oracle attacks) Check Section 9.4.2.[^1] This is not secure either. It is known that the attacker can decrypt all traffic using a chosen ciphertext attack. (padding oracle attacks) Check Section 9.4.2.[^1]
@@ -176,23 +176,23 @@ In **Encrypt-then-MAC**, the encrypted message is signed, and is known to be sec
> Given a message $m$, the sender outputs $(c, t)$ where > Given a message $m$, the sender outputs $(c, t)$ where
> >
> $$ > $$
> c \leftarrow E(k_1, m), \quad t \leftarrow S(k_2, c). > c \leftarrow E(k _ 1, m), \quad t \leftarrow S(k _ 2, c).
> $$ > $$
> >
> Decryption is done by returning $D(k_1, c)$ only if verification $V(k_2, c, t)$ succeeds. > Decryption is done by returning $D(k _ 1, c)$ only if verification $V(k _ 2, c, t)$ succeeds.
> **Theorem.** Let $\mathcal{E} = (E, D)$ be a cipher and let $\Pi = (S, V)$ be a MAC system. If $\mathcal{E}$ is CPA secure cipher and $\Pi$ is a strongly secure MAC, then $\mathcal{E}_\mathrm{EtM}$ is AE secure. > **Theorem.** Let $\mathcal{E} = (E, D)$ be a cipher and let $\Pi = (S, V)$ be a MAC system. If $\mathcal{E}$ is CPA secure cipher and $\Pi$ is a strongly secure MAC, then $\mathcal{E} _ \mathrm{EtM}$ is AE secure.
> >
> For every efficient CI adversary $\mathcal{A}_\mathrm{CI}$ attacking $\mathcal{E}_\mathrm{EtM}$, there exists an efficient MAC adversary $\mathcal{B}_\mathrm{MAC}$ attacking $\Pi$ such that > For every efficient CI adversary $\mathcal{A} _ \mathrm{CI}$ attacking $\mathcal{E} _ \mathrm{EtM}$, there exists an efficient MAC adversary $\mathcal{B} _ \mathrm{MAC}$ attacking $\Pi$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CI}}[\mathcal{A}_\mathrm{CI}, \mathcal{E}_\mathrm{EtM}] = \mathrm{Adv}_{\mathrm{MAC}}[\mathcal{B}_\mathrm{MAC}, \Pi]. > \mathrm{Adv} _ {\mathrm{CI}}[\mathcal{A} _ \mathrm{CI}, \mathcal{E} _ \mathrm{EtM}] = \mathrm{Adv} _ {\mathrm{MAC}}[\mathcal{B} _ \mathrm{MAC}, \Pi].
> $$ > $$
> >
> For every efficient CPA adversary $\mathcal{A}_\mathrm{CPA}$ attacking $\mathcal{E}_\mathrm{EtM}$, there exists an efficient CPA adversary $\mathcal{B}_\mathrm{MAC}$ attacking $\mathcal{E}$ such that > For every efficient CPA adversary $\mathcal{A} _ \mathrm{CPA}$ attacking $\mathcal{E} _ \mathrm{EtM}$, there exists an efficient CPA adversary $\mathcal{B} _ \mathrm{MAC}$ attacking $\mathcal{E}$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{A}_\mathrm{CPA}, \mathcal{E}_\mathrm{EtM}] = \mathrm{Adv}_{\mathrm{CPA}}[\mathcal{B}_\mathrm{CPA}, \mathcal{E}]. > \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{A} _ \mathrm{CPA}, \mathcal{E} _ \mathrm{EtM}] = \mathrm{Adv} _ {\mathrm{CPA}}[\mathcal{B} _ \mathrm{CPA}, \mathcal{E}].
> $$ > $$
*Proof*. See Theorem 9.2.[^1] *Proof*. See Theorem 9.2.[^1]
@@ -201,7 +201,7 @@ In **Encrypt-then-MAC**, the encrypted message is signed, and is known to be sec
#### Common Mistakes in EtM Implementation #### Common Mistakes in EtM Implementation
- Do not use the same key for $\mathcal{E}$ and $\Pi$. The security proof above relies on the fact that the two keys $k_1, k_2 \in \mathcal{K}$ were chosen independently. See Exercise 9.8.[^1] - Do not use the same key for $\mathcal{E}$ and $\Pi$. The security proof above relies on the fact that the two keys $k _ 1, k _ 2 \in \mathcal{K}$ were chosen independently. See Exercise 9.8.[^1]
- MAC must be applied to the full ciphertext. For example, if IV is not protected by the MAC, the attacker can create a new valid ciphertext by changing the IV. - MAC must be applied to the full ciphertext. For example, if IV is not protected by the MAC, the attacker can create a new valid ciphertext by changing the IV.
[^1]: A Graduate Course in Applied Cryptography [^1]: A Graduate Course in Applied Cryptography

View File

@@ -29,12 +29,12 @@ But *cryptographic hash functions* are different. They should *avoid* collisions
Intuitively, a function $H$ is collision resistant if it is computationally infeasible to find a collision for $H$. Formally, this can be defined also in the form of a security game. Intuitively, a function $H$ is collision resistant if it is computationally infeasible to find a collision for $H$. Formally, this can be defined also in the form of a security game.
> **Definition.** Let $H$ be a hash function defined over $(\mathcal{M}, \mathcal{T})$. Given an adversary $\mathcal{A}$, the adversary outputs two messages $m_0, m_1 \in \mathcal{M}$. > **Definition.** Let $H$ be a hash function defined over $(\mathcal{M}, \mathcal{T})$. Given an adversary $\mathcal{A}$, the adversary outputs two messages $m _ 0, m _ 1 \in \mathcal{M}$.
> >
> $\mathcal{A}$ wins the game if $H(m_0) = H(m_1)$ and $m_0 \neq m_1$. The **advantage** of $\mathcal{A}$ with respect to $H$ is defined as the probability that $\mathcal{A}$ wins the game. > $\mathcal{A}$ wins the game if $H(m _ 0) = H(m _ 1)$ and $m _ 0 \neq m _ 1$. The **advantage** of $\mathcal{A}$ with respect to $H$ is defined as the probability that $\mathcal{A}$ wins the game.
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CR}}[\mathcal{A}, H] = \Pr[H(m_0) = H(m_1) \wedge m_0 \neq m_1]. > \mathrm{Adv} _ {\mathrm{CR}}[\mathcal{A}, H] = \Pr[H(m _ 0) = H(m _ 1) \wedge m _ 0 \neq m _ 1].
> $$ > $$
> >
> If the advantage is negligible for any efficient adversary $\mathcal{A}$, then the hash function $H$ is **collision resistant**. > If the advantage is negligible for any efficient adversary $\mathcal{A}$, then the hash function $H$ is **collision resistant**.
@@ -59,10 +59,10 @@ Let $\Pi = (S, V)$ be a MAC scheme defined over $(\mathcal{K}, \mathcal{M}, \mat
> >
> If $\Pi$ is a secure MAC and $H$ is collision resistant, then $\Pi'$ is a secure MAC. > If $\Pi$ is a secure MAC and $H$ is collision resistant, then $\Pi'$ is a secure MAC.
> >
> For any efficient adversary $\mathcal{A}$ attacking $\Pi'$, there exist a MAC adversary $\mathcal{B}_\mathrm{MAC}$ attacking $\Pi$ and an adversary $\mathcal{B}_\mathrm{CR}$ attacking $H$ such that > For any efficient adversary $\mathcal{A}$ attacking $\Pi'$, there exist a MAC adversary $\mathcal{B} _ \mathrm{MAC}$ attacking $\Pi$ and an adversary $\mathcal{B} _ \mathrm{CR}$ attacking $H$ such that
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{MAC}}[\mathcal{A}, \Pi'] \leq \mathrm{Adv}_{\mathrm{MAC}}[\mathcal{B}_\mathrm{MAC}, \Pi] + \mathrm{Adv}_{\mathrm{CR}}[\mathcal{B}_\mathrm{CR}, H]. > \mathrm{Adv} _ {\mathrm{MAC}}[\mathcal{A}, \Pi'] \leq \mathrm{Adv} _ {\mathrm{MAC}}[\mathcal{B} _ \mathrm{MAC}, \Pi] + \mathrm{Adv} _ {\mathrm{CR}}[\mathcal{B} _ \mathrm{CR}, H].
> $$ > $$
*Proof*. See Theorem 8.1.[^2] *Proof*. See Theorem 8.1.[^2]
@@ -83,8 +83,8 @@ Actually, the attacker doesn't have to hash that many messages. This is because
Let $N$ be the size of the hash space. (If the hash is $n$ bits, then $N = 2^n$) Let $N$ be the size of the hash space. (If the hash is $n$ bits, then $N = 2^n$)
> 1. Sample $s$ uniform random messages $m_1, \dots, m_s \in \mathcal{M}$. > 1. Sample $s$ uniform random messages $m _ 1, \dots, m _ s \in \mathcal{M}$.
> 2. Compute $x_i \leftarrow H(m_i)$. > 2. Compute $x _ i \leftarrow H(m _ i)$.
> 3. Find and output a collision if it exists. > 3. Find and output a collision if it exists.
> **Lemma.** The above algorithm will output a collision with probability at least $1/2$ when $s \geq 1.2\sqrt{N}$. > **Lemma.** The above algorithm will output a collision with probability at least $1/2$ when $s \geq 1.2\sqrt{N}$.
@@ -92,7 +92,7 @@ Let $N$ be the size of the hash space. (If the hash is $n$ bits, then $N = 2^n$)
*Proof*. We show that the probability of no collisions is less than $1/2$. The probability that there is no collision is *Proof*. We show that the probability of no collisions is less than $1/2$. The probability that there is no collision is
$$ $$
\prod_{i=1}^{s-1}\left( 1-\frac{i}{N} \right) \leq \prod_{i=1}^{s-1} \exp\left( -\frac{i}{N} \right) = \exp\left( -\frac{s(s-1)}{2N} \right). \prod _ {i=1}^{s-1}\left( 1-\frac{i}{N} \right) \leq \prod _ {i=1}^{s-1} \exp\left( -\frac{i}{N} \right) = \exp\left( -\frac{s(s-1)}{2N} \right).
$$ $$
So solving $\exp\left( -s(s-1)/2N \right) < 1/2$ for $s$ gives approximately $s \geq \sqrt{(2\log2)N} \approx 1.17 \sqrt{N}$. So solving $\exp\left( -s(s-1)/2N \right) < 1/2$ for $s$ gives approximately $s \geq \sqrt{(2\log2)N} \approx 1.17 \sqrt{N}$.
@@ -113,13 +113,13 @@ The Merkle-Damgård transform gives as a way to extend our input domain of the h
> >
> 1. Given an input $m \in \left\lbrace 0, 1 \right\rbrace^{\leq L}$, pad $m$ so that the length of $m$ is a multiple of $l$. > 1. Given an input $m \in \left\lbrace 0, 1 \right\rbrace^{\leq L}$, pad $m$ so that the length of $m$ is a multiple of $l$.
> - The padding block $\mathrm{PB}$ must contain an encoding of the input message length. i.e, it is of the form $100\dots00\parallel\left\lvert m \right\lvert$. > - The padding block $\mathrm{PB}$ must contain an encoding of the input message length. i.e, it is of the form $100\dots00\parallel\left\lvert m \right\lvert$.
> 2. Then partition the input into $l$-bit blocks so that $m' = m_1 \parallel m_2 \parallel \cdots \parallel m_s$. > 2. Then partition the input into $l$-bit blocks so that $m' = m _ 1 \parallel m _ 2 \parallel \cdots \parallel m _ s$.
> 3. Set $t_0 \leftarrow \mathrm{IV} \in \left\lbrace 0, 1 \right\rbrace^n$. > 3. Set $t _ 0 \leftarrow \mathrm{IV} \in \left\lbrace 0, 1 \right\rbrace^n$.
> 4. For $i = 1, \dots, s$, calculate $t_i \leftarrow h(t_{i-1}, m_i)$. > 4. For $i = 1, \dots, s$, calculate $t _ i \leftarrow h(t _ {i-1}, m _ i)$.
> 5. Return $t_s$. > 5. Return $t _ s$.
- The function $h$ is called the **compression function**. - The function $h$ is called the **compression function**.
- The $t_i$ values are called **chaining values**. - The $t _ i$ values are called **chaining values**.
- Note that because of the padding block can be at most $l$-bits, the maximum message length is $2^l$, but usually $l = 64$, so it is enough. - Note that because of the padding block can be at most $l$-bits, the maximum message length is $2^l$, but usually $l = 64$, so it is enough.
- $\mathrm{IV}$ is fixed to some value, and is usually set to some complicated string. - $\mathrm{IV}$ is fixed to some value, and is usually set to some complicated string.
- We included the length of the message in the padding. This will be used in the security proof. - We included the length of the message in the padding. This will be used in the security proof.
@@ -130,17 +130,17 @@ The Merkle-Damgård construction is secure.
*Proof*. We show by contradiction. Suppose that an adversary $\mathcal{A}$ of $H$ found a collision for $H$. Let $H(m) = H(m')$ for $m \neq m'$. Now we construct an adversary $\mathcal{B}$ of $h$. $\mathcal{B}$ will examine $m$ and $m'$ and work its way backwards. *Proof*. We show by contradiction. Suppose that an adversary $\mathcal{A}$ of $H$ found a collision for $H$. Let $H(m) = H(m')$ for $m \neq m'$. Now we construct an adversary $\mathcal{B}$ of $h$. $\mathcal{B}$ will examine $m$ and $m'$ and work its way backwards.
Suppose that $m = m_1\cdots m_u$ and $m' = m_1'\cdots m_v'$. Let the chaining values be $t_i = h(t_{i-1},m_i)$ and $t_i' = h(t_{i-1}', m_i')$. Then since $H(m) = H(m')$, the very last iteration should give the same output. Suppose that $m = m _ 1\cdots m _ u$ and $m' = m _ 1'\cdots m _ v'$. Let the chaining values be $t _ i = h(t _ {i-1},m _ i)$ and $t _ i' = h(t _ {i-1}', m _ i')$. Then since $H(m) = H(m')$, the very last iteration should give the same output.
$$ $$
h(t_{u-1},m_u) = h(t_{v-1}', m_v'). h(t _ {u-1},m _ u) = h(t _ {v-1}', m _ v').
$$ $$
Suppose that $t_{u-1} \neq t_{v-1}'$ and $m_u \neq m_v'$. Then this is a collision for $h$, so $\mathcal{B}$ returns this collision, and we are done. So suppose otherwise. Then $t_{u-1} = t_{v-1}'$ and $m_u = m_v'$. But because the last block contains the padding, the padding values must be the same, which means that the length of these two messages must have been the same, so $u = v$. Suppose that $t _ {u-1} \neq t _ {v-1}'$ and $m _ u \neq m _ v'$. Then this is a collision for $h$, so $\mathcal{B}$ returns this collision, and we are done. So suppose otherwise. Then $t _ {u-1} = t _ {v-1}'$ and $m _ u = m _ v'$. But because the last block contains the padding, the padding values must be the same, which means that the length of these two messages must have been the same, so $u = v$.
Now we have $t_{u-1} = t_{u-1}'$, which implies $h(t_{u-2}, m_{u-1}) = h(t_{u-2}', m_{u-1}')$. We can now repeat the same process until the first block. If $\mathcal{B}$ did not find any collision then it means that $m_i = m_i'$ for all $i$, so $m = m'$. This is a contradiction, so $\mathcal{B}$ must have found a collision. Now we have $t _ {u-1} = t _ {u-1}'$, which implies $h(t _ {u-2}, m _ {u-1}) = h(t _ {u-2}', m _ {u-1}')$. We can now repeat the same process until the first block. If $\mathcal{B}$ did not find any collision then it means that $m _ i = m _ i'$ for all $i$, so $m = m'$. This is a contradiction, so $\mathcal{B}$ must have found a collision.
By the above argument, we see that $\mathrm{Adv}_{\mathrm{CR}}[\mathcal{A}, H] = \mathrm{Adv}_{\mathrm{CR}}[\mathcal{B}, h]$. By the above argument, we see that $\mathrm{Adv} _ {\mathrm{CR}}[\mathcal{A}, H] = \mathrm{Adv} _ {\mathrm{CR}}[\mathcal{B}, h]$.
### Attacking Merkle-Damgård Hash Functions ### Attacking Merkle-Damgård Hash Functions
@@ -150,7 +150,7 @@ See Joux's attack.[^2]
Now we only have to build a collision resistant compression function. We can build these functions from either a block cipher, or by using number theoretic primitives. Now we only have to build a collision resistant compression function. We can build these functions from either a block cipher, or by using number theoretic primitives.
Number theoretic primitives will be shown after we learn some number theory.[^3] An example is shown in [collision resistance using DL problem (Modern Cryptography)](./2023-10-03-key-exchange.md#collision-resistance-based-on-dl-problem). Number theoretic primitives will be shown after we learn some number theory.[^3] An example is shown in [collision resistance using DL problem (Modern Cryptography)](../2023-10-03-key-exchange/#collision-resistance-based-on-dl-problem).
![mc-06-davies-meyer.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-06-davies-meyer.png) ![mc-06-davies-meyer.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-06-davies-meyer.png)
@@ -169,7 +169,7 @@ Due to the birthday attack, we see that this bound is the best possible.
There are other constructions of $h$ using the block cipher. But some of them are totally insecure. These are some insecure functions. There are other constructions of $h$ using the block cipher. But some of them are totally insecure. These are some insecure functions.
$$ $$
h_1(x, y) = E(y, x) \oplus y, \quad h_2(x, y) = E(x, x \oplus y) \oplus x. h _ 1(x, y) = E(y, x) \oplus y, \quad h _ 2(x, y) = E(x, x \oplus y) \oplus x.
$$ $$
Also, just using $E(y, x)$ is insecure. Also, just using $E(y, x)$ is insecure.
@@ -195,7 +195,7 @@ We needed a complicated construction for MACs that work on long messages. We mig
Here are a few approaches. Suppose that a compression function $h$ is given and $H$ is a Merkle-Damgård function derived from $h$. Here are a few approaches. Suppose that a compression function $h$ is given and $H$ is a Merkle-Damgård function derived from $h$.
Recall that [we can construct a MAC scheme from a PRF](./2023-09-21-macs.md#mac-constructions-from-prfs), so either we want a secure PRF or a secure MAC scheme. Recall that [we can construct a MAC scheme from a PRF](../2023-09-21-macs/#mac-constructions-from-prfs), so either we want a secure PRF or a secure MAC scheme.
#### Prepending the Key #### Prepending the Key
@@ -211,7 +211,7 @@ Define $S(k, m) = H(k \parallel M \parallel k)$. This can be proven to be a secu
#### Two-Key Nest #### Two-Key Nest
Define $S((k_1,k_2), m) = H(k_2 \parallel H(k_1 \parallel m))$. This can also be proven to be a secure PRF under reasonable assumptions. See Section 8.7.1.[^2] Define $S((k _ 1,k _ 2), m) = H(k _ 2 \parallel H(k _ 1 \parallel m))$. This can also be proven to be a secure PRF under reasonable assumptions. See Section 8.7.1.[^2]
This can be thought of as blocking the length extension attack from prepending the key method. This can be thought of as blocking the length extension attack from prepending the key method.
@@ -219,19 +219,19 @@ This can be thought of as blocking the length extension attack from prepending t
![mc-06-hmac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-06-hmac.png) ![mc-06-hmac.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-06-hmac.png)
This is a variant of the two-key nest, but the difference is that the keys $k_1', k_2'$ are not independent. Choose a key $k \leftarrow \mathcal{K}$, and set This is a variant of the two-key nest, but the difference is that the keys $k _ 1', k _ 2'$ are not independent. Choose a key $k \leftarrow \mathcal{K}$, and set
$$ $$
k_1 = k \oplus \texttt{ipad}, \quad k_2 = k\oplus \texttt{opad} k _ 1 = k \oplus \texttt{ipad}, \quad k _ 2 = k\oplus \texttt{opad}
$$ $$
where $\texttt{ipad} = \texttt{0x363636}...$ and $\texttt{opad} = \texttt{0x5C5C5C}...$. Then where $\texttt{ipad} = \texttt{0x363636}...$ and $\texttt{opad} = \texttt{0x5C5C5C}...$. Then
$$ $$
\mathrm{HMAC}(k, m) = H(k_2 \parallel H(k_1 \parallel m)). \mathrm{HMAC}(k, m) = H(k _ 2 \parallel H(k _ 1 \parallel m)).
$$ $$
The security proof given for two-key nest does not apply here, since $k_1$ and $k_2$ are not independent. With stronger assumptions on $h$, then we almost get an optimal security bound. The security proof given for two-key nest does not apply here, since $k _ 1$ and $k _ 2$ are not independent. With stronger assumptions on $h$, then we almost get an optimal security bound.
## The Random Oracle Model ## The Random Oracle Model

View File

@@ -65,20 +65,20 @@ To implement the above protocol, we need two functions $E$ and $F$ that satisfy
Let $p$ be a large prime, and let $q$ be another large prime dividing $p - 1$. We typically use very large random primes, $p$ is about $2048$ bits long, and $q$ is about $256$ bits long. Let $p$ be a large prime, and let $q$ be another large prime dividing $p - 1$. We typically use very large random primes, $p$ is about $2048$ bits long, and $q$ is about $256$ bits long.
All arithmetic will be done in $\mathbb{Z}_p$. We also consider $\mathbb{Z}_p^\ast$ , the **unit group** of $\mathbb{Z}_p$. Since $\mathbb{Z}_p$ is a field, $\mathbb{Z}_p^\ast = \mathbb{Z}_p \setminus \left\lbrace 0 \right\rbrace$, meaning that $\mathbb{Z}_p^\ast$ has order $p-1$. All arithmetic will be done in $\mathbb{Z} _ p$. We also consider $\mathbb{Z} _ p^\ast$ , the **unit group** of $\mathbb{Z} _ p$. Since $\mathbb{Z} _ p$ is a field, $\mathbb{Z} _ p^\ast = \mathbb{Z} _ p \setminus \left\lbrace 0 \right\rbrace$, meaning that $\mathbb{Z} _ p^\ast$ has order $p-1$.
Since $q$ is a prime dividing $p - 1$, $\mathbb{Z}_p^\ast$ has an element $g$ of order $q$.[^1] Let Since $q$ is a prime dividing $p - 1$, $\mathbb{Z} _ p^\ast$ has an element $g$ of order $q$.[^1] Let
$$ $$
G = \left\langle g \right\rangle = \left\lbrace 1, g, g^2, \dots, g^{q-1} \right\rbrace \leq \mathbb{Z}_p^\ast. G = \left\langle g \right\rangle = \left\lbrace 1, g, g^2, \dots, g^{q-1} \right\rbrace \leq \mathbb{Z} _ p^\ast.
$$ $$
We assume that the description of $p$, $q$ and $g$ are generated at the setup and shared by all parties. Now the actual protocol goes like this. We assume that the description of $p$, $q$ and $g$ are generated at the setup and shared by all parties. Now the actual protocol goes like this.
![mc-07-dhke.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-07-dhke.png) ![mc-07-dhke.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-07-dhke.png)
> 1. Alice chooses $\alpha \leftarrow \mathbb{Z}_q$ and computes $g^\alpha$. > 1. Alice chooses $\alpha \leftarrow \mathbb{Z} _ q$ and computes $g^\alpha$.
> 2. Bob chooses $\beta \leftarrow \mathbb{Z}_q$ and computes $g^\beta$. > 2. Bob chooses $\beta \leftarrow \mathbb{Z} _ q$ and computes $g^\beta$.
> 3. Alice and Bob exchange $g^\alpha$ and $g^\beta$ over an insecure channel. > 3. Alice and Bob exchange $g^\alpha$ and $g^\beta$ over an insecure channel.
> 4. Using $\alpha$ and $g^\beta$, Alice computes $g^{\alpha\beta}$. > 4. Using $\alpha$ and $g^\beta$, Alice computes $g^{\alpha\beta}$.
> 5. Using $\beta$ and $g^\alpha$, Bob computes $g^{\alpha\beta}$. > 5. Using $\beta$ and $g^\alpha$, Bob computes $g^{\alpha\beta}$.
@@ -90,32 +90,32 @@ It works!
The protocol is secure if and only if the following holds. The protocol is secure if and only if the following holds.
> Let $\alpha, \beta \leftarrow \mathbb{Z}_q$. Given $g^\alpha, g^\beta \in G$, it is hard to compute $g^{\alpha\beta} \in G$. > Let $\alpha, \beta \leftarrow \mathbb{Z} _ q$. Given $g^\alpha, g^\beta \in G$, it is hard to compute $g^{\alpha\beta} \in G$.
This is called the **computational Diffie-Hellman assumption**. As we will see below, this is not as strong as the discrete logarithm assumption. But in the real world, CDH assumption is reasonable enough for groups where the DL assumption holds. This is called the **computational Diffie-Hellman assumption**. As we will see below, this is not as strong as the discrete logarithm assumption. But in the real world, CDH assumption is reasonable enough for groups where the DL assumption holds.
## Discrete Logarithm and Related Assumptions ## Discrete Logarithm and Related Assumptions
We have used $E(x) = g^x$ in the above implementation. This function is called the **discrete exponentiation function**. This function is actually a *group isomorphism*, so it has an inverse function called the **discrete logarithm function**. The name comes from the fact that if $u = g^x$, then it can be written as '$x = \log_g u$'. We have used $E(x) = g^x$ in the above implementation. This function is called the **discrete exponentiation function**. This function is actually a *group isomorphism*, so it has an inverse function called the **discrete logarithm function**. The name comes from the fact that if $u = g^x$, then it can be written as '$x = \log _ g u$'.
We required that $E$ must be a one-way function for the protocol to work. So it must be hard to compute the discrete logarithm function. There are some problems related to the discrete logarithm, which are used as assumptions in the security proof. They are formalized as a security game, as usual. We required that $E$ must be a one-way function for the protocol to work. So it must be hard to compute the discrete logarithm function. There are some problems related to the discrete logarithm, which are used as assumptions in the security proof. They are formalized as a security game, as usual.
$G = \left\langle g \right\rangle \leq \mathbb{Z}_p^\ast$ will be a *cyclic group* of order $q$ and $g$ is given as a generator. Note that $g$ and $q$ are also given to the adversary. $G = \left\langle g \right\rangle \leq \mathbb{Z} _ p^\ast$ will be a *cyclic group* of order $q$ and $g$ is given as a generator. Note that $g$ and $q$ are also given to the adversary.
### Discrete Logarithm Problem (DL) ### Discrete Logarithm Problem (DL)
> **Definition.** Let $\mathcal{A}$ be a given adversary. > **Definition.** Let $\mathcal{A}$ be a given adversary.
> >
> 1. The challenger chooses $\alpha \leftarrow \mathbb{Z}_q$ and sends $u = g^\alpha$ to the adversary. > 1. The challenger chooses $\alpha \leftarrow \mathbb{Z} _ q$ and sends $u = g^\alpha$ to the adversary.
> 2. The adversary calculates and outputs some $\alpha' \in \mathbb{Z}_q$. > 2. The adversary calculates and outputs some $\alpha' \in \mathbb{Z} _ q$.
> >
> We define the **advantage in solving the discrete logarithm problem for $G$** as > We define the **advantage in solving the discrete logarithm problem for $G$** as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{DL}}[\mathcal{A}, G] = \Pr[\alpha = \alpha']. > \mathrm{Adv} _ {\mathrm{DL}}[\mathcal{A}, G] = \Pr[\alpha = \alpha'].
> $$ > $$
> >
> We say that the **discrete logarithm (DL) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv}_{\mathrm{DL}}[\mathcal{A}, G]$ is negligible. > We say that the **discrete logarithm (DL) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv} _ {\mathrm{DL}}[\mathcal{A}, G]$ is negligible.
So if we assume the DL assumption, it means that DL problem is **hard**. i.e, no efficient adversary can effectively solve the DL problem for $G$. So if we assume the DL assumption, it means that DL problem is **hard**. i.e, no efficient adversary can effectively solve the DL problem for $G$.
@@ -123,16 +123,16 @@ So if we assume the DL assumption, it means that DL problem is **hard**. i.e, no
> **Definition.** Let $\mathcal{A}$ be a given adversary. > **Definition.** Let $\mathcal{A}$ be a given adversary.
> >
> 1. The challenger chooses $\alpha, \beta \leftarrow \mathbb{Z}_q$ and sends $g^\alpha, g^\beta$ to the adversary. > 1. The challenger chooses $\alpha, \beta \leftarrow \mathbb{Z} _ q$ and sends $g^\alpha, g^\beta$ to the adversary.
> 2. The adversary calculates and outputs some $w \in G$. > 2. The adversary calculates and outputs some $w \in G$.
> >
> We define the **advantage in solving the computational Diffie-Hellman problem for $G$** as > We define the **advantage in solving the computational Diffie-Hellman problem for $G$** as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{CDH}}[\mathcal{A}, G] = \Pr[w = g^{\alpha\beta}]. > \mathrm{Adv} _ {\mathrm{CDH}}[\mathcal{A}, G] = \Pr[w = g^{\alpha\beta}].
> $$ > $$
> >
> We say that the **computational Diffie-Hellman (CDH) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv}_{\mathrm{CDH}}[\mathcal{A}, G]$ is negligible. > We say that the **computational Diffie-Hellman (CDH) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv} _ {\mathrm{CDH}}[\mathcal{A}, G]$ is negligible.
An interesting property here is that given $(g^\alpha, g^\beta)$, it is hard to determine if $w$ is a solution to the problem. ($w \overset{?}{=} g^{\alpha\beta}$) An interesting property here is that given $(g^\alpha, g^\beta)$, it is hard to determine if $w$ is a solution to the problem. ($w \overset{?}{=} g^{\alpha\beta}$)
@@ -143,24 +143,24 @@ Since recognizing a solution to the CDH problem is hard, we have another assumpt
> **Definition.** Let $\mathcal{A}$ be a given adversary. We define two experiments 0 and 1. > **Definition.** Let $\mathcal{A}$ be a given adversary. We define two experiments 0 and 1.
> >
> **Experiment $b$**. > **Experiment $b$**.
> 1. The challenger chooses $\alpha, \beta, \gamma \leftarrow \mathbb{Z}_q$ and computes the following. > 1. The challenger chooses $\alpha, \beta, \gamma \leftarrow \mathbb{Z} _ q$ and computes the following.
> >
> $$ > $$
> u = g^\alpha, \quad v = g^\beta, \quad w_0 = g^{\alpha\beta}, \quad w_1 = g^\gamma. > u = g^\alpha, \quad v = g^\beta, \quad w _ 0 = g^{\alpha\beta}, \quad w _ 1 = g^\gamma.
> $$ > $$
> >
> 2. The challenger sends the triple $(u, v, w_b)$ to the adversary. > 2. The challenger sends the triple $(u, v, w _ b)$ to the adversary.
> 3. The adversary calculates and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 3. The adversary calculates and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. We define the **advantage in solving the decisional Diffie-Hellman problem for $G$** as > Let $W _ b$ be the event that $\mathcal{A}$ outputs $1$ in experiment $b$. We define the **advantage in solving the decisional Diffie-Hellman problem for $G$** as
> >
> $$ > $$
> \mathrm{Adv}_{\mathrm{DDH}}[\mathcal{A}, G] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert. > \mathrm{Adv} _ {\mathrm{DDH}}[\mathcal{A}, G] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert.
> $$ > $$
> >
> We say that the **decisional Diffie-Hellman (DDH) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv}_{\mathrm{DDH}}[\mathcal{A}, G]$ is negligible. > We say that the **decisional Diffie-Hellman (DDH) assumption** holds for $G$ if for any efficient adversary $\mathcal{A}$, $\mathrm{Adv} _ {\mathrm{DDH}}[\mathcal{A}, G]$ is negligible.
For $\alpha, \beta, \gamma \in \mathbb{Z}_q$, the triple $(g^\alpha, g^\beta, g^\gamma)$ is called a **DH-triple** if $\gamma = \alpha\beta$. So the assumption is saying that no efficient adversary can distinguish DH-triples from non DH-triples. For $\alpha, \beta, \gamma \in \mathbb{Z} _ q$, the triple $(g^\alpha, g^\beta, g^\gamma)$ is called a **DH-triple** if $\gamma = \alpha\beta$. So the assumption is saying that no efficient adversary can distinguish DH-triples from non DH-triples.
### Relations Between Problems ### Relations Between Problems
@@ -182,7 +182,7 @@ If we used the DL assumption and it turns out to be false, there will be an effi
Suppose we want something like a secret group chat, where there are $N$ ($\geq 3$) people and they need to generate a shared secret key. It is known that $N$-party Diffie-Hellman is possible in $N-1$ rounds. Here's how it goes. The indices are all in modulo $N$. Suppose we want something like a secret group chat, where there are $N$ ($\geq 3$) people and they need to generate a shared secret key. It is known that $N$-party Diffie-Hellman is possible in $N-1$ rounds. Here's how it goes. The indices are all in modulo $N$.
Each party $i$ chooses $\alpha_i \leftarrow \mathbb{Z}_q$, and computes $g^{\alpha_i}$. The parties communicate in a circular form, and passes the computed value to the $(i+1)$-th party. In the next round, the $i$-th party receives $g^{\alpha_{i-1}}$ and computes $g^{\alpha_{i-1}\alpha_i}$ and passes it to the next party. After $N-1$ rounds, all parties have the shared key $g^{\alpha_1\cdots\alpha_N}$. Each party $i$ chooses $\alpha _ i \leftarrow \mathbb{Z} _ q$, and computes $g^{\alpha _ i}$. The parties communicate in a circular form, and passes the computed value to the $(i+1)$-th party. In the next round, the $i$-th party receives $g^{\alpha _ {i-1}}$ and computes $g^{\alpha _ {i-1}\alpha _ i}$ and passes it to the next party. After $N-1$ rounds, all parties have the shared key $g^{\alpha _ 1\cdots\alpha _ N}$.
Taking $\mathcal{O}(N)$ steps is impractical in the real world, due to many communications that the above algorithm requires. Researchers are looking for methods to generate a shared key in a single round. It has been solved for $N=3$ using bilinear pairings, but for $N \geq 4$ it is an open problem. Taking $\mathcal{O}(N)$ steps is impractical in the real world, due to many communications that the above algorithm requires. Researchers are looking for methods to generate a shared key in a single round. It has been solved for $N=3$ using bilinear pairings, but for $N \geq 4$ it is an open problem.
@@ -196,7 +196,7 @@ The adversary will impersonate Bob when communicating with Alice, and will do th
## Collision Resistance Based on DL Problem ## Collision Resistance Based on DL Problem
Suppose that the DL problem is hard on the group $G = \left\langle g \right\rangle$, with prime order $q$. Choose an element $h \in G$, and define a hash function $H : \mathbb{Z}_q \times \mathbb{Z}_q \rightarrow G$ as Suppose that the DL problem is hard on the group $G = \left\langle g \right\rangle$, with prime order $q$. Choose an element $h \in G$, and define a hash function $H : \mathbb{Z} _ q \times \mathbb{Z} _ q \rightarrow G$ as
$$ $$
H(\alpha, \beta) = g^\alpha h^\beta. H(\alpha, \beta) = g^\alpha h^\beta.
@@ -215,19 +215,19 @@ The idea was to use *puzzles*, which are problems that can be solved with some e
![mc-07-merkle-puzzles.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-07-merkle-puzzles.png) ![mc-07-merkle-puzzles.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-07-merkle-puzzles.png)
> Let $\mathcal{E} = (E, D)$ be a block cipher defined over $(\mathcal{K}, \mathcal{M})$. > Let $\mathcal{E} = (E, D)$ be a block cipher defined over $(\mathcal{K}, \mathcal{M})$.
> 1. Alice chooses random pairs $(k_i, s_i) \leftarrow \mathcal{K} \times \mathcal{M}$ for $i = 1, \dots, L$. > 1. Alice chooses random pairs $(k _ i, s _ i) \leftarrow \mathcal{K} \times \mathcal{M}$ for $i = 1, \dots, L$.
> 2. Alice constructs $L$ puzzles, defined as a triple $(E(k_i, s_i), E(k_i, i), E(k_i, 0))$. > 2. Alice constructs $L$ puzzles, defined as a triple $(E(k _ i, s _ i), E(k _ i, i), E(k _ i, 0))$.
> 3. Alice randomly shuffles these puzzles and sends them to Bob. > 3. Alice randomly shuffles these puzzles and sends them to Bob.
> 4. Bob picks a random puzzle $(c_1, c_2, c_3)$ and solves the puzzle by **brute force**, trying all $k \in \mathcal{K}$ until some $D(k, c_3) = 0$ is found. > 4. Bob picks a random puzzle $(c _ 1, c _ 2, c _ 3)$ and solves the puzzle by **brute force**, trying all $k \in \mathcal{K}$ until some $D(k, c _ 3) = 0$ is found.
> - If Bob finds two different keys, he indicates Alice that the protocol failed and they start over. > - If Bob finds two different keys, he indicates Alice that the protocol failed and they start over.
> 5. Bob computes $l = D(k, c_2)$ and $s = D(k, c_1)$, sends $l$ to Alice. > 5. Bob computes $l = D(k, c _ 2)$ and $s = D(k, c _ 1)$, sends $l$ to Alice.
> 6. Alice will locate the $l$-th puzzle and set $s = s_l$. > 6. Alice will locate the $l$-th puzzle and set $s = s _ l$.
If successful, Alice and Bob can agree on a secret message $s \in \mathcal{M}$. It can be seen that Alice has to do $\mathcal{O}(L)$, Bob has to do $\mathcal{O}(\left\lvert \mathcal{K} \right\lvert)$ amount of work. If successful, Alice and Bob can agree on a secret message $s \in \mathcal{M}$. It can be seen that Alice has to do $\mathcal{O}(L)$, Bob has to do $\mathcal{O}(\left\lvert \mathcal{K} \right\lvert)$ amount of work.
For block ciphers, we commonly set $\mathcal{K}$ large enough so that brute force attacks are infeasible. So for Merkle puzzles, we reduce the key space. For example, if we were to use AES-128 as $\mathcal{E}$, then we can set the first $96$ bits of the key as $0$. Then the search space would be reduced to $2^{32}$, which is feasible for Bob. For block ciphers, we commonly set $\mathcal{K}$ large enough so that brute force attacks are infeasible. So for Merkle puzzles, we reduce the key space. For example, if we were to use AES-128 as $\mathcal{E}$, then we can set the first $96$ bits of the key as $0$. Then the search space would be reduced to $2^{32}$, which is feasible for Bob.
Now consider the adversary who obtains all puzzles $P_i$ and the value $l$. To obtain the secret message $s_l$, adversary has to locate the puzzle $P_l$. But since the puzzles are in random order, the adversary has to solve all puzzles until he finds $P_l$. Thus, the adversary must spend time $\mathcal{O}(L\left\lvert \mathcal{K} \right\lvert)$ to obtain $s$. So we have a quadratic gap here. Now consider the adversary who obtains all puzzles $P _ i$ and the value $l$. To obtain the secret message $s _ l$, adversary has to locate the puzzle $P _ l$. But since the puzzles are in random order, the adversary has to solve all puzzles until he finds $P _ l$. Thus, the adversary must spend time $\mathcal{O}(L\left\lvert \mathcal{K} \right\lvert)$ to obtain $s$. So we have a quadratic gap here.
### Performance Issues ### Performance Issues
@@ -241,5 +241,5 @@ It is unknown whether we can get a better gap (than quadratic) using a general s
To get exponential gaps, we need number theory. To get exponential gaps, we need number theory.
[^1]: By Cauchy's theorem, or use the fact that $\mathbb{Z}_p^\ast$ is commutative. Finite commutative groups have a subgroup of every order that divides the order of the group. [^1]: By Cauchy's theorem, or use the fact that $\mathbb{Z} _ p^\ast$ is commutative. Finite commutative groups have a subgroup of every order that divides the order of the group.
[^2]: R. Impagliazzo and S. Rudich. Limits on the provable consequences of one-way permutations. In Proceedings of the Symposium on Theory of Computing (STOC), pages 4461, 1989. [^2]: R. Impagliazzo and S. Rudich. Limits on the provable consequences of one-way permutations. In Proceedings of the Symposium on Theory of Computing (STOC), pages 4461, 1989.

View File

@@ -22,15 +22,15 @@ github_title: 2023-10-05-number-theory
Let $n$ be a positive integer and let $p$ be prime. Let $n$ be a positive integer and let $p$ be prime.
> **Notation.** Let $\mathbb{Z}$ denote the set of integers. We will write $\mathbb{Z}_n = \left\lbrace 0, 1, \dots, n - 1 \right\rbrace$. > **Notation.** Let $\mathbb{Z}$ denote the set of integers. We will write $\mathbb{Z} _ n = \left\lbrace 0, 1, \dots, n - 1 \right\rbrace$.
> **Definition.** Let $x, y \in \mathbb{Z}$. $\gcd(x, y)$ is the **greatest common divisor** of $x, y$. $x$ and $y$ are relatively prime if $\gcd(x, y) = 1$. > **Definition.** Let $x, y \in \mathbb{Z}$. $\gcd(x, y)$ is the **greatest common divisor** of $x, y$. $x$ and $y$ are relatively prime if $\gcd(x, y) = 1$.
> **Definition.** The **multiplicative inverse** of $x \in \mathbb{Z}_n$ is an element $y \in \mathbb{Z}_n$ such that $xy = 1$ in $\mathbb{Z}_n$. > **Definition.** The **multiplicative inverse** of $x \in \mathbb{Z} _ n$ is an element $y \in \mathbb{Z} _ n$ such that $xy = 1$ in $\mathbb{Z} _ n$.
> **Lemma.** $x \in \mathbb{Z}_n$ has a multiplicative inverse if and only if $\gcd(x, n) = 1$. > **Lemma.** $x \in \mathbb{Z} _ n$ has a multiplicative inverse if and only if $\gcd(x, n) = 1$.
> **Definition.** $\mathbb{Z}_n^\ast$ is the set of invertible elements in $\mathbb{Z}_n$. i.e, $\mathbb{Z}_n^\ast = \left\lbrace x \in \mathbb{Z}_n : \gcd(x, n) = 1 \right\rbrace$. > **Definition.** $\mathbb{Z} _ n^\ast$ is the set of invertible elements in $\mathbb{Z} _ n$. i.e, $\mathbb{Z} _ n^\ast = \left\lbrace x \in \mathbb{Z} _ n : \gcd(x, n) = 1 \right\rbrace$.
> **Lemma.** (Extended Euclidean Algorithm) For $x, y \in \mathbb{Z}$, there exists $a, b \in \mathbb{Z}$ such that $ax + by = \gcd(x, y)$. > **Lemma.** (Extended Euclidean Algorithm) For $x, y \in \mathbb{Z}$, there exists $a, b \in \mathbb{Z}$ such that $ax + by = \gcd(x, y)$.
@@ -58,21 +58,21 @@ Let $G$ be a group.
> **Definition.** $G$ is **cyclic** if there exists $g \in G$ such that $G = \left\langle g \right\rangle$. > **Definition.** $G$ is **cyclic** if there exists $g \in G$ such that $G = \left\langle g \right\rangle$.
> **Theorem.** $\mathbb{Z}_p^\ast$ is cyclic. > **Theorem.** $\mathbb{Z} _ p^\ast$ is cyclic.
*Proof*. $\mathbb{Z}_p$ is a finite field, so $\mathbb{Z}_p^\ast = \mathbb{Z}_p \setminus \left\lbrace 0 \right\rbrace$ is cyclic. *Proof*. $\mathbb{Z} _ p$ is a finite field, so $\mathbb{Z} _ p^\ast = \mathbb{Z} _ p \setminus \left\lbrace 0 \right\rbrace$ is cyclic.
> **Theorem.** If $G$ is a finite group, then $g^{\left\lvert G \right\lvert} = 1$ for all $g \in G$. i.e, $\left\lvert g \right\lvert \mid \left\lvert G \right\lvert$. > **Theorem.** If $G$ is a finite group, then $g^{\left\lvert G \right\lvert} = 1$ for all $g \in G$. i.e, $\left\lvert g \right\lvert \mid \left\lvert G \right\lvert$.
*Proof*. Consider $\left\langle g \right\rangle \leq G$, then the result follows from Lagrange's theorem. *Proof*. Consider $\left\langle g \right\rangle \leq G$, then the result follows from Lagrange's theorem.
> **Corollary.** (Fermat's Little Theorem) If $x \in \mathbb{Z}_p^\ast$, $x^{p-1} = 1$. > **Corollary.** (Fermat's Little Theorem) If $x \in \mathbb{Z} _ p^\ast$, $x^{p-1} = 1$.
*Proof*. $\mathbb{Z}_p^\ast$ has $p-1$ elements. *Proof*. $\mathbb{Z} _ p^\ast$ has $p-1$ elements.
> **Corollary.** (Euler's Generalization) If $x \in \mathbb{Z}_n^\ast$, $x^{\phi(n)} = 1$. > **Corollary.** (Euler's Generalization) If $x \in \mathbb{Z} _ n^\ast$, $x^{\phi(n)} = 1$.
*Proof*. $\mathbb{Z}_n^\ast$ has $\phi(n)$ elements, where $\phi(n)$ is the Euler's totient function. *Proof*. $\mathbb{Z} _ n^\ast$ has $\phi(n)$ elements, where $\phi(n)$ is the Euler's totient function.
--- ---
@@ -82,7 +82,7 @@ There are group-specific algorithms that exploit the algebraic features of the g
## Baby Step Giant Step Method (BSGS) ## Baby Step Giant Step Method (BSGS)
Let $G = \left\langle g \right\rangle$, where $g \in G$ has order $q$. $q$ need not be prime for this method. We are given $u = g^\alpha$, $g$, and $q$. Our task is to find $\alpha \in \mathbb{Z}_q$. Let $G = \left\langle g \right\rangle$, where $g \in G$ has order $q$. $q$ need not be prime for this method. We are given $u = g^\alpha$, $g$, and $q$. Our task is to find $\alpha \in \mathbb{Z} _ q$.
Set $m = \left\lceil \sqrt{q} \right\rceil$. $\alpha$ is currently unknown, but by the division algorithm, there exists integers $i,j$ such that $\alpha = i \cdot m + j$ and $0\leq i, j < m$. Then $u = g^\alpha = g^{i\cdot m + j} = g^{im} \cdot g^j$. Therefore, Set $m = \left\lceil \sqrt{q} \right\rceil$. $\alpha$ is currently unknown, but by the division algorithm, there exists integers $i,j$ such that $\alpha = i \cdot m + j$ and $0\leq i, j < m$. Then $u = g^\alpha = g^{i\cdot m + j} = g^{im} \cdot g^j$. Therefore,
@@ -108,35 +108,35 @@ Let $G$ be a cyclic group of composite order $n$. First, we start with a simple
Let $G = \left\langle g \right\rangle$ be a cyclic group of order $q^e$.[^1] ($q > 1$, $e \geq 1$) We are given $g,q, e$ and $u = g^\alpha$ and we will find $\alpha$. ($0 \leq \alpha < q^e)$ Let $G = \left\langle g \right\rangle$ be a cyclic group of order $q^e$.[^1] ($q > 1$, $e \geq 1$) We are given $g,q, e$ and $u = g^\alpha$ and we will find $\alpha$. ($0 \leq \alpha < q^e)$
For each $f = 0, \dots, e$, define $g_f = g^{(q^f)}$. Then For each $f = 0, \dots, e$, define $g _ f = g^{(q^f)}$. Then
$$ $$
(g_f)^{(q^{e-f})} = g^{(q^f) \cdot (q^{e-f})} = g^{(q^e)} = 1. (g _ f)^{(q^{e-f})} = g^{(q^f) \cdot (q^{e-f})} = g^{(q^e)} = 1.
$$ $$
So $g_f$ generates a cyclic subgroup of order $q^{e-f}$. In particular, $g_{e-1}$ generates a cyclic subgroup of order $q$. Using this fact, we will reduce the given problem into a discrete logarithm problem on a group having smaller order $q$. So $g _ f$ generates a cyclic subgroup of order $q^{e-f}$. In particular, $g _ {e-1}$ generates a cyclic subgroup of order $q$. Using this fact, we will reduce the given problem into a discrete logarithm problem on a group having smaller order $q$.
We proceed with recursion on $e$. If $e = 1$, then $\alpha \in \mathbb{Z}_q$, so we have nothing to do. Suppose $e > 1$. Choose $f$ so that $1 \leq f \leq e-1$. We can write $\alpha = i\cdot q^f + j$, where $0 \leq i < q^{e-f}$ and $0 \leq j < g^f$. Then We proceed with recursion on $e$. If $e = 1$, then $\alpha \in \mathbb{Z} _ q$, so we have nothing to do. Suppose $e > 1$. Choose $f$ so that $1 \leq f \leq e-1$. We can write $\alpha = i\cdot q^f + j$, where $0 \leq i < q^{e-f}$ and $0 \leq j < g^f$. Then
$$ $$
u = g^\alpha = g^{i \cdot q^f + j} = (g_f)^i \cdot g^j. u = g^\alpha = g^{i \cdot q^f + j} = (g _ f)^i \cdot g^j.
$$ $$
Since $g_f$ has order $q^{e-f}$, exponentiate both sides by $q^{e-f}$ to get Since $g _ f$ has order $q^{e-f}$, exponentiate both sides by $q^{e-f}$ to get
$$ $$
u^{(q^{e-f})} = (g_f)^{q^{e-f} \cdot i} \cdot g^{q^{e-f} \cdot j} = (g_{e-f})^j. u^{(q^{e-f})} = (g _ f)^{q^{e-f} \cdot i} \cdot g^{q^{e-f} \cdot j} = (g _ {e-f})^j.
$$ $$
Now the problem has been reduced to a discrete logarithm problem with base $g_{e-f}$, which has order $q^f$. We can compute $j$ using algorithms for discrete logarithms. Now the problem has been reduced to a discrete logarithm problem with base $g _ {e-f}$, which has order $q^f$. We can compute $j$ using algorithms for discrete logarithms.
After finding $j$, we have After finding $j$, we have
$$ $$
u/g^j = (g_f)^i u/g^j = (g _ f)^i
$$ $$
which is also a discrete logarithm problem with base $g_f$, which has order $q^{e-f}$. We can compute $i$ that satisfies this equation. Finally, we can compute $\alpha = i \cdot q^f + j$. We have reduced a discrete logarithm problem into two smaller discrete logarithm problems. which is also a discrete logarithm problem with base $g _ f$, which has order $q^{e-f}$. We can compute $i$ that satisfies this equation. Finally, we can compute $\alpha = i \cdot q^f + j$. We have reduced a discrete logarithm problem into two smaller discrete logarithm problems.
To get the best running time, choose $f \approx e/2$. Let $T(e)$ be the running time, then To get the best running time, choose $f \approx e/2$. Let $T(e)$ be the running time, then
@@ -147,23 +147,23 @@ $$
The $\mathcal{O}(e\log q)$ term comes from exponentiating both sides by $q^{e-f}$. Solving this recurrence gives The $\mathcal{O}(e\log q)$ term comes from exponentiating both sides by $q^{e-f}$. Solving this recurrence gives
$$ $$
T(e) = \mathcal{O}(e \cdot T_{\mathrm{base}} + e\log e \log q), T(e) = \mathcal{O}(e \cdot T _ {\mathrm{base}} + e\log e \log q),
$$ $$
where $T_\mathrm{base}$ is the complexity of the algorithm for the base case $e = 1$. $T_\mathrm{base}$ is usually the dominant term, since the best known algorithm takes $\mathcal{O}(\sqrt{q})$. where $T _ \mathrm{base}$ is the complexity of the algorithm for the base case $e = 1$. $T _ \mathrm{base}$ is usually the dominant term, since the best known algorithm takes $\mathcal{O}(\sqrt{q})$.
Thus, computing the discrete logarithm in $G$ is only as hard as computing it in the subgroup of prime order. Thus, computing the discrete logarithm in $G$ is only as hard as computing it in the subgroup of prime order.
### General Case: Pohlig-Hellman Algorithm ### General Case: Pohlig-Hellman Algorithm
Let $G = \left\langle g \right\rangle$ be a cyclic group of order $n = q_1^{e_1}\cdots q_r^{e_r}$, where the factorization of $n$ into distinct primes $q_i$ is given. We want to find $\alpha$ such that $g^\alpha = u$. Let $G = \left\langle g \right\rangle$ be a cyclic group of order $n = q _ 1^{e _ 1}\cdots q _ r^{e _ r}$, where the factorization of $n$ into distinct primes $q _ i$ is given. We want to find $\alpha$ such that $g^\alpha = u$.
For $i = 1, \dots, r$, define $q_i^\ast = n / q_i^{e_i}$. Then $u^{q_i^\ast} = (g^{q_i^\ast})^\alpha$, where $g^{q_i^\ast}$ will have order $q_i^{e_i}$ in $G$. Now compute $\alpha_i$ using the algorithm for the prime power case. For $i = 1, \dots, r$, define $q _ i^\ast = n / q _ i^{e _ i}$. Then $u^{q _ i^\ast} = (g^{q _ i^\ast})^\alpha$, where $g^{q _ i^\ast}$ will have order $q _ i^{e _ i}$ in $G$. Now compute $\alpha _ i$ using the algorithm for the prime power case.
Then for all $i$, we have $\alpha \equiv \alpha_i \pmod{q_i^{e_i}}$. We can now use the Chinese remainder theorem to recover $\alpha$. Let $q_r$ be the largest prime, then the running time is bounded by Then for all $i$, we have $\alpha \equiv \alpha _ i \pmod{q _ i^{e _ i}}$. We can now use the Chinese remainder theorem to recover $\alpha$. Let $q _ r$ be the largest prime, then the running time is bounded by
$$ $$
\sum_{i=1}^r \mathcal{O}(e_i T(q_i) + e_i \log e_i \log q_i) = \mathcal{O}(T(q_r) \log n + \log n \log \log n) \sum _ {i=1}^r \mathcal{O}(e _ i T(q _ i) + e _ i \log e _ i \log q _ i) = \mathcal{O}(T(q _ r) \log n + \log n \log \log n)
$$ $$
group operations. Thus, we can conclude the following. group operations. Thus, we can conclude the following.
@@ -173,17 +173,17 @@ group operations. Thus, we can conclude the following.
### Consequences ### Consequences
- For a group with order $n = 2^k$, the Pohlig-Hellman algorithm will easily compute the discrete logarithm, since the largest prime factor is $2$. The DL assumption is false for this group. - For a group with order $n = 2^k$, the Pohlig-Hellman algorithm will easily compute the discrete logarithm, since the largest prime factor is $2$. The DL assumption is false for this group.
- For primes of the form $p = 2^k + 1$, the group $\mathbb{Z}_p^\ast$ has order $2^k$, so the DL assumption is also false for these primes. - For primes of the form $p = 2^k + 1$, the group $\mathbb{Z} _ p^\ast$ has order $2^k$, so the DL assumption is also false for these primes.
- In general, $G$ must have at least one large prime factor for the DL assumption to be true. - In general, $G$ must have at least one large prime factor for the DL assumption to be true.
- By the Pohlig-Hellman algorithm, discrete logarithms in groups of composite order is a little harder than groups of prime order. So we often use a prime order group. - By the Pohlig-Hellman algorithm, discrete logarithms in groups of composite order is a little harder than groups of prime order. So we often use a prime order group.
## Information Leakage in Groups of Composite Order ## Information Leakage in Groups of Composite Order
Let $G = \left\langle g \right\rangle$ be a cyclic group of composite order $n$. We suppose that $n = n_1n_2$, where $n_1$ is a small prime factor. Let $G = \left\langle g \right\rangle$ be a cyclic group of composite order $n$. We suppose that $n = n _ 1n _ 2$, where $n _ 1$ is a small prime factor.
By the Pohlig-Hellman algorithm, the adversary can compute $\alpha_1 \equiv \alpha \pmod {n_1}$ by computing the discrete logarithm of $u^{n_2}$ with base $g^{n_2}$. By the Pohlig-Hellman algorithm, the adversary can compute $\alpha _ 1 \equiv \alpha \pmod {n _ 1}$ by computing the discrete logarithm of $u^{n _ 2}$ with base $g^{n _ 2}$.
Consider $n_1 = 2$. Then the adversary knows whether $\alpha$ is even or not. Consider $n _ 1 = 2$. Then the adversary knows whether $\alpha$ is even or not.
> **Lemma.** $\alpha$ is even if and only if $u^{n/2} = 1$. > **Lemma.** $\alpha$ is even if and only if $u^{n/2} = 1$.
@@ -191,7 +191,7 @@ Consider $n_1 = 2$. Then the adversary knows whether $\alpha$ is even or not.
This lemma can be used to break the DDH assumption. This lemma can be used to break the DDH assumption.
> **Lemma.** Given $u = g^\alpha$ and $v = g^\beta$, $\alpha\beta \in \mathbb{Z}_n$ is even if and only if $u^{n/2} = 1$ or $v^{n/2} = 1$. > **Lemma.** Given $u = g^\alpha$ and $v = g^\beta$, $\alpha\beta \in \mathbb{Z} _ n$ is even if and only if $u^{n/2} = 1$ or $v^{n/2} = 1$.
*Proof*. $\alpha\beta$ is even if and only if either $\alpha$ or $\beta$ is even. By the above lemma, this is equivalent to $u^{n/2} = 1$ or $v^{n/2} = 1$. *Proof*. $\alpha\beta$ is even if and only if either $\alpha$ or $\beta$ is even. By the above lemma, this is equivalent to $u^{n/2} = 1$ or $v^{n/2} = 1$.
@@ -205,15 +205,15 @@ If $\gamma$ was chosen uniformly, then the adversary wins with probability $1/2$
The above process can be generalized to any groups with small prime factor. See Exercise 16.2[^2] Thus, this is another reason we use groups of prime order. The above process can be generalized to any groups with small prime factor. See Exercise 16.2[^2] Thus, this is another reason we use groups of prime order.
- DDH assumption does not hold in $\mathbb{Z}_p^\ast$, since its order $p-1$ is always even. - DDH assumption does not hold in $\mathbb{Z} _ p^\ast$, since its order $p-1$ is always even.
- Instead, we use a prime order subgroup of $\mathbb{Z}_p^\ast$ or prime order elliptic curve group. - Instead, we use a prime order subgroup of $\mathbb{Z} _ p^\ast$ or prime order elliptic curve group.
## Summary of Discrete Logarithm Algorithms ## Summary of Discrete Logarithm Algorithms
|Name|Time Complexity|Space Complexity| |Name|Time Complexity|Space Complexity|
|:-:|:-:|:-:| |:-:|:-:|:-:|
|BSGS|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(\sqrt{q})$| |BSGS|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(\sqrt{q})$|
|Pohlig-Hellman|$\mathcal{O}(\sqrt{q_\mathrm{max}}$|$\mathcal{O}(1)$| |Pohlig-Hellman|$\mathcal{O}(\sqrt{q _ \mathrm{max}}$|$\mathcal{O}(1)$|
|Pollard's Rho|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(1)$| |Pollard's Rho|$\mathcal{O}(\sqrt{q})$|$\mathcal{O}(1)$|
- In generic groups, solving the DLP requires $\Omega(\sqrt{q})$ operations. - In generic groups, solving the DLP requires $\Omega(\sqrt{q})$ operations.
@@ -222,14 +222,14 @@ The above process can be generalized to any groups with small prime factor. See
## Candidates of Discrete Logarithm Groups ## Candidates of Discrete Logarithm Groups
We need groups of order prime, and we cannot use $\mathbb{Z}_p^\ast$ as itself. We have two candidates. We need groups of order prime, and we cannot use $\mathbb{Z} _ p^\ast$ as itself. We have two candidates.
- Use a subgroup of $\mathbb{Z}_p^\ast$ having prime order $q$ such that $q \mid (p-1)$ as in Diffie-Hellman. - Use a subgroup of $\mathbb{Z} _ p^\ast$ having prime order $q$ such that $q \mid (p-1)$ as in Diffie-Hellman.
- Elliptic curve group modulo $p$. - Elliptic curve group modulo $p$.
### Reduced Residue Class $\mathbb{Z}_p^\ast$ ### Reduced Residue Class $\mathbb{Z} _ p^\ast$
There are many specific algorithms for discrete logarithms on $\mathbb{Z}_p^\ast$. There are many specific algorithms for discrete logarithms on $\mathbb{Z} _ p^\ast$.
- Index-calculus - Index-calculus
- Elliptic-curve method - Elliptic-curve method
@@ -248,9 +248,9 @@ Compare this with symmetric ciphers such as AES, where doubling the key size squ
All sizes are in bits. Thus we need a very large prime, for example $p > 2^{2048}$, for security these days. All sizes are in bits. Thus we need a very large prime, for example $p > 2^{2048}$, for security these days.
### Elliptic Curve Group over $\mathbb{Z}_p$ ### Elliptic Curve Group over $\mathbb{Z} _ p$
Currently, the best-known attacks are generic attacks, so we can use much smaller parameters than $\mathbb{Z}_p^\ast$. Often the groups have sizes about $2^{256}$, $2^{384}$, $2^{512}$. Currently, the best-known attacks are generic attacks, so we can use much smaller parameters than $\mathbb{Z} _ p^\ast$. Often the groups have sizes about $2^{256}$, $2^{384}$, $2^{512}$.
[^1]: We didn't require $q$ to be prime! [^1]: We didn't require $q$ to be prime!
[^2]: A Graduate Course in Applied Cryptography [^2]: A Graduate Course in Applied Cryptography

View File

@@ -51,39 +51,39 @@ The following notion of security is only for an eavesdropping adversary.
> >
> **Experiment** $b$. > **Experiment** $b$.
> 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary. > 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary.
> 2. The adversary chooses $m_0, m_1 \in \mc{M}$ of the same length, and sends them to the challenger. > 2. The adversary chooses $m _ 0, m _ 1 \in \mc{M}$ of the same length, and sends them to the challenger.
> 3. The challenger computes $c \la E(pk, m_b)$ and sends $c$ to the adversary. > 3. The challenger computes $c \la E(pk, m _ b)$ and sends $c$ to the adversary.
> 4. $\mc{A}$ outputs a bit $b' \in \braces{0, 1}$. > 4. $\mc{A}$ outputs a bit $b' \in \braces{0, 1}$.
> >
> Let $W_b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. The **advantage** of $\mc{A}$ with respect to $\mc{E}$ is defined as > Let $W _ b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. The **advantage** of $\mc{A}$ with respect to $\mc{E}$ is defined as
> >
> $$ > $$
> \Adv[SS]{\mc{A}, \mc{E}} = \abs{\Pr[W_0] - \Pr[W_1]}. > \Adv[SS]{\mc{A}, \mc{E}} = \abs{\Pr[W _ 0] - \Pr[W _ 1]}.
> $$ > $$
> >
> $\mc{E}$ is **semantically secure** if $\rm{Adv}_{\rm{SS}}[\mc{A}, \mc{E}]$ is negligible for any efficient $\mc{A}$. > $\mc{E}$ is **semantically secure** if $\rm{Adv} _ {\rm{SS}}[\mc{A}, \mc{E}]$ is negligible for any efficient $\mc{A}$.
Note that $pk$ is sent to the adversary, and adversary can encrypt any message! Thus, encryption must be randomized. Otherwise, the adversary can compute $E(pk, m_b)$ for each $b$ and compare with $c$ given from the challenger. Note that $pk$ is sent to the adversary, and adversary can encrypt any message! Thus, encryption must be randomized. Otherwise, the adversary can compute $E(pk, m _ b)$ for each $b$ and compare with $c$ given from the challenger.
### Semantic Security $\implies$ CPA ### Semantic Security $\implies$ CPA
For symmetric ciphers, semantic security (one-time) did not guarantee CPA security (many-time). But in public key encryption, semantic security implies CPA security. This is because *the attacker can encrypt any message using the public key*. For symmetric ciphers, semantic security (one-time) did not guarantee CPA security (many-time). But in public key encryption, semantic security implies CPA security. This is because *the attacker can encrypt any message using the public key*.
First, we check the definition of CPA security for public key encryption. It is similar to that of symmetric ciphers, compare with [CPA Security for symmetric key encryption (Modern Cryptography)](./2023-09-19-symmetric-key-encryption.md#cpa-security). First, we check the definition of CPA security for public key encryption. It is similar to that of symmetric ciphers, compare with [CPA Security for symmetric key encryption (Modern Cryptography)](../2023-09-19-symmetric-key-encryption/#cpa-security).
> **Definition.** For a given public-key encryption scheme $\mc{E} = (G, E, D)$ defined over $(\mc{M}, \mc{C})$ and given an adversary $\mc{A}$, define experiments 0 and 1. > **Definition.** For a given public-key encryption scheme $\mc{E} = (G, E, D)$ defined over $(\mc{M}, \mc{C})$ and given an adversary $\mc{A}$, define experiments 0 and 1.
> >
> **Experiment $b$.** > **Experiment $b$.**
> 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary. > 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary.
> 2. The adversary submits a sequence of queries to the challenger: > 2. The adversary submits a sequence of queries to the challenger:
> - The $i$-th query is a pair of messages $m_{i, 0}, m_{i, 1} \in \mc{M}$ of the same length. > - The $i$-th query is a pair of messages $m _ {i, 0}, m _ {i, 1} \in \mc{M}$ of the same length.
> 3. The challenger computes $c_i = E(pk, m_{i, b})$ and sends $c_i$ to the adversary. > 3. The challenger computes $c _ i = E(pk, m _ {i, b})$ and sends $c _ i$ to the adversary.
> 4. The adversary computes and outputs a bit $b' \in \braces{0, 1}$. > 4. The adversary computes and outputs a bit $b' \in \braces{0, 1}$.
> >
> Let $W_b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mc{E}$** is defined as > Let $W _ b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. Then the **CPA advantage with respect to $\mc{E}$** is defined as
> >
> $$ > $$
> \Adv[CPA]{\mc{A}, \mc{E}} = \abs{\Pr[W_0] - \Pr[W_1]}. > \Adv[CPA]{\mc{A}, \mc{E}} = \abs{\Pr[W _ 0] - \Pr[W _ 1]}.
> $$ > $$
> >
> If the CPA advantage is negligible for all efficient adversaries $\mc{A}$, then $\mc{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**. > If the CPA advantage is negligible for all efficient adversaries $\mc{A}$, then $\mc{E}$ is **semantically secure against chosen plaintext attack**, or simply **CPA secure**.
@@ -95,71 +95,71 @@ We formally prove the following theorem.
> For any $q$-query CPA adversary $\mc{A}$, there exists an SS adversary $\mc{B}$ such that > For any $q$-query CPA adversary $\mc{A}$, there exists an SS adversary $\mc{B}$ such that
> >
> $$ > $$
> \rm{Adv}_{\rm{CPA}}[\mc{A}, \mc{E}] = q \cdot \rm{Adv}_{\rm{SS}}[\mc{B}, \mc{E}]. > \rm{Adv} _ {\rm{CPA}}[\mc{A}, \mc{E}] = q \cdot \rm{Adv} _ {\rm{SS}}[\mc{B}, \mc{E}].
> $$ > $$
*Proof*. The proof uses a hybrid argument. For $j = 0, \dots, q$, the *hybrid game* $j$ is played between $\mc{A}$ and a challenger that responds to the $q$ queries as follows: *Proof*. The proof uses a hybrid argument. For $j = 0, \dots, q$, the *hybrid game* $j$ is played between $\mc{A}$ and a challenger that responds to the $q$ queries as follows:
- On the $i$-th query $(m_{i,0}, m_{i, 1})$, respond with $c_i$ where - On the $i$-th query $(m _ {i,0}, m _ {i, 1})$, respond with $c _ i$ where
- $c_i \la E(pk, m_{i, 1})$ if $i \leq j$. - $c _ i \la E(pk, m _ {i, 1})$ if $i \leq j$.
- $c_i \la E(pk, m_{i, 0})$ otherwise. - $c _ i \la E(pk, m _ {i, 0})$ otherwise.
So, the challenger in hybrid game $j$ encrypts $m_{i, 1}$ in the first $j$ queries, and encrypts $m_{i, 0}$ for the rest of the queries. If we define $p_j$ to be the probability that $\mc{A}$ outputs $1$ in hybrid game $j$, we have So, the challenger in hybrid game $j$ encrypts $m _ {i, 1}$ in the first $j$ queries, and encrypts $m _ {i, 0}$ for the rest of the queries. If we define $p _ j$ to be the probability that $\mc{A}$ outputs $1$ in hybrid game $j$, we have
$$ $$
\Adv[CPA]{\mc{A}, \mc{E}} = \abs{p_q - p_0} \Adv[CPA]{\mc{A}, \mc{E}} = \abs{p _ q - p _ 0}
$$ $$
since hybrid $q$ is precisely experiment $1$, hybrid $0$ is experiment $0$. With $\mc{A}$, we define $\mc{B}$ as follows. since hybrid $q$ is precisely experiment $1$, hybrid $0$ is experiment $0$. With $\mc{A}$, we define $\mc{B}$ as follows.
1. $\mc{B}$ randomly chooses $\omega \la \braces{1, \dots, q}$. 1. $\mc{B}$ randomly chooses $\omega \la \braces{1, \dots, q}$.
2. $\mc{B}$ obtains $pk$ from the challenger, and forwards it to $\mc{A}$. 2. $\mc{B}$ obtains $pk$ from the challenger, and forwards it to $\mc{A}$.
3. For the $i$-th query $(m_{i, 0}, m_{i, 1})$ from $\mc{A}$, $\mc{B}$ responds as follows. 3. For the $i$-th query $(m _ {i, 0}, m _ {i, 1})$ from $\mc{A}$, $\mc{B}$ responds as follows.
- If $i < \omega$, $c \la E(pk, m_{i, 1})$. - If $i < \omega$, $c \la E(pk, m _ {i, 1})$.
- If $i = \omega$, forward query to the challenger and forward its response to $\mc{A}$. - If $i = \omega$, forward query to the challenger and forward its response to $\mc{A}$.
- Otherwise, $c_i \la E(pk, m_{i, 0})$. - Otherwise, $c _ i \la E(pk, m _ {i, 0})$.
4. $\mc{B}$ outputs whatever $\mc{A}$ outputs. 4. $\mc{B}$ outputs whatever $\mc{A}$ outputs.
Note that $\mc{B}$ can encrypt queries on its own, since the public key is given. Define $W_b$ as the event that $\mc{B}$ outputs $1$ in experiment $b$ in the semantic security game. For $j = 1, \dots, q$, we have that Note that $\mc{B}$ can encrypt queries on its own, since the public key is given. Define $W _ b$ as the event that $\mc{B}$ outputs $1$ in experiment $b$ in the semantic security game. For $j = 1, \dots, q$, we have that
$$ $$
\Pr[W_0 \mid \omega = j] = p_{j - 1}, \quad \Pr[W_1 \mid \omega = j] = p_j. \Pr[W _ 0 \mid \omega = j] = p _ {j - 1}, \quad \Pr[W _ 1 \mid \omega = j] = p _ j.
$$ $$
In experiment $0$ with $\omega = j$, $\mc{A}$ receives encryptions of $m_{i, 1}$ in the first $j - 1$ queries and receives encryptions of $m_{i, 1}$ for the rest of the queries. The second equation follows similarly. In experiment $0$ with $\omega = j$, $\mc{A}$ receives encryptions of $m _ {i, 1}$ in the first $j - 1$ queries and receives encryptions of $m _ {i, 1}$ for the rest of the queries. The second equation follows similarly.
Then the SS advantage can be calculated as Then the SS advantage can be calculated as
$$ $$
\begin{aligned} \begin{aligned}
\Adv[SS]{\mc{B}, \mc{E}} &= \abs{\Pr[W_0] - \Pr[W_1]} \\ \Adv[SS]{\mc{B}, \mc{E}} &= \abs{\Pr[W _ 0] - \Pr[W _ 1]} \\
&= \frac{1}{q} \abs{\sum_{j=1}^q \Pr[W_0 \mid \omega = j] - \sum_{j = 1}^q \Pr[W_1 \mid \omega = j]} \\ &= \frac{1}{q} \abs{\sum _ {j=1}^q \Pr[W _ 0 \mid \omega = j] - \sum _ {j = 1}^q \Pr[W _ 1 \mid \omega = j]} \\
&= \frac{1}{q} \abs{\sum_{j=1}^q (p_{j-1} - p_j)} \\ &= \frac{1}{q} \abs{\sum _ {j=1}^q (p _ {j-1} - p _ j)} \\
&= \frac{1}{q} \Adv[CPA]{\mc{A}, \mc{E}}. &= \frac{1}{q} \Adv[CPA]{\mc{A}, \mc{E}}.
\end{aligned} \end{aligned}
$$ $$
## CCA Security for Public Key Encryption ## CCA Security for Public Key Encryption
We also define CCA security for public key encryption, which models a wide spectrum of real-world attacks. The definition is also very similar to that of symmetric ciphers, compare with [CCA security for symmetric ciphers (Modern Cryptography)](./2023-09-26-cca-security-authenticated-encryption.md#cca-security). We also define CCA security for public key encryption, which models a wide spectrum of real-world attacks. The definition is also very similar to that of symmetric ciphers, compare with [CCA security for symmetric ciphers (Modern Cryptography)](../2023-09-26-cca-security-authenticated-encryption/#cca-security).
> **Definition.** Let $\mc{E} = (G, E, D)$ be a public-key encryption scheme over $(\mc{M}, \mc{C})$. Given an adversary $\mc{A}$, define experiments $0$ and $1$. > **Definition.** Let $\mc{E} = (G, E, D)$ be a public-key encryption scheme over $(\mc{M}, \mc{C})$. Given an adversary $\mc{A}$, define experiments $0$ and $1$.
> >
> **Experiment $b$.** > **Experiment $b$.**
> 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary. > 1. The challenger computes $(pk, sk) \la G()$ and sends $pk$ to the adversary.
> 2. $\mc{A}$ makes a series of queries to the challenger, which is one of the following two types. > 2. $\mc{A}$ makes a series of queries to the challenger, which is one of the following two types.
> - *Encryption*: Send $(m_{i_,0}, m_{i, 1})$ and receive $c'_i \la E(pk, m_{i, b})$. > - *Encryption*: Send $(m _ {i _ ,0}, m _ {i, 1})$ and receive $c' _ i \la E(pk, m _ {i, b})$.
> - *Decryption*: Send $c_i$ and receive $m'_i \la D(sk, c_i)$. > - *Decryption*: Send $c _ i$ and receive $m' _ i \la D(sk, c _ i)$.
> - Note that $\mc{A}$ is not allowed to make a decryption query for any $c_i'$. > - Note that $\mc{A}$ is not allowed to make a decryption query for any $c _ i'$.
> 3. $\mc{A}$ outputs a pair of messages $(m_0^\ast , m_1^\ast)$. > 3. $\mc{A}$ outputs a pair of messages $(m _ 0^\ast , m _ 1^\ast)$.
> 4. The challenger generates $c^\ast \la E(pk, m_b^\ast)$ and gives it to $\mc{A}$. > 4. The challenger generates $c^\ast \la E(pk, m _ b^\ast)$ and gives it to $\mc{A}$.
> 5. $\mc{A}$ is allowed to keep making queries, but not allowed to make a decryption query for $c^\ast$. > 5. $\mc{A}$ is allowed to keep making queries, but not allowed to make a decryption query for $c^\ast$.
> 6. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$. > 6. The adversary computes and outputs a bit $b' \in \left\lbrace 0, 1 \right\rbrace$.
> >
> Let $W_b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. Then the **CCA advantage with respect to $\mc{E}$** is defined as > Let $W _ b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. Then the **CCA advantage with respect to $\mc{E}$** is defined as
> >
> $$ > $$
> \rm{Adv}_{\rm{CCA}}[\mc{A}, \mc{E}] = \left\lvert \Pr[W_0] - \Pr[W_1] \right\lvert. > \rm{Adv} _ {\rm{CCA}}[\mc{A}, \mc{E}] = \left\lvert \Pr[W _ 0] - \Pr[W _ 1] \right\lvert.
> $$ > $$
> >
> If the CCA advantage is negligible for all efficient adversaries $\mc{A}$, then $\mc{E}$ is **semantically secure against a chosen ciphertext attack**, or simply **CCA secure**. > If the CCA advantage is negligible for all efficient adversaries $\mc{A}$, then $\mc{E}$ is **semantically secure against a chosen ciphertext attack**, or simply **CCA secure**.
@@ -176,7 +176,7 @@ Similarly, 1CCA security implies CCA security, as in the above theorem. So to sh
### Active Adversaries in Symmetric vs Public Key ### Active Adversaries in Symmetric vs Public Key
In symmetric key encryption, we studied [authenticated encryption (AE)](./2023-09-26-cca-security-authenticated-encryption.md#authenticated-encryption-(ae)), which required the scheme to be CPA secure and provide ciphertext integrity. In symmetric key settings, AE implied CCA. In symmetric key encryption, we studied [authenticated encryption (AE)](../2023-09-26-cca-security-authenticated-encryption/#authenticated-encryption-(ae)), which required the scheme to be CPA secure and provide ciphertext integrity. In symmetric key settings, AE implied CCA.
However in public-key schemes, adversaries can always create new ciphertexts using the public key, which makes the original definition of ciphertext integrity unusable. Thus we directly require CCA security. However in public-key schemes, adversaries can always create new ciphertexts using the public key, which makes the original definition of ciphertext integrity unusable. Thus we directly require CCA security.
@@ -187,24 +187,24 @@ Symmetric key encryptions are significantly faster than public key encryption, s
Generate $(pk, sk)$ for the public key encryption, and generate a symmetric key $k$. For the message $m$, encrypt it as Generate $(pk, sk)$ for the public key encryption, and generate a symmetric key $k$. For the message $m$, encrypt it as
$$ $$
(c, c_S) \la \big( E(pk, k), E_S(k, m) \big) (c, c _ S) \la \big( E(pk, k), E _ S(k, m) \big)
$$ $$
where $E_S$ is the symmetric encryption algorithm, $E$ is the public-key encryption algorithm. The receiver decrypts $c$ and recovers $k$ that can be used for decrypting $c_S$. This is a form of **hybrid encryption**. We are *encapsulating* the key $k$ inside a ciphertext, so we call this **key encapsulation mechanism** (KEM). where $E _ S$ is the symmetric encryption algorithm, $E$ is the public-key encryption algorithm. The receiver decrypts $c$ and recovers $k$ that can be used for decrypting $c _ S$. This is a form of **hybrid encryption**. We are *encapsulating* the key $k$ inside a ciphertext, so we call this **key encapsulation mechanism** (KEM).
We can use public-key schemes for KEM, but there are dedicated constructions for KEM which are more efficient. The dedicated algorithms does the key generation and encryption in one-shot. We can use public-key schemes for KEM, but there are dedicated constructions for KEM which are more efficient. The dedicated algorithms does the key generation and encryption in one-shot.
> **Definition.** A KEM $\mc{E}_\rm{KEM}$ consists of a triple of algorithms $(G, E_\rm{KEM}, D_\rm{KEM})$. > **Definition.** A KEM $\mc{E} _ \rm{KEM}$ consists of a triple of algorithms $(G, E _ \rm{KEM}, D _ \rm{KEM})$.
> >
> - The key generation algorithm generates $(pk, sk) \la G()$. > - The key generation algorithm generates $(pk, sk) \la G()$.
> - The encapsulation algorithm generates $(k, c_\rm{KEM}) \la E_\rm{KEM}(pk)$. > - The encapsulation algorithm generates $(k, c _ \rm{KEM}) \la E _ \rm{KEM}(pk)$.
> - The decapsulation algorithm generates $k \la D_\rm{KEM}(sk, c_\rm{KEM})$. > - The decapsulation algorithm generates $k \la D _ \rm{KEM}(sk, c _ \rm{KEM})$.
Note that $E_\rm{KEM}$ only takes the public key as a parameter. The correctness condition is that for any $(pk, sk) \la G()$ and any $(k, c_\rm{KEM}) \la E_\rm{KEM}(pk)$, we must have $k \la D_\rm{KEM}(sk, c_\rm{KEM})$. Note that $E _ \rm{KEM}$ only takes the public key as a parameter. The correctness condition is that for any $(pk, sk) \la G()$ and any $(k, c _ \rm{KEM}) \la E _ \rm{KEM}(pk)$, we must have $k \la D _ \rm{KEM}(sk, c _ \rm{KEM})$.
Using the KEM, the symmetric key is automatically encapsulated during encryption process. Using the KEM, the symmetric key is automatically encapsulated during encryption process.
> **Definition.** A KEM scheme is secure if any efficient adversary cannot distinguish between $(c_\rm{KEM}, k_0)$ and $(c_\rm{KEM}, k_1)$, where $k_0$ is generated by $E(pk)$, and $k_1$ is chosen randomly from $\mc{K}$. > **Definition.** A KEM scheme is secure if any efficient adversary cannot distinguish between $(c _ \rm{KEM}, k _ 0)$ and $(c _ \rm{KEM}, k _ 1)$, where $k _ 0$ is generated by $E(pk)$, and $k _ 1$ is chosen randomly from $\mc{K}$.
Read more about this in Exercise 11.9.[^1] Read more about this in Exercise 11.9.[^1]
@@ -212,57 +212,57 @@ Read more about this in Exercise 11.9.[^1]
We introduce a public-key encryption scheme based on the hardness of discrete logarithms. We introduce a public-key encryption scheme based on the hardness of discrete logarithms.
> **Definition.** Suppose we have two parties Alice and Bob. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$, let $\mc{E}_S = (E_S, D_S)$ be a symmetric cipher. > **Definition.** Suppose we have two parties Alice and Bob. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$, let $\mc{E} _ S = (E _ S, D _ S)$ be a symmetric cipher.
> >
> 1. Alice chooses $sk = \alpha \la \Z_q$, computes $pk = g^\alpha$ and sends $pk$ to Bob. > 1. Alice chooses $sk = \alpha \la \Z _ q$, computes $pk = g^\alpha$ and sends $pk$ to Bob.
> 2. Bob also chooses $\beta \la \Z_q$ and computes $k = h^\beta = g^{\alpha\beta}$. > 2. Bob also chooses $\beta \la \Z _ q$ and computes $k = h^\beta = g^{\alpha\beta}$.
> 3. Bob sends $\big( g^\beta, E_S(k, m) \big)$ to Alice. > 3. Bob sends $\big( g^\beta, E _ S(k, m) \big)$ to Alice.
> 4. Alice computes $k = g^{\alpha\beta} = (g^\beta)^\alpha$ using $\alpha$ and recovers $m$ by decrypting $E_S(k, m)$. > 4. Alice computes $k = g^{\alpha\beta} = (g^\beta)^\alpha$ using $\alpha$ and recovers $m$ by decrypting $E _ S(k, m)$.
As a concrete example, set $E_S(k, m) = k \cdot m$ and $D_S(k, c) = k^{-1} \cdot c$. The correctness property automatically holds. Therefore, As a concrete example, set $E _ S(k, m) = k \cdot m$ and $D _ S(k, c) = k^{-1} \cdot c$. The correctness property automatically holds. Therefore,
- $G$ outputs $sk = \alpha \la \Z_q$, $pk = h = g^\alpha$. - $G$ outputs $sk = \alpha \la \Z _ q$, $pk = h = g^\alpha$.
- $E(pk, m) = (c_1, c_2) \la (g^\beta, h^\beta \cdot m)$ where $\beta \la \Z_q$. - $E(pk, m) = (c _ 1, c _ 2) \la (g^\beta, h^\beta \cdot m)$ where $\beta \la \Z _ q$.
- $D(sk, c) = c_2 \cdot (c_1)^{-\alpha} = m$. - $D(sk, c) = c _ 2 \cdot (c _ 1)^{-\alpha} = m$.
### Security of ElGamal Encryption ### Security of ElGamal Encryption
> **Theorem.** If the DDH assumption holds on $G$, and the symmetric cipher $\mc{E}_S = (E_S, D_S)$ is semantically secure, then the ElGamal encryption scheme $\mc{E}_\rm{EG}$ is semantically secure. > **Theorem.** If the DDH assumption holds on $G$, and the symmetric cipher $\mc{E} _ S = (E _ S, D _ S)$ is semantically secure, then the ElGamal encryption scheme $\mc{E} _ \rm{EG}$ is semantically secure.
> >
> For any SS adversary $\mc{A}$ of $\mc{E}_\rm{EG}$, there exist a DDH adversary $\mc{B}$, and an SS adversary $\mc{C}$ for $\mc{E}_S$ such that > For any SS adversary $\mc{A}$ of $\mc{E} _ \rm{EG}$, there exist a DDH adversary $\mc{B}$, and an SS adversary $\mc{C}$ for $\mc{E} _ S$ such that
> >
> $$ > $$
> \Adv[SS]{\mc{A}, \mc{E}_\rm{EG}} \leq 2 \cdot \Adv[DDH]{\mc{B}, G} + \Adv[SS]{\mc{C}, \mc{E}_S}. > \Adv[SS]{\mc{A}, \mc{E} _ \rm{EG}} \leq 2 \cdot \Adv[DDH]{\mc{B}, G} + \Adv[SS]{\mc{C}, \mc{E} _ S}.
> $$ > $$
*Proof Idea*. For any $m_0, m_1 \in G$ and random $\gamma \la \Z_q$, *Proof Idea*. For any $m _ 0, m _ 1 \in G$ and random $\gamma \la \Z _ q$,
$$ $$
E_S(g^{\alpha\beta}, m_0) \approx_c E_S(g^{\gamma}, m_0) \approx_c E_S(g^\gamma, m_1) \approx_c E_S(g^{\alpha\beta}, m_1). E _ S(g^{\alpha\beta}, m _ 0) \approx _ c E _ S(g^{\gamma}, m _ 0) \approx _ c E _ S(g^\gamma, m _ 1) \approx _ c E _ S(g^{\alpha\beta}, m _ 1).
$$ $$
The first two and last two ciphertexts are computationally indistinguishable since the DDH problem is hard. The second and third ciphertexts are also indistinguishable since $\mc{E}_S$ is semantically secure. The first two and last two ciphertexts are computationally indistinguishable since the DDH problem is hard. The second and third ciphertexts are also indistinguishable since $\mc{E} _ S$ is semantically secure.
*Proof*. Full proof in Theorem 11.5.[^1] *Proof*. Full proof in Theorem 11.5.[^1]
Note that $\beta \la \Z_q$ must be chosen differently for each encrypted message. This is the randomness part of the encryption, since $pk = g^\alpha, sk =\alpha$ are fixed. Note that $\beta \la \Z _ q$ must be chosen differently for each encrypted message. This is the randomness part of the encryption, since $pk = g^\alpha, sk =\alpha$ are fixed.
### Hashed ElGamal Encryption ### Hashed ElGamal Encryption
**Hashed ElGamal encryption** scheme is a variant of the original ElGamal scheme, where we use a hash function $H : G \ra \mc{K}$, where $\mc{K}$ is the key space of $\mc{E}_S$. **Hashed ElGamal encryption** scheme is a variant of the original ElGamal scheme, where we use a hash function $H : G \ra \mc{K}$, where $\mc{K}$ is the key space of $\mc{E} _ S$.
The only difference is that we use $H(g^{\alpha\beta})$ as the key.[^2] The only difference is that we use $H(g^{\alpha\beta})$ as the key.[^2]
> 1. Alice chooses $sk = \alpha \la \Z_q$, computes $pk = g^\alpha$ and sends $pk$ to Bob. > 1. Alice chooses $sk = \alpha \la \Z _ q$, computes $pk = g^\alpha$ and sends $pk$ to Bob.
> 2. Bob also chooses $\beta \la \Z_q$ and computes $h^\beta = g^{\alpha\beta}$**, and sets $k = H(g^{\alpha\beta})$.** > 2. Bob also chooses $\beta \la \Z _ q$ and computes $h^\beta = g^{\alpha\beta}$**, and sets $k = H(g^{\alpha\beta})$.**
> 3. Bob sends $\big( g^\beta, E_S(k, m) \big)$ to Alice. > 3. Bob sends $\big( g^\beta, E _ S(k, m) \big)$ to Alice.
> 4. Alice computes $g^{\alpha\beta} = (g^\beta)^\alpha$ using $\alpha$, **computes $k = H(g^{\alpha\beta})$** and recovers $m$ by decrypting $E_S(k, m)$. > 4. Alice computes $g^{\alpha\beta} = (g^\beta)^\alpha$ using $\alpha$, **computes $k = H(g^{\alpha\beta})$** and recovers $m$ by decrypting $E _ S(k, m)$.
This is also semantically secure, under the random oracle model. This is also semantically secure, under the random oracle model.
> **Theorem.** Let $H : G \ra \mc{K}$ be modeled as a random oracle. If the CDH assumption holds on $G$ and $\mc{E}_S$ is semantically secure, then the hashed ElGamal scheme $\mc{E}_\rm{HEG}$ is semantically secure. > **Theorem.** Let $H : G \ra \mc{K}$ be modeled as a random oracle. If the CDH assumption holds on $G$ and $\mc{E} _ S$ is semantically secure, then the hashed ElGamal scheme $\mc{E} _ \rm{HEG}$ is semantically secure.
*Proof Idea*. Given a ciphertext $\big( g^\beta, E_S(k, m) \big)$ with $k = H(g^{\alpha\beta})$, the adversary learns nothing about $k$ unless it constructs $g^{\alpha\beta}$. This is because we modeled $H$ as a random oracle. If the adversary learns about $k$, then this adversary breaks the CDH assumption for $G$. Thus, if CDH assumption holds for the adversary, $k$ is completely random, so the hashed ElGamal scheme is secure by the semantic security of $\mc{E}_S$. *Proof Idea*. Given a ciphertext $\big( g^\beta, E _ S(k, m) \big)$ with $k = H(g^{\alpha\beta})$, the adversary learns nothing about $k$ unless it constructs $g^{\alpha\beta}$. This is because we modeled $H$ as a random oracle. If the adversary learns about $k$, then this adversary breaks the CDH assumption for $G$. Thus, if CDH assumption holds for the adversary, $k$ is completely random, so the hashed ElGamal scheme is secure by the semantic security of $\mc{E} _ S$.
*Proof*. Refer to Theorem 11.4.[^1] *Proof*. Refer to Theorem 11.4.[^1]
@@ -272,7 +272,7 @@ Since the hashed ElGamal scheme is semantically secure, it is automatically CPA
> **Definition.** Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. Let $\mc{A}$ be a given adversary. > **Definition.** Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. Let $\mc{A}$ be a given adversary.
> >
> 1. The challenger chooses $\alpha, \beta \la \Z_q$ and sends $g^\alpha, g^\beta$ to the adversary. > 1. The challenger chooses $\alpha, \beta \la \Z _ q$ and sends $g^\alpha, g^\beta$ to the adversary.
> 2. The adversary makes a sequence of **DH-decision oracle queries** to the challenger. > 2. The adversary makes a sequence of **DH-decision oracle queries** to the challenger.
> - Each query has the form $(v, w) \in G^2$, challenger replies with $1$ if $v^\alpha = w$, replies $0$ otherwise. > - Each query has the form $(v, w) \in G^2$, challenger replies with $1$ if $v^\alpha = w$, replies $0$ otherwise.
> 3. The adversary calculates and outputs some $w \in G$. > 3. The adversary calculates and outputs some $w \in G$.
@@ -289,7 +289,7 @@ This is also known as **gap-CDH**. Intuitively, it says that even if we have a D
### CCA Security of Hashed ElGamal ### CCA Security of Hashed ElGamal
> **Theorem.** If the gap-CDH assumption holds on $G$ and $\mc{E}_S$ provides AE and $H : G \ra \mc{K}$ is a random oracle, then the hashed ElGamal scheme is CCA secure. > **Theorem.** If the gap-CDH assumption holds on $G$ and $\mc{E} _ S$ provides AE and $H : G \ra \mc{K}$ is a random oracle, then the hashed ElGamal scheme is CCA secure.
*Proof*. See Theorem 12.4.[^1] (very long) *Proof*. See Theorem 12.4.[^1] (very long)
@@ -320,15 +320,15 @@ Since $m^{p-1} = 1 \bmod p$, $m^{ed} = m \bmod N$ (holds trivially if $p \mid m$
But this scheme is not CPA secure, since it is deterministic and the ciphertext is malleable. For instance, one can choose two messages to be $1$ and $2$. Then the ciphertext is easily distinguishable. But this scheme is not CPA secure, since it is deterministic and the ciphertext is malleable. For instance, one can choose two messages to be $1$ and $2$. Then the ciphertext is easily distinguishable.
Also, ciphertext is malleable by the **homomorphic property**. If $c_1 = m_1^e \bmod N$ and $c_2 = m_2^e \bmod N$, then set $c =c_1c_2 = (m_1m_2)^e \bmod N$, which is an encryption of $m_1m_2$. Also, ciphertext is malleable by the **homomorphic property**. If $c _ 1 = m _ 1^e \bmod N$ and $c _ 2 = m _ 2^e \bmod N$, then set $c =c _ 1c _ 2 = (m _ 1m _ 2)^e \bmod N$, which is an encryption of $m _ 1m _ 2$.
#### Attack on KEM #### Attack on KEM
Assume that the textbook RSA is used as KEM. Suppose that $k$ is $128$ bits, and the attacker sees $c = k^e \bmod N$. With high probability ($80\%$), $k = k_1 \cdot k_2$ for some $k_1, k_2 < 2^{64}$. Using the homomorphic property, $c = k_1^e k_2^e \bmod N$, so the following attack is possible. Assume that the textbook RSA is used as KEM. Suppose that $k$ is $128$ bits, and the attacker sees $c = k^e \bmod N$. With high probability ($80\%$), $k = k _ 1 \cdot k _ 2$ for some $k _ 1, k _ 2 < 2^{64}$. Using the homomorphic property, $c = k _ 1^e k _ 2^e \bmod N$, so the following attack is possible.
1. Build a table of $c\cdot k_2^{-e}$ for $0 \leq k_2 < 2^{64}$. 1. Build a table of $c\cdot k _ 2^{-e}$ for $0 \leq k _ 2 < 2^{64}$.
2. For each $1 \leq k_1 < 2^{64}$, compute $k_1^e$ to check if it is in the table. 2. For each $1 \leq k _ 1 < 2^{64}$, compute $k _ 1^e$ to check if it is in the table.
3. Output a match $(k_1, k_2)$. 3. Output a match $(k _ 1, k _ 2)$.
The attack has complexity $\mc{O}(2^{n/2})$ where $n$ is the key length. The attack has complexity $\mc{O}(2^{n/2})$ where $n$ is the key length.
@@ -390,35 +390,35 @@ The RSA assumption says that the RSA problem is hard, which implies that RSA is
### The RSA Problem ### The RSA Problem
> **Definition.** Let $\mc{T}_\rm{RSA} = (G, F, I)$ the RSA trapdoor function scheme. Given an adversary $\mc{A}$, > **Definition.** Let $\mc{T} _ \rm{RSA} = (G, F, I)$ the RSA trapdoor function scheme. Given an adversary $\mc{A}$,
> >
> 1. The challenger chooses $(pk, sk) \la G()$ and $x \la \Z_N$. > 1. The challenger chooses $(pk, sk) \la G()$ and $x \la \Z _ N$.
> - $pk = (N, e)$, $sk = (N, d)$. > - $pk = (N, e)$, $sk = (N, d)$.
> 2. The challenger computes $y \la x^e \bmod N$ and sends $pk$ and $y$ to the adversary. > 2. The challenger computes $y \la x^e \bmod N$ and sends $pk$ and $y$ to the adversary.
> 3. The adversary computes and outputs $x' \in \Z_N$. > 3. The adversary computes and outputs $x' \in \Z _ N$.
> >
> The adversary wins if $x = x'$. The advantage is defined as > The adversary wins if $x = x'$. The advantage is defined as
> >
> $$ > $$
> \rm{Adv}_{\rm{RSA}}[\mc{A}, \mc{T_\rm{RSA}}] = \Pr[x = x']. > \rm{Adv} _ {\rm{RSA}}[\mc{A}, \mc{T _ \rm{RSA}}] = \Pr[x = x'].
> $$ > $$
> >
> We say that the **RSA assumption** holds if the advantage is negligible for any efficient $\mc{A}$. > We say that the **RSA assumption** holds if the advantage is negligible for any efficient $\mc{A}$.
## RSA Public Key Encryption (ISO Standard) ## RSA Public Key Encryption (ISO Standard)
- Let $(E_S, D_S)$ be a symmetric encryption scheme over $(\mc{K}, \mc{M}, \mc{C})$ that provides AE. - Let $(E _ S, D _ S)$ be a symmetric encryption scheme over $(\mc{K}, \mc{M}, \mc{C})$ that provides AE.
- Let $H : \Z_N^{\ast} \ra \mc{K}$ be a hash function. - Let $H : \Z _ N^{\ast} \ra \mc{K}$ be a hash function.
The RSA public key encryption is done as follows. The RSA public key encryption is done as follows.
- Key generation is the same. - Key generation is the same.
- Encryption - Encryption
1. Choose random $x \la \Z_N^{\ast}$ and let $y = x^e \bmod N$. 1. Choose random $x \la \Z _ N^{\ast}$ and let $y = x^e \bmod N$.
2. Compute $c \la E_S(H(x), m)$. 2. Compute $c \la E _ S(H(x), m)$.
3. Output $c' = (y, c)$. 3. Output $c' = (y, c)$.
- Decryption - Decryption
- Output $D_S(H(y^d), c)$. - Output $D _ S(H(y^d), c)$.
This works because $x = y^d \bmod N$ and $H(y^d) = H(x)$. In short, this uses RSA trapdoor function as a **key exchange mechanism**, and the actual encryption is done by symmetric encryption. This works because $x = y^d \bmod N$ and $H(y^d) = H(x)$. In short, this uses RSA trapdoor function as a **key exchange mechanism**, and the actual encryption is done by symmetric encryption.

View File

@@ -22,10 +22,10 @@ attachment:
## Digital Signatures ## Digital Signatures
> **Definition.** A **signature scheme** $\mc{S} = (G, S, V)$ is a triple of efficient algorithms, where $G$ is a **key generation** algorithm, $S$ is a **signing** algorithm, and $V$ is a **verification** algorithm. > **Definition.** A **signature scheme** $\mc{S} = (G, S, V)$ is a triple of efficient algorithms, where $G$ is a **key generation** algorithm, $S$ is a **signing** algorithm, and $V$ is a **verification** algorithm.
> >
> - A probabilistic algorithm $G$ outputs a pair $(pk, sk)$, where $sk$ is called a secret **signing key**, and $pk$ is a public **verification key**. > - A probabilistic algorithm $G$ outputs a pair $(pk, sk)$, where $sk$ is called a secret **signing key**, and $pk$ is a public **verification key**.
> - Given $sk$ and a message $m$, a probabilistic algorithm $S$ outputs a **signature** $\sigma \la S(sk, m)$. > - Given $sk$ and a message $m$, a probabilistic algorithm $S$ outputs a **signature** $\sigma \la S(sk, m)$.
> - $V$ is a deterministic algorithm that outputs either $\texttt{{accept}}$ or $\texttt{reject}$ for $V(pk, m, \sigma)$. > - $V$ is a deterministic algorithm that outputs either $\texttt{accept}$ or $\texttt{reject}$ for $V(pk, m, \sigma)$.
The correctness property requires that all signatures generated by $S$ is always accepted by $V$. For all $(pk, sk) \la G$ and $m \in \mc{M}$, The correctness property requires that all signatures generated by $S$ is always accepted by $V$. For all $(pk, sk) \la G$ and $m \in \mc{M}$,
@@ -60,20 +60,20 @@ The definition is similar to the [secure MAC](../2023-09-21-macs/#secure-mac-unf
![mc-10-dsig-security.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-10-dsig-security.png) ![mc-10-dsig-security.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-10-dsig-security.png)
> **Definition.** Let $\mc{S} = (G, S, V)$ be a signature scheme defined over $(\mc{M}, \Sigma)$. Given an adversary $\mc{A}$, the game goes as follows. > **Definition.** Let $\mc{S} = (G, S, V)$ be a signature scheme defined over $(\mc{M}, \Sigma)$. Given an adversary $\mc{A}$, the game goes as follows.
> >
> 1. The challenger generates $(pk, sk) \la G()$ and sends $pk$ to $\mc{A}$. > 1. The challenger generates $(pk, sk) \la G()$ and sends $pk$ to $\mc{A}$.
> 2. $\mc{A}$ makes a series of *signing queries* to the challenger. > 2. $\mc{A}$ makes a series of *signing queries* to the challenger.
> - Each query is a message $m_i \in \mc{M}$, the challenger responds with $\sigma_i \la S(sk, m_i)$. > - Each query is a message $m _ i \in \mc{M}$, the challenger responds with $\sigma _ i \la S(sk, m _ i)$.
> 3. $\mc{A}$ computes and outputs a candidate forgery pair $(m, \sigma) \in \mc{M} \times \Sigma$. > 3. $\mc{A}$ computes and outputs a candidate forgery pair $(m, \sigma) \in \mc{M} \times \Sigma$.
> - $m \notin \left\lbrace m_1, \dots, m_q \right\rbrace$. > - $m \notin \left\lbrace m _ 1, \dots, m _ q \right\rbrace$.
> - $(m, \sigma) \notin \left\lbrace (m_1, \sigma_1), \dots, (m_q, \sigma_q) \right\rbrace$. (strong) > - $(m, \sigma) \notin \left\lbrace (m _ 1, \sigma _ 1), \dots, (m _ q, \sigma _ q) \right\rbrace$. (strong)
> >
> $\mc{A}$ wins if $V(pk, m, \sigma) = \texttt{accept}$, let this event be $W$. The advantage of $\mc{A}$ with respect to $\mc{S}$ is defined as > $\mc{A}$ wins if $V(pk, m, \sigma) = \texttt{accept}$, let this event be $W$. The advantage of $\mc{A}$ with respect to $\mc{S}$ is defined as
> >
> $$ > $$
> \rm{Adv}_{\rm{SIG}}[\mc{A}, \mc{S}] = \Pr[W]. > \rm{Adv} _ {\rm{SIG}}[\mc{A}, \mc{S}] = \Pr[W].
> $$ > $$
> >
> If the advantage is negligible for all efficient adversaries $\mc{A}$, the signature scheme $S$ is (strongly) **secure**. $\mc{S}$ is **existentially unforgeable under a chosen message attack**. > If the advantage is negligible for all efficient adversaries $\mc{A}$, the signature scheme $S$ is (strongly) **secure**. $\mc{S}$ is **existentially unforgeable under a chosen message attack**.
- We do not make verification queries, since the adversary can always check any signature. - We do not make verification queries, since the adversary can always check any signature.
@@ -108,11 +108,11 @@ $$
This is often called the **hash-and-sign paradigm**, and the new signature scheme is also secure. This is often called the **hash-and-sign paradigm**, and the new signature scheme is also secure.
> **Theorem.** Suppose that $\mc{S}$ is a secure signature scheme and $H$ is a collision resistant hash function. Then $\mc{S}'$ is a secure signature. > **Theorem.** Suppose that $\mc{S}$ is a secure signature scheme and $H$ is a collision resistant hash function. Then $\mc{S}'$ is a secure signature.
> >
> If $\mc{A}$ is an adversary attacking $\mc{S}'$, then there exist an adversary $\mc{B}_\mc{S}$ attacking $\mc{S}$ and an adversary $\mc{B}_H$ attacking $H$ such that > If $\mc{A}$ is an adversary attacking $\mc{S}'$, then there exist an adversary $\mc{B} _ \mc{S}$ attacking $\mc{S}$ and an adversary $\mc{B} _ H$ attacking $H$ such that
> >
> $$ > $$
> \rm{Adv}_{\rm{SIG}}[A, \mc{S}'] \leq \rm{Adv}_{\rm{SIG}}[\mc{B}_\mc{S}, \mc{S}] + \rm{Adv}_{\rm{CR}}[\mc{B}_H, H]. > \rm{Adv} _ {\rm{SIG}}[A, \mc{S}'] \leq \rm{Adv} _ {\rm{SIG}}[\mc{B} _ \mc{S}, \mc{S}] + \rm{Adv} _ {\rm{CR}}[\mc{B} _ H, H].
> $$ > $$
*Proof*. The proof is identical to the theorem for MACs. *Proof*. The proof is identical to the theorem for MACs.
@@ -135,16 +135,16 @@ Here are some possible attacks.
- Just return $(\sigma^e, \sigma)$ for some $\sigma$. Then it passes verification. - Just return $(\sigma^e, \sigma)$ for some $\sigma$. Then it passes verification.
- Attack using the homomorphic property. - Attack using the homomorphic property.
- Suppose we want to forge a message $m$. - Suppose we want to forge a message $m$.
- Pick $m_1 \in \Z_N^{\ast}$ and set $m_2 = m\cdot m_1^{-1} \bmod N$. - Pick $m _ 1 \in \Z _ N^{\ast}$ and set $m _ 2 = m\cdot m _ 1^{-1} \bmod N$.
- Query signatures for both messages and multiply the responses. - Query signatures for both messages and multiply the responses.
- $\sigma = \sigma_1 \cdot \sigma_2 = m_1^e \cdot m^e \cdot m_1^{-e} = m^e \bmod N$. - $\sigma = \sigma _ 1 \cdot \sigma _ 2 = m _ 1^e \cdot m^e \cdot m _ 1^{-e} = m^e \bmod N$.
- Then $(m, \sigma)$ is a valid pair. - Then $(m, \sigma)$ is a valid pair.
Because of the second attack, the textbook RSA signature is **universally forgeable**. This property is used to create **blind signatures**, where the signer creates a signature without any knowledge about the message. See Exercise 13.15.[^1] Because of the second attack, the textbook RSA signature is **universally forgeable**. This property is used to create **blind signatures**, where the signer creates a signature without any knowledge about the message. See Exercise 13.15.[^1]
### RSA Full Domain Hash Signature Scheme ### RSA Full Domain Hash Signature Scheme
Given a hash function $H : \mc{M} \ra \mc{Y}$, the **RSA full domain hash** signature scheme $\mc{S}_\rm{RSA-FDH}$ is defined as follows. Given a hash function $H : \mc{M} \ra \mc{Y}$, the **RSA full domain hash** signature scheme $\mc{S} _ \rm{RSA-FDH}$ is defined as follows.
- Key generation: $pk = (N, e)$ and $sk = (N, d)$ are chosen to satisfy $d = e^{-1} \bmod \phi(N)$ for $N = pq$. - Key generation: $pk = (N, e)$ and $sk = (N, d)$ are chosen to satisfy $d = e^{-1} \bmod \phi(N)$ for $N = pq$.
- Sign: $S(sk, m) = H(m)^d \bmod N$. - Sign: $S(sk, m) = H(m)^d \bmod N$.
@@ -152,25 +152,25 @@ Given a hash function $H : \mc{M} \ra \mc{Y}$, the **RSA full domain hash** sign
This scheme is now secure. This scheme is now secure.
> **Theorem.** If the hash function $H$ is modeled as a random oracle, and the RSA assumptions holds, then $\mc{S}_\rm{RSA-FDH}$ is a secure signature scheme. > **Theorem.** If the hash function $H$ is modeled as a random oracle, and the RSA assumptions holds, then $\mc{S} _ \rm{RSA-FDH}$ is a secure signature scheme.
> >
> For any $q$-query adversary $\mc{A}$ against hashed RSA, there exists an adversary $\mc{B}$ solving the RSA problem such that > For any $q$-query adversary $\mc{A}$ against hashed RSA, there exists an adversary $\mc{B}$ solving the RSA problem such that
> >
> $$ > $$
> \rm{Adv}_{\rm{SIG}}[\mc{A}, \mc{S}_\rm{RSA-FDH}] \leq q \cdot \rm{Adv}_{\rm{RSA}}[\mc{B}]. > \rm{Adv} _ {\rm{SIG}}[\mc{A}, \mc{S} _ \rm{RSA-FDH}] \leq q \cdot \rm{Adv} _ {\rm{RSA}}[\mc{B}].
> $$ > $$
### Full Domain Hash Signature Scheme ### Full Domain Hash Signature Scheme
The following is a description of a **full domain hash** scheme $\mc{S}_\rm{FDH}$, constructed from trapdoor permutation scheme $\mc{T} = (G, F, I)$. The following is a description of a **full domain hash** scheme $\mc{S} _ \rm{FDH}$, constructed from trapdoor permutation scheme $\mc{T} = (G, F, I)$.
- Key generation: $(pk, sk) \la G()$. - Key generation: $(pk, sk) \la G()$.
- Sign: $S(sk, m)$ returns $\sigma \la I(sk, H(m))$. - Sign: $S(sk, m)$ returns $\sigma \la I(sk, H(m))$.
- Verify: $V(pk, m, \sigma)$ returns $\texttt{accept}$ if and only if $F(pk, \sigma) = H(m)$. - Verify: $V(pk, m, \sigma)$ returns $\texttt{accept}$ if and only if $F(pk, \sigma) = H(m)$.
This scheme $\mc{S}_\rm{FDH} = (G, S, V)$ is secure if $\mc{T}$ is a **one-way trapdoor permutation** and $H$ is a random oracle. This scheme $\mc{S} _ \rm{FDH} = (G, S, V)$ is secure if $\mc{T}$ is a **one-way trapdoor permutation** and $H$ is a random oracle.
> **Theorem.** Let $\mc{T} = (G,F,I)$ be a one-way trapdoor permutation defined over $\mc{X}$. Let $H : \mc{M} \ra \mc{X}$ be a hash function, modeled as a random oracle. Then the derived FDH signature scheme $\mc{S}_\rm{FDH}$ is a secure signature scheme. > **Theorem.** Let $\mc{T} = (G,F,I)$ be a one-way trapdoor permutation defined over $\mc{X}$. Let $H : \mc{M} \ra \mc{X}$ be a hash function, modeled as a random oracle. Then the derived FDH signature scheme $\mc{S} _ \rm{FDH}$ is a secure signature scheme.
*Proof*. See Theorem 13.3.[^1] *Proof*. See Theorem 13.3.[^1]
@@ -182,26 +182,26 @@ This one uses discrete logarithms.
This scheme is originally from the **Schnorr identification protocol**. This scheme is originally from the **Schnorr identification protocol**.
Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. We consider an interaction between two parties, prover $P$ and a verifier $V$. The prover has a secret $\alpha \in \Z_q$ and the verification key is $u = g^\alpha$. **$P$ wants to convince $V$ that he knows $\alpha$, but does not want to reveal $\alpha$**. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. We consider an interaction between two parties, prover $P$ and a verifier $V$. The prover has a secret $\alpha \in \Z _ q$ and the verification key is $u = g^\alpha$. **$P$ wants to convince $V$ that he knows $\alpha$, but does not want to reveal $\alpha$**.
![mc-10-schnorr-identification.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-10-schnorr-identification.png) ![mc-10-schnorr-identification.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-10-schnorr-identification.png)
The protocol $\mc{I}_\rm{sch} = (G, P, V)$ works as follows. The protocol $\mc{I} _ \rm{sch} = (G, P, V)$ works as follows.
> 1. A **secret key** $\alpha \la \Z_q$ and **verification key** $u \la g^\alpha$ is generated. The prover $P$ has $\alpha$ and the verifier $V$ has $u$. > 1. A **secret key** $\alpha \la \Z _ q$ and **verification key** $u \la g^\alpha$ is generated. The prover $P$ has $\alpha$ and the verifier $V$ has $u$.
> 2. $P$ computes a random $\alpha_t \la \Z_q$, and sends $u_t \la g^{\alpha_t}$ to $V$. > 2. $P$ computes a random $\alpha _ t \la \Z _ q$, and sends $u _ t \la g^{\alpha _ t}$ to $V$.
> 3. $V$ chooses a random $c \la \Z_q$ and sends it to $P$. > 3. $V$ chooses a random $c \la \Z _ q$ and sends it to $P$.
> 4. $P$ computes $\alpha_z \la \alpha_t + \alpha c \in \Z_q$ and sends it to $V$. > 4. $P$ computes $\alpha _ z \la \alpha _ t + \alpha c \in \Z _ q$ and sends it to $V$.
> 5. $V$ checks if $g^{\alpha_z} = u_t \cdot u^c$. Accept if and only if it is equal. > 5. $V$ checks if $g^{\alpha _ z} = u _ t \cdot u^c$. Accept if and only if it is equal.
- $u_t$ is the **commitment** sent to the verifier. - $u _ t$ is the **commitment** sent to the verifier.
- $c$ is the **challenge** sent to the prover. - $c$ is the **challenge** sent to the prover.
- If $P$ can predict the challenge, $P$ can choose $\alpha_t$ and $\alpha_z$ so that verifier accepts it. - If $P$ can predict the challenge, $P$ can choose $\alpha _ t$ and $\alpha _ z$ so that verifier accepts it.
- $\alpha_z$ is the **response** sent to the verifier. - $\alpha _ z$ is the **response** sent to the verifier.
We must check a few things. We must check a few things.
- **Correctness**: If $P$ has the correct $\alpha$, then $g^{\alpha_z} = g^{\alpha_t} \cdot (g^\alpha)^c = u_t \cdot u^c$. - **Correctness**: If $P$ has the correct $\alpha$, then $g^{\alpha _ z} = g^{\alpha _ t} \cdot (g^\alpha)^c = u _ t \cdot u^c$.
- **Soundness**: If $P$ does not have the correct $\alpha$, it is reject with probability $1 - \frac{1}{q}$. - **Soundness**: If $P$ does not have the correct $\alpha$, it is reject with probability $1 - \frac{1}{q}$.
- We can repeat this many times then the probability of reject is $1 - \frac{1}{q^n} \ra 1$. - We can repeat this many times then the probability of reject is $1 - \frac{1}{q^n} \ra 1$.
- Thus $q$ (the size of the challenge space) must be large. - Thus $q$ (the size of the challenge space) must be large.
@@ -214,32 +214,32 @@ We must check a few things.
We *transform* the above protocol to a signature scheme.[^2] We need a hash function $H : \mc{M} \times G \ra \mc{C}$, modeled as a random oracle. The protocol originally involves interaction between two parties, but a signature is computed by a single party. Intuitively, $H$ will play the role of the verifier. We *transform* the above protocol to a signature scheme.[^2] We need a hash function $H : \mc{M} \times G \ra \mc{C}$, modeled as a random oracle. The protocol originally involves interaction between two parties, but a signature is computed by a single party. Intuitively, $H$ will play the role of the verifier.
The **Schnorr signature scheme** $\mc{S}_\rm{sch} = (G, S, V)$ is defined as follows. The **Schnorr signature scheme** $\mc{S} _ \rm{sch} = (G, S, V)$ is defined as follows.
- Key generation: a **secret key** $sk = \alpha \la \Z_q$ and **public key** $pk = u \la g^\alpha$ is generated. - Key generation: a **secret key** $sk = \alpha \la \Z _ q$ and **public key** $pk = u \la g^\alpha$ is generated.
- Sign: $S(sk, m)$ outputs $\sigma = (u_t, \alpha_z)$ where - Sign: $S(sk, m)$ outputs $\sigma = (u _ t, \alpha _ z)$ where
- Choose random $\alpha_t \la \Z_q$ and set $u_t \la g^{\alpha_t}$. - Choose random $\alpha _ t \la \Z _ q$ and set $u _ t \la g^{\alpha _ t}$.
- **Compute $c \la H(m, u_t)$** and set $\alpha_z \la \alpha_t + \alpha c$. - **Compute $c \la H(m, u _ t)$** and set $\alpha _ z \la \alpha _ t + \alpha c$.
- Verify: $V(pk, m, \sigma)$ outputs $\texttt{accept}$ if and only if $g^{\alpha_z} = u_t \cdot u^c$. - Verify: $V(pk, m, \sigma)$ outputs $\texttt{accept}$ if and only if $g^{\alpha _ z} = u _ t \cdot u^c$.
- $c \la H(m, u_t)$ can be computed and $u$ is known. - $c \la H(m, u _ t)$ can be computed and $u$ is known.
Since $H$ is being modeled as a random oracle, the signer cannot predict the value of the challenge $c$. Also, $c$ must take both $m$ and $u_t$ as input, since without $m$, the signature is not related to $m$ (the signature has no $m$ term inside it). On the other hand, without $u_t$, then the scheme is insecure since the Schnorr identification protocol is HVZK. See Exercise 19.12.[^1] Since $H$ is being modeled as a random oracle, the signer cannot predict the value of the challenge $c$. Also, $c$ must take both $m$ and $u _ t$ as input, since without $m$, the signature is not related to $m$ (the signature has no $m$ term inside it). On the other hand, without $u _ t$, then the scheme is insecure since the Schnorr identification protocol is HVZK. See Exercise 19.12.[^1]
> **Theorem.** If $H$ is modeled as a random oracle and Schnorr's identification protocol is secure, then Schnorr's signature scheme is also secure. > **Theorem.** If $H$ is modeled as a random oracle and Schnorr's identification protocol is secure, then Schnorr's signature scheme is also secure.
*Proof*. See Theorem 19.7.[^1] *Proof*. See Theorem 19.7.[^1]
Note that $\alpha \la \Z_q$ must be chosen randomly every time. Note that $\alpha \la \Z _ q$ must be chosen randomly every time.
## Digital Signature Algorithm ## Digital Signature Algorithm
Schnorr's scheme was protected by a patent, so NIST opted for a ad-hoc signature scheme based on a prime order subgroup of $\Z_p^{\ast}$. This algorithm eventually became the **Digital Signature Algorithm** (DSA). The standard was updated to support elliptic curve groups over a finite field, resulting in **ECDSA**. Schnorr's scheme was protected by a patent, so NIST opted for a ad-hoc signature scheme based on a prime order subgroup of $\Z _ p^{\ast}$. This algorithm eventually became the **Digital Signature Algorithm** (DSA). The standard was updated to support elliptic curve groups over a finite field, resulting in **ECDSA**.
## Public Key Infrastructure ## Public Key Infrastructure
How would you trust public keys? We introduce **digital certificates** for this. How would you trust public keys? We introduce **digital certificates** for this.
Read in [public key infrastructure (Internet Security)](../../internet-security/2023-10-16-pki). Read in [public key infrastructure (Internet Security)](../../internet-security/2023-10-16-pki/).
[^1]: A Graduate Course in Applied Cryptography [^1]: A Graduate Course in Applied Cryptography
[^2]: By using the [Fiat-Shamir transform](../2023-11-07-sigma-protocols/#the-fiat-shamir-transform). [^2]: By using the [Fiat-Shamir transform](../2023-11-07-sigma-protocols/#the-fiat-shamir-transform).

View File

@@ -54,10 +54,10 @@ Suppose Alice wants to commit a message $m$. She computes $(c, o) \la C(m)$, and
The scheme must satisfy the following properties. First, the commitment must open to a single message. This is called the **binding** property. Next, the commitment must not reveal any information about the message. This is called the **hiding** property. The scheme must satisfy the following properties. First, the commitment must open to a single message. This is called the **binding** property. Next, the commitment must not reveal any information about the message. This is called the **hiding** property.
> **Definition.** A commitment scheme $\mc{C} = (C, V)$ is **binding** if for every efficient adversary $\mc{A}$ that outputs a $5$-tuple $(c, m_1, o_1, m_2, o_2)$, the probability > **Definition.** A commitment scheme $\mc{C} = (C, V)$ is **binding** if for every efficient adversary $\mc{A}$ that outputs a $5$-tuple $(c, m _ 1, o _ 1, m _ 2, o _ 2)$, the probability
> >
> $$ > $$
> \Pr[m_1 \neq m_2 \land V(m_1, c, o_1) = V(m_2, c, o_2) = \texttt{{accept}}] > \Pr[m _ 1 \neq m _ 2 \land V(m _ 1, c, o _ 1) = V(m _ 2, c, o _ 2) = \texttt{{accept}}]
> $$ > $$
> >
> is negligible. > is negligible.
@@ -67,14 +67,14 @@ The hiding property is defined as a security game.
> **Definition.** Let $\mc{C} = (C, V)$ be a commitment scheme. Given an adversary $\mc{A}$, define two experiments. > **Definition.** Let $\mc{C} = (C, V)$ be a commitment scheme. Given an adversary $\mc{A}$, define two experiments.
> >
> **Experiment $b$**. > **Experiment $b$**.
> 1. $\mc{A}$ sends $m_0, m_1 \in \mc{M}$ to the challenger. > 1. $\mc{A}$ sends $m _ 0, m _ 1 \in \mc{M}$ to the challenger.
> 2. The challenger computes $(c, o) \la C(m_b)$ and sends $c$ to $\mc{A}$. > 2. The challenger computes $(c, o) \la C(m _ b)$ and sends $c$ to $\mc{A}$.
> 3. $\mc{A}$ computes and outputs $b' \in \braces{0, 1}$. > 3. $\mc{A}$ computes and outputs $b' \in \braces{0, 1}$.
> >
> Let $W_b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. The **advantage** of $\mc{A}$ with respect to $\mc{C}$ is defined as > Let $W _ b$ be the event that $\mc{A}$ outputs $1$ in experiment $b$. The **advantage** of $\mc{A}$ with respect to $\mc{C}$ is defined as
> >
> $$ > $$
> \Adv{\mc{A}, \mc{C}} = \abs{\Pr[W_0] - \Pr[W_1]}. > \Adv{\mc{A}, \mc{C}} = \abs{\Pr[W _ 0] - \Pr[W _ 1]}.
> $$ > $$
> >
> If the advantage is negligible for all efficient adversaries $\mc{A}$, then the commitment scheme $\mc{C}$ has the **hiding** property. > If the advantage is negligible for all efficient adversaries $\mc{A}$, then the commitment scheme $\mc{C}$ has the **hiding** property.
@@ -98,24 +98,24 @@ However, it may be feasible to find another $k' \in \mc{K}'$ such that $D(k, c)
To commit a bit, we can use a secure PRG. The following is due to Naor. To commit a bit, we can use a secure PRG. The following is due to Naor.
> Let $G : \mc{S} \ra \mc{R}$ be a secure PRG where $\left\lvert \mc{R} \right\lvert \geq \left\lvert \mc{S} \right\lvert^3$ and $\mc{R} = \braces{0, 1}^n$. Suppose that Bob wants to commit a bit $b_0 \in \braces{0, 1}$. > Let $G : \mc{S} \ra \mc{R}$ be a secure PRG where $\left\lvert \mc{R} \right\lvert \geq \left\lvert \mc{S} \right\lvert^3$ and $\mc{R} = \braces{0, 1}^n$. Suppose that Bob wants to commit a bit $b _ 0 \in \braces{0, 1}$.
> >
> 1. Alice chooses a random $r \in \mc{R}$ and sends it to Bob. > 1. Alice chooses a random $r \in \mc{R}$ and sends it to Bob.
> 2. Bob chooses a random $s \in \mc{S}$ and computes $c \la C(s, r, b_0)$, where > 2. Bob chooses a random $s \in \mc{S}$ and computes $c \la C(s, r, b _ 0)$, where
> >
> $$ > $$
> C(s, r, b_0) = \begin{cases} G(s) & (b_0 = 0) \\ G(s) \oplus r & (b_0 = 1). \end{cases} > C(s, r, b _ 0) = \begin{cases} G(s) & (b _ 0 = 0) \\ G(s) \oplus r & (b _ 0 = 1). \end{cases}
> $$ > $$
> >
> Then Bob outputs $(c, s)$ as the commitment and the opening string. > Then Bob outputs $(c, s)$ as the commitment and the opening string.
> 3. During opening, Bob sends $(b_0, s)$ to Alice. > 3. During opening, Bob sends $(b _ 0, s)$ to Alice.
> 4. Alice accepts if and only if $C(s, r, b_0) = c$. > 4. Alice accepts if and only if $C(s, r, b _ 0) = c$.
Correctness is obvious, since Alice recomputes $C(s, r, b_0)$. Correctness is obvious, since Alice recomputes $C(s, r, b _ 0)$.
The hiding property follows since $G(s)$ and $G(s) \oplus r$ are indistinguishable if $G$ is a secure PRG. The hiding property follows since $G(s)$ and $G(s) \oplus r$ are indistinguishable if $G$ is a secure PRG.
The binding property follows if $1 / \left\lvert \mc{S} \right\lvert$ is negligible. For Bob to open $c$ as both $0$ and $1$, he must find two seeds $s_0, s_1 \in \mc{S}$ such that $c = G(s_0) = G(s_1) \oplus r$. Then $r = G(s_0) \oplus G(s_1)$. There are at most $\left\lvert \mc{S} \right\lvert^2$ possible $r \in \mc{R}$ values that this can happen. The probability that Alice chooses such $r$ is The binding property follows if $1 / \left\lvert \mc{S} \right\lvert$ is negligible. For Bob to open $c$ as both $0$ and $1$, he must find two seeds $s _ 0, s _ 1 \in \mc{S}$ such that $c = G(s _ 0) = G(s _ 1) \oplus r$. Then $r = G(s _ 0) \oplus G(s _ 1)$. There are at most $\left\lvert \mc{S} \right\lvert^2$ possible $r \in \mc{R}$ values that this can happen. The probability that Alice chooses such $r$ is
$$ $$
\left\lvert \mc{S} \right\lvert^2 / \left\lvert \mc{R} \right\lvert \leq \left\lvert \mc{S} \right\lvert^2 / \left\lvert \mc{S} \right\lvert^3 = 1 / \left\lvert \mc{S} \right\lvert \left\lvert \mc{S} \right\lvert^2 / \left\lvert \mc{R} \right\lvert \leq \left\lvert \mc{S} \right\lvert^2 / \left\lvert \mc{S} \right\lvert^3 = 1 / \left\lvert \mc{S} \right\lvert
@@ -129,30 +129,30 @@ The downside of the above protocol is that it has to be interactive.
A bit commitment scheme can be used for a **coin flipping protocol**. Suppose that Alice and Bob are flipping coins, when they are physically distant from each other. A bit commitment scheme can be used for a **coin flipping protocol**. Suppose that Alice and Bob are flipping coins, when they are physically distant from each other.
> 1. Bob chooses a random bit $b_0 \la \braces{0, 1}$. > 1. Bob chooses a random bit $b _ 0 \la \braces{0, 1}$.
> 2. Execute the commitment protocol. > 2. Execute the commitment protocol.
> - Alice obtains a commitment string $c$ of $b_0$. > - Alice obtains a commitment string $c$ of $b _ 0$.
> - Bob keeps an opening string $o$. > - Bob keeps an opening string $o$.
> 3. Alice chooses a random bit $b_1 \la \braces{0, 1}$, and sends it to Bob. > 3. Alice chooses a random bit $b _ 1 \la \braces{0, 1}$, and sends it to Bob.
> 4. Bob reveals $b_0$ and $s$ to Alice, she verifies that $c$ is valid. > 4. Bob reveals $b _ 0$ and $s$ to Alice, she verifies that $c$ is valid.
> 5. The final outcome is $b = b_0 \oplus b_1$. > 5. The final outcome is $b = b _ 0 \oplus b _ 1$.
After step $2$, Alice has no information about $b_0$ because of the hiding property. Her choice of $b_1$ is unbiased, and cannot affect the final outcome. Next, in step $4$, $b_0$ cannot be manipulated by the binding property. After step $2$, Alice has no information about $b _ 0$ because of the hiding property. Her choice of $b _ 1$ is unbiased, and cannot affect the final outcome. Next, in step $4$, $b _ 0$ cannot be manipulated by the binding property.
Thus, $b_0$ and $b_1$ are both random, so $b$ is either $0$ or $1$ each with probability $1/2$.[^1] Thus, $b _ 0$ and $b _ 1$ are both random, so $b$ is either $0$ or $1$ each with probability $1/2$.[^1]
### Commitment Scheme from Hashing ### Commitment Scheme from Hashing
> Let $H : \mc{X} \ra \mc{Y}$ be a collision resistant hash function, where $\mc{X} = \mc{M} \times \mc{R}$. $\mc{M}$ is the message space, and $\mc{R}$ is a finite nonce space. For $m \in \mc{M}$, the derived commitment scheme $\mc{C}_H = (C, V)$ is defined as follows. > Let $H : \mc{X} \ra \mc{Y}$ be a collision resistant hash function, where $\mc{X} = \mc{M} \times \mc{R}$. $\mc{M}$ is the message space, and $\mc{R}$ is a finite nonce space. For $m \in \mc{M}$, the derived commitment scheme $\mc{C} _ H = (C, V)$ is defined as follows.
> >
> - $C(m)$: choose random $o \la \mc{R}$, set $c = H(m, o)$ and output $(c, o)$. > - $C(m)$: choose random $o \la \mc{R}$, set $c = H(m, o)$ and output $(c, o)$.
> - $V(m, c, o)$: output $\texttt{accept}$ if and only if $c = H(m, o)$. > - $V(m, c, o)$: output $\texttt{accept}$ if and only if $c = H(m, o)$.
Correctness is obvious. Correctness is obvious.
The binding property follows since $H$ is collision resistant. If it is easy to find a $5$-tuple $(c, m_1, o_1, m_2, o_2)$ such that $c = H(m_1, o_1) = H(m_2, o_2)$, $H$ is not collision resistant. The binding property follows since $H$ is collision resistant. If it is easy to find a $5$-tuple $(c, m _ 1, o _ 1, m _ 2, o _ 2)$ such that $c = H(m _ 1, o _ 1) = H(m _ 2, o _ 2)$, $H$ is not collision resistant.
The hiding property follows if $H$ is modeled as a random oracle, or has a property called **input hiding**. For adversarially chosen $m_1, m_2 \in \mc{M}$ and random $o \la \mc{R}$, the distributions of $H(m_1, o)$ and $H(m_2, o)$ are computationally indistinguishable. The hiding property follows if $H$ is modeled as a random oracle, or has a property called **input hiding**. For adversarially chosen $m _ 1, m _ 2 \in \mc{M}$ and random $o \la \mc{R}$, the distributions of $H(m _ 1, o)$ and $H(m _ 2, o)$ are computationally indistinguishable.
Additionally, this scheme is **non-malleable** if $H$ is modeled as a random oracle and $\mc{Y}$ is sufficiently large.[^2] Additionally, this scheme is **non-malleable** if $H$ is modeled as a random oracle and $\mc{Y}$ is sufficiently large.[^2]
@@ -160,14 +160,14 @@ Additionally, this scheme is **non-malleable** if $H$ is modeled as a random ora
> Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. Let $h$ be chosen randomly from $G$. > Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. Let $h$ be chosen randomly from $G$.
> >
> - $C(m)$: choose random $o \la \mathbb{Z}_q$ and $c \la g^m h^o$ and return $(c, o)$. > - $C(m)$: choose random $o \la \mathbb{Z} _ q$ and $c \la g^m h^o$ and return $(c, o)$.
> - $V(m, c, o)$: output $\texttt{accept}$ if and only if $c = g^m h^o$. > - $V(m, c, o)$: output $\texttt{accept}$ if and only if $c = g^m h^o$.
Correctness is obvious. Correctness is obvious.
The binding property follows from the DL assumption. If an adversary finds $m_1, m_2$, $o_1, o_2$ such that $c = g^{m_1} h^{o_1} = g^{m_2} h^{o_2}$, then $h = g^{(m_2 - m_1)/(o_1 - o_2)}$, solving the discrete logarithm problem for $h$. The binding property follows from the DL assumption. If an adversary finds $m _ 1, m _ 2$, $o _ 1, o _ 2$ such that $c = g^{m _ 1} h^{o _ 1} = g^{m _ 2} h^{o _ 2}$, then $h = g^{(m _ 2 - m _ 1)/(o _ 1 - o _ 2)}$, solving the discrete logarithm problem for $h$.
The hiding property follows since $h$ is uniform in $G$ and $o$ is also uniform in $\mathbb{Z}_q$. Then $g^m h^o$ is uniform in $G$, not revealing any information. The hiding property follows since $h$ is uniform in $G$ and $o$ is also uniform in $\mathbb{Z} _ q$. Then $g^m h^o$ is uniform in $G$, not revealing any information.
## Post Quantum Cryptography ## Post Quantum Cryptography
@@ -193,7 +193,7 @@ But factorization and discrete logarithms are not safe. The core idea is that a
### Quantum Factorization ### Quantum Factorization
Let $n \in \mathbb{Z}$ and $0\neq g \in \mathbb{Z}_n$. Let $\gamma_g : \mathbb{Z} \ra \mathbb{Z}_n$ be defined as $\gamma_g(\alpha) = g^\alpha$. This function is periodic, since $g^{\phi(n)} = 1$ by Euler's generalization. Also, the order of $g$ will certainly divide the period. Let $n \in \mathbb{Z}$ and $0\neq g \in \mathbb{Z} _ n$. Let $\gamma _ g : \mathbb{Z} \ra \mathbb{Z} _ n$ be defined as $\gamma _ g(\alpha) = g^\alpha$. This function is periodic, since $g^{\phi(n)} = 1$ by Euler's generalization. Also, the order of $g$ will certainly divide the period.
Thus, find a period $p$, and let $t$ be the smallest positive integer such that $g^{p/2^t} \neq 1$. Then $\gcd(n, g^{p/2^t} - 1)$ is a non-trivial factor of $n$ with probability about $1/2$ over the choice of $g$. See Exercise 16.10.[^3] Thus, find a period $p$, and let $t$ be the smallest positive integer such that $g^{p/2^t} \neq 1$. Then $\gcd(n, g^{p/2^t} - 1)$ is a non-trivial factor of $n$ with probability about $1/2$ over the choice of $g$. See Exercise 16.10.[^3]
@@ -218,5 +218,5 @@ This period can be found in $\mc{O}(\log^3 q)$ time. The DL assumption is false
(Detailed explanation to be added...) (Detailed explanation to be added...)
[^1]: There is one caveat. Bob gets to know the final result before Alice. If the outcome is not what he desired, he could abort the protocol in some way, like sending an invalid $c$, and go over the whole process again. [^1]: There is one caveat. Bob gets to know the final result before Alice. If the outcome is not what he desired, he could abort the protocol in some way, like sending an invalid $c$, and go over the whole process again.
[^2]: A commitment scheme is **malleable** if a commitment $c = (c_1, c_2)$ of a message $m$ can be transformed into a commitment $c' = (c_1, c_2 + \delta)$ of a message $m + \delta$. [^2]: A commitment scheme is **malleable** if a commitment $c = (c _ 1, c _ 2)$ of a message $m$ can be transformed into a commitment $c' = (c _ 1, c _ 2 + \delta)$ of a message $m + \delta$.
[^3]: A Graduate Course in Applied Cryptography. [^3]: A Graduate Course in Applied Cryptography.

View File

@@ -84,7 +84,7 @@ We define these formally.
> **Definition.** Let $\mc{R} \subset \mc{X} \times \mc{Y}$ be a relation. A statement $y \in \mc{Y}$ is **true** if $(x, y) \in \mc{R}$ for some $x \in \mc{X}$. The set of true statements > **Definition.** Let $\mc{R} \subset \mc{X} \times \mc{Y}$ be a relation. A statement $y \in \mc{Y}$ is **true** if $(x, y) \in \mc{R}$ for some $x \in \mc{X}$. The set of true statements
> >
> $$ > $$
> L_\mc{R} = \braces{y \in \mc{Y} : \exists x \in \mc{X},\; (x, y) \in \mc{R}} > L _ \mc{R} = \braces{y \in \mc{Y} : \exists x \in \mc{X},\; (x, y) \in \mc{R}}
> $$ > $$
> >
> is called the **language** defined by $\mc{R}$. > is called the **language** defined by $\mc{R}$.
@@ -103,10 +103,10 @@ But how do we define *zero knowledge*? What is *knowledge*? If the verifier lear
> **Definition.** We say that a protocol is **honest verifier zero knowledge** (HVZK) if there exists an efficient algorithm $\rm{Sim}$ (simulator) on input $x$ such that the output distribution of $\rm{Sim}(x)$ is indistinguishable from the distribution of the verifier's view. > **Definition.** We say that a protocol is **honest verifier zero knowledge** (HVZK) if there exists an efficient algorithm $\rm{Sim}$ (simulator) on input $x$ such that the output distribution of $\rm{Sim}(x)$ is indistinguishable from the distribution of the verifier's view.
> >
> $$ > $$
> \rm{Sim}(x) \approx \rm{View}_V[P(x, y) \lra V(x)] > \rm{Sim}(x) \approx \rm{View} _ V[P(x, y) \lra V(x)]
> $$ > $$
For every verifier $V^{\ast}$, possibly dishonest, there exists a simulator $\rm{Sim}$ such that $\rm{Sim}(x)$ is indistinguishable from the verifier's view $\rm{View}_{V^{\ast}}[P(x, y) \leftrightarrow V^{\ast}(x)]$. For every verifier $V^{\ast}$, possibly dishonest, there exists a simulator $\rm{Sim}$ such that $\rm{Sim}(x)$ is indistinguishable from the verifier's view $\rm{View} _ {V^{\ast}}[P(x, y) \leftrightarrow V^{\ast}(x)]$.
If the proof is *zero knowledge*, the adversary can simulate conversations on his own without knowing the secret. Meaning that the adversary learns nothing from the conversation. If the proof is *zero knowledge*, the adversary can simulate conversations on his own without knowing the secret. Meaning that the adversary learns nothing from the conversation.

View File

@@ -56,7 +56,7 @@ The **soundness** property says that it is infeasible for any prover to make the
> 1. The adversary chooses a statement $y^{\ast} \in \mc{Y}$ and gives it to the challenger. > 1. The adversary chooses a statement $y^{\ast} \in \mc{Y}$ and gives it to the challenger.
> 2. The adversary interacts with the verifier $V(y^{\ast})$, where the challenger plays the role of verifier, and the adversary is a possibly *cheating* prover. > 2. The adversary interacts with the verifier $V(y^{\ast})$, where the challenger plays the role of verifier, and the adversary is a possibly *cheating* prover.
> >
> The adversary wins if $V(y^{\ast})$ outputs $\texttt{accept}$ but $y^{\ast} \notin L_\mc{R}$. The advantage of $\mc{A}$ with respect to $\Pi$ is denoted $\rm{Adv}_{\rm{Snd}}[\mc{A}, \Pi]$ and defined as the probability that $\mc{A}$ wins the game. > The adversary wins if $V(y^{\ast})$ outputs $\texttt{accept}$ but $y^{\ast} \notin L _ \mc{R}$. The advantage of $\mc{A}$ with respect to $\Pi$ is denoted $\rm{Adv} _ {\rm{Snd}}[\mc{A}, \Pi]$ and defined as the probability that $\mc{A}$ wins the game.
> >
> If the advantage is negligible for all efficient adversaries $\mc{A}$, then $\Pi$ is **sound**. > If the advantage is negligible for all efficient adversaries $\mc{A}$, then $\Pi$ is **sound**.
@@ -81,7 +81,7 @@ We also require that the challenge space is large, the challenger shouldn't be a
> For every efficient adversary $\mc{A}$, > For every efficient adversary $\mc{A}$,
> >
> $$ > $$
> \rm{Adv}_{\rm{Snd}}[\mc{A}, \Pi] \leq \frac{1}{N} > \rm{Adv} _ {\rm{Snd}}[\mc{A}, \Pi] \leq \frac{1}{N}
> $$ > $$
> >
> where $N$ is the size of the challenge space. > where $N$ is the size of the challenge space.
@@ -112,24 +112,24 @@ The Schnorr identification protocol is actually a sigma protocol. Refer to [Schn
> The pair $(P, V)$ is a sigma protocol for the relation $\mc{R} \subset \mc{X} \times \mc{Y}$ where > The pair $(P, V)$ is a sigma protocol for the relation $\mc{R} \subset \mc{X} \times \mc{Y}$ where
> >
> $$ > $$
> \mc{X} = \bb{Z}_q, \quad \mc{Y} = G, \quad \mc{R} = \left\lbrace (\alpha, u) \in \bb{Z}_q \times G : g^\alpha = u \right\rbrace. > \mc{X} = \bb{Z} _ q, \quad \mc{Y} = G, \quad \mc{R} = \left\lbrace (\alpha, u) \in \bb{Z} _ q \times G : g^\alpha = u \right\rbrace.
> $$ > $$
> >
> The challenge space $\mc{C}$ is a subset of $\bb{Z}_q$. > The challenge space $\mc{C}$ is a subset of $\bb{Z} _ q$.
The protocol provides **special soundness**. If $(u_t, c, \alpha_z)$ and $(u_t, c', \alpha_z')$ are two accepting conversations with $c \neq c'$, then we have The protocol provides **special soundness**. If $(u _ t, c, \alpha _ z)$ and $(u _ t, c', \alpha _ z')$ are two accepting conversations with $c \neq c'$, then we have
$$ $$
g^{\alpha_z} = u_t \cdot u^c, \quad g^{\alpha_z'} = u_t \cdot u^{c'}, g^{\alpha _ z} = u _ t \cdot u^c, \quad g^{\alpha _ z'} = u _ t \cdot u^{c'},
$$ $$
so we have $g^{\alpha_z - \alpha_z'} = u^{c - c'}$. Setting $\alpha^{\ast} = (\alpha_z - \alpha_z') /(c - c')$ satisfies $g^{\alpha^{\ast}} = u$, solving the discrete logarithm and $\alpha^{\ast}$ is a proof. so we have $g^{\alpha _ z - \alpha _ z'} = u^{c - c'}$. Setting $\alpha^{\ast} = (\alpha _ z - \alpha _ z') /(c - c')$ satisfies $g^{\alpha^{\ast}} = u$, solving the discrete logarithm and $\alpha^{\ast}$ is a proof.
As for HVZK, the simulator chooses $\alpha_z \la \bb{Z}_q$, $c \la \mc{C}$ randomly and sets $u_t = g^{\alpha_z} \cdot u^{-c}$. Then $(u_t, c, \alpha_z)$ will be accepted. *Note that the order doesn't matter.* Also, the distribution is same, since $c$ and $\alpha_z$ are uniform over $\mc{C}$ and $\bb{Z}_q$ and the choice of $c$ and $\alpha_z$ determines $u_t$ uniquely. This is identical to the distribution in the actual protocol. As for HVZK, the simulator chooses $\alpha _ z \la \bb{Z} _ q$, $c \la \mc{C}$ randomly and sets $u _ t = g^{\alpha _ z} \cdot u^{-c}$. Then $(u _ t, c, \alpha _ z)$ will be accepted. *Note that the order doesn't matter.* Also, the distribution is same, since $c$ and $\alpha _ z$ are uniform over $\mc{C}$ and $\bb{Z} _ q$ and the choice of $c$ and $\alpha _ z$ determines $u _ t$ uniquely. This is identical to the distribution in the actual protocol.
### Dishonest Verifier ### Dishonest Verifier
In case of dishonest verifiers, $V$ may not follow the protocol. For example, $V$ may choose non-uniform $c \in \mc{C}$ depending on the commitment $u_t$. In this case, the conversation from the actual protocol and the conversation generated by the simulator will have different distributions. In case of dishonest verifiers, $V$ may not follow the protocol. For example, $V$ may choose non-uniform $c \in \mc{C}$ depending on the commitment $u _ t$. In this case, the conversation from the actual protocol and the conversation generated by the simulator will have different distributions.
We need a different distribution. The simulator must also take the verifier's actions as input, to properly simulate the dishonest verifier. We need a different distribution. The simulator must also take the verifier's actions as input, to properly simulate the dishonest verifier.
@@ -137,8 +137,8 @@ We need a different distribution. The simulator must also take the verifier's ac
The original protocol can be modified so that the challenge space $\mc{C}$ is smaller. Completeness property is obvious, and the soundness error grows, but we can always repeat the protocol. The original protocol can be modified so that the challenge space $\mc{C}$ is smaller. Completeness property is obvious, and the soundness error grows, but we can always repeat the protocol.
As for zero knowledge, the simulator $\rm{Sim}_{V^{\ast}}(u)$ generates a verifier's view $(u, c, z)$ as follows. As for zero knowledge, the simulator $\rm{Sim} _ {V^{\ast}}(u)$ generates a verifier's view $(u, c, z)$ as follows.
- Guess $c' \la \mc{C}$. Sample $z' \la \bb{Z}_q$ and set $u' = g^{z'}\cdot u^{-c'}$. Send $u'$ to $V^{\ast}$. - Guess $c' \la \mc{C}$. Sample $z' \la \bb{Z} _ q$ and set $u' = g^{z'}\cdot u^{-c'}$. Send $u'$ to $V^{\ast}$.
- If the response from the verifier $V^{\ast}(u')$ is $c$ and $c \neq c'$, restart. - If the response from the verifier $V^{\ast}(u')$ is $c$ and $c \neq c'$, restart.
- $c = c'$ holds with probability $1 / \left\lvert \mc{C} \right\lvert$, since $c'$ is uniform. - $c = c'$ holds with probability $1 / \left\lvert \mc{C} \right\lvert$, since $c'$ is uniform.
- Otherwise, output $(u, c, z) = (u', c', z')$. - Otherwise, output $(u, c, z) = (u', c', z')$.
@@ -155,22 +155,22 @@ But in most cases, it is enough to assume honest verifiers, as we will see soon.
This one is similar to Schnorr protocol. This is used for proving the representation of a group element. This one is similar to Schnorr protocol. This is used for proving the representation of a group element.
Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$, let $h \in G$ be some arbitrary group element, fixed as a system parameter. A **representation** of $u$ relative to $g$ and $h$ is a pair $(\alpha, \beta) \in \bb{Z}_q^2$ such that $g^\alpha h^\beta = u$. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$, let $h \in G$ be some arbitrary group element, fixed as a system parameter. A **representation** of $u$ relative to $g$ and $h$ is a pair $(\alpha, \beta) \in \bb{Z} _ q^2$ such that $g^\alpha h^\beta = u$.
**Okamoto's protocol** for the relation **Okamoto's protocol** for the relation
$$ $$
\mc{R} = \bigg\lbrace \big( (\alpha, \beta), u \big) \in \bb{Z}_q^2 \times G : g^\alpha h^\beta = u \bigg\rbrace \mc{R} = \bigg\lbrace \big( (\alpha, \beta), u \big) \in \bb{Z} _ q^2 \times G : g^\alpha h^\beta = u \bigg\rbrace
$$ $$
goes as follows. goes as follows.
![mc-13-okamoto.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-okamoto.png) ![mc-13-okamoto.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-okamoto.png)
> 1. $P$ computes random $\alpha_t, \beta_t \la \bb{Z}_q$ and sends commitment $u_t \la g^{\alpha_t}h^{\beta_t}$ to $V$. > 1. $P$ computes random $\alpha _ t, \beta _ t \la \bb{Z} _ q$ and sends commitment $u _ t \la g^{\alpha _ t}h^{\beta _ t}$ to $V$.
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$. > 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
> 3. $P$ computes $\alpha_z \la \alpha_t + \alpha c$, $\beta_z \la \beta_t + \beta c$ and sends $(\alpha_z, \beta_z)$ to $V$. > 3. $P$ computes $\alpha _ z \la \alpha _ t + \alpha c$, $\beta _ z \la \beta _ t + \beta c$ and sends $(\alpha _ z, \beta _ z)$ to $V$.
> 4. $V$ outputs $\texttt{accept}$ if and only if $g^{\alpha_z} h^{\beta_z} = u_t \cdot u^c$. > 4. $V$ outputs $\texttt{accept}$ if and only if $g^{\alpha _ z} h^{\beta _ z} = u _ t \cdot u^c$.
Completeness is obvious. Completeness is obvious.
@@ -182,22 +182,22 @@ Completeness is obvious.
The **Chaum-Pederson protocol** is for convincing a verifier that a given triple is a DH-triple. The **Chaum-Pederson protocol** is for convincing a verifier that a given triple is a DH-triple.
Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. $(g^\alpha, g^\beta, g^\gamma)$ is a DH-triple if $\gamma = \alpha\beta$. Then, the triple $(u, v, w)$ is a DH-triple if and only if $v = g^\beta$ and $w = u^\beta$ for some $\beta \in \bb{Z}_q$. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. $(g^\alpha, g^\beta, g^\gamma)$ is a DH-triple if $\gamma = \alpha\beta$. Then, the triple $(u, v, w)$ is a DH-triple if and only if $v = g^\beta$ and $w = u^\beta$ for some $\beta \in \bb{Z} _ q$.
The Chaum-Pederson protocol for the relation The Chaum-Pederson protocol for the relation
$$ $$
\mc{R} = \bigg\lbrace \big( \beta, (u, v, w) \big) \in \bb{Z}_q \times G^3 : v = g^\beta \land w = u^\beta \bigg\rbrace \mc{R} = \bigg\lbrace \big( \beta, (u, v, w) \big) \in \bb{Z} _ q \times G^3 : v = g^\beta \land w = u^\beta \bigg\rbrace
$$ $$
goes as follows. goes as follows.
![mc-13-chaum-pedersen.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-chaum-pedersen.png) ![mc-13-chaum-pedersen.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-chaum-pedersen.png)
> 1. $P$ computes random $\beta_t \la \bb{Z}_q$ and sends commitment $v_t \la g^{\beta_t}$, $w_t \la u^{\beta_t}$ to $V$. > 1. $P$ computes random $\beta _ t \la \bb{Z} _ q$ and sends commitment $v _ t \la g^{\beta _ t}$, $w _ t \la u^{\beta _ t}$ to $V$.
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$. > 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
> 3. $P$ computes $\beta_z \la \beta_t + \beta c$, and sends it to $V$. > 3. $P$ computes $\beta _ z \la \beta _ t + \beta c$, and sends it to $V$.
> 4. $V$ outputs $\texttt{accept}$ if and only if $g^{\beta_z} = v_t \cdot v^c$ and $u^{\beta_z} = w_t \cdot w^c$. > 4. $V$ outputs $\texttt{accept}$ if and only if $g^{\beta _ z} = v _ t \cdot v^c$ and $u^{\beta _ z} = w _ t \cdot w^c$.
Completeness is obvious. Completeness is obvious.
@@ -213,22 +213,22 @@ Schnorr, Okamoto, Chaum-Pedersen protocols look similar. They are special cases
### Sigma Protocol for RSA ### Sigma Protocol for RSA
Let $(n, e)$ be an RSA public key, where $e$ is prime. The **Guillou-Quisquater** (GQ) protocol is used to convince a verifier that he knows an $e$-th root of $y \in \bb{Z}_n^{\ast}$. Let $(n, e)$ be an RSA public key, where $e$ is prime. The **Guillou-Quisquater** (GQ) protocol is used to convince a verifier that he knows an $e$-th root of $y \in \bb{Z} _ n^{\ast}$.
The Guillou-Quisquater protocol for the relation The Guillou-Quisquater protocol for the relation
$$ $$
\mc{R} = \bigg\lbrace (x, y) \in \big( \bb{Z}_n^{\ast} \big)^2 : x^e = y \bigg\rbrace \mc{R} = \bigg\lbrace (x, y) \in \big( \bb{Z} _ n^{\ast} \big)^2 : x^e = y \bigg\rbrace
$$ $$
goes as follows. goes as follows.
![mc-13-gq-protocol.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-gq-protocol.png) ![mc-13-gq-protocol.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-13-gq-protocol.png)
> 1. $P$ computes random $x_t \la \bb{Z}_n^{\ast}$ and sends commitment $y_t \la x_t^e$ to $V$. > 1. $P$ computes random $x _ t \la \bb{Z} _ n^{\ast}$ and sends commitment $y _ t \la x _ t^e$ to $V$.
> 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$. > 2. $V$ computes challenge $c \la \mc{C}$ and sends it to $P$.
> 3. $P$ computes $x_z \la x_t \cdot x^c$ and sends it to $V$. > 3. $P$ computes $x _ z \la x _ t \cdot x^c$ and sends it to $V$.
> 4. $V$ outputs $\texttt{accept}$ if and only if $x_z^e = y_t \cdot y^c$. > 4. $V$ outputs $\texttt{accept}$ if and only if $x _ z^e = y _ t \cdot y^c$.
Completeness is obvious. Completeness is obvious.
@@ -244,29 +244,29 @@ Using the basic sigma protocols, we can build sigma protocols for complex statem
The construction is straightforward, since we can just prove both statements. The construction is straightforward, since we can just prove both statements.
Given two sigma protocols $(P_0, V_0)$ for $\mc{R}_0 \subset \mc{X}_0 \times \mc{Y}_0$ and $(P_1, V_1)$ for $\mc{R}_1 \subset \mc{X}_1 \times \mc{Y}_1$, we construct a sigma protocol for the relation $\mc{R}_\rm{AND}$ defined on $(\mc{X}_0 \times \mc{X}_1) \times (\mc{Y}_0 \times \mc{Y}_1)$ as Given two sigma protocols $(P _ 0, V _ 0)$ for $\mc{R} _ 0 \subset \mc{X} _ 0 \times \mc{Y} _ 0$ and $(P _ 1, V _ 1)$ for $\mc{R} _ 1 \subset \mc{X} _ 1 \times \mc{Y} _ 1$, we construct a sigma protocol for the relation $\mc{R} _ \rm{AND}$ defined on $(\mc{X} _ 0 \times \mc{X} _ 1) \times (\mc{Y} _ 0 \times \mc{Y} _ 1)$ as
$$ $$
\mc{R}_\rm{AND} = \bigg\lbrace \big( (x_0, x_1), (y_0, y_1) \big) : (x_0, y_0) \in \mc{R}_0 \land (x_1, y_1) \in \mc{R}_1 \bigg\rbrace. \mc{R} _ \rm{AND} = \bigg\lbrace \big( (x _ 0, x _ 1), (y _ 0, y _ 1) \big) : (x _ 0, y _ 0) \in \mc{R} _ 0 \land (x _ 1, y _ 1) \in \mc{R} _ 1 \bigg\rbrace.
$$ $$
Given a pair of statements $(y_0, y_1) \in \mc{Y}_0 \times \mc{Y}_1$, the prover tries to convince the verifier that he knows a proof $(x_0, x_1) \in \mc{X}_0 \times \mc{X}_1$. This is equivalent to proving the AND of both statements. Given a pair of statements $(y _ 0, y _ 1) \in \mc{Y} _ 0 \times \mc{Y} _ 1$, the prover tries to convince the verifier that he knows a proof $(x _ 0, x _ 1) \in \mc{X} _ 0 \times \mc{X} _ 1$. This is equivalent to proving the AND of both statements.
> 1. $P$ runs $P_i(x_i, y_i)$ to get a commitment $t_i$. $(t_0, t_1)$ is sent to $V$. > 1. $P$ runs $P _ i(x _ i, y _ i)$ to get a commitment $t _ i$. $(t _ 0, t _ 1)$ is sent to $V$.
> 2. $V$ computes challenge $c \la C$ and sends it to $P$. > 2. $V$ computes challenge $c \la C$ and sends it to $P$.
> 3. $P$ uses the challenge for both $P_0, P_1$, obtains response $z_0$, $z_1$, which is sent to $V$. > 3. $P$ uses the challenge for both $P _ 0, P _ 1$, obtains response $z _ 0$, $z _ 1$, which is sent to $V$.
> 4. $V$ outputs $\texttt{accept}$ if and only if $(t_i, c, z_i)$ is an accepting conversation for $y_i$. > 4. $V$ outputs $\texttt{accept}$ if and only if $(t _ i, c, z _ i)$ is an accepting conversation for $y _ i$.
Completeness is clear. Completeness is clear.
> **Theorem.** If $(P_0, V_0)$ and $(P_1, V_1)$ provide special soundness and are special HVZK, then the AND protocol $(P, V)$ defined above also provides special soundness and is special HVZK. > **Theorem.** If $(P _ 0, V _ 0)$ and $(P _ 1, V _ 1)$ provide special soundness and are special HVZK, then the AND protocol $(P, V)$ defined above also provides special soundness and is special HVZK.
*Proof*. For special soundness, let $\rm{Ext}_0$, $\rm{Ext}_1$ be the knowledge extractor for $(P_0, V_0)$ and $(P_1, V_1)$, respectively. Then the knowledge extractor $\rm{Ext}$ for $(P, V)$ can be constructed straightforward. For statements $(y_0, y_1)$, suppose that $\big( (t_0, t_1), c, (z_0, z_1) \big)$ and $\big( (t_0, t_1), c', (z_0', z_1') \big)$ are two accepting conversations. Feed $\big( y_0, (t_0, c, z_0), (t_0, c', z_0') \big)$ to $\rm{Ext}_0$, and feed $\big( y_1, (t_1, c, z_1), (t_1, c', z_1') \big)$ to $\rm{Ext}_1$. *Proof*. For special soundness, let $\rm{Ext} _ 0$, $\rm{Ext} _ 1$ be the knowledge extractor for $(P _ 0, V _ 0)$ and $(P _ 1, V _ 1)$, respectively. Then the knowledge extractor $\rm{Ext}$ for $(P, V)$ can be constructed straightforward. For statements $(y _ 0, y _ 1)$, suppose that $\big( (t _ 0, t _ 1), c, (z _ 0, z _ 1) \big)$ and $\big( (t _ 0, t _ 1), c', (z _ 0', z _ 1') \big)$ are two accepting conversations. Feed $\big( y _ 0, (t _ 0, c, z _ 0), (t _ 0, c', z _ 0') \big)$ to $\rm{Ext} _ 0$, and feed $\big( y _ 1, (t _ 1, c, z _ 1), (t _ 1, c', z _ 1') \big)$ to $\rm{Ext} _ 1$.
For special HVZK, let $\rm{Sim}_0$ and $\rm{Sim}_1$ be simulators for each protocol. Then the simulator $\rm{Sim}$ for $(P, V)$ is built by using $(t_0, z_0) \la \rm{Sim}_0(y_0, c)$ and $(t_1, z_1) \la \rm{Sim}_1(y_1, c)$. Set For special HVZK, let $\rm{Sim} _ 0$ and $\rm{Sim} _ 1$ be simulators for each protocol. Then the simulator $\rm{Sim}$ for $(P, V)$ is built by using $(t _ 0, z _ 0) \la \rm{Sim} _ 0(y _ 0, c)$ and $(t _ 1, z _ 1) \la \rm{Sim} _ 1(y _ 1, c)$. Set
$$ $$
\big( (t_0, t_1), (z_0, z_1) \big) \la \rm{Sim}\big( (y_0, y_1), c \big). \big( (t _ 0, t _ 1), (z _ 0, z _ 1) \big) \la \rm{Sim}\big( (y _ 0, y _ 1), c \big).
$$ $$
We have used the fact that the challenge is used for both protocols. We have used the fact that the challenge is used for both protocols.
@@ -275,83 +275,83 @@ We have used the fact that the challenge is used for both protocols.
However, OR-proof construction is difficult. The prover must convince the verifier that either one of the statement is true, but **should not reveal which one is true.** However, OR-proof construction is difficult. The prover must convince the verifier that either one of the statement is true, but **should not reveal which one is true.**
If the challenge is known in advance, the prover can cheat. We exploit this fact. For the proof of $y_0 \lor y_1$, do the real proof for $y_b$ and cheat for $y_{1-b}$. If the challenge is known in advance, the prover can cheat. We exploit this fact. For the proof of $y _ 0 \lor y _ 1$, do the real proof for $y _ b$ and cheat for $y _ {1-b}$.
Suppose we are given two sigma protocols $(P_0, V_0)$ for $\mc{R}_0 \subset \mc{X}_0 \times \mc{Y}_0$ and $(P_1, V_1)$ for $\mc{R}_1 \subset \mc{X}_1 \times \mc{Y}_1$. We assume that these both use the same challenge space, and both are special HVZK with simulators $\rm{Sim}_0$ and $\rm{Sim}_1$. Suppose we are given two sigma protocols $(P _ 0, V _ 0)$ for $\mc{R} _ 0 \subset \mc{X} _ 0 \times \mc{Y} _ 0$ and $(P _ 1, V _ 1)$ for $\mc{R} _ 1 \subset \mc{X} _ 1 \times \mc{Y} _ 1$. We assume that these both use the same challenge space, and both are special HVZK with simulators $\rm{Sim} _ 0$ and $\rm{Sim} _ 1$.
We combine the protocols to form a sigma protocol for the relation $\mc{R}_\rm{OR}$ defined on $\big( \braces{0, 1} \times (\mc{X}_0 \cup \mc{X}_1) \big) \times (\mc{Y}_0\times \mc{Y}_1)$ as We combine the protocols to form a sigma protocol for the relation $\mc{R} _ \rm{OR}$ defined on $\big( \braces{0, 1} \times (\mc{X} _ 0 \cup \mc{X} _ 1) \big) \times (\mc{Y} _ 0\times \mc{Y} _ 1)$ as
$$ $$
\mc{R}_\rm{OR} = \bigg\lbrace \big( (b, x), (y_0, y_1) \big): (x, y_b) \in \mc{R}_b\bigg\rbrace. \mc{R} _ \rm{OR} = \bigg\lbrace \big( (b, x), (y _ 0, y _ 1) \big): (x, y _ b) \in \mc{R} _ b\bigg\rbrace.
$$ $$
Here, $b$ denotes the actual statement $y_b$ to prove. For $y_{1-b}$, we cheat. Here, $b$ denotes the actual statement $y _ b$ to prove. For $y _ {1-b}$, we cheat.
> $P$ is initialized with $\big( (b, x), (y_0, y_1) \big) \in \mc{R}_\rm{OR}$ and $V$ is initialized with $(y_0, y_1) \in \mc{Y}_0 \times \mc{Y}_1$. Let $d = 1 - b$. > $P$ is initialized with $\big( (b, x), (y _ 0, y _ 1) \big) \in \mc{R} _ \rm{OR}$ and $V$ is initialized with $(y _ 0, y _ 1) \in \mc{Y} _ 0 \times \mc{Y} _ 1$. Let $d = 1 - b$.
> >
> 1. $P$ computes $c_d \la \mc{C}$ and $(t_d, z_d) \la \rm{Sim}_d(y_d, c_d)$. > 1. $P$ computes $c _ d \la \mc{C}$ and $(t _ d, z _ d) \la \rm{Sim} _ d(y _ d, c _ d)$.
> 2. $P$ runs $P_b(x, y_b)$ to get a real commitment $t_b$ and sends $(t_0, t_1)$ to $V$. > 2. $P$ runs $P _ b(x, y _ b)$ to get a real commitment $t _ b$ and sends $(t _ 0, t _ 1)$ to $V$.
> 3. $V$ computes challenge $c \la C$ and sends it to $P$. > 3. $V$ computes challenge $c \la C$ and sends it to $P$.
> 4. $P$ computes $c_b \la c \oplus c_d$, feeds it to $P_b(x, y_b)$ obtains a response $z_b$. > 4. $P$ computes $c _ b \la c \oplus c _ d$, feeds it to $P _ b(x, y _ b)$ obtains a response $z _ b$.
> 5. $P$ sends $(c_0, z_0, z_1)$ to $V$. > 5. $P$ sends $(c _ 0, z _ 0, z _ 1)$ to $V$.
> 6. $V$ computes $c_1 \la c \oplus c_0$, and outputs $\texttt{accept}$ if and only if $(t_0, c_0, z_0)$ is an accepting conversation for $y_0$ and $(t_1, c_1, z_1)$ is an accepting conversation for $y_1$. > 6. $V$ computes $c _ 1 \la c \oplus c _ 0$, and outputs $\texttt{accept}$ if and only if $(t _ 0, c _ 0, z _ 0)$ is an accepting conversation for $y _ 0$ and $(t _ 1, c _ 1, z _ 1)$ is an accepting conversation for $y _ 1$.
Step $1$ is the cheating part, where the prover chooses a challenge, and generates a commitment and a response from the simulator. Step $1$ is the cheating part, where the prover chooses a challenge, and generates a commitment and a response from the simulator.
Completeness follows from the following. Completeness follows from the following.
- $c_b = c \oplus c_{1-b}$, so $c_1 = c \oplus c_0$ always holds. - $c _ b = c \oplus c _ {1-b}$, so $c _ 1 = c \oplus c _ 0$ always holds.
- Both conversations $(t_0, c_0, z_0)$ and $(t_1, c_1, z_1)$ are accepted. - Both conversations $(t _ 0, c _ 0, z _ 0)$ and $(t _ 1, c _ 1, z _ 1)$ are accepted.
- An actual proof is done for statement $y_b$. - An actual proof is done for statement $y _ b$.
- For statement $y_{1-b}$, the simulator always outputs an accepting conversation. - For statement $y _ {1-b}$, the simulator always outputs an accepting conversation.
$c_b = c \oplus c_d$ is random, so $P$ cannot manipulate the challenge. Also, $V$ checks $c_1 = c \oplus c_0$. $c _ b = c \oplus c _ d$ is random, so $P$ cannot manipulate the challenge. Also, $V$ checks $c _ 1 = c \oplus c _ 0$.
> **Theorem.** If $(P_0, V_0)$ and $(P_1, V_1)$ provide special soundness and are special HVZK, then the OR protocol $(P, V)$ defined above also provides special soundness and is special HVZK. > **Theorem.** If $(P _ 0, V _ 0)$ and $(P _ 1, V _ 1)$ provide special soundness and are special HVZK, then the OR protocol $(P, V)$ defined above also provides special soundness and is special HVZK.
*Proof*. For special soundness, suppose that $\rm{Ext}_0$ and $\rm{Ext}_1$ are knowledge extractors. Let *Proof*. For special soundness, suppose that $\rm{Ext} _ 0$ and $\rm{Ext} _ 1$ are knowledge extractors. Let
$$ $$
\big( (t_0, t_1), c, (c_0, z_0, z_1) \big), \qquad \big( (t_0, t_1), c', (c_0', z_0', z_1') \big) \big( (t _ 0, t _ 1), c, (c _ 0, z _ 0, z _ 1) \big), \qquad \big( (t _ 0, t _ 1), c', (c _ 0', z _ 0', z _ 1') \big)
$$ $$
be two accepting conversations with $c \neq c'$. Define $c_1 = c \oplus c_0$ and $c_1' = c' \oplus c_0'$. Since $c \neq c'$, it must be the case that either $c_0 \neq c_0'$ or $c_1 \neq c_1'$. Now $\rm{Ext}$ will work as follows. be two accepting conversations with $c \neq c'$. Define $c _ 1 = c \oplus c _ 0$ and $c _ 1' = c' \oplus c _ 0'$. Since $c \neq c'$, it must be the case that either $c _ 0 \neq c _ 0'$ or $c _ 1 \neq c _ 1'$. Now $\rm{Ext}$ will work as follows.
- If $c_0 \neq c_0'$, output $\bigg( 0, \rm{Ext}_0\big( y_0, (t_0, c_0, z_0), (t_0, c_0', z_0') \big) \bigg)$. - If $c _ 0 \neq c _ 0'$, output $\bigg( 0, \rm{Ext} _ 0\big( y _ 0, (t _ 0, c _ 0, z _ 0), (t _ 0, c _ 0', z _ 0') \big) \bigg)$.
- If $c_1 \neq c_1'$, output $\bigg( 1, \rm{Ext}_1\big( y_1, (t_1, c_1, z_1), (t_1, c_1', z_1') \big) \bigg)$. - If $c _ 1 \neq c _ 1'$, output $\bigg( 1, \rm{Ext} _ 1\big( y _ 1, (t _ 1, c _ 1, z _ 1), (t _ 1, c _ 1', z _ 1') \big) \bigg)$.
Then $\rm{Ext}$ will extract the knowledge. Then $\rm{Ext}$ will extract the knowledge.
For special HVZK, define $c_0 \la \mc{C}$, $c_1 \la c \oplus c_0$. Then run each simulator to get For special HVZK, define $c _ 0 \la \mc{C}$, $c _ 1 \la c \oplus c _ 0$. Then run each simulator to get
$$ $$
(t_0, z_0) \la \rm{Sim}_0(y_0, c_0), \quad (t_1, z_1) \la \rm{Sim}_1(y_1, c_1). (t _ 0, z _ 0) \la \rm{Sim} _ 0(y _ 0, c _ 0), \quad (t _ 1, z _ 1) \la \rm{Sim} _ 1(y _ 1, c _ 1).
$$ $$
Then the simulator for $(P, V)$ outputs Then the simulator for $(P, V)$ outputs
$$ $$
\big( (t_0, t_1), (c_0, z_0, z_1) \big) \la \rm{Sim}\big( (y_0, y_1), c \big). \big( (t _ 0, t _ 1), (c _ 0, z _ 0, z _ 1) \big) \la \rm{Sim}\big( (y _ 0, y _ 1), c \big).
$$ $$
The simulator just simulates for both of the statements and returns the messages as in the protocol. $c_b$ is random, and the remaining values have the same distribution since the original two protocols were special HVZK. The simulator just simulates for both of the statements and returns the messages as in the protocol. $c _ b$ is random, and the remaining values have the same distribution since the original two protocols were special HVZK.
### Example: OR of Sigma Protocols with Schnorr Protocol ### Example: OR of Sigma Protocols with Schnorr Protocol
Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. The prover wants to convince the verifier that he knows the discrete logarithm of either $h_0$ or $h_1$ in $G$. Let $G = \left\langle g \right\rangle$ be a cyclic group of prime order $q$. The prover wants to convince the verifier that he knows the discrete logarithm of either $h _ 0$ or $h _ 1$ in $G$.
Suppose that the prover knows $x_b \in \bb{Z}_q$ such that $g^{x_b} = h_b$. Suppose that the prover knows $x _ b \in \bb{Z} _ q$ such that $g^{x _ b} = h _ b$.
> 1. Choose $c_{1-b} \la \mc{C}$ and call simulator of $1-b$ to obtain $(u_{1-b}, z_{1-b}) \la \rm{Sim}_{1-b}$. > 1. Choose $c _ {1-b} \la \mc{C}$ and call simulator of $1-b$ to obtain $(u _ {1-b}, z _ {1-b}) \la \rm{Sim} _ {1-b}$.
> 2. $P$ sends two commitments $u_0, u_1$. > 2. $P$ sends two commitments $u _ 0, u _ 1$.
> - For $u_b$, choose random $y \la \bb{Z}_q$ and set $u_b = g^y$. > - For $u _ b$, choose random $y \la \bb{Z} _ q$ and set $u _ b = g^y$.
> - For $u_{1-b}$, use the value from the simulator. > - For $u _ {1-b}$, use the value from the simulator.
> 3. $V$ sends a single challenge $c \la \mc{C}$. > 3. $V$ sends a single challenge $c \la \mc{C}$.
> 4. Using $c_{1-b}$, split the challenge into $c_0$, $c_1$ so that they satisfy $c_0 \oplus c_1 = c$. Then send $(c_0, c_1, z_0, z_1)$ to $V$. > 4. Using $c _ {1-b}$, split the challenge into $c _ 0$, $c _ 1$ so that they satisfy $c _ 0 \oplus c _ 1 = c$. Then send $(c _ 0, c _ 1, z _ 0, z _ 1)$ to $V$.
> - For $z_b$, calculate $z_b \la y + c_b x$. > - For $z _ b$, calculate $z _ b \la y + c _ b x$.
> - For $z_{1-b}$, use the value from the simulator. > - For $z _ {1-b}$, use the value from the simulator.
> 5. $V$ checks if $c = c_0 \oplus c_1$. $V$ accepts if and only if $(u_0, c_0, z_0)$ and $(u_1, c_1, z_1)$ are both accepting conversations. > 5. $V$ checks if $c = c _ 0 \oplus c _ 1$. $V$ accepts if and only if $(u _ 0, c _ 0, z _ 0)$ and $(u _ 1, c _ 1, z _ 1)$ are both accepting conversations.
- Since $c, c_{1-b}$ are random, $c_b$ is random. Thus one of the proofs must be valid. - Since $c, c _ {1-b}$ are random, $c _ b$ is random. Thus one of the proofs must be valid.
### Generalized Constructions ### Generalized Constructions
@@ -376,7 +376,7 @@ Intuitively, it is hard to create a valid proof of a false statement.
> **Definition.** Let $\Phi = (G, V)$ be a non-interactive proof system for $\mc{R} \subset \mc{X} \times \mc{Y}$ with proof space $\mc{PS}$. An adversary $\mc{A}$ outputs a statement $y^{\ast} \in \mc{Y}$ and a proof $\pi^{\ast} \in \mc{PS}$ to attack $\Phi$. > **Definition.** Let $\Phi = (G, V)$ be a non-interactive proof system for $\mc{R} \subset \mc{X} \times \mc{Y}$ with proof space $\mc{PS}$. An adversary $\mc{A}$ outputs a statement $y^{\ast} \in \mc{Y}$ and a proof $\pi^{\ast} \in \mc{PS}$ to attack $\Phi$.
> >
> The adversary wins if $V(y^{\ast}, \pi^{\ast}) = \texttt{accept}$ and $y^{\ast} \notin L_\mc{R}$. The advantage of $\mc{A}$ with respect to $\Phi$ is defined as the probability that $\mc{A}$ wins, and is denoted as $\rm{Adv}_{\rm{niSnd}}[\mc{A}, \Phi]$. > The adversary wins if $V(y^{\ast}, \pi^{\ast}) = \texttt{accept}$ and $y^{\ast} \notin L _ \mc{R}$. The advantage of $\mc{A}$ with respect to $\Phi$ is defined as the probability that $\mc{A}$ wins, and is denoted as $\rm{Adv} _ {\rm{niSnd}}[\mc{A}, \Phi]$.
> >
> If the advantage is negligible for all efficient adversaries $\mc{A}$, $\Phi$ is **sound**. > If the advantage is negligible for all efficient adversaries $\mc{A}$, $\Phi$ is **sound**.
@@ -390,10 +390,10 @@ The basic idea is **using a hash function to derive a challenge**, instead of a
> **Definition.** Let $\Pi = (P, V)$ be a sigma protocol for a relation $\mc{R} \subset \mc{X} \times \mc{Y}$. Suppose that conversations $(t, c, z) \in \mc{T} \times \mc{C} \times \mc{Z}$. Let $H : \mc{Y} \times \mc{T} \rightarrow \mc{C}$ be a hash function. > **Definition.** Let $\Pi = (P, V)$ be a sigma protocol for a relation $\mc{R} \subset \mc{X} \times \mc{Y}$. Suppose that conversations $(t, c, z) \in \mc{T} \times \mc{C} \times \mc{Z}$. Let $H : \mc{Y} \times \mc{T} \rightarrow \mc{C}$ be a hash function.
> >
> Define the **Fiat-Shamir non-interactive proof system** $\Pi_\rm{FS} = (G_\rm{FS}, V_\rm{FS})$ with proof space $\mc{PS} = \mc{T} \times \mc{Z}$ as follows. > Define the **Fiat-Shamir non-interactive proof system** $\Pi _ \rm{FS} = (G _ \rm{FS}, V _ \rm{FS})$ with proof space $\mc{PS} = \mc{T} \times \mc{Z}$ as follows.
> >
> - For input $(x, y) \in \mc{R}$, $G_\rm{FS}$ runs $P(x, y)$ to obtain a commitment $t \in \mc{T}$. Then computes the challenge $c = H(y, t)$, which is fed to $P(x, y)$, obtaining a response $z \in \mc{Z}$. $G_\rm{FS}$ outputs $(t, z) \in \mc{T} \times \mc{Z}$. > - For input $(x, y) \in \mc{R}$, $G _ \rm{FS}$ runs $P(x, y)$ to obtain a commitment $t \in \mc{T}$. Then computes the challenge $c = H(y, t)$, which is fed to $P(x, y)$, obtaining a response $z \in \mc{Z}$. $G _ \rm{FS}$ outputs $(t, z) \in \mc{T} \times \mc{Z}$.
> - For input $\big( y, (t, z) \big) \in \mc{Y} \times (\mc{T} \times \mc{Z})$, $V_\rm{FS}$ verifies that $(t, c, z)$ is an accepting conversation for $y$, where $c = H(y, t)$. > - For input $\big( y, (t, z) \big) \in \mc{Y} \times (\mc{T} \times \mc{Z})$, $V _ \rm{FS}$ verifies that $(t, c, z)$ is an accepting conversation for $y$, where $c = H(y, t)$.
Any sigma protocol can be converted into a non-interactive proof system. Its completeness is automatically given by the completeness of the sigma protocol. Any sigma protocol can be converted into a non-interactive proof system. Its completeness is automatically given by the completeness of the sigma protocol.
@@ -409,12 +409,12 @@ By modeling the hash function as a random oracle, we can show that:
### Soundness of the Fiat-Shamir Transform ### Soundness of the Fiat-Shamir Transform
> **Theorem.** Let $\Pi$ be a sigma protocol for a relation $\mc{R} \subset \mc{X} \times \mc{Y}$, and let $\Pi_\rm{FS}$ be the Fiat-Shamir non-interactive proof system derived from $\Pi$ with hash function $H$. If $\Pi$ is sound and $H$ is modeled as a random oracle, then $\Pi_\rm{FS}$ is also sound. > **Theorem.** Let $\Pi$ be a sigma protocol for a relation $\mc{R} \subset \mc{X} \times \mc{Y}$, and let $\Pi _ \rm{FS}$ be the Fiat-Shamir non-interactive proof system derived from $\Pi$ with hash function $H$. If $\Pi$ is sound and $H$ is modeled as a random oracle, then $\Pi _ \rm{FS}$ is also sound.
> >
> Let $\mc{A}$ be a $q$-query adversary attacking the soundness of $\Pi_\rm{FS}$. There exists an adversary $\mc{B}$ attacking the soundness of $\Pi$ such that > Let $\mc{A}$ be a $q$-query adversary attacking the soundness of $\Pi _ \rm{FS}$. There exists an adversary $\mc{B}$ attacking the soundness of $\Pi$ such that
> >
> $$ > $$
> \rm{Adv}_{\rm{niSnd^{ro}}}[\mc{A}, \Pi_\rm{FS}] \leq (q + 1) \rm{Adv}_{\rm{Snd}}[\mc{B}, \Pi]. > \rm{Adv} _ {\rm{niSnd^{ro}}}[\mc{A}, \Pi _ \rm{FS}] \leq (q + 1) \rm{Adv} _ {\rm{Snd}}[\mc{B}, \Pi].
> $$ > $$
*Proof Idea*. Suppose that $\mc{A}$ produces a valid proof $(t^{\ast}, z^{\ast})$ on a false statement $y^{\ast}$. Without loss of generality, $\mc{A}$ queries the random oracle at $(y^{\ast}, t^{\ast})$ within $q+1$ queries. Then $\mc{B}$ guesses which of the $q+1$ queries is the relevant one. If $\mc{B}$ guesses the correct query, the conversation $(t^{\ast}, c, z^{\ast})$ will be accepted and $\mc{B}$ succeeds. The factor $q+1$ comes from the choice of $\mc{B}$. *Proof Idea*. Suppose that $\mc{A}$ produces a valid proof $(t^{\ast}, z^{\ast})$ on a false statement $y^{\ast}$. Without loss of generality, $\mc{A}$ queries the random oracle at $(y^{\ast}, t^{\ast})$ within $q+1$ queries. Then $\mc{B}$ guesses which of the $q+1$ queries is the relevant one. If $\mc{B}$ guesses the correct query, the conversation $(t^{\ast}, c, z^{\ast})$ will be accepted and $\mc{B}$ succeeds. The factor $q+1$ comes from the choice of $\mc{B}$.
@@ -452,23 +452,23 @@ $n$ voters are casting a vote, either $0$ or $1$. At the end, all voters learn t
We can use the [multiplicative ElGamal encryption](../2023-10-19-public-key-encryption/#the-elgamal-encryption) scheme in this case. Assume that a trusted vote tallying center generates a key pair, keeps $sk = \alpha$ to itself and publishes $pk = g^\alpha$. We can use the [multiplicative ElGamal encryption](../2023-10-19-public-key-encryption/#the-elgamal-encryption) scheme in this case. Assume that a trusted vote tallying center generates a key pair, keeps $sk = \alpha$ to itself and publishes $pk = g^\alpha$.
Each voter encrypts the vote $b_i$ and the ciphertext is Each voter encrypts the vote $b _ i$ and the ciphertext is
$$ $$
(u_i, v_i) = (g^{\beta_i}, h^{\beta_i} \cdot g^{b_i}) (u _ i, v _ i) = (g^{\beta _ i}, h^{\beta _ i} \cdot g^{b _ i})
$$ $$
where $\beta_i \la\bb{Z}_q$. The vote tallying center aggregates all ciphertexts my multiplying everything. No need to decrypt yet. Then where $\beta _ i \la\bb{Z} _ q$. The vote tallying center aggregates all ciphertexts my multiplying everything. No need to decrypt yet. Then
$$ $$
(u^{\ast}, v^{\ast}) = \left( \prod_{i=1}^n g^{\beta_i}, \prod_{i=1}^n h^{\beta_i} \cdot g^{b_i} \right) = \big( g^{\beta^{\ast}}, h^{\beta^{\ast}} \cdot g^{b^{\ast}} \big), (u^{\ast}, v^{\ast}) = \left( \prod _ {i=1}^n g^{\beta _ i}, \prod _ {i=1}^n h^{\beta _ i} \cdot g^{b _ i} \right) = \big( g^{\beta^{\ast}}, h^{\beta^{\ast}} \cdot g^{b^{\ast}} \big),
$$ $$
where $\beta^{\ast} = \sum_{i=1}^n \beta_i$ and $b^{\ast} = \sum_{i=1}^n b_i$. Now decrypt $(u^{\ast}, v^{\ast})$ and publish the result $b^{\ast}$.[^4] where $\beta^{\ast} = \sum _ {i=1}^n \beta _ i$ and $b^{\ast} = \sum _ {i=1}^n b _ i$. Now decrypt $(u^{\ast}, v^{\ast})$ and publish the result $b^{\ast}$.[^4]
Since the ElGamal scheme is semantically secure, the protocol is also secure if all voters follow the protocol. But a dishonest voter can encrypt $b_i = -100$ or some arbitrary value. Since the ElGamal scheme is semantically secure, the protocol is also secure if all voters follow the protocol. But a dishonest voter can encrypt $b _ i = -100$ or some arbitrary value.
To fix this, we can make each voter prove that the vote is valid. Using the [Chaum-Pedersen protocol for DH-triples](../2023-11-07-sigma-protocols/#the-chaum-pedersen-protocol-for-dh-triples) and the [OR-proof construction](../2023-11-07-sigma-protocols/#or-proof-construction), the voter can submit a proof that the ciphertext is either a encryption of $b_i = 0$ or $1$. We can also apply the Fiat-Shamir transform here for efficient protocols, resulting in non-interactive proofs. To fix this, we can make each voter prove that the vote is valid. Using the [Chaum-Pedersen protocol for DH-triples](../2023-11-07-sigma-protocols/#the-chaum-pedersen-protocol-for-dh-triples) and the [OR-proof construction](../2023-11-07-sigma-protocols/#or-proof-construction), the voter can submit a proof that the ciphertext is either a encryption of $b _ i = 0$ or $1$. We can also apply the Fiat-Shamir transform here for efficient protocols, resulting in non-interactive proofs.
[^1]: The message flows in a shape that resembles the greek letter $\Sigma$, hence the name *sigma protocol*. [^1]: The message flows in a shape that resembles the greek letter $\Sigma$, hence the name *sigma protocol*.
[^2]: A Graduate Course in Applied Cryptography. [^2]: A Graduate Course in Applied Cryptography.

View File

@@ -20,10 +20,10 @@ github_title: 2023-11-09-secure-mpc
Suppose we have a function $f$ that takes $n$ inputs and produces $m$ outputs. Suppose we have a function $f$ that takes $n$ inputs and produces $m$ outputs.
$$ $$
(y_1, \dots, y_m) = f(x_1, \dots, x_n). (y _ 1, \dots, y _ m) = f(x _ 1, \dots, x _ n).
$$ $$
$N$ parties $P_1, \dots, P_N$ are trying to evaluate this function with a protocol. Each $x_i$ is submitted by one of the parties, and each output $y_j$ will be given to one or more parties. $N$ parties $P _ 1, \dots, P _ N$ are trying to evaluate this function with a protocol. Each $x _ i$ is submitted by one of the parties, and each output $y _ j$ will be given to one or more parties.
In **secure multiparty computation** (MPC), we wish to achieve some security functionalities. In **secure multiparty computation** (MPC), we wish to achieve some security functionalities.
@@ -35,27 +35,27 @@ Security must hold even if there is any adversarial behavior in the party.
### Example: Secure Summation ### Example: Secure Summation
Suppose we have $n$ parties $P_1, \dots, P_n$ with private values $x_1, \dots, x_n$. We would like to *securely* compute the sum $s = x_1 + \cdots + x_n$. Suppose we have $n$ parties $P _ 1, \dots, P _ n$ with private values $x _ 1, \dots, x _ n$. We would like to *securely* compute the sum $s = x _ 1 + \cdots + x _ n$.
> 1. Choose $M$ large enough so that $M > s$. > 1. Choose $M$ large enough so that $M > s$.
> 2. $P_1$ samples $r \la \Z_M$ and computes $s_1 = r + x_1 \pmod M$ and sends it to $P_2$. > 2. $P _ 1$ samples $r \la \Z _ M$ and computes $s _ 1 = r + x _ 1 \pmod M$ and sends it to $P _ 2$.
> 3. In the same manner, $P_i$ computes $s_i = s_{i-1} + x_i \pmod M$ and sends it to $P_{i+1}$. > 3. In the same manner, $P _ i$ computes $s _ i = s _ {i-1} + x _ i \pmod M$ and sends it to $P _ {i+1}$.
> 4. As the final step, $s_n$ is returned to $P_1$, where he outputs $s = s_n - r \pmod M$. > 4. As the final step, $s _ n$ is returned to $P _ 1$, where he outputs $s = s _ n - r \pmod M$.
This protocol seems secure since $r$ is a random noise added to the actual partial sum. But the security actually depends on how we model adversarial behavior. This protocol seems secure since $r$ is a random noise added to the actual partial sum. But the security actually depends on how we model adversarial behavior.
Consider the case where parties $P_2$ and $P_4$ team up (collusion). These two can share information between them. They have the following: Consider the case where parties $P _ 2$ and $P _ 4$ team up (collusion). These two can share information between them. They have the following:
- $P_2$ has $s_1$, $s_2$, $x_2$. - $P _ 2$ has $s _ 1$, $s _ 2$, $x _ 2$.
- $P_4$ has $s_3$, $s_4$, $x_4$. - $P _ 4$ has $s _ 3$, $s _ 4$, $x _ 4$.
Using $s_2$ and $s_3$, they can compute $x_3 = s_3 - s_2$ and obtain the input of $P_3$. This violates privacy. Similarly, if $P_i$ and $P_j$ team up, the can compute the partial sum Using $s _ 2$ and $s _ 3$, they can compute $x _ 3 = s _ 3 - s _ 2$ and obtain the input of $P _ 3$. This violates privacy. Similarly, if $P _ i$ and $P _ j$ team up, the can compute the partial sum
$$ $$
s_{j - 1} - s_{i} = x_{i+1} + \cdots + x_{j-1} s _ {j - 1} - s _ {i} = x _ {i+1} + \cdots + x _ {j-1}
$$ $$
which leaks information about the inputs of $P_{i+1}, \dots, P_{j-1}$. which leaks information about the inputs of $P _ {i+1}, \dots, P _ {j-1}$.
## Modeling Adversaries for Multiparty Computation ## Modeling Adversaries for Multiparty Computation
@@ -100,8 +100,8 @@ Thus, a secure protocol must provide security in the real world that is equivale
- If we show the existence of a simulator, a real world adversary's ability is the same as an adversary in the ideal world. - If we show the existence of a simulator, a real world adversary's ability is the same as an adversary in the ideal world.
> **Definition.** Let $\mc{A}$ be the set of parties that are corrupted, and let $\rm{Sim}$ be a simulator algorithm. > **Definition.** Let $\mc{A}$ be the set of parties that are corrupted, and let $\rm{Sim}$ be a simulator algorithm.
> - $\rm{Real}(\mc{A}; x_1, \dots, x_n)$: each party $P_i$ runs the protocol with private input $x_i$. Let $V_i$ be the final view of $P_i$. Output $\braces{V_i : i \in \mc{A}}$. > - $\rm{Real}(\mc{A}; x _ 1, \dots, x _ n)$: each party $P _ i$ runs the protocol with private input $x _ i$. Let $V _ i$ be the final view of $P _ i$. Output $\braces{V _ i : i \in \mc{A}}$.
> - $\rm{Ideal}_\rm{Sim}(x_1, \dots, x_n)$: output $\rm{Sim}(\mc{A}; \braces{(x_i, y_i) : i \in \mc{A}})$. > - $\rm{Ideal} _ \rm{Sim}(x _ 1, \dots, x _ n)$: output $\rm{Sim}(\mc{A}; \braces{(x _ i, y _ i) : i \in \mc{A}})$.
> >
> A protocol is **secure against semi-honest adversaries** if there exists a simulator such that for every subset of corrupted parties $\mc{A}$, its views in the real and ideal worlds are indistinguishable. > A protocol is **secure against semi-honest adversaries** if there exists a simulator such that for every subset of corrupted parties $\mc{A}$, its views in the real and ideal worlds are indistinguishable.
@@ -109,7 +109,7 @@ Thus, a secure protocol must provide security in the real world that is equivale
This is a building block for building any MPC. This is a building block for building any MPC.
Suppose that the sender has data $m_1, \dots, m_n \in \mc{M}$, and the receiver has an index $i \in \braces{1, \dots, n}$. The sender wants to send exactly one message and hide others. Also, the receiver wants to hide which message he received. Suppose that the sender has data $m _ 1, \dots, m _ n \in \mc{M}$, and the receiver has an index $i \in \braces{1, \dots, n}$. The sender wants to send exactly one message and hide others. Also, the receiver wants to hide which message he received.
This problem is called 1-out-of-$n$ **oblivious transfer** (OT). This problem is called 1-out-of-$n$ **oblivious transfer** (OT).
@@ -119,70 +119,70 @@ We show an example of 1-out-of-2 OT using the ElGamal encryptions scheme. We use
It is known that $k$-out-of-$n$ OT is constructible from 1-out-of-2 OTs. It is known that $k$-out-of-$n$ OT is constructible from 1-out-of-2 OTs.
> Suppose that the sender Alice has messages $x_0, x_1 \in \braces{0, 1}\conj$, and the receiver Bob has a choice $\sigma \in \braces{0, 1}$. > Suppose that the sender Alice has messages $x _ 0, x _ 1 \in \braces{0, 1}\conj$, and the receiver Bob has a choice $\sigma \in \braces{0, 1}$.
> >
> 1. Bob chooses $sk = \alpha \la \Z_q$ and computes $h = g^\alpha$, and chooses $h' \la G$. > 1. Bob chooses $sk = \alpha \la \Z _ q$ and computes $h = g^\alpha$, and chooses $h' \la G$.
> 2. Bob sets $pk_\sigma = h$ and $pk_{1-\sigma} = h'$ and sends $(pk_0, pk_1)$ to Alice. > 2. Bob sets $pk _ \sigma = h$ and $pk _ {1-\sigma} = h'$ and sends $(pk _ 0, pk _ 1)$ to Alice.
> 3. Alice encrypts each $x_i$ using $pk_i$, obtains two ciphertexts. > 3. Alice encrypts each $x _ i$ using $pk _ i$, obtains two ciphertexts.
> - $\beta_0, \beta_1 \la \Z_q$. > - $\beta _ 0, \beta _ 1 \la \Z _ q$.
> - $c_0 = \big( g^{\beta_0}, H(pk_0^{\beta_0}) \oplus x_0 \big)$, $c_1 = \big( g^{\beta_1}, H(pk_1^{\beta_1}) \oplus x_1 \big)$. > - $c _ 0 = \big( g^{\beta _ 0}, H(pk _ 0^{\beta _ 0}) \oplus x _ 0 \big)$, $c _ 1 = \big( g^{\beta _ 1}, H(pk _ 1^{\beta _ 1}) \oplus x _ 1 \big)$.
> 4. Alice sends $(c_0, c_1)$ to Bob. > 4. Alice sends $(c _ 0, c _ 1)$ to Bob.
> 5. Bob decrypts $c_\sigma$ with $sk$ to get $x_\sigma$. > 5. Bob decrypts $c _ \sigma$ with $sk$ to get $x _ \sigma$.
Correctness is obvious. Correctness is obvious.
Alice's view contains the following: $x_0, x_1, pk_0, pk_1, c_0, c_1$. Among these, $pk_0, pk_1$ are the received values from Bob. But these are random group elements, so she learns nothing about $\sigma$. The simulator can choose two random group elements to simulate Alice. Alice's view contains the following: $x _ 0, x _ 1, pk _ 0, pk _ 1, c _ 0, c _ 1$. Among these, $pk _ 0, pk _ 1$ are the received values from Bob. But these are random group elements, so she learns nothing about $\sigma$. The simulator can choose two random group elements to simulate Alice.
Bob's view contains the following: $\sigma, \alpha, g^\alpha, h', c_0, c_1, x_\sigma$. He only knows one private key, so he only learns $x_\sigma$, under the DL assumption. (He doesn't have the discrete logarithm for $h'$) The simulator must simulate $c_0, c_1$, so it encrypts $x_\sigma$ with $pk_\sigma$, and as for $x_{1-\sigma}$, a random message is encrypted with $pk_{1-\sigma}$. This works because the encryption scheme is semantically secure, meaning that it doesn't reveal any information about the underlying message. Bob's view contains the following: $\sigma, \alpha, g^\alpha, h', c _ 0, c _ 1, x _ \sigma$. He only knows one private key, so he only learns $x _ \sigma$, under the DL assumption. (He doesn't have the discrete logarithm for $h'$) The simulator must simulate $c _ 0, c _ 1$, so it encrypts $x _ \sigma$ with $pk _ \sigma$, and as for $x _ {1-\sigma}$, a random message is encrypted with $pk _ {1-\sigma}$. This works because the encryption scheme is semantically secure, meaning that it doesn't reveal any information about the underlying message.
The above works for **semi-honest** parties. To prevent malicious behavior, we fix the protocol a bit. The above works for **semi-honest** parties. To prevent malicious behavior, we fix the protocol a bit.
> 1. Alice sends a random $w \la G$ first. > 1. Alice sends a random $w \la G$ first.
> 2. Bob must choose $h$ and $h'$ so that $hh' = w$. $h$ is chosen the same way, and $h' = wh\inv$ is computed. > 2. Bob must choose $h$ and $h'$ so that $hh' = w$. $h$ is chosen the same way, and $h' = wh\inv$ is computed.
> >
> The remaining steps are the same, except that Alice checks if $pk_0 \cdot pk_1 = w$. > The remaining steps are the same, except that Alice checks if $pk _ 0 \cdot pk _ 1 = w$.
Bob must choose $h, h'$ such that $hh' = w$. If not, Bob can choose $\alpha' \la \Z_q$ and set $h' = g^{\alpha'}$, enabling him to decrypt both $c_0, c_1$, revealing $x_0, x_1$. Under the DL assumption, Bob cannot find the discrete logarithm of $h'$, which prevents malicious behavior. Bob must choose $h, h'$ such that $hh' = w$. If not, Bob can choose $\alpha' \la \Z _ q$ and set $h' = g^{\alpha'}$, enabling him to decrypt both $c _ 0, c _ 1$, revealing $x _ 0, x _ 1$. Under the DL assumption, Bob cannot find the discrete logarithm of $h'$, which prevents malicious behavior.
### 1-out-of-$n$ OT Construction from ElGamal Encryption ### 1-out-of-$n$ OT Construction from ElGamal Encryption
Let $m_1, \dots, m_n \in \mc{M}$ be the messages to send, and let $i$ be an index. We will use ElGamal encryption on a cyclic group $G = \span{g}$ of prime order, with a hash function and a semantically secure symmetric cipher $(E_S, D_S)$. Let $m _ 1, \dots, m _ n \in \mc{M}$ be the messages to send, and let $i$ be an index. We will use ElGamal encryption on a cyclic group $G = \span{g}$ of prime order, with a hash function and a semantically secure symmetric cipher $(E _ S, D _ S)$.
> 1. Alice chooses $\beta \la \Z_q$, computes $v \la g^\beta$ and sends $v$ to Bob. > 1. Alice chooses $\beta \la \Z _ q$, computes $v \la g^\beta$ and sends $v$ to Bob.
> 2. Bob chooses $\alpha \la \Z_q$, computes $u \la g^\alpha v^{-i}$ and sends $u$ to Alice. > 2. Bob chooses $\alpha \la \Z _ q$, computes $u \la g^\alpha v^{-i}$ and sends $u$ to Alice.
> 3. For $j = 1, \dots, n$, Alice computes the following. > 3. For $j = 1, \dots, n$, Alice computes the following.
> - Compute $u_j \la u \cdot v^j = g^\alpha v^{j-i}$ as the public key for the $j$-th message. > - Compute $u _ j \la u \cdot v^j = g^\alpha v^{j-i}$ as the public key for the $j$-th message.
> - Encrypt $m_j$ as $(g^\beta, c_j)$, where $c_j \la E_S\big( H(g^\beta, u_j^\beta), m_j \big)$. > - Encrypt $m _ j$ as $(g^\beta, c _ j)$, where $c _ j \la E _ S\big( H(g^\beta, u _ j^\beta), m _ j \big)$.
> 4. Alice sends $(c_1, \dots, c_n)$ to Bob. > 4. Alice sends $(c _ 1, \dots, c _ n)$ to Bob.
> 5. Bob decrypts $c_i$ as follows. > 5. Bob decrypts $c _ i$ as follows.
> - Compute symmetric key $k \la H(v, v^\alpha)$ where $v = g^\beta$ from step $1$. > - Compute symmetric key $k \la H(v, v^\alpha)$ where $v = g^\beta$ from step $1$.
> - $m_i \la D_S(k, c_i)$. > - $m _ i \la D _ S(k, c _ i)$.
Note that all ciphertexts $c_j$ were created from the same ephemeral key $\beta \in \Z_q$. Note that all ciphertexts $c _ j$ were created from the same ephemeral key $\beta \in \Z _ q$.
For correctness, we check that Bob indeed receives $m_i$ from the above protocol. Check that $u_i = u\cdot v^i = g^\alpha v^0 = g^\alpha$, then $u_i^\beta = g^{\alpha\beta} = v^\alpha$. Since $c_i = E_S\big( H(g^\beta, u_i^\beta), m_i \big) = E_S\big( H(v, v^\alpha), m_i \big)$, the decryption gives $m_i$. For correctness, we check that Bob indeed receives $m _ i$ from the above protocol. Check that $u _ i = u\cdot v^i = g^\alpha v^0 = g^\alpha$, then $u _ i^\beta = g^{\alpha\beta} = v^\alpha$. Since $c _ i = E _ S\big( H(g^\beta, u _ i^\beta), m _ i \big) = E _ S\big( H(v, v^\alpha), m _ i \big)$, the decryption gives $m _ i$.
Now is this oblivious? All that Alice sees is $u = g^\alpha v^{-i}$ from Bob. Since $\alpha \la \Z_q$, $u$ is uniformly distributed over elements of $G$. Alice learns no information about $i$. Now is this oblivious? All that Alice sees is $u = g^\alpha v^{-i}$ from Bob. Since $\alpha \la \Z _ q$, $u$ is uniformly distributed over elements of $G$. Alice learns no information about $i$.
As for Bob, we need the **CDH assumption**. Suppose that Bob can query $H$ on two different ciphertexts $c_{j_1}, c_{j_2}$. Then he knows As for Bob, we need the **CDH assumption**. Suppose that Bob can query $H$ on two different ciphertexts $c _ {j _ 1}, c _ {j _ 2}$. Then he knows
$$ $$
u_{j_1}^\beta/u_{j_2}^\beta = v^{\beta(j_1 - j_2)}, u _ {j _ 1}^\beta/u _ {j _ 2}^\beta = v^{\beta(j _ 1 - j _ 2)},
$$ $$
and by raising both to the $(j_1 - j_2)\inv$ power (inverse in $\Z_q$), he can compute $v^\beta = g^{\beta^2}$. Thus, Bob has computed $g^{\beta^2}$ from $g^\beta$, and this breaks the CDH assumption.[^1] Thus Bob cannot query $H$ on two points, and is unable to decrypt two ciphertexts. He only learns $m_i$. and by raising both to the $(j _ 1 - j _ 2)\inv$ power (inverse in $\Z _ q$), he can compute $v^\beta = g^{\beta^2}$. Thus, Bob has computed $g^{\beta^2}$ from $g^\beta$, and this breaks the CDH assumption.[^1] Thus Bob cannot query $H$ on two points, and is unable to decrypt two ciphertexts. He only learns $m _ i$.
### OT for Computing $2$-ary Function with Finite Domain ### OT for Computing $2$-ary Function with Finite Domain
We can use an OT for computing a $2$-ary function with finite domain. We can use an OT for computing a $2$-ary function with finite domain.
Let $f : X_1 \times X_2 \ra Y$ be a deterministic function with $X_1$, $X_2$ both finite. There are two parties $P_1, P_2$ with inputs $x_1, x_2$, and they want to compute $f(x_1, x_2)$ without revealing their input. Let $f : X _ 1 \times X _ 2 \ra Y$ be a deterministic function with $X _ 1$, $X _ 2$ both finite. There are two parties $P _ 1, P _ 2$ with inputs $x _ 1, x _ 2$, and they want to compute $f(x _ 1, x _ 2)$ without revealing their input.
Then we can use $1$-out-of-$\abs{X_2}$ OT to securely compute $f(x_1, x_2)$. Without loss of generality, suppose that $P_1$ is the sender. Then we can use $1$-out-of-$\abs{X _ 2}$ OT to securely compute $f(x _ 1, x _ 2)$. Without loss of generality, suppose that $P _ 1$ is the sender.
$P_1$ computes $y_x =f(x_1, x)$ for all $x \in X_2$, resulting in $\abs{X_2}$ messages. Then $P_1$ performs 1-out-of-$\abs{X_2}$ OT with $P_2$. The value of $x_2$ will be used as the choice of $P_2$, which will be oblivious to $P_1$.[^2] $P _ 1$ computes $y _ x =f(x _ 1, x)$ for all $x \in X _ 2$, resulting in $\abs{X _ 2}$ messages. Then $P _ 1$ performs 1-out-of-$\abs{X _ 2}$ OT with $P _ 2$. The value of $x _ 2$ will be used as the choice of $P _ 2$, which will be oblivious to $P _ 1$.[^2]
This method is inefficient, so we have better methods! This method is inefficient, so we have better methods!
[^1]: Given $g^\alpha, g^\beta$, compute $g^{\alpha + \beta}$. Then compute $g^{\alpha^2}, g^{\beta^2}, g^{(\alpha+\beta)^2}$, and obtain $g^{2\alpha\beta}$. Exponentiate by $2\inv \in \Z_q$ to find $g^{\alpha\beta}$. [^1]: Given $g^\alpha, g^\beta$, compute $g^{\alpha + \beta}$. Then compute $g^{\alpha^2}, g^{\beta^2}, g^{(\alpha+\beta)^2}$, and obtain $g^{2\alpha\beta}$. Exponentiate by $2\inv \in \Z _ q$ to find $g^{\alpha\beta}$.
[^2]: Can $P_1$ learn the value of $x_2$ from the final output $y_{x_2} = f(x_1, x_2)$? [^2]: Can $P _ 1$ learn the value of $x _ 2$ from the final output $y _ {x _ 2} = f(x _ 1, x _ 2)$?

View File

@@ -35,20 +35,20 @@ A **garbled circuit** is an *encrypted circuit*, with a pair of keys for each wi
The garbler first encrypts the circuit. First, assign two keys, called **garbled values**, to each wire of the circuit. The garbler first encrypts the circuit. First, assign two keys, called **garbled values**, to each wire of the circuit.
Suppose we have an AND gate, where $C = \rm{AND}(A, B)$. For the wire $A$, the garbler assigns $A_0, A_1$, each for representing the bit $0$ and $1$. Note that this mapping is known only to the garbler. Similar process is done for wires $B$ and $C$. Suppose we have an AND gate, where $C = \rm{AND}(A, B)$. For the wire $A$, the garbler assigns $A _ 0, A _ 1$, each for representing the bit $0$ and $1$. Note that this mapping is known only to the garbler. Similar process is done for wires $B$ and $C$.
Then we have the following garbled values, as in columns 1 to 3. Now, encrypt the values of $C$ with a semantically secure scheme $E$, and obtain the $4$th column. Then, permute the rows in random order so that it is indistinguishable. Then we have the following garbled values, as in columns 1 to 3. Now, encrypt the values of $C$ with a semantically secure scheme $E$, and obtain the $4$th column. Then, permute the rows in random order so that it is indistinguishable.
|$A$|$B$|$C$|$C = \rm{AND}(A, B)$| |$A$|$B$|$C$|$C = \rm{AND}(A, B)$|
|:-:|:-:|:-:|:-:| |:-:|:-:|:-:|:-:|
|$A_0$|$B_0$|$C_0$|$E(A_0 \parallel B_0, C_0)$| |$A _ 0$|$B _ 0$|$C _ 0$|$E(A _ 0 \parallel B _ 0, C _ 0)$|
|$A_0$|$B_1$|$C_0$|$E(A_0 \parallel B_1, C_0)$| |$A _ 0$|$B _ 1$|$C _ 0$|$E(A _ 0 \parallel B _ 1, C _ 0)$|
|$A_1$|$B_0$|$C_0$|$E(A_1 \parallel B_0, C_0)$| |$A _ 1$|$B _ 0$|$C _ 0$|$E(A _ 1 \parallel B _ 0, C _ 0)$|
|$A_1$|$B_1$|$C_1$|$E(A_1 \parallel B_1, C_1)$| |$A _ 1$|$B _ 1$|$C _ 1$|$E(A _ 1 \parallel B _ 1, C _ 1)$|
For evaluation, the **last column** will be given to the other party as the representation of the **garbled gate**. The inputs will be given as $A_x$ and $B_y$, but the evaluator will have no idea about the actual value of $x$ and $y$, hiding the actual input value. Although he doesn't know the underlying bit values, the evaluator is able to compute $C_z$ where $z = x \land y$. Similarly, the evaluator will not know whether $z$ is $0$ or $1$, hiding the output or intermediate values. For evaluation, the **last column** will be given to the other party as the representation of the **garbled gate**. The inputs will be given as $A _ x$ and $B _ y$, but the evaluator will have no idea about the actual value of $x$ and $y$, hiding the actual input value. Although he doesn't know the underlying bit values, the evaluator is able to compute $C _ z$ where $z = x \land y$. Similarly, the evaluator will not know whether $z$ is $0$ or $1$, hiding the output or intermediate values.
The above *garbling* process is done for all gates. For the last output gate, the garbler keeps a **output translation table** to himself, that maps $0$ to $C_0$ and $1$ to $C_1$. This is used for recovering the bit, when the evaluation is done and the evaluator sends the final garbled value. The above *garbling* process is done for all gates. For the last output gate, the garbler keeps a **output translation table** to himself, that maps $0$ to $C _ 0$ and $1$ to $C _ 1$. This is used for recovering the bit, when the evaluation is done and the evaluator sends the final garbled value.
> In summary, given a boolean circuit, > In summary, given a boolean circuit,
> 1. Assign garbled values to all wires in the circuit. > 1. Assign garbled values to all wires in the circuit.
@@ -60,13 +60,13 @@ Note that the evaluator learns nothing during the evaluation.
There is a slight problem here. In some encryption schemes, a ciphertext can be decrypted by an incorrect key. If the above encryptions are in arbitrary order, how does the evaluator know if he decrypted the correct one? There is a slight problem here. In some encryption schemes, a ciphertext can be decrypted by an incorrect key. If the above encryptions are in arbitrary order, how does the evaluator know if he decrypted the correct one?
One method is to add **redundant zeros** to the $C_k$. Then the last column would contain $E\big( A_i \pll B_j, C_k \pll 0^n \big)$. Then when the evaluator decrypts these ciphertexts, the probability of getting redundant zeros with an incorrect key would be negligible. But with this method, all four ciphertexts have to be decrypted in the worst case. One method is to add **redundant zeros** to the $C _ k$. Then the last column would contain $E\big( A _ i \pll B _ j, C _ k \pll 0^n \big)$. Then when the evaluator decrypts these ciphertexts, the probability of getting redundant zeros with an incorrect key would be negligible. But with this method, all four ciphertexts have to be decrypted in the worst case.
Another method is adding a bit to signal which ciphertext to decrypt. This method is called **point-and-permute**. The garbler chooses a random bit $b_A$ for each wire $A$. Then when drawing $A_0, A_1$, set the first bit (MSB) to $b_A$ and $1 - b_A$, respectively. Next, the ciphertexts are sorted in the order of $b_A$ and $b_B$. Then the evaluator can exploit this information during evaluation. Another method is adding a bit to signal which ciphertext to decrypt. This method is called **point-and-permute**. The garbler chooses a random bit $b _ A$ for each wire $A$. Then when drawing $A _ 0, A _ 1$, set the first bit (MSB) to $b _ A$ and $1 - b _ A$, respectively. Next, the ciphertexts are sorted in the order of $b _ A$ and $b _ B$. Then the evaluator can exploit this information during evaluation.
For example, if the evaluator has $X$ and $Y$ such that $\rm{MSB}(X) = 0$ and $\rm{MSB}(Y) = 1$, then choose the second ($01$ in binary) ciphertext entry to decrypt. For example, if the evaluator has $X$ and $Y$ such that $\rm{MSB}(X) = 0$ and $\rm{MSB}(Y) = 1$, then choose the second ($01$ in binary) ciphertext entry to decrypt.
This method does not reduce security, since the bits $b_A$, $b_B$ are random. Also, now the evaluator doesn't have to decrypt all four ciphertexts, reducing the evaluation load. This method does not reduce security, since the bits $b _ A$, $b _ B$ are random. Also, now the evaluator doesn't have to decrypt all four ciphertexts, reducing the evaluation load.
## Protocol Description ## Protocol Description
@@ -75,8 +75,8 @@ This method does not reduce security, since the bits $b_A$, $b_B$ are random. Al
> 1. Alice garbles the circuit, generating garbled values and gates. > 1. Alice garbles the circuit, generating garbled values and gates.
> 2. Garbled gate tables and the garbled values of Alice's inputs are sent to Bob. > 2. Garbled gate tables and the garbled values of Alice's inputs are sent to Bob.
> 3. For Bob's input wire $B$, Alice and Bob run an 1-out-of-2 OT protocol. > 3. For Bob's input wire $B$, Alice and Bob run an 1-out-of-2 OT protocol.
> - Alice provides $B_0$ and $B_1$ to the OT. > - Alice provides $B _ 0$ and $B _ 1$ to the OT.
> - Bob inputs his input bit $b$ to the OT, and Bob now has $B_b$. > - Bob inputs his input bit $b$ to the OT, and Bob now has $B _ b$.
> 4. Bob has garbled values for all input wires, so evaluates the circuit. > 4. Bob has garbled values for all input wires, so evaluates the circuit.
> 5. Bob sends the final garbled output to Alice. > 5. Bob sends the final garbled output to Alice.
> 6. Alices uses the output translation table to recover the final result bit. > 6. Alices uses the output translation table to recover the final result bit.
@@ -85,9 +85,9 @@ Note that OT can be done in *parallel*, reducing the round complexity.
### Why is OT Necessary? ### Why is OT Necessary?
Suppose Alice gave both $B_0$ and $B_1$ to Bob. Bob doesn't know which one represents $0$ or $1$, but he can just run the evaluation for both inputs. Suppose Alice gave both $B _ 0$ and $B _ 1$ to Bob. Bob doesn't know which one represents $0$ or $1$, but he can just run the evaluation for both inputs.
Suppose we have a $2$-input AND gate $C = \rm{AND}(A, B)$. Bob already has $A_x$ from Alice, so he evaluates for both $B_0$ and $B_1$, obtaining $C_{x\land 0}$ and $C_{x \land 1}$. If these are the same, Bob learns that $x = 0$. If different, $x = 1$. Suppose we have a $2$-input AND gate $C = \rm{AND}(A, B)$. Bob already has $A _ x$ from Alice, so he evaluates for both $B _ 0$ and $B _ 1$, obtaining $C _ {x\land 0}$ and $C _ {x \land 1}$. If these are the same, Bob learns that $x = 0$. If different, $x = 1$.
So we need an OT to make sure that Bob only learns one of the garbled values. So we need an OT to make sure that Bob only learns one of the garbled values.
@@ -106,11 +106,11 @@ So we need an OT to make sure that Bob only learns one of the garbled values.
## Summary of Yao's Protocol ## Summary of Yao's Protocol
Let $f$ be a given public function that Alice and Bob want to compute, in circuit representation. Let $(x_1, \dots, x_n)$ and $(y_1, \dots, y_m)$ be inputs provided by Alice and Bob, respectively. Let $f$ be a given public function that Alice and Bob want to compute, in circuit representation. Let $(x _ 1, \dots, x _ n)$ and $(y _ 1, \dots, y _ m)$ be inputs provided by Alice and Bob, respectively.
Alice generates a garbled circuit $G(f)$ by assigning garbled values for each wire. Then gives Bob $G(f)$ and the garbled values of her inputs. Then Alice and Bob run several OTs in parallel for the garbled values of Bob's inputs. Alice generates a garbled circuit $G(f)$ by assigning garbled values for each wire. Then gives Bob $G(f)$ and the garbled values of her inputs. Then Alice and Bob run several OTs in parallel for the garbled values of Bob's inputs.
Bob computes $G(f)$ and obtains a key of $f(x_1, \dots, x_n, y_1, \dots, y_m)$, which is sent to Alice and Alice recovers the final result. Bob computes $G(f)$ and obtains a key of $f(x _ 1, \dots, x _ n, y _ 1, \dots, y _ m)$, which is sent to Alice and Alice recovers the final result.
## Proof of Security (Semi-honest) ## Proof of Security (Semi-honest)
@@ -126,9 +126,9 @@ In the OT-hybrid model, we assume an ideal OT. In this case, Alice receives no m
This case is harder to show. The simulator must construct a fake garbled circuit that is indistinguishable to the real one. But the simulator doesn't know the inputs of Alice, so it cannot generate a real circuit. This case is harder to show. The simulator must construct a fake garbled circuit that is indistinguishable to the real one. But the simulator doesn't know the inputs of Alice, so it cannot generate a real circuit.
Bob's view contains his inputs $(y_1, \dots, y_m)$ and the final output $z = (z_1, \dots, z_k)$. Thus, the simulator generates a fake garbled circuit that **always** outputs $z$. To do this, the garbled values for the wires can be chosen randomly, and use them for encryption keys. But the encrypted message is fixed to the (intermediate) output. For instance, make the gate table consists of $E\big( A_i \pll B_j, C_0 \big)$ for fixed $C_0$. In this way, the simulator can control the values of output wires and get $z$ for the final output. Bob's view contains his inputs $(y _ 1, \dots, y _ m)$ and the final output $z = (z _ 1, \dots, z _ k)$. Thus, the simulator generates a fake garbled circuit that **always** outputs $z$. To do this, the garbled values for the wires can be chosen randomly, and use them for encryption keys. But the encrypted message is fixed to the (intermediate) output. For instance, make the gate table consists of $E\big( A _ i \pll B _ j, C _ 0 \big)$ for fixed $C _ 0$. In this way, the simulator can control the values of output wires and get $z$ for the final output.
The output translation tables can be generated using this method. An entry of the table would be $(z_i, C_0)$ where $C_0$ is the garbled value used for generating the gate table. As for $1-z_i$, any random garbled value can be used. The output translation tables can be generated using this method. An entry of the table would be $(z _ i, C _ 0)$ where $C _ 0$ is the garbled value used for generating the gate table. As for $1-z _ i$, any random garbled value can be used.
Lastly for communicating garbled values, Alice's input wires can be set to any two garbled values of the wire. Bob's input wires should be simulated by the simulator of the OT, which will result in any one of the two values on the wire. Lastly for communicating garbled values, Alice's input wires can be set to any two garbled values of the wire. Bob's input wires should be simulated by the simulator of the OT, which will result in any one of the two values on the wire.
@@ -141,15 +141,15 @@ For each wire of the circuit, two random *super-seeds* (garbled values) are used
For example, for input wire $A$, let For example, for input wire $A$, let
$$ $$
A_0 = a_0^1 \pll \cdots \pll a_0^n, \quad A_1 = a_1^1 \pll \cdots \pll a_1^n, A _ 0 = a _ 0^1 \pll \cdots \pll a _ 0^n, \quad A _ 1 = a _ 1^1 \pll \cdots \pll a _ 1^n,
$$ $$
where $a_0^k, a_1^k$ are seeds generated by party $P_k$. where $a _ 0^k, a _ 1^k$ are seeds generated by party $P _ k$.
Then for garbling gates, the super-seeds of the output wire is encrypted by the super-seeds of the input wires. As an example, suppose that we use $A_b = a_b^1 \pll \cdots \pll a_b^n$ to encrypt an output value $B$. Then we could use a secure PRG $G$ and set Then for garbling gates, the super-seeds of the output wire is encrypted by the super-seeds of the input wires. As an example, suppose that we use $A _ b = a _ b^1 \pll \cdots \pll a _ b^n$ to encrypt an output value $B$. Then we could use a secure PRG $G$ and set
$$ $$
B \oplus G(a_b^1) \oplus \cdots \oplus G(a_b^n) B \oplus G(a _ b^1) \oplus \cdots \oplus G(a _ b^n)
$$ $$
as the garbled value. as the garbled value.

View File

@@ -23,7 +23,7 @@ There are two types of MPC protocols, **generic** and **specific**. Generic prot
## GMW Protocol ## GMW Protocol
The **Goldreich-Micali-Wigderson** (GMW) **protocol** is a designed for evaluating boolean circuits. In particular, it can be used for XOR and AND gates, which corresponds to addition and multiplication in $\Z_2$. Thus, the protocol can be generalized for evaluating arbitrary arithmetic circuits. The **Goldreich-Micali-Wigderson** (GMW) **protocol** is a designed for evaluating boolean circuits. In particular, it can be used for XOR and AND gates, which corresponds to addition and multiplication in $\Z _ 2$. Thus, the protocol can be generalized for evaluating arbitrary arithmetic circuits.
We assume semi-honest adversaries and static corruption. The GMW protocol is known to be secure against any number of corrupted parties. We also assume that any two parties have private channels for communication. We assume semi-honest adversaries and static corruption. The GMW protocol is known to be secure against any number of corrupted parties. We also assume that any two parties have private channels for communication.
@@ -36,22 +36,22 @@ The protocol can be broken down into $3$ phases.
### Input Phase ### Input Phase
Suppose that we have $n$ parties $P_1, \dots, P_n$ with inputs $x_1, \dots, x_n \in \braces{0, 1}$. The inputs are bits but they can be generalized to inputs over $\Z_q$ where $q$ is prime. Suppose that we have $n$ parties $P _ 1, \dots, P _ n$ with inputs $x _ 1, \dots, x _ n \in \braces{0, 1}$. The inputs are bits but they can be generalized to inputs over $\Z _ q$ where $q$ is prime.
> Each party $P_i$ shares its input with other parties as follows. > Each party $P _ i$ shares its input with other parties as follows.
> >
> 1. Choose random $r_{i, j} \la \braces{0, 1}$ for all $j \neq i$ and send $r_{i, j}$ to $P_j$. > 1. Choose random $r _ {i, j} \la \braces{0, 1}$ for all $j \neq i$ and send $r _ {i, j}$ to $P _ j$.
> 2. Set $r_{i, i} = x_i + \sum_{i \neq j} r_{i, j}$. > 2. Set $r _ {i, i} = x _ i + \sum _ {i \neq j} r _ {i, j}$.
Then we see that $x_i = \sum_{j = 1}^n r_{i, j}$. Each party has a **share** of $x_i$, which is $r_{i, j}$. We have a notation for this, Then we see that $x _ i = \sum _ {j = 1}^n r _ {i, j}$. Each party has a **share** of $x _ i$, which is $r _ {i, j}$. We have a notation for this,
$$ $$
[x_i] = (r_{i, 1}, \dots, r_{i, n}). [x _ i] = (r _ {i, 1}, \dots, r _ {i, n}).
$$ $$
It means that $r_{i, 1}, \dots, r_{i, n}$ are shares of $x_i$. It means that $r _ {i, 1}, \dots, r _ {i, n}$ are shares of $x _ i$.
After this phase, each party $P_j$ has $n$ shares $r_{1, j}, \dots, r_{n,j}$, where each is a share of $x_i$. After this phase, each party $P _ j$ has $n$ shares $r _ {1, j}, \dots, r _ {n,j}$, where each is a share of $x _ i$.
### Evaluation Phase ### Evaluation Phase
@@ -67,10 +67,10 @@ $$
each party can simply add all the input shares. each party can simply add all the input shares.
If $y = x_1 + \cdots + x_n$, then party $P_j$ will compute $y_j = \sum_{i=1}^n r_{i, j}$, which is a share of $y$, $[y] = (y_1, \dots, y_n)$. It can be checked that If $y = x _ 1 + \cdots + x _ n$, then party $P _ j$ will compute $y _ j = \sum _ {i=1}^n r _ {i, j}$, which is a share of $y$, $[y] = (y _ 1, \dots, y _ n)$. It can be checked that
$$ $$
y = \sum_{j=1}^n y_j = \sum_{j=1}^n \sum_{i=1}^n r_{i, j}. y = \sum _ {j=1}^n y _ j = \sum _ {j=1}^n \sum _ {i=1}^n r _ {i, j}.
$$ $$
#### Evaluating AND Gates #### Evaluating AND Gates
@@ -78,20 +78,20 @@ $$
AND gates are not as simple as XOR gates. If $c = ab$, AND gates are not as simple as XOR gates. If $c = ab$,
$$ $$
c = \paren{\sum_{i=1}^n a_i} \paren{\sum_{j=1}^n b_j} = \sum_{i=1}^n a_ib_i + \sum_{1 \leq i < j \leq n} (a_ib_j + a_j b_i). c = \paren{\sum _ {i=1}^n a _ i} \paren{\sum _ {j=1}^n b _ j} = \sum _ {i=1}^n a _ ib _ i + \sum _ {1 \leq i < j \leq n} (a _ ib _ j + a _ j b _ i).
$$ $$
The first term can be computed internally by each party. The problem is the second term. $P_i$ doesn't know the values of $a_j$ and $b_j$. Therefore, we need some kind of interaction between $P_i$ and $P_j$, but no information should be revealed. We can use an OT for this. The first term can be computed internally by each party. The problem is the second term. $P _ i$ doesn't know the values of $a _ j$ and $b _ j$. Therefore, we need some kind of interaction between $P _ i$ and $P _ j$, but no information should be revealed. We can use an OT for this.
> For every pair of parties $(P_i, P_j)$, perform the following. > For every pair of parties $(P _ i, P _ j)$, perform the following.
> >
> 1. $P_i$ chooses a random bit $s_{i, j}$ and computes all possible values of $a_ib_j + a_jb_i + s_{i, j}$. These values are used in the OT. > 1. $P _ i$ chooses a random bit $s _ {i, j}$ and computes all possible values of $a _ ib _ j + a _ jb _ i + s _ {i, j}$. These values are used in the OT.
> 2. $P_i$ and $P_j$ run a $1$-out-of-$4$ OT. > 2. $P _ i$ and $P _ j$ run a $1$-out-of-$4$ OT.
> 3. $P_i$ keeps $s_{i, j}$ and $P_j$ receives $a_ib_j + a_jb_i + s_{i, j}$. > 3. $P _ i$ keeps $s _ {i, j}$ and $P _ j$ receives $a _ ib _ j + a _ jb _ i + s _ {i, j}$.
- If $a_ib_j + a_jb_i$ is exposed to any party, it reveals information about other party's share. - If $a _ ib _ j + a _ jb _ i$ is exposed to any party, it reveals information about other party's share.
- These are bits, so $P_i$ and $P_j$ get to keep a share of $a_ib_j + a_jb_i$. If these aren't bits, then $s_{i, j} - a_ib_j - a_jb_i$ must be computed for inputs to the OT. - These are bits, so $P _ i$ and $P _ j$ get to keep a share of $a _ ib _ j + a _ jb _ i$. If these aren't bits, then $s _ {i, j} - a _ ib _ j - a _ jb _ i$ must be computed for inputs to the OT.
- Since $a_j, b_j \in \braces{0, 1}$, it is possible to compute all possible values, and use them in the OT. $(a_j, b_j)$ will be used as the choice of $P_j$. - Since $a _ j, b _ j \in \braces{0, 1}$, it is possible to compute all possible values, and use them in the OT. $(a _ j, b _ j)$ will be used as the choice of $P _ j$.
### Output Phase ### Output Phase
@@ -107,17 +107,17 @@ A shallow circuit is better for GMW protocols. However, shallow circuits may end
## Security Proof ## Security Proof
We show the case when there are $n-1$ corrupted parties.[^1] Let $P_i$ be the honest party and assume that all others are corrupted. We will construct a simulator. We show the case when there are $n-1$ corrupted parties.[^1] Let $P _ i$ be the honest party and assume that all others are corrupted. We will construct a simulator.
Let $(x_1, \dots, x_n)$ be inputs to the function, and let $[y] = (y_1, \dots, y_n)$ be output shares. The adversary's view contains $y$, and all $x_j$, $y_j$ values except for $x_i$ and $y_i$. Let $(x _ 1, \dots, x _ n)$ be inputs to the function, and let $[y] = (y _ 1, \dots, y _ n)$ be output shares. The adversary's view contains $y$, and all $x _ j$, $y _ j$ values except for $x _ i$ and $y _ i$.
To simulate the input phase, choose random shares to be communicated, both for $P_i \ra P_j$ and $P_j \ra P_i$. The shares were chosen randomly, so they are indistinguishable to the real protocol execution. To simulate the input phase, choose random shares to be communicated, both for $P _ i \ra P _ j$ and $P _ j \ra P _ i$. The shares were chosen randomly, so they are indistinguishable to the real protocol execution.
For the evaluation phase, XOR gates can be computed internally, so we only consider AND gates. For the evaluation phase, XOR gates can be computed internally, so we only consider AND gates.
- When $P_j$ is the receiver, choose a random bit as the value learned from the OT. Since the OT contains possible values of $a_ib_j + a_jb_i + s_{i, j}$ and they are random, the random bit is equivalent. - When $P _ j$ is the receiver, choose a random bit as the value learned from the OT. Since the OT contains possible values of $a _ ib _ j + a _ jb _ i + s _ {i, j}$ and they are random, the random bit is equivalent.
- When $P_j$ is the sender, choose $s_{i, j}$ randomly and compute all $4$ possible values following the protocol. - When $P _ j$ is the sender, choose $s _ {i, j}$ randomly and compute all $4$ possible values following the protocol.
Lastly, for the output phase, the simulator has to simulate the message $y_i$ from $P_i$. Since the final output $y$ is known and $y_j$ ($j \neq i$) is known, $y_i$ can be computed from the simulator. Lastly, for the output phase, the simulator has to simulate the message $y _ i$ from $P _ i$. Since the final output $y$ is known and $y _ j$ ($j \neq i$) is known, $y _ i$ can be computed from the simulator.
We see that the distribution of the values inside the simulator is identical to the view in the real protocol execution. We see that the distribution of the values inside the simulator is identical to the view in the real protocol execution.
@@ -129,20 +129,20 @@ We see that the distribution of the values inside the simulator is identical to
### Beaver Triple Sharing ### Beaver Triple Sharing
When Beaver triples are shared, $[x] = (x_1, x_2)$ and $[y] = (y_1, y_2)$ are chosen so that When Beaver triples are shared, $[x] = (x _ 1, x _ 2)$ and $[y] = (y _ 1, y _ 2)$ are chosen so that
$$ $$
\tag{$\ast$} \tag{$\ast$}
z = z_1 + z _2 = (x_1 + x_2)(y_1 + y_2) = x_1y_1 + x_1y_2 + x_2y_1 + x_2y_2. z = z _ 1 + z _ 2 = (x _ 1 + x _ 2)(y _ 1 + y _ 2) = x _ 1y _ 1 + x _ 1y _ 2 + x _ 2y _ 1 + x _ 2y _ 2.
$$ $$
> 1. Each party $P_i$ chooses random bits $x_i, y_i$. Now they must generate $z_1, z_2$ so that the values satisfy equation $(\ast)$ above. > 1. Each party $P _ i$ chooses random bits $x _ i, y _ i$. Now they must generate $z _ 1, z _ 2$ so that the values satisfy equation $(\ast)$ above.
> 2. $P_1$ chooses a random bit $s$ and computes all $4$ possible values of $s + x_1y_2 + x_2y_1$. > 2. $P _ 1$ chooses a random bit $s$ and computes all $4$ possible values of $s + x _ 1y _ 2 + x _ 2y _ 1$.
> 3. $P_1$ and $P_2$ run a $1$-out-of-$4$ OT. > 3. $P _ 1$ and $P _ 2$ run a $1$-out-of-$4$ OT.
> 4. $P_1$ keeps $z_1 = s + x_1y_1$, $P_2$ keeps $z_2 = (s + x_1y_2 + x_2y_1) + x_2y_2$. > 4. $P _ 1$ keeps $z _ 1 = s + x _ 1y _ 1$, $P _ 2$ keeps $z _ 2 = (s + x _ 1y _ 2 + x _ 2y _ 1) + x _ 2y _ 2$.
Indeed, $z_1, z_2$ are shares of $z$.[^2] See also Exercise 23.5.[^3] Indeed, $z _ 1, z _ 2$ are shares of $z$.[^2] See also Exercise 23.5.[^3]
### Evaluating AND Gates with Beaver Triples ### Evaluating AND Gates with Beaver Triples
@@ -150,17 +150,17 @@ Now, in the actual computation of AND gates, proceed as follows.
![mc-16-beaver-triple.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-16-beaver-triple.png) ![mc-16-beaver-triple.png](../../../assets/img/posts/lecture-notes/modern-cryptography/mc-16-beaver-triple.png)
> Each $P_i$ has a share of inputs $a_i, b_i$ and a Beaver triple $(x_i, y_i, z_i)$. > Each $P _ i$ has a share of inputs $a _ i, b _ i$ and a Beaver triple $(x _ i, y _ i, z _ i)$.
> 1. Each $P_i$ computes $u_i = a_i + x_i$, $v_i = b_i + y_i$. > 1. Each $P _ i$ computes $u _ i = a _ i + x _ i$, $v _ i = b _ i + y _ i$.
> 2. $P_i$ shares $u_i, v_i$ to $P_{3-i}$ and receives $u_{3-i}, v_{3-i}$ from $P_{3-i}$. > 2. $P _ i$ shares $u _ i, v _ i$ to $P _ {3-i}$ and receives $u _ {3-i}, v _ {3-i}$ from $P _ {3-i}$.
> 3. Each party now can compute $u = u_1 + u_2$, $v = v_1 + v_2$. > 3. Each party now can compute $u = u _ 1 + u _ 2$, $v = v _ 1 + v _ 2$.
> 4. $P_1$ computes $c_1 = uv + uy_1 + vx_1 + z_1$, $P_2$ computes $c_2 = uy_2 + vx_2 + z_2$. > 4. $P _ 1$ computes $c _ 1 = uv + uy _ 1 + vx _ 1 + z _ 1$, $P _ 2$ computes $c _ 2 = uy _ 2 + vx _ 2 + z _ 2$.
Note that Note that
$$ $$
\begin{aligned} \begin{aligned}
c = c_1 + c_2 &= uv + u(y_1 + y_2) + v(x_1 + x_2) + (z_1 + z_2) \\ c = c _ 1 + c _ 2 &= uv + u(y _ 1 + y _ 2) + v(x _ 1 + x _ 2) + (z _ 1 + z _ 2) \\
&= uv + uy + vx + xy \qquad (\because z = xy) \\ &= uv + uy + vx + xy \qquad (\because z = xy) \\
&= u(v + y) + x(v + y) \\ &= u(v + y) + x(v + y) \\
&= (u + x)(v + y) = ab &= (u + x)(v + y) = ab
@@ -187,11 +187,11 @@ $$
and $uv$ is public, so any party can include it in its share. and $uv$ is public, so any party can include it in its share.
Also note that $u_i, v_i$ does not reveal any information about $x_i, y_i$. Essentially, they are *one-time pad* encryptions of $x_i$ and $y_i$ since $a_i, b_i$ were chosen randomly. No need for OTs during actual computation. Also note that $u _ i, v _ i$ does not reveal any information about $x _ i, y _ i$. Essentially, they are *one-time pad* encryptions of $x _ i$ and $y _ i$ since $a _ i, b _ i$ were chosen randomly. No need for OTs during actual computation.
### Reusing Beaver Triples? ### Reusing Beaver Triples?
**Beaver triples are to be used only once!** If $u_1 = a_1 + x_1$ and $u_1' = a_1' + x_1$, then $u_1 + u_1' = a_1 + a_1'$, revealing information about $a_1 + a_1'$. **Beaver triples are to be used only once!** If $u _ 1 = a _ 1 + x _ 1$ and $u _ 1' = a _ 1' + x _ 1$, then $u _ 1 + u _ 1' = a _ 1 + a _ 1'$, revealing information about $a _ 1 + a _ 1'$.
Thus, before the online phase, a huge amount of Beaver triples are shared to speed up the computation. This can be done efficiently using [OT extension](../2023-11-16-gmw-protocol/#ot-extension) described below. Thus, before the online phase, a huge amount of Beaver triples are shared to speed up the computation. This can be done efficiently using [OT extension](../2023-11-16-gmw-protocol/#ot-extension) described below.
@@ -217,65 +217,65 @@ There is a technique called **OT extension**, that allows us to obtain many OT i
This protocol will extend $n$ OTs to $m$ OTs, where $m \gg n$. This protocol will extend $n$ OTs to $m$ OTs, where $m \gg n$.
- Sender has inputs $\paren{x_i^0, x_i^1}$ for $i = 1, \dots, m$. - Sender has inputs $\paren{x _ i^0, x _ i^1}$ for $i = 1, \dots, m$.
- Receiver has choice vector $\sigma = (\sigma_1, \dots, \sigma_m) \in \braces{0, 1}^m$. - Receiver has choice vector $\sigma = (\sigma _ 1, \dots, \sigma _ m) \in \braces{0, 1}^m$.
- After the protocol, the receiver will get $x_i^{\sigma_i}$ for $i = 1, \dots, m$. - After the protocol, the receiver will get $x _ i^{\sigma _ i}$ for $i = 1, \dots, m$.
> **First phase.** > **First phase.**
> >
> 1. The receiver samples $n$ random strings $T_1, \dots, T_n \la \braces{0, 1}^m$ of length $m$. > 1. The receiver samples $n$ random strings $T _ 1, \dots, T _ n \la \braces{0, 1}^m$ of length $m$.
> 2. The receiver prepares pairs $\paren{T_i, T_i \oplus \sigma}$ for $i = 1, \dots, n$ and plays *sender in base OT*. > 2. The receiver prepares pairs $\paren{T _ i, T _ i \oplus \sigma}$ for $i = 1, \dots, n$ and plays *sender in base OT*.
> 3. The sender chooses random $s = (s_1, \dots, s_n) \in \braces{0, 1}^n$. > 3. The sender chooses random $s = (s _ 1, \dots, s _ n) \in \braces{0, 1}^n$.
> 4. The sender plays *receiver in base OT* with input $s_i$ for $i = 1, \dots, n$. > 4. The sender plays *receiver in base OT* with input $s _ i$ for $i = 1, \dots, n$.
In the first phase, the roles are temporarily switched. In the first phase, the roles are temporarily switched.
- The receiver chose $n$ random $m$-bit vectors, now has a $m\times n$ bit matrix $T$. - The receiver chose $n$ random $m$-bit vectors, now has a $m\times n$ bit matrix $T$.
- For the $i$-th base OT, the receiver inputs $T_i$ or $T_i \oplus \sigma$. Therefore, if $s_i = 0$, the sender gets $T_i$. If $s_i = 1$, then sender gets $T_i \oplus \sigma$. - For the $i$-th base OT, the receiver inputs $T _ i$ or $T _ i \oplus \sigma$. Therefore, if $s _ i = 0$, the sender gets $T _ i$. If $s _ i = 1$, then sender gets $T _ i \oplus \sigma$.
- Suppose that the sender gets $Q_i \in \braces{0, 1}^m$ in the $i$-th base OT. The sender will also have a $m \times n$ bit matrix $Q$. - Suppose that the sender gets $Q _ i \in \braces{0, 1}^m$ in the $i$-th base OT. The sender will also have a $m \times n$ bit matrix $Q$.
$$ $$
Q_i = \begin{cases} T_i & (s_i = 0) \\ Q _ i = \begin{cases} T _ i & (s _ i = 0) \\
T_i \oplus \sigma & (s_i = 1). T _ i \oplus \sigma & (s _ i = 1).
\end{cases} \end{cases}
$$ $$
**Now consider each row separately!** Let $A[k]$ be the $k$-th row of matrix $A$. **Now consider each row separately!** Let $A[k]$ be the $k$-th row of matrix $A$.
If $\sigma_j = 0$, the XOR operation in $T_i \oplus \sigma$ has no effect on the $j$-th element (row), so the $j$-th element of $T_i \oplus \sigma$ and $T_i$ are the same. Thus, we have $Q[j] = T[j]$. If $\sigma _ j = 0$, the XOR operation in $T _ i \oplus \sigma$ has no effect on the $j$-th element (row), so the $j$-th element of $T _ i \oplus \sigma$ and $T _ i$ are the same. Thus, we have $Q[j] = T[j]$.
On the other hand, suppose that $\sigma_j = 1$ and consider each element of $Q[j]$. The $i$-th element is the $j$-th element of $Q_i$. If $s_i = 0$, then $Q_i = T_i$, so the $j$-th element (row) is the same as the $j$-th element of $T_i$. If $s_i = 1$, then $Q_i = T_i \oplus \sigma$, so the $j$-th element is flipped. Thus, $Q[j] = T[j] \oplus s$. On the other hand, suppose that $\sigma _ j = 1$ and consider each element of $Q[j]$. The $i$-th element is the $j$-th element of $Q _ i$. If $s _ i = 0$, then $Q _ i = T _ i$, so the $j$-th element (row) is the same as the $j$-th element of $T _ i$. If $s _ i = 1$, then $Q _ i = T _ i \oplus \sigma$, so the $j$-th element is flipped. Thus, $Q[j] = T[j] \oplus s$.
$$ $$
Q[j] = \begin{cases} T[j] & (\sigma_j = 0) \\ Q[j] = \begin{cases} T[j] & (\sigma _ j = 0) \\
T[j] \oplus s & (\sigma_j = 1). T[j] \oplus s & (\sigma _ j = 1).
\end{cases} \end{cases}
$$ $$
> **Second phase.** To perform the $j$-th transfer $(j = 1, \dots, m)$, > **Second phase.** To perform the $j$-th transfer $(j = 1, \dots, m)$,
> >
> 1. The sender sends $y_j^0 = H(j, Q[j]) \oplus x_j^0$ and $y_j^1 = H(j, Q[j] \oplus s) \oplus x_j^1$. > 1. The sender sends $y _ j^0 = H(j, Q[j]) \oplus x _ j^0$ and $y _ j^1 = H(j, Q[j] \oplus s) \oplus x _ j^1$.
> 2. The receiver computes $H(j, T[j]) \oplus y_j^{\sigma_j}$. > 2. The receiver computes $H(j, T[j]) \oplus y _ j^{\sigma _ j}$.
If $\sigma_j = 0$, then the sender gets If $\sigma _ j = 0$, then the sender gets
$$ $$
H(j, T[j]) \oplus y_j^0 = H(j, T[j]) \oplus H(j, Q[j]) \oplus x_j^0 = x_j^0. H(j, T[j]) \oplus y _ j^0 = H(j, T[j]) \oplus H(j, Q[j]) \oplus x _ j^0 = x _ j^0.
$$ $$
If $\sigma_j = 1$, If $\sigma _ j = 1$,
$$ $$
H(j, T[j]) \oplus y_j^1 = H(j, T[j]) \oplus H(j, Q[j] \oplus s) \oplus x_j^1 = x_j^1. H(j, T[j]) \oplus y _ j^1 = H(j, T[j]) \oplus H(j, Q[j] \oplus s) \oplus x _ j^1 = x _ j^1.
$$ $$
We have just shown correctness. We have just shown correctness.
### Security Proof of OT Extension ### Security Proof of OT Extension
Intuitively, the sender receives either $T_i$ or $T_i \oplus \sigma$. But $T_i$ are chosen randomly, so it hides $\sigma$, revealing no information. Intuitively, the sender receives either $T _ i$ or $T _ i \oplus \sigma$. But $T _ i$ are chosen randomly, so it hides $\sigma$, revealing no information.
As for the receiver, the values $(x_j^0, x_j^1)$ are masked by a hash function, namely $H(j, Q[j])$ and $H(j, Q[j] \oplus s)$. The receiver can compute $H(j, T[j])$, which equals *only one of them* but since receiver has no information about $s$, prohibiting the receiver from computing the other mask. As for the receiver, the values $(x _ j^0, x _ j^1)$ are masked by a hash function, namely $H(j, Q[j])$ and $H(j, Q[j] \oplus s)$. The receiver can compute $H(j, T[j])$, which equals *only one of them* but since receiver has no information about $s$, prohibiting the receiver from computing the other mask.
### Performance of OT Extension ### Performance of OT Extension
@@ -286,5 +286,5 @@ One may concern that we have to send a lot of information for each of the $n$ OT
Hence, with OT extensions, we can perform millions of OTs efficiently, which can be used especially for computing many Beaver triples during preprocessing. Hence, with OT extensions, we can perform millions of OTs efficiently, which can be used especially for computing many Beaver triples during preprocessing.
[^1]: Intuitively, it may seem that proving security for $n-1$ corrupted parties would be the hardest. However, security for $n-1$ corrupted parties does not imply security for $n-2$ corrupted parties, in general. [^1]: Intuitively, it may seem that proving security for $n-1$ corrupted parties would be the hardest. However, security for $n-1$ corrupted parties does not imply security for $n-2$ corrupted parties, in general.
[^2]: There is a variant of sharing Beaver triples, where a dealer generates all $x_i, y_i, z_i$ and gives them to each party. [^2]: There is a variant of sharing Beaver triples, where a dealer generates all $x _ i, y _ i, z _ i$ and gives them to each party.
[^3]: A Graduate Course in Applied Cryptography. [^3]: A Graduate Course in Applied Cryptography.

View File

@@ -73,26 +73,26 @@ This is a sample scheme, which is insecure.
> Choose parameters $n$ and $q$ as security parameters. > Choose parameters $n$ and $q$ as security parameters.
> >
> 1. Set secret key $\bf{s} = (s_1, \dots, s_n) \in \Z^n$. > 1. Set secret key $\bf{s} = (s _ 1, \dots, s _ n) \in \Z^n$.
> 2. For message $m \in \Z_q$, encrypt it as follows. > 2. For message $m \in \Z _ q$, encrypt it as follows.
> - Randomly choose $\bf{a} = (a_1, \dots, a_n) \la \Z_q^n$. > - Randomly choose $\bf{a} = (a _ 1, \dots, a _ n) \la \Z _ q^n$.
> - Compute $b = -\span{\bf{a}, \bf{s}} + m \pmod q$. > - Compute $b = -\span{\bf{a}, \bf{s}} + m \pmod q$.
> - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$. > - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$.
> 3. To decrypt $\bf{c}$, compute $m = b + \span{\bf{a}, \bf{s}} \pmod q$. > 3. To decrypt $\bf{c}$, compute $m = b + \span{\bf{a}, \bf{s}} \pmod q$.
Correctness is trivial. Also, this encryption algorithm has the *additive homomorphism* property. If $b_1, b_2$ are encryptions of $m_1, m_2$, then Correctness is trivial. Also, this encryption algorithm has the *additive homomorphism* property. If $b _ 1, b _ 2$ are encryptions of $m _ 1, m _ 2$, then
$$ $$
b_1 = -\span{\bf{a}_1, \bf{s}} + m_1, \quad b_2 = -\span{\bf{a}_2, \bf{s}} + m_2 b _ 1 = -\span{\bf{a} _ 1, \bf{s}} + m _ 1, \quad b _ 2 = -\span{\bf{a} _ 2, \bf{s}} + m _ 2
$$ $$
in $\Z_q$. Thus, in $\Z _ q$. Thus,
$$ $$
b_1 + b_2 = -\span{\bf{a}_1 + \bf{a}_2, \bf{s}} + m_1 + m_2. b _ 1 + b _ 2 = -\span{\bf{a} _ 1 + \bf{a} _ 2, \bf{s}} + m _ 1 + m _ 2.
$$ $$
Decrypting the ciphertext $(b_1 + b_2, \bf{a}_1 + \bf{a}_2)$ will surely give $m_1 + m_2$. Decrypting the ciphertext $(b _ 1 + b _ 2, \bf{a} _ 1 + \bf{a} _ 2)$ will surely give $m _ 1 + m _ 2$.
But this scheme is not secure. After $n$ queries, the plaintext-ciphertext pairs can be transformed into a linear system of equations But this scheme is not secure. After $n$ queries, the plaintext-ciphertext pairs can be transformed into a linear system of equations
@@ -100,16 +100,16 @@ $$
\bf{b} = -A \bf{s} + \bf{m}, \bf{b} = -A \bf{s} + \bf{m},
$$ $$
where $\bf{a}_i$ are in the rows of $A$. This system can be solved for $\bf{s}$ with non-negligible probability.[^2] where $\bf{a} _ i$ are in the rows of $A$. This system can be solved for $\bf{s}$ with non-negligible probability.[^2]
## Lattice Cryptography ## Lattice Cryptography
Recall that schemes like RSA and ElGamal rely on the hardness of computational problems. The hardness of those problems make the schemes secure. There are other (known to be) *hard* problems using **lattices**, and recent homomorphic encryption schemes use **lattice-based** cryptography. Recall that schemes like RSA and ElGamal rely on the hardness of computational problems. The hardness of those problems make the schemes secure. There are other (known to be) *hard* problems using **lattices**, and recent homomorphic encryption schemes use **lattice-based** cryptography.
> **Definition.** For $\bf{b}_i \in \Z^n$ for $i = 1, \dots, n$, let $B = \braces{\bf{b}_1, \dots, \bf{b}_n}$ be a basis. The set > **Definition.** For $\bf{b} _ i \in \Z^n$ for $i = 1, \dots, n$, let $B = \braces{\bf{b} _ 1, \dots, \bf{b} _ n}$ be a basis. The set
> >
> $$ > $$
> L = \braces{\sum_{i=1}^n a_i\bf{b}_i : a_i \in \Z} > L = \braces{\sum _ {i=1}^n a _ i\bf{b} _ i : a _ i \in \Z}
> $$ > $$
> >
> is called a **lattice**. The set $B$ is a basis over $L$. > is called a **lattice**. The set $B$ is a basis over $L$.
@@ -128,16 +128,16 @@ for a small error $\bf{e}$, the problem is to find the closest lattice point $B\
It is known that all (including quantum) algorithms for solving BDD have costs $2^{\Omega(n)}$. It is known that all (including quantum) algorithms for solving BDD have costs $2^{\Omega(n)}$.
This problem is easy when we have a *short* basis, where the angles between vectors are closer to $\pi/2$. For example, given $\bf{t}$, find $a_i \in \R$ such that This problem is easy when we have a *short* basis, where the angles between vectors are closer to $\pi/2$. For example, given $\bf{t}$, find $a _ i \in \R$ such that
$$ $$
\bf{t} = a_1 \bf{b}_1 + \cdots a_n \bf{b}_n \bf{t} = a _ 1 \bf{b} _ 1 + \cdots a _ n \bf{b} _ n
$$ $$
and return $B\bf{u}$ as and return $B\bf{u}$ as
$$ $$
B\bf{u} = \sum_{i=1}^n \lfloor a_i \rceil \bf{b}_i. B\bf{u} = \sum _ {i=1}^n \lfloor a _ i \rceil \bf{b} _ i.
$$ $$
Then this $B\bf{u} \in L$ is pretty close to $\bf{t} \notin L$. Then this $B\bf{u} \in L$ is pretty close to $\bf{t} \notin L$.
@@ -146,28 +146,28 @@ Then this $B\bf{u} \in L$ is pretty close to $\bf{t} \notin L$.
This is the problem we will mainly use for homomorphic schemes. This is the problem we will mainly use for homomorphic schemes.
Let $\rm{LWE}_{n, q, \sigma}(\bf{s})$ denote the LWE distribution, where Let $\rm{LWE} _ {n, q, \sigma}(\bf{s})$ denote the LWE distribution, where
- $n$ is the number of dimensions, - $n$ is the number of dimensions,
- $q$ is the modulus, - $q$ is the modulus,
- $\sigma$ is the standard deviation of error. - $\sigma$ is the standard deviation of error.
Also $D_\sigma$ denotes the discrete gaussian distribution with standard deviation $\sigma$. Also $D _ \sigma$ denotes the discrete gaussian distribution with standard deviation $\sigma$.
> Let $\bf{s} = (s_1, \dots, s_n) \in \Z_q^n$ be a secret. > Let $\bf{s} = (s _ 1, \dots, s _ n) \in \Z _ q^n$ be a secret.
> >
> - Sample $\bf{a} = (a_1, \dots, a_n) \la \Z_q^n$ and $e \la D_\sigma$. > - Sample $\bf{a} = (a _ 1, \dots, a _ n) \la \Z _ q^n$ and $e \la D _ \sigma$.
> - Compute $b = \span{\bf{a}, \bf{s}} + e \pmod q$. > - Compute $b = \span{\bf{a}, \bf{s}} + e \pmod q$.
> - Output $(b, \bf{a}) \in \Z_q^{n+1}$. > - Output $(b, \bf{a}) \in \Z _ q^{n+1}$.
> >
> This is called a **LWE instance**. > This is called a **LWE instance**.
### Search LWE Problem ### Search LWE Problem
> Given many samples from $\rm{LWE}_{n, q, \sigma}(\bf{s})$, find $\bf{s}$. > Given many samples from $\rm{LWE} _ {n, q, \sigma}(\bf{s})$, find $\bf{s}$.
### Decisional LWE Problem (DLWE) ### Decisional LWE Problem (DLWE)
> Distinguish two distributions $\rm{LWE}_{n, q, \sigma}(\bf{s})$ and $U(\Z_q^{n+1})$. > Distinguish two distributions $\rm{LWE} _ {n, q, \sigma}(\bf{s})$ and $U(\Z _ q^{n+1})$.
It is known that the two versions of LWE problem are **equivalent** when $q$ is a prime bounded by some polynomial in $n$. It is known that the two versions of LWE problem are **equivalent** when $q$ is a prime bounded by some polynomial in $n$.
@@ -175,17 +175,17 @@ LWE problem can be turned into **assumptions**, just like the DL and RSA problem
## The BGV Scheme ## The BGV Scheme
**BGV scheme** is by Brakerski-Gentry-Vaikuntanathan (2012). The scheme is defined over the finite field $\Z_p$ and can perform arithmetic in $\Z_p$. **BGV scheme** is by Brakerski-Gentry-Vaikuntanathan (2012). The scheme is defined over the finite field $\Z _ p$ and can perform arithmetic in $\Z _ p$.
> Choose security parameters $n$, $q$ and $\sigma$. It is important that $q$ is chosen as an **odd** integer. > Choose security parameters $n$, $q$ and $\sigma$. It is important that $q$ is chosen as an **odd** integer.
> >
> **Key Generation** > **Key Generation**
> - Set secret key $\bf{s} = (s_1, \dots, s_n) \in \Z^n$. > - Set secret key $\bf{s} = (s _ 1, \dots, s _ n) \in \Z^n$.
> >
> **Encryption** > **Encryption**
> - Sample $\bf{a} \la \Z_q^n$ and $e \la D_\sigma$. > - Sample $\bf{a} \la \Z _ q^n$ and $e \la D _ \sigma$.
> - Compute $b = -\span{\bf{a}, \bf{s}} + m + 2e \pmod q$. > - Compute $b = -\span{\bf{a}, \bf{s}} + m + 2e \pmod q$.
> - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$. > - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$.
> >
> **Decryption** > **Decryption**
> - Compute $r = b + \span{\bf{a}, \bf{s}} \pmod q$. > - Compute $r = b + \span{\bf{a}, \bf{s}} \pmod q$.
@@ -206,16 +206,16 @@ $$
Under the LWE assumption, it can be proven that the scheme is semantically secure, i.e, Under the LWE assumption, it can be proven that the scheme is semantically secure, i.e,
$$ $$
E(\bf{s}, 0) \approx_c E(\bf{s}, 1). E(\bf{s}, 0) \approx _ c E(\bf{s}, 1).
$$ $$
### Addition in BGV ### Addition in BGV
Addition is easy! Addition is easy!
> Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \braces{0, 1}$. Then, $\bf{c}_\rm{add} = \bf{c} + \bf{c}'$ is an encryption of $m + m'$. > Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \braces{0, 1}$. Then, $\bf{c} _ \rm{add} = \bf{c} + \bf{c}'$ is an encryption of $m + m'$.
*Proof*. Decrypt $\bf{c}_\rm{add} = (b + b', \bf{a} + \bf{a}')$. If *Proof*. Decrypt $\bf{c} _ \rm{add} = (b + b', \bf{a} + \bf{a}')$. If
$$ $$
r = b + \span{\bf{a}, \bf{s}} = m + 2e \pmod q r = b + \span{\bf{a}, \bf{s}} = m + 2e \pmod q
@@ -230,10 +230,10 @@ $$
then we have then we have
$$ $$
r_\rm{add} = b + b' + \span{\bf{a} + \bf{a}', \bf{s}} = r + r' = m + m' + 2(e + e') \pmod q. r _ \rm{add} = b + b' + \span{\bf{a} + \bf{a}', \bf{s}} = r + r' = m + m' + 2(e + e') \pmod q.
$$ $$
If $\abs{r + r'} < q/2$, then $m + m' = r_\rm{add} \pmod 2$. If $\abs{r + r'} < q/2$, then $m + m' = r _ \rm{add} \pmod 2$.
### Multiplication in BGV ### Multiplication in BGV
@@ -241,10 +241,10 @@ If $\abs{r + r'} < q/2$, then $m + m' = r_\rm{add} \pmod 2$.
For multiplication, we need **tensor products**. For multiplication, we need **tensor products**.
> **Definition.** Let $\bf{a} = (a_1, \dots, a_n)^\top, \bf{b} = (b_1, \dots, b_n)^\top$ be vectors. Then the **tensor product** $\bf{a} \otimes \bf{b}$ is a vector with $n^2$ dimensions such that > **Definition.** Let $\bf{a} = (a _ 1, \dots, a _ n)^\top, \bf{b} = (b _ 1, \dots, b _ n)^\top$ be vectors. Then the **tensor product** $\bf{a} \otimes \bf{b}$ is a vector with $n^2$ dimensions such that
> >
> $$ > $$
> \bf{a} \otimes \bf{b} = \big( a_i \cdot b_j \big)_{1 \leq i, j \leq n}. > \bf{a} \otimes \bf{b} = \big( a _ i \cdot b _ j \big) _ {1 \leq i, j \leq n}.
> $$ > $$
We will use the following property. We will use the following property.
@@ -255,12 +255,12 @@ We will use the following property.
> \span{\bf{a}, \bf{b}} \cdot \span{\bf{c}, \bf{d}} = \span{\bf{a} \otimes \bf{c}, \bf{b} \otimes \bf{d}}. > \span{\bf{a}, \bf{b}} \cdot \span{\bf{c}, \bf{d}} = \span{\bf{a} \otimes \bf{c}, \bf{b} \otimes \bf{d}}.
> $$ > $$
*Proof*. Denote the components as $a_i, b_i, c_i, d_i$. *Proof*. Denote the components as $a _ i, b _ i, c _ i, d _ i$.
$$ $$
\begin{aligned} \begin{aligned}
\span{\bf{a} \otimes \bf{c}, \bf{b} \otimes \bf{d}} &= \sum_{i=1}^n\sum_{j=1}^n a_ic_j \cdot b_id_j \\ \span{\bf{a} \otimes \bf{c}, \bf{b} \otimes \bf{d}} &= \sum _ {i=1}^n\sum _ {j=1}^n a _ ic _ j \cdot b _ id _ j \\
&= \paren{\sum_{i=1}^n a_ib_i} \paren{\sum_{j=1}^n c_j d_j} = \span{\bf{a}, \bf{b}} \cdot \span{\bf{c}, \bf{d}}. &= \paren{\sum _ {i=1}^n a _ ib _ i} \paren{\sum _ {j=1}^n c _ j d _ j} = \span{\bf{a}, \bf{b}} \cdot \span{\bf{c}, \bf{d}}.
\end{aligned} \end{aligned}
$$ $$
@@ -281,26 +281,26 @@ $$
we have that we have that
$$ $$
r_\rm{mul} = rr' = (m + 2e)(m' + 2e') = mm' + 2e\conj \pmod q. r _ \rm{mul} = rr' = (m + 2e)(m' + 2e') = mm' + 2e\conj \pmod q.
$$ $$
So $mm' = r_\rm{mul} \pmod 2$ if $e\conj$ is small. So $mm' = r _ \rm{mul} \pmod 2$ if $e\conj$ is small.
However, to compute $r_\rm{mul} = rr'$ from the ciphertext, However, to compute $r _ \rm{mul} = rr'$ from the ciphertext,
$$ $$
\begin{aligned} \begin{aligned}
r_\rm{mul} &= rr' = (b + \span{\bf{a}, \bf{s}})(b' + \span{\bf{a}', \bf{s}}) \\ r _ \rm{mul} &= rr' = (b + \span{\bf{a}, \bf{s}})(b' + \span{\bf{a}', \bf{s}}) \\
&= bb' + \span{b\bf{a}' + b' \bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'}. &= bb' + \span{b\bf{a}' + b' \bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'}.
\end{aligned} \end{aligned}
$$ $$
Thus we define $\bf{c}_\rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, then this can be decrypted with $(1, \bf{s}, \bf{s} \otimes \bf{s})$ by the above equation. Thus we define $\bf{c} _ \rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, then this can be decrypted with $(1, \bf{s}, \bf{s} \otimes \bf{s})$ by the above equation.
> Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m'$. Then, > Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m'$. Then,
> >
> $$ > $$
> \bf{c}_\rm{mul} = \bf{c} \otimes \bf{c}' = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}') > \bf{c} _ \rm{mul} = \bf{c} \otimes \bf{c}' = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')
> $$ > $$
> >
> is an encryption of $mm'$ with $(1, \bf{s}, \bf{s} \otimes \bf{s})$. > is an encryption of $mm'$ with $(1, \bf{s}, \bf{s} \otimes \bf{s})$.
@@ -319,58 +319,58 @@ The multiplication described above has two major problems.
### Dimension Reduction ### Dimension Reduction
First, we reduce the ciphertext dimension. In the ciphertext $\bf{c}_\rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{a} \otimes \bf{a}'$ is causing the problem, since it must be decrypted with $\bf{s} \otimes \bf{s}'$. First, we reduce the ciphertext dimension. In the ciphertext $\bf{c} _ \rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{a} \otimes \bf{a}'$ is causing the problem, since it must be decrypted with $\bf{s} \otimes \bf{s}'$.
Observe that the following dot product is calculated during decryption. Observe that the following dot product is calculated during decryption.
$$ $$
\tag{1} \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} = \sum_{i = 1}^n \sum_{j=1}^n a_i a_j' s_i s_j. \tag{1} \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} = \sum _ {i = 1}^n \sum _ {j=1}^n a _ i a _ j' s _ i s _ j.
$$ $$
The above expression has $n^2$ terms, so they have to be manipulated. The idea is to switch these terms as encryptions of $\bf{s}$, instead of $\bf{s} \otimes \bf{s}'$. The above expression has $n^2$ terms, so they have to be manipulated. The idea is to switch these terms as encryptions of $\bf{s}$, instead of $\bf{s} \otimes \bf{s}'$.
Thus we use encryptions of $s_is_j$ by $\bf{s}$. If we have ciphertexts of $s_is_j$, we can calculate the expression in $(1)$ since this scheme is *homomorphic*. Then the ciphertext can be decrypted only with $\bf{s}$, as usual. This process is called **relinearization**, and the ciphertexts of $s_i s_j$ are called **relinearization keys**. Thus we use encryptions of $s _ is _ j$ by $\bf{s}$. If we have ciphertexts of $s _ is _ j$, we can calculate the expression in $(1)$ since this scheme is *homomorphic*. Then the ciphertext can be decrypted only with $\bf{s}$, as usual. This process is called **relinearization**, and the ciphertexts of $s _ i s _ j$ are called **relinearization keys**.
#### First Attempt #### First Attempt
> **Relinearization Keys**: for $1 \leq i, j \leq n$, perform the following. > **Relinearization Keys**: for $1 \leq i, j \leq n$, perform the following.
> - Sample $\bf{u}_{i, j} \la \Z_q^{n}$ and $e_{i, j} \la D_\sigma$. > - Sample $\bf{u} _ {i, j} \la \Z _ q^{n}$ and $e _ {i, j} \la D _ \sigma$.
> - Compute $v_{i, j} = -\span{\bf{u}_{i, j}, \bf{s}} + s_i s_j + 2e_{i, j} \pmod q$. > - Compute $v _ {i, j} = -\span{\bf{u} _ {i, j}, \bf{s}} + s _ i s _ j + 2e _ {i, j} \pmod q$.
> - Output $\bf{w}_{i, j} = (v_{i, j}, \bf{u}_{i, j})$. > - Output $\bf{w} _ {i, j} = (v _ {i, j}, \bf{u} _ {i, j})$.
> >
> **Linearization**: given $\bf{c}_\rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$ and $\bf{w}_{i, j}$ for $1 \leq i, j \leq n$, output the following. > **Linearization**: given $\bf{c} _ \rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$ and $\bf{w} _ {i, j}$ for $1 \leq i, j \leq n$, output the following.
> >
> $$ > $$
> \bf{c}_\rm{mul}^\ast = (b_\rm{mul}^\ast, \bf{a}_\rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' \bf{w}_{i, j} \pmod q. > \bf{c} _ \rm{mul}^\ast = (b _ \rm{mul}^\ast, \bf{a} _ \rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' \bf{w} _ {i, j} \pmod q.
> $$ > $$
Note that the addition $+$ is the addition of two $(n+1)$-dimensional vectors. By plugging in $\bf{w}_{i, j} = (v_{i, j}, \bf{u}_{i, j})$, we actually have Note that the addition $+$ is the addition of two $(n+1)$-dimensional vectors. By plugging in $\bf{w} _ {i, j} = (v _ {i, j}, \bf{u} _ {i, j})$, we actually have
$$ $$
b_\rm{mul}^\ast = bb' + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' v_{i, j} b _ \rm{mul}^\ast = bb' + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' v _ {i, j}
$$ $$
and and
$$ $$
\bf{a}_\rm{mul}^\ast = b\bf{a}' + b'\bf{a} + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' \bf{u}_{i, j}. \bf{a} _ \rm{mul}^\ast = b\bf{a}' + b'\bf{a} + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' \bf{u} _ {i, j}.
$$ $$
Now we check correctness. $\bf{c}_\rm{mul}^\ast$ should decrypt to $mm'$ with only $\bf{s}$. Now we check correctness. $\bf{c} _ \rm{mul}^\ast$ should decrypt to $mm'$ with only $\bf{s}$.
$$ $$
\begin{aligned} \begin{aligned}
b_\rm{mul}^\ast + \span{\bf{a}_\rm{mul}^\ast, \bf{s}} &= bb' + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' v_{i, j} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' \span{\bf{u}_{i, j}, \bf{s}} \\ b _ \rm{mul}^\ast + \span{\bf{a} _ \rm{mul}^\ast, \bf{s}} &= bb' + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' v _ {i, j} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' \span{\bf{u} _ {i, j}, \bf{s}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' \paren{v_{i, j} + \span{\bf{u}_{i, j}, \bf{s}}}. &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' \paren{v _ {i, j} + \span{\bf{u} _ {i, j}, \bf{s}}}.
\end{aligned} \end{aligned}
$$ $$
Since $v_{i, j} + \span{\bf{u}_{i, j}, \bf{s}} = s_i s_j + 2e_{i, j} \pmod q$, the above expression further reduces to Since $v _ {i, j} + \span{\bf{u} _ {i, j}, \bf{s}} = s _ i s _ j + 2e _ {i, j} \pmod q$, the above expression further reduces to
$$ $$
\begin{aligned} \begin{aligned}
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i=1}^n \sum_{j=1}^n a_i a_j' \paren{s_i s_j + 2e_{i, j}} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' \paren{s _ i s _ j + 2e _ {i, j}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} + 2\sum_{i=1}^n\sum_{j=1}^n a_i a_j' e_{i, j} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} + 2\sum _ {i=1}^n\sum _ {j=1}^n a _ i a _ j' e _ {i, j} \\
&= rr' + 2e\conj \pmod q, &= rr' + 2e\conj \pmod q,
\end{aligned} \end{aligned}
$$ $$
@@ -380,57 +380,57 @@ and we have an encryption of $mm'$.
However, we require that However, we require that
$$ $$
e\conj = \sum_{i=1}^n \sum_{j=1}^n a_i a_j' e_{i, j} \ll q e\conj = \sum _ {i=1}^n \sum _ {j=1}^n a _ i a _ j' e _ {i, j} \ll q
$$ $$
for correctness. It is highly unlikely that this relation holds, since $a_i a_j'$ will be large. They are random elements of $\Z_q$ after all, so the size is about $\mc{O}(n^2 q)$. for correctness. It is highly unlikely that this relation holds, since $a _ i a _ j'$ will be large. They are random elements of $\Z _ q$ after all, so the size is about $\mc{O}(n^2 q)$.
#### Relinearization #### Relinearization
We use a method to make $a_i a_j'$ smaller. The idea is to use the binary representation. We use a method to make $a _ i a _ j'$ smaller. The idea is to use the binary representation.
Let $a[k] \in \braces{0, 1}$ denote the $k$-th least significant bit of $a \in \Z_q$. Then we can write Let $a[k] \in \braces{0, 1}$ denote the $k$-th least significant bit of $a \in \Z _ q$. Then we can write
$$ $$
a = \sum_{0\leq k<l} 2^k \cdot a[k] a = \sum _ {0\leq k<l} 2^k \cdot a[k]
$$ $$
where $l = \ceil{\log q}$. Then we have where $l = \ceil{\log q}$. Then we have
$$ $$
a_i a_j' s_i s_j = \sum_{0\leq k <l} (a_i a_j')[k] \cdot 2^k s_i s_j, a _ i a _ j' s _ i s _ j = \sum _ {0\leq k <l} (a _ i a _ j')[k] \cdot 2^k s _ i s _ j,
$$ $$
so instead of encryptions of $s_i s_j$, we use encryptions of $2^k s_i s_j$. so instead of encryptions of $s _ i s _ j$, we use encryptions of $2^k s _ i s _ j$.
For convenience, let $a_{i, j} = a_i a_j'$. Now we have triple indices including $k$. For convenience, let $a _ {i, j} = a _ i a _ j'$. Now we have triple indices including $k$.
> **Relinearization Keys**: for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, perform the following. > **Relinearization Keys**: for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, perform the following.
> - Sample $\bf{u}_{i, j, k} \la \Z_q^{n}$ and $e_{i, j, k} \la D_\sigma$. > - Sample $\bf{u} _ {i, j, k} \la \Z _ q^{n}$ and $e _ {i, j, k} \la D _ \sigma$.
> - Compute $v_{i, j, k} = -\span{\bf{u}_{i, j, k}, \bf{s}} + 2^k \cdot s_i s_j + 2e_{i, j, k} \pmod q$. > - Compute $v _ {i, j, k} = -\span{\bf{u} _ {i, j, k}, \bf{s}} + 2^k \cdot s _ i s _ j + 2e _ {i, j, k} \pmod q$.
> - Output $\bf{w}_{i, j, k} = (v_{i, j, k}, \bf{u}_{i, j, k})$. > - Output $\bf{w} _ {i, j, k} = (v _ {i, j, k}, \bf{u} _ {i, j, k})$.
> >
> **Linearization**: given $\bf{c}_\rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{w}_{i, j, k}$ for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, output the following. > **Linearization**: given $\bf{c} _ \rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{w} _ {i, j, k}$ for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, output the following.
> >
> $$ > $$
> \bf{c}_\rm{mul}^\ast = (b_\rm{mul}^\ast, \bf{a}_\rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum_{i=1}^n \sum_{j=1}^n \sum_{k=0}^{\ceil{\log q}} a_{i, j}[k] \bf{w}_{i, j, k} \pmod q. > \bf{c} _ \rm{mul}^\ast = (b _ \rm{mul}^\ast, \bf{a} _ \rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum _ {i=1}^n \sum _ {j=1}^n \sum _ {k=0}^{\ceil{\log q}} a _ {i, j}[k] \bf{w} _ {i, j, k} \pmod q.
> $$ > $$
Correctness can be checked similarly. The bounds for summations are omitted for brevity. They range from $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$. Correctness can be checked similarly. The bounds for summations are omitted for brevity. They range from $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$.
$$ $$
\begin{aligned} \begin{aligned}
b_\rm{mul}^\ast + \span{\bf{a}_\rm{mul}^\ast, \bf{s}} &= bb' + \sum_{i, j, k} a_{i, j}[k] \cdot v_{i, j, k} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \cdot \span{\bf{u}_{i, j, k}, \bf{s}} \\ b _ \rm{mul}^\ast + \span{\bf{a} _ \rm{mul}^\ast, \bf{s}} &= bb' + \sum _ {i, j, k} a _ {i, j}[k] \cdot v _ {i, j, k} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \cdot \span{\bf{u} _ {i, j, k}, \bf{s}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \paren{v_{i, j, k} + \span{\bf{u}_{i, j, k}, \bf{s}}}. &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \paren{v _ {i, j, k} + \span{\bf{u} _ {i, j, k}, \bf{s}}}.
\end{aligned} \end{aligned}
$$ $$
Since $v_{i, j, k} + \span{\bf{u}_{i, j, k}, \bf{s}} = 2^k \cdot s_i s_j + 2e_{i, j, k} \pmod q$, the above expression further reduces to Since $v _ {i, j, k} + \span{\bf{u} _ {i, j, k}, \bf{s}} = 2^k \cdot s _ i s _ j + 2e _ {i, j, k} \pmod q$, the above expression further reduces to
$$ $$
\begin{aligned} \begin{aligned}
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \paren{2^k \cdot s_i s_j + 2e_{i, j, k}} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \paren{2^k \cdot s _ i s _ j + 2e _ {i, j, k}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j} a_{i, j}s_i s_j + 2\sum_{i, j, k} a_{i, j}[k] \cdot e_{i, j, k} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j} a _ {i, j}s _ i s _ j + 2\sum _ {i, j, k} a _ {i, j}[k] \cdot e _ {i, j, k} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} + 2e\conj \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} + 2e\conj \\
&= rr' + 2e\conj \pmod q, &= rr' + 2e\conj \pmod q,
\end{aligned} \end{aligned}
@@ -439,10 +439,10 @@ $$
and we have an encryption of $mm'$. In this case, and we have an encryption of $mm'$. In this case,
$$ $$
e\conj = 2\sum_{i=1}^n\sum_{j=1}^n \sum_{k=0}^{\ceil{\log q}} a_{i, j}[k] \cdot e_{i, j, k} e\conj = 2\sum _ {i=1}^n\sum _ {j=1}^n \sum _ {k=0}^{\ceil{\log q}} a _ {i, j}[k] \cdot e _ {i, j, k}
$$ $$
is small enough to use, since $a_{i, j}[k] \in \braces{0, 1}$. The size is about $\mc{O}(n^2 \log q)$, which is a lot smaller than $q$ for practical uses. We have reduced $n^2 q$ to $n^2 \log q$ with this method. is small enough to use, since $a _ {i, j}[k] \in \braces{0, 1}$. The size is about $\mc{O}(n^2 \log q)$, which is a lot smaller than $q$ for practical uses. We have reduced $n^2 q$ to $n^2 \log q$ with this method.
### Noise Reduction ### Noise Reduction
@@ -452,42 +452,42 @@ $$
\abs{r} = \abs{m + 2e} < \frac{1}{2}q. \abs{r} = \abs{m + 2e} < \frac{1}{2}q.
$$ $$
But for multiplication, $\abs{r_\rm{mul}} = \abs{rr' + 2e\conj}$, so the noise grows very fast. If the initial noise size was $N$, then after $L$ levels of multiplication, the noise is now $N^{2^L}$.[^3] To reduce noise, we use **modulus switching**. But for multiplication, $\abs{r _ \rm{mul}} = \abs{rr' + 2e\conj}$, so the noise grows very fast. If the initial noise size was $N$, then after $L$ levels of multiplication, the noise is now $N^{2^L}$.[^3] To reduce noise, we use **modulus switching**.
Given $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$, we reduce the modulus to $q' < q$ which results in a smaller noise $e'$. This can be done by scaling $\bf{c}$ by $q'/q$ and rounding it. Given $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$, we reduce the modulus to $q' < q$ which results in a smaller noise $e'$. This can be done by scaling $\bf{c}$ by $q'/q$ and rounding it.
> **Modulus Switching**: let $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$ be given. > **Modulus Switching**: let $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$ be given.
> >
> - Find $b'$ closest to $b \cdot (q' /q)$ such that $b' = b \pmod 2$. > - Find $b'$ closest to $b \cdot (q' /q)$ such that $b' = b \pmod 2$.
> - Find $a_i'$ closest to $a_i \cdot (q'/q)$ such that $a_i' = a_i \pmod 2$. > - Find $a _ i'$ closest to $a _ i \cdot (q'/q)$ such that $a _ i' = a _ i \pmod 2$.
> - Output $\bf{c}' = (b', \bf{a}') \in \Z_{q'}^{n+1}$. > - Output $\bf{c}' = (b', \bf{a}') \in \Z _ {q'}^{n+1}$.
In summary, $\bf{c}' \approx \bf{c} \cdot (q'/q)$, and $\bf{c}' = \bf{c} \pmod 2$ component-wise. In summary, $\bf{c}' \approx \bf{c} \cdot (q'/q)$, and $\bf{c}' = \bf{c} \pmod 2$ component-wise.
We check if the noise has been reduced, and decryption results in the same message $m$. Decryption of $\bf{c}'$ is done by $r' = b' + \span{\bf{a}', \bf{s}} \pmod{q'}$, so we must prove that $r' \approx r \cdot (q'/q)$ and $r' = r \pmod 2$. Then the noise is scaled down by $q'/q$ and the message is preserved. We check if the noise has been reduced, and decryption results in the same message $m$. Decryption of $\bf{c}'$ is done by $r' = b' + \span{\bf{a}', \bf{s}} \pmod{q'}$, so we must prove that $r' \approx r \cdot (q'/q)$ and $r' = r \pmod 2$. Then the noise is scaled down by $q'/q$ and the message is preserved.
Let $k \in \Z$ such that $b + \span{\bf{a}, \bf{s}} = r + kq$. By the choice of $b'$ and $a_i'$, Let $k \in \Z$ such that $b + \span{\bf{a}, \bf{s}} = r + kq$. By the choice of $b'$ and $a _ i'$,
$$ $$
b' = b \cdot (q'/q) + \epsilon_0, \quad a_i' = a_i \cdot (q'/q) + \epsilon_i b' = b \cdot (q'/q) + \epsilon _ 0, \quad a _ i' = a _ i \cdot (q'/q) + \epsilon _ i
$$ $$
for $\epsilon_i \in\braces{0, 1}$. Then for $\epsilon _ i \in\braces{0, 1}$. Then
$$ $$
\begin{aligned} \begin{aligned}
b' + \span{\bf{a}', \bf{s}} &= b' + \sum_{i=1}^n a_i's_i \\ b' + \span{\bf{a}', \bf{s}} &= b' + \sum _ {i=1}^n a _ i's _ i \\
&= b \cdot (q'/q) + \epsilon_0 + \sum_{i=1}^n \paren{a_i \cdot (q'/q) + \epsilon_i} s_i \\ &= b \cdot (q'/q) + \epsilon _ 0 + \sum _ {i=1}^n \paren{a _ i \cdot (q'/q) + \epsilon _ i} s _ i \\
&= (q'/q) \paren{b + \sum_{i=1}^n a_i s_i} + \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i \\ &= (q'/q) \paren{b + \sum _ {i=1}^n a _ i s _ i} + \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i \\
&= (q'/q) \cdot (r + kq) + \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i \\ &= (q'/q) \cdot (r + kq) + \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i \\
&= r \cdot (q'/q) + \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i + kq'. &= r \cdot (q'/q) + \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i + kq'.
\end{aligned} \end{aligned}
$$ $$
We additionally assume that $\bf{s} \in \Z_2^n$, then the error term is bounded by $n+1$, and $n \ll q$.[^4] Set We additionally assume that $\bf{s} \in \Z _ 2^n$, then the error term is bounded by $n+1$, and $n \ll q$.[^4] Set
$$ $$
r' = r \cdot (q'/q) + \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i, r' = r \cdot (q'/q) + \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i,
$$ $$
then we have $r' \approx r \cdot (q'/q)$. then we have $r' \approx r \cdot (q'/q)$.
@@ -502,7 +502,7 @@ Since $q, q'$ are odd, $r = r' \pmod 2$.
### Modulus Chain ### Modulus Chain
Let the initial noise be $\abs{r} \approx N$. Set the maximal level $L$ for multiplication, and set $q_{L} = N^{L+1}$. Then after each multiplication, switch the modulus to $q_{k-1} = q_k/N$ using the above method. Let the initial noise be $\abs{r} \approx N$. Set the maximal level $L$ for multiplication, and set $q _ {L} = N^{L+1}$. Then after each multiplication, switch the modulus to $q _ {k-1} = q _ k/N$ using the above method.
Multiplication increases the noise to $N^2$, and then modulus switching decreases the noise back to $N$, allowing further computation. Multiplication increases the noise to $N^2$, and then modulus switching decreases the noise back to $N$, allowing further computation.
@@ -512,27 +512,27 @@ $$
N^{L+1} \ra N^L \ra \cdots \ra N. N^{L+1} \ra N^L \ra \cdots \ra N.
$$ $$
When we perform $L$ levels of computation and reach modulus $q_0 = N$, we cannot perform any multiplications. We must apply [bootstrapping](../2023-12-08-bootstrapping-ckks/#bootstrapping). When we perform $L$ levels of computation and reach modulus $q _ 0 = N$, we cannot perform any multiplications. We must apply [bootstrapping](../2023-12-08-bootstrapping-ckks/#bootstrapping).
Note that without modulus switching, we need $q_L > N^{2^L}$ for $L$ levels of computation, which is very large. Since we want $q$ to be small (for the hardness of the LWE problem), modulus switching is necessary. We now only require $q_L > N^{L+1}$. Note that without modulus switching, we need $q _ L > N^{2^L}$ for $L$ levels of computation, which is very large. Since we want $q$ to be small (for the hardness of the LWE problem), modulus switching is necessary. We now only require $q _ L > N^{L+1}$.
### Multiplication in BGV (Summary) ### Multiplication in BGV (Summary)
- Set up a modulus chain $q_k = N^{k+1}$ for $k = 0, \dots, L$. - Set up a modulus chain $q _ k = N^{k+1}$ for $k = 0, \dots, L$.
- Given two ciphertexts $\bf{c} = (b, \bf{a}) \in \Z_{q_k}^{n+1}$ and $\bf{c}' = (b', \bf{a}') \in \Z_{q_k}^{n+1}$ with modulus $q_k$ and noise $N$. - Given two ciphertexts $\bf{c} = (b, \bf{a}) \in \Z _ {q _ k}^{n+1}$ and $\bf{c}' = (b', \bf{a}') \in \Z _ {q _ k}^{n+1}$ with modulus $q _ k$ and noise $N$.
- (**Tensor Product**) $\bf{c}_\rm{mul} = \bf{c} \otimes \bf{c}' \pmod{q_k}$. - (**Tensor Product**) $\bf{c} _ \rm{mul} = \bf{c} \otimes \bf{c}' \pmod{q _ k}$.
- Now we have $n^2$ dimensions and noise $N^2$. - Now we have $n^2$ dimensions and noise $N^2$.
- (**Relinearization**) - (**Relinearization**)
- Back to $n$ dimensions and noise $N^2$. - Back to $n$ dimensions and noise $N^2$.
- (**Modulus Switching**) - (**Modulus Switching**)
- Modulus is switched to $q_{k-1}$ and noise is back to $N$. - Modulus is switched to $q _ {k-1}$ and noise is back to $N$.
## BGV Generalizations and Optimizations ## BGV Generalizations and Optimizations
### From $\Z_2$ to $\Z_p$ ### From $\Z _ 2$ to $\Z _ p$
The above description is for messages $m \in \braces{0, 1} = \Z_2$. This can be extend to any finite field $\Z_p$. Replace $2$ with $p$ in the scheme. Then encryption of $m \in \Z_p$ is done as The above description is for messages $m \in \braces{0, 1} = \Z _ 2$. This can be extend to any finite field $\Z _ p$. Replace $2$ with $p$ in the scheme. Then encryption of $m \in \Z _ p$ is done as
$$ $$
b = -\span{\bf{a}, \bf{s}} + m + pe \pmod q, b = -\span{\bf{a}, \bf{s}} + m + pe \pmod q,
@@ -542,7 +542,7 @@ and we have $r = b + \span{\bf{a}, \bf{s}} = m + pe$, $m = r \pmod p$.
### Packing Technique ### Packing Technique
Based on the Ring LWE problem, plaintext space can be extended from $\Z_p$ to $\Z_p^n$ by using **polynomials**. Based on the Ring LWE problem, plaintext space can be extended from $\Z _ p$ to $\Z _ p^n$ by using **polynomials**.
With this technique, the number of linearization keys is reduced from $n^2 \log q$ to $\mc{O}(1)$. With this technique, the number of linearization keys is reduced from $n^2 \log q$ to $\mc{O}(1)$.
@@ -558,6 +558,6 @@ With this technique, the number of linearization keys is reduced from $n^2 \log
- Parallelization is effective for optimization, since multiplication is basically performing the same operations on different data. - Parallelization is effective for optimization, since multiplication is basically performing the same operations on different data.
[^1]: A homomorphism is a *confused name changer*. It can map different elements to the same name. [^1]: A homomorphism is a *confused name changer*. It can map different elements to the same name.
[^2]: The columns $\bf{a}_i$ are chosen random, so $A$ is invertible with high probability. [^2]: The columns $\bf{a} _ i$ are chosen random, so $A$ is invertible with high probability.
[^3]: Noise: $N \ra N^2 \ra N^4 \ra \cdots \ra N^{2^L}$. [^3]: Noise: $N \ra N^2 \ra N^4 \ra \cdots \ra N^{2^L}$.
[^4]: This is how $\bf{s}$ is chosen in practice. [^4]: This is how $\bf{s}$ is chosen in practice.

View File

@@ -40,18 +40,18 @@ Then $f(\bf{s}) = m$.
Let $\bf{s}' \in \braces{0, 1}^n$ be a new secret key. Generate the **bootstrapping keys** Let $\bf{s}' \in \braces{0, 1}^n$ be a new secret key. Generate the **bootstrapping keys**
$$ $$
BK = \braces{\bf{k}_i}_{i=1}^n, \qquad \bf{k}_i = E(\bf{s}', s_i). BK = \braces{\bf{k} _ i} _ {i=1}^n, \qquad \bf{k} _ i = E(\bf{s}', s _ i).
$$ $$
Then by the homomorphic property of $f$, Then by the homomorphic property of $f$,
$$ $$
f(\bf{k_1}, \bf{k}_2, \dots, \bf{k}_n) = f\big( E(\bf{s}', s_1), \dots, E(\bf{s}', s_n) \big) = E\big( \bf{s}', f(s_1, \dots, s_n) \big) = E(\bf{s}', m). f(\bf{k _ 1}, \bf{k} _ 2, \dots, \bf{k} _ n) = f\big( E(\bf{s}', s _ 1), \dots, E(\bf{s}', s _ n) \big) = E\big( \bf{s}', f(s _ 1, \dots, s _ n) \big) = E(\bf{s}', m).
$$ $$
#### Example with BGV #### Example with BGV
Technically, the expression $f(\bf{k_1}, \bf{k}_2, \dots, \bf{k}_n)$ doesn't make sense, but it works. Consider a message $m$ encrypted with secret $\bf{s}$ in the BGV scheme. Technically, the expression $f(\bf{k _ 1}, \bf{k} _ 2, \dots, \bf{k} _ n)$ doesn't make sense, but it works. Consider a message $m$ encrypted with secret $\bf{s}$ in the BGV scheme.
$$ $$
\bf{c} = (b, \bf{a}), \quad b = -\span{\bf{a}, \bf{s}} + m + 2e \pmod q. \bf{c} = (b, \bf{a}), \quad b = -\span{\bf{a}, \bf{s}} + m + 2e \pmod q.
@@ -60,15 +60,15 @@ $$
The decryption is $r = b + \span{\bf{a}, \bf{s}} \pmod q$, and then taking the least significant bit. Consider it as a function The decryption is $r = b + \span{\bf{a}, \bf{s}} \pmod q$, and then taking the least significant bit. Consider it as a function
$$ $$
f(\bf{s}) = b + \span{\bf{a}, \bf{s}} = b + \sum_{i=1}^n a_is_i. f(\bf{s}) = b + \span{\bf{a}, \bf{s}} = b + \sum _ {i=1}^n a _ is _ i.
$$ $$
For a new key $\bf{s}' = (s_1', \dots, s_n')$, generate bootstrapping keys $\bf{k}_i = E(\bf{s}', s_i)$ and plugging it in forcefully gives For a new key $\bf{s}' = (s _ 1', \dots, s _ n')$, generate bootstrapping keys $\bf{k} _ i = E(\bf{s}', s _ i)$ and plugging it in forcefully gives
$$ $$
\begin{aligned} \begin{aligned}
f(\bf{k}_1, \dots, \bf{k}_n) &= b + \sum_{i=1}^n a_i E(\bf{s}', s_i) = b + \sum_{i=1}^n E(\bf{s}', a_is_i) \\ f(\bf{k} _ 1, \dots, \bf{k} _ n) &= b + \sum _ {i=1}^n a _ i E(\bf{s}', s _ i) = b + \sum _ {i=1}^n E(\bf{s}', a _ is _ i) \\
&=b + E\paren{\bf{s}', \sum_{i=1}^n a_is_i} = b + E\paren{\bf{s}', \span{\bf{a}, \bf{s}}}. &=b + E\paren{\bf{s}', \sum _ {i=1}^n a _ is _ i} = b + E\paren{\bf{s}', \span{\bf{a}, \bf{s}}}.
\end{aligned} \end{aligned}
$$ $$
@@ -81,7 +81,7 @@ b' &=b -\span{\bf{a}', \bf{s}'} + \span{\bf{a}, \bf{s}} + 2e' \\
\end{aligned} \end{aligned}
$$ $$
Indeed, decrypting $b'$ will give $m$. So we have $E(\bf{s}', m)$ from $f(\bf{k}_1, \dots, \bf{k}_n)$.[^1] Indeed, decrypting $b'$ will give $m$. So we have $E(\bf{s}', m)$ from $f(\bf{k} _ 1, \dots, \bf{k} _ n)$.[^1]
### Bootstrapping Procedure ### Bootstrapping Procedure
@@ -89,13 +89,13 @@ Indeed, decrypting $b'$ will give $m$. So we have $E(\bf{s}', m)$ from $f(\bf{k}
> >
> **Bootstrapping Key Generation** > **Bootstrapping Key Generation**
> - Choose a new secret key $\bf{s}' \in \braces{0, 1}^n$. > - Choose a new secret key $\bf{s}' \in \braces{0, 1}^n$.
> - Generate *bootstrapping key* $BK = \braces{\bf{k}_i}_{i=1}^n$ where $\bf{k}_i = E(\bf{s}', s_i)$. > - Generate *bootstrapping key* $BK = \braces{\bf{k} _ i} _ {i=1}^n$ where $\bf{k} _ i = E(\bf{s}', s _ i)$.
> >
> **Bootstrapping** > **Bootstrapping**
> - Generate a circuit representation $f : \braces{0, 1}^n \ra \braces{0, 1}$ of the decryption function $D(\cdot, \bf{c})$. > - Generate a circuit representation $f : \braces{0, 1}^n \ra \braces{0, 1}$ of the decryption function $D(\cdot, \bf{c})$.
> - Compute and output $\bf{c}' = f(\bf{k}_1, \dots, \bf{k}_n)$. > - Compute and output $\bf{c}' = f(\bf{k} _ 1, \dots, \bf{k} _ n)$.
The bootstrapping procedure returns an encryption of $m$ under $\bf{s}'$, as shown above. The key idea here is that $\bf{k}_i$ are *fresh* ciphertexts at level $L$. Even though a few levels are consumed during the evaluation of $f$, the resulting ciphertext $\bf{c}'$ is not at level $0$ anymore, allowing us to do more computation. The bootstrapping procedure returns an encryption of $m$ under $\bf{s}'$, as shown above. The key idea here is that $\bf{k} _ i$ are *fresh* ciphertexts at level $L$. Even though a few levels are consumed during the evaluation of $f$, the resulting ciphertext $\bf{c}'$ is not at level $0$ anymore, allowing us to do more computation.
> Suppose that the homomorphic evaluation of $f$ requires depth $d$, consuming $d$ levels. Then we say that the BGV scheme is **bootstrappable** if $d < L$. The output ciphertext $\bf{c}'$ will have level $l = L - d > 0$, which we call **remaining level**. > Suppose that the homomorphic evaluation of $f$ requires depth $d$, consuming $d$ levels. Then we say that the BGV scheme is **bootstrappable** if $d < L$. The output ciphertext $\bf{c}'$ will have level $l = L - d > 0$, which we call **remaining level**.
@@ -111,13 +111,13 @@ $$
\bf{s} \ra \bf{s}' \ra \bf{s}'' \ra \cdots \bf{s} \ra \bf{s}' \ra \bf{s}'' \ra \cdots
$$ $$
Currently, we set $\bf{s}' = \bf{s}$ and make the chain **circular**, so the bootstrapping keys are $E(\bf{s}, s_i)$. $\bf{s}$ is being encrypted by itself. We wonder if this is secure, but there is no known proof for this. This is used as an assumption called the **circular security assumption**. Currently, we set $\bf{s}' = \bf{s}$ and make the chain **circular**, so the bootstrapping keys are $E(\bf{s}, s _ i)$. $\bf{s}$ is being encrypted by itself. We wonder if this is secure, but there is no known proof for this. This is used as an assumption called the **circular security assumption**.
Designing an FHE scheme without the circular security assumption is currently an open problem. Designing an FHE scheme without the circular security assumption is currently an open problem.
## CKKS Scheme ## CKKS Scheme
The [BGV scheme](../2023-11-23-bgv-scheme/#the-bgv-scheme) operates on $\Z_p$, so it doesn't work on real numbers. **Cheon-Kim-Kim-Song** (CKKS) scheme works on real numbers using approximate computation. The [BGV scheme](../2023-11-23-bgv-scheme/#the-bgv-scheme) operates on $\Z _ p$, so it doesn't work on real numbers. **Cheon-Kim-Kim-Song** (CKKS) scheme works on real numbers using approximate computation.
### Approximate Computation ### Approximate Computation
@@ -129,7 +129,7 @@ $$
Here, $2.9979$ is the **significand**, $10$ is the base and $8$ is the exponent. We also call $10^8$ the **scaling factor**. Here, $2.9979$ is the **significand**, $10$ is the base and $8$ is the exponent. We also call $10^8$ the **scaling factor**.
Floating point operations involve **rounding**, but rounding is not easy in homomorphic encryption. Using the BGV scheme on $\Z_p$, there are $2$ methods to do this. Floating point operations involve **rounding**, but rounding is not easy in homomorphic encryption. Using the BGV scheme on $\Z _ p$, there are $2$ methods to do this.
- Bit-wise Encryption - Bit-wise Encryption
- $32$-bit integer results in $32$ ciphertexts. - $32$-bit integer results in $32$ ciphertexts.
@@ -139,7 +139,7 @@ Floating point operations involve **rounding**, but rounding is not easy in homo
- Integer Encryption - Integer Encryption
- To encrypt the significant, use a modulus large enough, such as $p > 2^{32}$. - To encrypt the significant, use a modulus large enough, such as $p > 2^{32}$.
- For multiplication, use $p > 2^{64}$. - For multiplication, use $p > 2^{64}$.
- But rounding is hard in $\Z_p$. - But rounding is hard in $\Z _ p$.
So our wish is to design an HE scheme that natively supports rounding operation! So our wish is to design an HE scheme that natively supports rounding operation!
@@ -150,12 +150,12 @@ In the LWE problem, error was added for security. This can be exploited, since c
> Let $n, q, \sigma$ be parameters for LWE and set a scaling factor $\Delta > 0$. > Let $n, q, \sigma$ be parameters for LWE and set a scaling factor $\Delta > 0$.
> >
> **Key Generation** > **Key Generation**
> - A secret key is chosen as $\bf{s} = (s_1, \dots, s_n) \in \braces{0, 1}^n$, with its linearization gadget. > - A secret key is chosen as $\bf{s} = (s _ 1, \dots, s _ n) \in \braces{0, 1}^n$, with its linearization gadget.
> >
> **Encryption**: message $m \in \R$. > **Encryption**: message $m \in \R$.
> - Randomly sample $\bf{a} = (a_1, \dots, a_n) \la \Z_q^n$ and $e \la D_\sigma$. > - Randomly sample $\bf{a} = (a _ 1, \dots, a _ n) \la \Z _ q^n$ and $e \la D _ \sigma$.
> - Compute $b = -\span{\bf{a}, \bf{s}} + \round{\Delta \cdot m} + e \pmod q$. > - Compute $b = -\span{\bf{a}, \bf{s}} + \round{\Delta \cdot m} + e \pmod q$.
> - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$. > - Output ciphertext $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$.
> >
> **Decryption** > **Decryption**
> - Compute $\mu = b + \span{\bf{a}, \bf{s}} \pmod q$. > - Compute $\mu = b + \span{\bf{a}, \bf{s}} \pmod q$.
@@ -191,18 +191,18 @@ $$
### Addition in CKKS ### Addition in CKKS
> Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \R$. Then, $\bf{c}_\rm{add} = \bf{c} + \bf{c}'$ is an encryption of $m + m'$. > Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \R$. Then, $\bf{c} _ \rm{add} = \bf{c} + \bf{c}'$ is an encryption of $m + m'$.
*Proof*. Decrypt $\bf{c}_\rm{add} = (b + b', \bf{a} + \bf{a}')$. *Proof*. Decrypt $\bf{c} _ \rm{add} = (b + b', \bf{a} + \bf{a}')$.
$$ $$
\mu_\rm{add} = \mu + \mu' = (b + b') + \span{\bf{a} + \bf{a}', \bf{s}} \pmod q. \mu _ \rm{add} = \mu + \mu' = (b + b') + \span{\bf{a} + \bf{a}', \bf{s}} \pmod q.
$$ $$
If $\abs{\mu + \mu'} < q/2$, then If $\abs{\mu + \mu'} < q/2$, then
$$ $$
\mu_\rm{add} = \mu + \mu' = \Delta \cdot (m + m'), \mu _ \rm{add} = \mu + \mu' = \Delta \cdot (m + m'),
$$ $$
so the decryption results in $\Delta\inv \cdot (\mu + \mu') \approx m + m'$. so the decryption results in $\Delta\inv \cdot (\mu + \mu') \approx m + m'$.
@@ -214,7 +214,7 @@ We also use [tensor products](../2023-11-23-bgv-scheme/#tensor-product), and the
> Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \R$. Then, > Let $\bf{c} = (b, \bf{a})$ and $\bf{c}' = (b', \bf{a}')$ be encryptions of $m, m' \in \R$. Then,
> >
> $$ > $$
> \bf{c}_\rm{mul} = \bf{c} \otimes \bf{c}' = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}') > \bf{c} _ \rm{mul} = \bf{c} \otimes \bf{c}' = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')
> $$ > $$
> >
> is an encryption of $mm'$ with $(1, \bf{s}, \bf{s} \otimes \bf{s})$. > is an encryption of $mm'$ with $(1, \bf{s}, \bf{s} \otimes \bf{s})$.
@@ -223,7 +223,7 @@ We also use [tensor products](../2023-11-23-bgv-scheme/#tensor-product), and the
$$ $$
\begin{aligned} \begin{aligned}
\mu_\rm{mul} &= \mu\mu' = (b + \span{\bf{a}, \bf{s}})(b' + \span{\bf{a}', \bf{s}}) \\ \mu _ \rm{mul} &= \mu\mu' = (b + \span{\bf{a}, \bf{s}})(b' + \span{\bf{a}', \bf{s}}) \\
&= bb' + \span{b\bf{a}' + b' \bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} \pmod q &= bb' + \span{b\bf{a}' + b' \bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}'} \pmod q
\end{aligned} \end{aligned}
$$ $$
@@ -231,10 +231,10 @@ $$
if $\abs{\mu\mu'} < q/2$. Then if $\abs{\mu\mu'} < q/2$. Then
$$ $$
\mu_\rm{mul} = \mu\mu' \approx (\Delta \cdot m) \cdot (\Delta \cdot m') = \Delta^2 \cdot mm'. \mu _ \rm{mul} = \mu\mu' \approx (\Delta \cdot m) \cdot (\Delta \cdot m') = \Delta^2 \cdot mm'.
$$ $$
So $mm' \approx \Delta^{-2} \cdot \mu_\rm{mul}$. So $mm' \approx \Delta^{-2} \cdot \mu _ \rm{mul}$.
We have issues with multiplication, as we did in BGV. We have issues with multiplication, as we did in BGV.
@@ -246,42 +246,42 @@ We have issues with multiplication, as we did in BGV.
The relinearization procedure is almost the same as in [BGV relinearization](../2023-11-23-bgv-scheme/#relinearization). The relinearization procedure is almost the same as in [BGV relinearization](../2023-11-23-bgv-scheme/#relinearization).
For convenience, let $a_{i, j} = a_i a_j'$. For convenience, let $a _ {i, j} = a _ i a _ j'$.
> **Relinearization Keys**: for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, perform the following. > **Relinearization Keys**: for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, perform the following.
> - Sample $\bf{u}_{i, j, k} \la \Z_q^{n}$ and $e_{i, j, k} \la D_\sigma$. > - Sample $\bf{u} _ {i, j, k} \la \Z _ q^{n}$ and $e _ {i, j, k} \la D _ \sigma$.
> - Compute $v_{i, j, k} = -\span{\bf{u}_{i, j, k}, \bf{s}} + 2^k \cdot s_i s_j + e_{i, j, k} \pmod q$. > - Compute $v _ {i, j, k} = -\span{\bf{u} _ {i, j, k}, \bf{s}} + 2^k \cdot s _ i s _ j + e _ {i, j, k} \pmod q$.
> - Output $\bf{w}_{i, j, k} = (v_{i, j, k}, \bf{u}_{i, j, k})$. > - Output $\bf{w} _ {i, j, k} = (v _ {i, j, k}, \bf{u} _ {i, j, k})$.
> >
> **Linearization**: given $\bf{c}_\rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{w}_{i, j, k}$ for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, output the following. > **Linearization**: given $\bf{c} _ \rm{mul} = (bb', b\bf{a}' + b' \bf{a}, \bf{a} \otimes \bf{a}')$, $\bf{w} _ {i, j, k}$ for $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$, output the following.
> >
> $$ > $$
> \bf{c}_\rm{mul}^\ast = (b_\rm{mul}^\ast, \bf{a}_\rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum_{i=1}^n \sum_{j=1}^n \sum_{k=0}^{\ceil{\log q}} a_{i, j}[k] \bf{w}_{i, j, k} \pmod q. > \bf{c} _ \rm{mul}^\ast = (b _ \rm{mul}^\ast, \bf{a} _ \rm{mul}^\ast) = (bb', b\bf{a}' + b'\bf{a}) + \sum _ {i=1}^n \sum _ {j=1}^n \sum _ {k=0}^{\ceil{\log q}} a _ {i, j}[k] \bf{w} _ {i, j, k} \pmod q.
> $$ > $$
Correctness can be checked. The bounds for summations are omitted for brevity. They range from $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$. Correctness can be checked. The bounds for summations are omitted for brevity. They range from $1 \leq i, j \leq n$ and $0 \leq k < \ceil{\log q}$.
$$ $$
\begin{aligned} \begin{aligned}
b_\rm{mul}^\ast + \span{\bf{a}_\rm{mul}^\ast, \bf{s}} &= bb' + \sum_{i, j, k} a_{i, j}[k] \cdot v_{i, j, k} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \cdot \span{\bf{u}_{i, j, k}, \bf{s}} \\ b _ \rm{mul}^\ast + \span{\bf{a} _ \rm{mul}^\ast, \bf{s}} &= bb' + \sum _ {i, j, k} a _ {i, j}[k] \cdot v _ {i, j, k} + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \cdot \span{\bf{u} _ {i, j, k}, \bf{s}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \cdot \paren{v_{i, j, k} + \span{\bf{u}_{i, j, k}, \bf{s}}} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \cdot \paren{v _ {i, j, k} + \span{\bf{u} _ {i, j, k}, \bf{s}}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j, k} a_{i, j}[k] \paren{2^k \cdot s_is_j + e_{i, j, k}} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j, k} a _ {i, j}[k] \paren{2^k \cdot s _ is _ j + e _ {i, j, k}} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum_{i, j} a_{i, j}s_i s_j + \sum_{i, j, k} a_{i, j}[k] \cdot e_{i, j, k} \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \sum _ {i, j} a _ {i, j}s _ i s _ j + \sum _ {i, j, k} a _ {i, j}[k] \cdot e _ {i, j, k} \\
&= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}} + e\conj \\ &= bb' + \span{b\bf{a}' + b'\bf{a}, \bf{s}} + \span{\bf{a} \otimes \bf{a}', \bf{s} \otimes \bf{s}} + e\conj \\
&= \mu_\rm{mul} + e\conj\pmod q. &= \mu _ \rm{mul} + e\conj\pmod q.
\end{aligned} \end{aligned}
$$ $$
Since Since
$$ $$
e\conj = \sum_{i, j, k} a_{i, j}[k] \cdot e_{i, j, k} \ll q, e\conj = \sum _ {i, j, k} a _ {i, j}[k] \cdot e _ {i, j, k} \ll q,
$$ $$
we have we have
$$ $$
\mu_\rm{mul}^\ast = \mu_\rm{mul} + e\conj \approx \mu\mu' \approx \Delta^2 \cdot mm'. \mu _ \rm{mul}^\ast = \mu _ \rm{mul} + e\conj \approx \mu\mu' \approx \Delta^2 \cdot mm'.
$$ $$
Note that the proof is identical to that of BGV linearization, except for missing constant factor $2$ in the error. Note that the proof is identical to that of BGV linearization, except for missing constant factor $2$ in the error.
@@ -290,12 +290,12 @@ Note that the proof is identical to that of BGV linearization, except for missin
In BGV, we used modulus switching for [noise reduction](../2023-11-23-bgv-scheme/#noise-reduction). It was for reducing the error and preserving the message. We also use modulus switching here, but for a different purpose. The message can have small numerical errors, we just want to reduce the scaling factor. This operation is called **rescaling**. In BGV, we used modulus switching for [noise reduction](../2023-11-23-bgv-scheme/#noise-reduction). It was for reducing the error and preserving the message. We also use modulus switching here, but for a different purpose. The message can have small numerical errors, we just want to reduce the scaling factor. This operation is called **rescaling**.
Given $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$ such that $b + \span{\bf{a}, \bf{s}} = \mu \pmod q$ and $\mu \approx \Delta^2 \cdot m$, we want to generate a new ciphertext of $m' \approx m$ that has a scaling factor reduced to $\Delta$. This can be done by dividing the ciphertext by $\Delta$ and then rounding it appropriately. Given $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$ such that $b + \span{\bf{a}, \bf{s}} = \mu \pmod q$ and $\mu \approx \Delta^2 \cdot m$, we want to generate a new ciphertext of $m' \approx m$ that has a scaling factor reduced to $\Delta$. This can be done by dividing the ciphertext by $\Delta$ and then rounding it appropriately.
> **Modulus Switching**: let $\bf{c} = (b, \bf{a}) \in \Z_q^{n+1}$ be given. > **Modulus Switching**: let $\bf{c} = (b, \bf{a}) \in \Z _ q^{n+1}$ be given.
> >
> - Let $q' = \Delta \inv \cdot q$.[^2] > - Let $q' = \Delta \inv \cdot q$.[^2]
> - Output $\bf{c}' = \round{\Delta\inv \cdot \bf{c}} \in \Z_{q'}^{n+1}$. > - Output $\bf{c}' = \round{\Delta\inv \cdot \bf{c}} \in \Z _ {q'}^{n+1}$.
Note that the modulus has been switched to $q'$. Constant multiplication and rounding is done component-wise on $\bf{c}$. Note that the modulus has been switched to $q'$. Constant multiplication and rounding is done component-wise on $\bf{c}$.
@@ -304,23 +304,23 @@ We check that $\bf{c}'$ has scaling factor $\Delta$. We know that $\mu' = b' + \
Let $k \in \Z$ such that $b + \span{\bf{a}, \bf{s}} = \mu + kq$. By the choice of $b'$ and $\bf{a}'$, we have Let $k \in \Z$ such that $b + \span{\bf{a}, \bf{s}} = \mu + kq$. By the choice of $b'$ and $\bf{a}'$, we have
$$ $$
b' = \Delta\inv \cdot b + \epsilon_0, \quad a_i' = \Delta\inv \cdot a_i + \epsilon_i b' = \Delta\inv \cdot b + \epsilon _ 0, \quad a _ i' = \Delta\inv \cdot a _ i + \epsilon _ i
$$ $$
for some $\epsilon_i$ such that $\abs{\epsilon_i} \leq 0.5$. So we have for some $\epsilon _ i$ such that $\abs{\epsilon _ i} \leq 0.5$. So we have
$$ $$
\begin{aligned} \begin{aligned}
\mu' &= \Delta\inv \cdot \paren{b + \sum_{i=1}^n a_i s_i} + \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i \\ \mu' &= \Delta\inv \cdot \paren{b + \sum _ {i=1}^n a _ i s _ i} + \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i \\
&= \Delta\inv \cdot (\mu + kq) + \epsilon \approx \Delta \inv \cdot (\Delta^2 \cdot m) + kq' = \Delta \cdot m \pmod{q'}, &= \Delta\inv \cdot (\mu + kq) + \epsilon \approx \Delta \inv \cdot (\Delta^2 \cdot m) + kq' = \Delta \cdot m \pmod{q'},
\end{aligned} \end{aligned}
$$ $$
since $\epsilon = \epsilon_0 + \sum_{i=1}^n \epsilon_i s_i$ is small. since $\epsilon = \epsilon _ 0 + \sum _ {i=1}^n \epsilon _ i s _ i$ is small.
### Modulus Chain ### Modulus Chain
Using modulus switching, we can set $q_L = \Delta^{L+1}$ where $L$ is the maximal level for multiplication. After each multiplication, the modulus is switched to $q_{k-1} = q_k / \Delta$. Using modulus switching, we can set $q _ L = \Delta^{L+1}$ where $L$ is the maximal level for multiplication. After each multiplication, the modulus is switched to $q _ {k-1} = q _ k / \Delta$.
Multiplication increases the scaling factor to $\Delta^2$, and then rescaling operation reduces the scaling factor back to $\Delta$. Multiplication increases the scaling factor to $\Delta^2$, and then rescaling operation reduces the scaling factor back to $\Delta$.
@@ -330,19 +330,19 @@ $$
\Delta^{L+1} \ra \Delta^L \ra \cdots \ra \Delta. \Delta^{L+1} \ra \Delta^L \ra \cdots \ra \Delta.
$$ $$
When we reach $q_0 = \Delta$, we cannot perform any multiplications, so we apply [bootstrapping](../2023-12-08-bootstrapping-ckks/#bootstrapping) here. When we reach $q _ 0 = \Delta$, we cannot perform any multiplications, so we apply [bootstrapping](../2023-12-08-bootstrapping-ckks/#bootstrapping) here.
### Multiplication in CKKS (Summary) ### Multiplication in CKKS (Summary)
- Set up a modulus chain $q_k = \Delta^{k+1}$ for $k = 0, \dots, L$. - Set up a modulus chain $q _ k = \Delta^{k+1}$ for $k = 0, \dots, L$.
- Given two ciphertexts $\bf{c} = (b, \bf{a}) \in \Z_{q_k}^{n+1}$ and $\bf{c}' = (b', \bf{a}') \in \Z_{q_k}^{n+1}$ with modulus $q_k$ and **scaling factor** $\Delta$. - Given two ciphertexts $\bf{c} = (b, \bf{a}) \in \Z _ {q _ k}^{n+1}$ and $\bf{c}' = (b', \bf{a}') \in \Z _ {q _ k}^{n+1}$ with modulus $q _ k$ and **scaling factor** $\Delta$.
- (**Tensor Product**) $\bf{c}_\rm{mul} = \bf{c} \otimes \bf{c}' \pmod{q_k}$. - (**Tensor Product**) $\bf{c} _ \rm{mul} = \bf{c} \otimes \bf{c}' \pmod{q _ k}$.
- Now we have $n^2$ dimensions and scaling factor $\Delta^2$. - Now we have $n^2$ dimensions and scaling factor $\Delta^2$.
- (**Relinearization**) - (**Relinearization**)
- Back to $n$ dimensions and scaling factor $\Delta^2$. - Back to $n$ dimensions and scaling factor $\Delta^2$.
- (**Modulus Switching**; **Rescaling**) - (**Modulus Switching**; **Rescaling**)
- Modulus is switched to $q_{k-1}$ and scaling factor is back to $\Delta$. - Modulus is switched to $q _ {k-1}$ and scaling factor is back to $\Delta$.
[^1]: The noise hasn't gone away since we didn't *fully evaluate* the decryption circuit, which takes the remainders from dividing by $q$ and $2$. [^1]: The noise hasn't gone away since we didn't *fully evaluate* the decryption circuit, which takes the remainders from dividing by $q$ and $2$.
[^2]: No rounding...? [^2]: No rounding...?

View File

@@ -2,18 +2,24 @@
share: true share: true
toc: true toc: true
math: true math: true
categories: [Mathematics, Measure Theory] categories:
tags: [math, analysis, measure-theory] - Mathematics
title: "01. Algebra of Sets" - Measure Theory
date: "2023-01-11" path: _posts/mathematics/measure-theory
github_title: "2023-01-11-algebra-of-sets" tags:
- math
- analysis
- measure-theory
title: 01. Algebra of Sets
date: 2023-01-11
github_title: 2023-01-11-algebra-of-sets
image: image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-01.png path: /assets/img/posts/mathematics/measure-theory/mt-01.png
attachment: attachment:
folder: assets/img/posts/Mathematics/Measure Theory folder: assets/img/posts/mathematics/measure-theory
--- ---
![mt-01.png](/assets/img/posts/Mathematics/Measure%20Theory/mt-01.png) ![mt-01.png](../../../assets/img/posts/mathematics/measure-theory/mt-01.png)
르벡 적분을 공부하기 위해서는 먼저 집합의 ‘길이’ 개념을 공부해야 합니다. 그리고 집합의 ‘길이’ 개념을 확립하기 위해서는 집합 간의 연산과 이에 대한 구조가 필요합니다. 르벡 적분을 공부하기 위해서는 먼저 집합의 ‘길이’ 개념을 공부해야 합니다. 그리고 집합의 ‘길이’ 개념을 확립하기 위해서는 집합 간의 연산과 이에 대한 구조가 필요합니다.
@@ -65,19 +71,19 @@ Ring과 유사하지만 살짝 더 좋은 성질을 가진 구조를 가지고
조금만 더 확장해서 countable한 연산에 대해서도 허용하고 싶습니다. 조금만 더 확장해서 countable한 연산에 대해서도 허용하고 싶습니다.
**정의.** ($\sigma$-ring) $\mathcal{R}$이 ring일 때, $A_n \in \mathcal{R}$ ($n = 1, 2, \dots$) 에 대하여 $\displaystyle\bigcup_ {n=1}^\infty A_n \in \mathcal{R}$ 이 성립하면 $\mathcal{R}$을 **$\sigma$-ring**이라 한다. **정의.** ($\sigma$-ring) $\mathcal{R}$이 ring일 때, $A _ n \in \mathcal{R}$ ($n = 1, 2, \dots$) 에 대하여 $\displaystyle\bigcup _ {n=1}^\infty A _ n \in \mathcal{R}$ 이 성립하면 $\mathcal{R}$을 **$\sigma$-ring**이라 한다.
Countable한 합집합을 해도 닫혀 있다는 뜻입니다. 조금 생각해보면 마찬가지로 교집합에 대해서도 성립함을 알 수 있습니다. Countable한 합집합을 해도 닫혀 있다는 뜻입니다. 조금 생각해보면 마찬가지로 교집합에 대해서도 성립함을 알 수 있습니다.
**참고.** 다음 성질 **참고.** 다음 성질
$$\bigcap_ {n=1}^\infty A_n = A_1 \setminus\bigcup_ {n=1}^\infty (A_1 \setminus A_n)$$ $$\bigcap _ {n=1}^\infty A _ n = A _ 1 \setminus\bigcup _ {n=1}^\infty (A _ 1 \setminus A _ n)$$
을 이용하면 $\mathcal{R}$이 $\sigma$-ring이고 $A_n \in \mathcal{R}$ 일 때 $\displaystyle\bigcap_ {n=1}^\infty A_n \in \mathcal{R}$ 임을 알 수 있다. 을 이용하면 $\mathcal{R}$이 $\sigma$-ring이고 $A _ n \in \mathcal{R}$ 일 때 $\displaystyle\bigcap _ {n=1}^\infty A _ n \in \mathcal{R}$ 임을 알 수 있다.
마찬가지로 algebra도 정의할 수 있습니다. 마찬가지로 algebra도 정의할 수 있습니다.
**정의.** ($\sigma$-algebra) $\mathcal{F}$가 algebra on $X$일 때, $A_n \in \mathcal{F}$ ($n = 1, 2, \dots$) 에 대하여 $\displaystyle\bigcup_ {n=1}^\infty A_n \in \mathcal{F}$ 가 성립하면 $\mathcal{F}$를 **$\sigma$-algebra**라 한다. **정의.** ($\sigma$-algebra) $\mathcal{F}$가 algebra on $X$일 때, $A _ n \in \mathcal{F}$ ($n = 1, 2, \dots$) 에 대하여 $\displaystyle\bigcup _ {n=1}^\infty A _ n \in \mathcal{F}$ 가 성립하면 $\mathcal{F}$를 **$\sigma$-algebra**라 한다.
$\sigma$-algebra는 당연히 $\sigma$-ring이기 때문에 countable한 교집합을 해도 닫혀 있습니다. $\sigma$-algebra는 당연히 $\sigma$-ring이기 때문에 countable한 교집합을 해도 닫혀 있습니다.
@@ -101,11 +107,11 @@ $\sigma$-algebra는 당연히 $\sigma$-ring이기 때문에 countable한 교집
이면 $\phi$는 **additive**하다. 이면 $\phi$는 **additive**하다.
2. 쌍마다 서로소인 집합 $A_i \in \mathcal{R}$ 에 대하여 2. 쌍마다 서로소인 집합 $A _ i \in \mathcal{R}$ 에 대하여
$$\phi\left( \bigcup_ {i=1}^\infty A_i \right) = \sum_ {i=1}^\infty \phi(A_i)$$ $$\phi\left( \bigcup _ {i=1}^\infty A _ i \right) = \sum _ {i=1}^\infty \phi(A _ i)$$
이고 $\displaystyle\bigcup_ {i=1}^\infty A_i \in \mathcal{R}$ 이면[^1] $\phi$는 **countably additive** ($\sigma$-additive) 하다. 이고 $\displaystyle\bigcup _ {i=1}^\infty A _ i \in \mathcal{R}$ 이면[^1] $\phi$는 **countably additive** ($\sigma$-additive) 하다.
이제 ‘길이’의 개념을 나타내는 함수를 정의합니다. 이 함수는 측도(measure)라고 합니다. 이제 ‘길이’의 개념을 나타내는 함수를 정의합니다. 이 함수는 측도(measure)라고 합니다.
@@ -115,9 +121,9 @@ $\sigma$-algebra는 당연히 $\sigma$-ring이기 때문에 countable한 교집
**참고.** **참고.**
1. $\phi$가 additive이면 쌍마다 서로소인 $A_i \in \mathcal{R}$ 에 대하여 다음이 성립한다. 1. $\phi$가 additive이면 쌍마다 서로소인 $A _ i \in \mathcal{R}$ 에 대하여 다음이 성립한다.
$$\phi\left( \bigcup_ {i=1}^n A_i \right) = \sum_ {i=1}^n \phi(A_i).$$ $$\phi\left( \bigcup _ {i=1}^n A _ i \right) = \sum _ {i=1}^n \phi(A _ i).$$
이 성질을 *finite additivity*라 부르고, $\phi$는 *finitely additive*하다고 한다. 이 성질을 *finite additivity*라 부르고, $\phi$는 *finitely additive*하다고 한다.
@@ -129,7 +135,7 @@ $\phi(A) \in \mathbb{R}$ 인 $A \in \mathcal{R}$이 존재한다는 가정을
1. $\mu$가 **finite** 하다. $\iff$모든 $X \in \mathcal{F}$ 에 대하여 $\mu(X) < \infty$ 이다. 1. $\mu$가 **finite** 하다. $\iff$모든 $X \in \mathcal{F}$ 에 대하여 $\mu(X) < \infty$ 이다.
2. $\mu$가 **$\sigma$-finite** 하다. $\iff$집합열 $F_1 \subseteq F_2 \subseteq\cdots$ 가 존재하여 $\mu(F_i) < \infty$ 이고 $\displaystyle\bigcup_ {i=1}^\infty F_i = X$ 이다. 2. $\mu$가 **$\sigma$-finite** 하다. $\iff$집합열 $F _ 1 \subseteq F _ 2 \subseteq\cdots$ 가 존재하여 $\mu(F _ i) < \infty$ 이고 $\displaystyle\bigcup _ {i=1}^\infty F _ i = X$ 이다.
## Basic Properties of Set Functions ## Basic Properties of Set Functions
@@ -143,57 +149,57 @@ $\phi$가 set function이라 하자.
가 성립한다.[^3] 가 성립한다.[^3]
- $\phi$가 ring $\mathcal{R}$ 위에서 additive이면, $A_1 \subseteq A_2$ 인 $A_1, A_2 \in \mathcal{R}$ 에 대하여 - $\phi$가 ring $\mathcal{R}$ 위에서 additive이면, $A _ 1 \subseteq A _ 2$ 인 $A _ 1, A _ 2 \in \mathcal{R}$ 에 대하여
$$\phi(A_2) = \phi(A_2 \setminus A_1) + \phi(A_1)$$ $$\phi(A _ 2) = \phi(A _ 2 \setminus A _ 1) + \phi(A _ 1)$$
가 성립한다. 따라서, 가 성립한다. 따라서,
1. $\phi \geq 0$ 이면 $\phi(A_1) \leq \phi(A_2)$ 이다. (단조성) 1. $\phi \geq 0$ 이면 $\phi(A _ 1) \leq \phi(A _ 2)$ 이다. (단조성)
2. $\lvert \phi(A_1) \rvert < \infty$ 이면 $\phi(A_2 \setminus A_1) = \phi(A_2) - \phi(A_1)$ 이다.[^4] 2. $\lvert \phi(A _ 1) \rvert < \infty$ 이면 $\phi(A _ 2 \setminus A _ 1) = \phi(A _ 2) - \phi(A _ 1)$ 이다.[^4]
- $\phi$가 additive이고 $\phi \geq 0$ 이면 $A, B \in \mathcal{R}$ 에 대하여 - $\phi$가 additive이고 $\phi \geq 0$ 이면 $A, B \in \mathcal{R}$ 에 대하여
$$\phi(A\cup B) \leq \phi(A) + \phi(B)$$ $$\phi(A\cup B) \leq \phi(A) + \phi(B)$$
가 성립한다. 귀납법을 적용하면, 모든 $A_i \in \mathcal{R}$에 대하여 가 성립한다. 귀납법을 적용하면, 모든 $A _ i \in \mathcal{R}$에 대하여
$$\phi\left( \bigcup_ {n=1}^m A_n \right) \leq \sum_ {n=1}^m \phi(A_n)$$ $$\phi\left( \bigcup _ {n=1}^m A _ n \right) \leq \sum _ {n=1}^m \phi(A _ n)$$
가 성립한다. 이 때 $A_i$가 반드시 쌍마다 서로소일 필요는 없다. 이 성질을 *finite subadditivity*라 한다. 가 성립한다. 이 때 $A _ i$가 반드시 쌍마다 서로소일 필요는 없다. 이 성질을 *finite subadditivity*라 한다.
마지막으로 measure와 관련된 정리를 소개합니다. 마지막으로 measure와 관련된 정리를 소개합니다.
**정리.** $\mu$가 $\sigma$-algebra $\mathcal{F}$의 measure라 하자. $A_n \in \mathcal{F}$ 에 대하여 $A_1 \subseteq A_2 \subseteq\cdots$ 이면 **정리.** $\mu$가 $\sigma$-algebra $\mathcal{F}$의 measure라 하자. $A _ n \in \mathcal{F}$ 에 대하여 $A _ 1 \subseteq A _ 2 \subseteq\cdots$ 이면
$$\lim_ {n\rightarrow\infty} \mu(A_n) = \mu\left( \bigcup_ {n=1}^\infty A_n \right)$$ $$\lim _ {n\rightarrow\infty} \mu(A _ n) = \mu\left( \bigcup _ {n=1}^\infty A _ n \right)$$
이 성립한다. 이 성립한다.
**증명.** $B_1 = A_1$, $n \geq 2$ 에 대해 $B_n = A_n \setminus A_ {n-1}$ 로 두자. $B_n$은 쌍마다 서로소임이 자명하다. 따라서, **증명.** $B _ 1 = A _ 1$, $n \geq 2$ 에 대해 $B _ n = A _ n \setminus A _ {n-1}$ 로 두자. $B _ n$은 쌍마다 서로소임이 자명하다. 따라서,
$$\mu(A_n) = \mu\left( \bigcup_ {k=1}^n B_k \right) = \sum_ {k=1}^n \mu(B_k)$$ $$\mu(A _ n) = \mu\left( \bigcup _ {k=1}^n B _ k \right) = \sum _ {k=1}^n \mu(B _ k)$$
이고, measure의 countable additivity를 이용하여 이고, measure의 countable additivity를 이용하여
$$\lim_ {n\rightarrow\infty} \mu(A_n) = \lim_ {n\rightarrow\infty} \sum_ {k=1}^n \mu(B_k) = \sum_ {n=1}^\infty \mu(B_n) = \mu\left( \bigcup_ {n=1}^{\infty} B_n \right) = \mu\left( \bigcup_ {n=1}^\infty A_n \right)$$ $$\lim _ {n\rightarrow\infty} \mu(A _ n) = \lim _ {n\rightarrow\infty} \sum _ {k=1}^n \mu(B _ k) = \sum _ {n=1}^\infty \mu(B _ n) = \mu\left( \bigcup _ {n=1}^{\infty} B _ n \right) = \mu\left( \bigcup _ {n=1}^\infty A _ n \right)$$
임을 알 수 있다. 마지막 등호에서는 $\displaystyle\bigcup_ {n=1}^\infty A_n = \bigcup_ {n=1}^\infty B_n$ 임을 이용한다. 임을 알 수 있다. 마지막 등호에서는 $\displaystyle\bigcup _ {n=1}^\infty A _ n = \bigcup _ {n=1}^\infty B _ n$ 임을 이용한다.
왠지 위 조건을 뒤집어서 $A_1 \supseteq A_2 \supseteq \cdots$ 인 경우 교집합에 대해서도 성립하면 좋을 것 같습니다. 왠지 위 조건을 뒤집어서 $A _ 1 \supseteq A _ 2 \supseteq \cdots$ 인 경우 교집합에 대해서도 성립하면 좋을 것 같습니다.
$$\lim_ {n\rightarrow\infty} \mu(A_n) = \mu\left( \bigcap_ {n=1}^\infty A_n \right).$$ $$\lim _ {n\rightarrow\infty} \mu(A _ n) = \mu\left( \bigcap _ {n=1}^\infty A _ n \right).$$
하지만 안타깝게도 조건이 부족합니다. $\mu(A_1) < \infty$ 라는 추가 조건이 필요합니다. 반례는 $A_n = [n, \infty)$ 생각해보면 됩니다. 정리의 정확한 서술은 다음과 같습니다. 증명은 연습문제로 남깁니다. 하지만 안타깝게도 조건이 부족합니다. $\mu(A _ 1) < \infty$ 라는 추가 조건이 필요합니다. 반례는 $A _ n = [n, \infty)$를 생각해보면 됩니다. 정리의 정확한 서술은 다음과 같습니다. 증명은 연습문제로 남깁니다.
**정리.** $\mu$가 $\sigma$-algebra $\mathcal{F}$의 measure라 하자. $A_n \in \mathcal{F}$ 에 대하여 $A_1 \supseteq A_2 \supseteq \cdots$ 이고 $\mu(A_1) < \infty$ 이면 **정리.** $\mu$가 $\sigma$-algebra $\mathcal{F}$의 measure라 하자. $A _ n \in \mathcal{F}$ 에 대하여 $A _ 1 \supseteq A _ 2 \supseteq \cdots$ 이고 $\mu(A _ 1) < \infty$ 이면
$$\lim_ {n\rightarrow\infty} \mu(A_n) = \mu\left( \bigcap_ {n=1}^\infty A_n \right)$$ $$\lim _ {n\rightarrow\infty} \mu(A _ n) = \mu\left( \bigcap _ {n=1}^\infty A _ n \right)$$
이 성립한다. 이 성립한다.
이 두 정리를 **continuity of measure**라고 합니다. 함수가 연속이면 극한이 함수 안으로 들어갈 수 있는 성질과 유사하여 이와 같은 이름이 붙었습니다. 어떤 책에서는 $A_1 \subseteq A_2 \subseteq\cdots$ 조건을 $A_n \nearrow \bigcup_n A_n$ 라 표현하기도 합니다. 그래서 이 조건에 대한 정리를 *continuity from below*라 하기도 합니다. 마찬가지로 $A_1 \supseteq A_2 \supseteq \cdots$ 조건을 $A_n \searrow \bigcap_n A_n$ 로 적고 이에 대한 정리를 *continuity from above*라 합니다. 이 두 정리를 **continuity of measure**라고 합니다. 함수가 연속이면 극한이 함수 안으로 들어갈 수 있는 성질과 유사하여 이와 같은 이름이 붙었습니다. 어떤 책에서는 $A _ 1 \subseteq A _ 2 \subseteq\cdots$ 조건을 $A _ n \nearrow \bigcup _ n A _ n$ 라 표현하기도 합니다. 그래서 이 조건에 대한 정리를 *continuity from below*라 하기도 합니다. 마찬가지로 $A _ 1 \supseteq A _ 2 \supseteq \cdots$ 조건을 $A _ n \searrow \bigcap _ n A _ n$ 로 적고 이에 대한 정리를 *continuity from above*라 합니다.
--- ---

View File

@@ -0,0 +1,267 @@
---
share: true
toc: true
math: true
categories:
- Mathematics
- Measure Theory
path: _posts/mathematics/measure-theory
tags:
- math
- analysis
- measure-theory
title: 02. Construction of Measure
date: 2023-01-23
github_title: 2023-01-23-construction-of-measure
image:
path: /assets/img/posts/mathematics/measure-theory/mt-02.png
attachment:
folder: assets/img/posts/mathematics/measure-theory
---
![mt-02.png](../../../assets/img/posts/mathematics/measure-theory/mt-02.png)
이제 본격적으로 집합을 재보도록 하겠습니다. 우리가 잴 수 있는 집합들부터 시작합니다. $\mathbb{R}^p$에서 논의할 건데, 이제 여기서부터는 $\mathbb{R}$의 구간의 열림/닫힘을 모두 포괄하여 정의합니다. 즉, $\mathbb{R}$의 구간이라고 하면 $[a, b], (a, b), [a, b), (a, b]$ 네 가지 경우를 모두 포함합니다.
## Elementary Sets
**정의.** ($\mathbb{R}^p$의 구간) $a _ i, b _ i \in \mathbb{R}$, $a _ i \leq b _ i$ 라 하자. $I _ i$가 $\mathbb{R}$의 구간이라고 할 때, $\mathbb{R}^p$의 구간은
$$\prod _ {i=1}^p I _ i = I _ 1 \times \cdots \times I _ p$$
와 같이 정의한다.
예를 들어 $\mathbb{R}^2$의 구간이라 하면 직사각형 영역, $\mathbb{R}^3$의 구간이라 하면 직육면체 영역을 떠올릴 수 있습니다. 단, 경계는 포함되지 않을 수도 있습니다.
이러한 구간들을 유한개 모아 합집합하여 얻은 집합을 모아 elementary set이라 합니다.
**정의.** (Elementary Set) 어떤 집합이 유한개 구간의 합집합으로 표현되면 그 집합을 **elementary set**이라고 한다. 그리고 $\mathbb{R}^p$의 elementary set의 모임을 $\Sigma$로 표기한다.
임의의 구간은 유계입니다. 따라서 구간의 유한한 합집합도 유계일 것입니다.
**참고.** 임의의 elementary set은 유계이다.
Elementary set의 모임에서 집합의 연산을 정의할 수 있을 것입니다. 이 때, $\Sigma$가 ring이 된다는 것을 간단하게 확인할 수 있습니다.
**명제.** $\Sigma$는 ring이다. 하지만 전체 공간인 $\mathbb{R}^p$를 포함하고 있지 않기 때문에 $\sigma$-ring은 아니다.
구간의 길이를 재는 방법은 아주 잘 알고 있습니다. 유한개 구간의 합집합인 elementary set에서도 쉽게 잴 수 있습니다. 이제 길이 함수 $m: \Sigma \rightarrow[0, \infty)$ 을 정의하겠습니다. 아직 measure는 아닙니다.
**정의.** $a _ i, b _ i \in \mathbb{R}$ 가 구간 $I _ i$의 양 끝점이라 하자. $\mathbb{R}^p$의 구간 $I = \displaystyle\prod _ {i=1}^p I _ i$ 에 대하여,
$$m(I) = \prod _ {i=1}^p (b _ i - a _ i)$$
로 정의한다.
**정의.** $I _ i$가 쌍마다 서로소인 $\mathbb{R}^p$의 구간이라 하자. $A = \displaystyle\bigcup _ {i=1}^n I _ i$ 에 대하여
$$m(A) = \sum _ {i=1}^n m(I _ i)$$
로 정의한다.
$\mathbb{R}, \mathbb{R}^2, \mathbb{R}^3$에서 생각해보면 $m$은 곧 길이, 넓이, 부피와 대응되는 함수임을 알 수 있습니다. 또한 쌍마다 서로소인 구간의 합집합에 대해서는 각 구간의 함숫값을 더한 것으로 정의합니다. 어떤 집합을 겹치지 않게 구간으로 나눌 수 있다면, 집합의 ‘길이’가 각 구간의 ‘길이’ 합이 되는 것은 자연스럽습니다.
그리고 이 정의는 well-defined 입니다. $A \in \Sigma$ 에 대해서 서로소인 유한개 구간의 합집합으로 나타내는 방법이 유일하지 않아도, $m$ 값은 같습니다.
**참고.** $m$은 $\Sigma$ 위에서 additive이다. 따라서 $m : \Sigma \rightarrow[0, \infty)$ 은 additive set function이다.
여기서 추가로 regularity 조건을 만족했으면 좋겠습니다.
**정의.** (Regularity) Set function $\mu: \Sigma \rightarrow[0, \infty]$ 가 additive라 하자. 모든 $A \in \Sigma$ 와 $\epsilon > 0$ 에 대하여
> 닫힌집합 $F \in \Sigma$, 열린집합 $G \in \Sigma$ 가 존재하여 $F \subseteq A \subseteq G$ 이고 $\mu(G) - \epsilon \leq \mu(A) \leq \mu(F) + \epsilon$
이면 $\mu$가 $\Sigma$ 위에서 **regular**하다고 정의한다.
위에서 정의한 $m$이 regular한 것은 쉽게 확인할 수 있습니다.
이제 set function $\mu: \Sigma \rightarrow[0, \infty)$ 가 finite, regular, additive 하다고 가정합니다.
**정의.** (Outer Measure) $E \in \mathcal{P}(\mathbb{R}^p)$ 의 **outer measure** $\mu^\ast: \mathcal{P}(\mathbb{R}^p) \rightarrow[0, \infty]$ 는
$$\mu^\ast(E) = \inf \left\lbrace \sum _ {n=1}^\infty \mu(A _ n) : \text{열린집합 } A _ n \in \Sigma \text{ 에 대하여 } E \subseteq\bigcup _ {n=1}^\infty A _ n\right\rbrace.$$
로 정의한다.
Outer measure라 부르는 이유는 $E$의 바깥에서 길이를 재서 근사하기 때문입니다. Outer measure는 모든 power set에 대해서 정의할 수 있으니, 이를 이용해서 모든 집합을 잴 수 있으면 좋겠습니다. 하지만 measure가 되려면 countably additive 해야하는데, 이 조건이 가장 만족하기 까다로운 조건입니다. 실제로 countably additive 조건이 성립하지 않습니다.
**참고.**
- $\mu^\ast \geq 0$ 이다.
- $E _ 1 \subseteq E _ 2$ 이면 $\mu^\ast(E _ 1) \leq \mu^\ast(E _ 2)$ 이다. (단조성)
**정리.**
1. $A \in \Sigma$ 이면 $\mu^\ast(A) = \mu(A)$.[^1]
2. Countable subadditivity가 성립한다.
$$\mu^\ast\left( \bigcup _ {n=1}^\infty E _ n \right) \leq \sum _ {n=1}^\infty \mu^\ast(E _ n), \quad (\forall E _ n \in \mathcal{P}(\mathbb{R}^p))$$
**증명.**
(1) $A \in \Sigma$, $\epsilon > 0$ 라 두자. $\mu$의 regularity를 이용하면, 열린집합 $G \in \Sigma$ 가 존재하여 $A \subseteq G$ 이고
$$\mu^\ast(A) \leq \mu(G) \leq \mu(A) + \epsilon$$
이다. $\mu^\ast$의 정의에 의해 열린집합 $A _ n \in \Sigma$ 가 존재하여 $A \subseteq\displaystyle\bigcup _ {n=1}^\infty A _ n$ 이고
$$\sum _ {n=1}^\infty \mu(A _ n) \leq \mu^\ast(A) + \epsilon$$
이다. 마찬가지로 regularity에 의해 닫힌집합 $F \in \Sigma$ 가 존재하여 $F\subseteq A$ 이고 $\mu(A) \leq \mu(F) + \epsilon$ 이다. $F \subseteq\mathbb{R}^p$ 는 유계이고 닫힌집합이므로 compact set이고, finite open cover를 택할 수 있다. 즉, 적당한 $N \in \mathbb{N}$ 에 대하여 $F \subseteq\displaystyle\bigcup _ {i=1}^N A _ {i}$ 가 성립한다.
따라서
$$\mu(A) \leq \mu(F) + \epsilon \leq \sum _ {i=1}^N \mu(A _ i) \leq \sum _ {i=1}^n \mu(A _ i) + \epsilon \leq \mu^\ast(A) + 2\epsilon$$
이제 $\epsilon \rightarrow 0$ 로 두면 $\mu(A) = \mu^\ast(A)$ 를 얻는다.
\(2\) 부등식의 양변이 모두 $\infty$ 이면 증명할 것이 없으므로, 양변이 모두 유한하다고 가정하여 모든 $n\in \mathbb{N}$ 에 대해 $\mu^\ast(E _ n) < \infty$ 라 하자. $\epsilon > 0$ 로 두고, 각 $n \in \mathbb{N}$ 에 대하여 열린집합 $A _ {n, k} \in \Sigma$ 가 존재하여 $E _ n \subseteq\displaystyle\bigcup _ {k=1}^\infty A _ {n, k}$ 이고 $\displaystyle\sum _ {k=1}^\infty \mu(A _ {n,k}) \leq \mu^\ast(E _ n) + 2^{-n}\epsilon$ 이다.
$\mu^\ast$는 하한(infimum)으로 정의되었기 때문에,
$$\mu^\ast\left( \bigcup _ {n=1}^\infty E _ n \right) \leq \sum _ {n=1}^\infty \sum _ {k=1}^\infty \mu(A _ {n,k}) \leq \sum _ {n=1}^\infty \mu^\ast(E _ n) + \epsilon$$
가 성립하고, $\epsilon \rightarrow 0$ 로 두면 부등식이 성립함을 알 수 있다.
## $\mu$-measurable Sets
Countably additive 조건이 성립하는 집합들만 모아서 measure를 construct 하려고 합니다. 아래 내용은 이를 위한 사전 준비 작업입니다.
**표기법.** (대칭차집합) $A \mathop{\mathrm{\triangle}}B = (A\setminus B) \cup (B \setminus A)$.
**정의.**
- $d(A, B) = \mu^\ast(A \mathop{\mathrm{\triangle}}B)$ 로 정의한다.
- 집합열 $A _ n$에 대하여 $d(A _ n, A) \rightarrow 0$ 이면 $A _ n \rightarrow A$ 로 정의한다.
**참고.**
- $A, B, C \in \mathbb{R}^p$ 에 대하여 $d(A, B) \leq d(A, C) + d(C, B)$ 이다.
- $A _ 1, B _ 2, B _ 1, B _ 2 \in \mathbb{R}^p$ 일 때, 다음이 성립한다.
$$\left.\begin{array}{c}d(A _ 1 \cup A _ 2, B _ 1 \cup B _ 2) \\d(A _ 1 \cap A _ 2, B _ 1 \cap B _ 2) \\d(A _ 1 \setminus A _ 2, B _ 1 \setminus B _ 2)\end{array}\right\rbrace\leq d(A _ 1, B _ 1) + d(A _ 2, B _ 2).$$
**정의.** (Finitely $\mu$-measurable) 집합 $A _ n \in \Sigma$ 이 존재하여 $A _ n \rightarrow A$ 이면 $A$가 **finitely $\mu$-measurable**이라 한다. 그리고 finitely $\mu$-measurable한 집합의 모임을 $\mathfrak{M} _ F(\mu)$로 표기한다.
위 정의는 $\mu$라는 set function에 의해 $\mu^\ast (A _ n \mathop{\mathrm{\triangle}}A) \rightarrow 0$ 이 되는 elementary set $A _ n$이 존재한다는 의미입니다.
**정의.** ($\mu$-measurable) $A _ n \in \mathfrak{M} _ F(\mu)$ 에 대하여 $A = \displaystyle\bigcup _ {n=1}^\infty A _ n$ 이면 $A$가 **$\mu$-measurable**이라 한다. 그리고 $\mu$-measurable한 집합의 모임을 $\mathfrak{M}(\mu)$로 표기한다.
**참고.** $\mu^\ast(A) = d(A, \varnothing) \leq d(A, B) + \mu^\ast(B)$.
**명제.** $\mu^\ast(A)$ 또는 $\mu^\ast(B)$가 유한하면, 다음이 성립한다.
$$\lvert \mu^\ast(A) - \mu^\ast(B) \rvert \leq d(A, B).$$
**따름정리.** $A \in \mathfrak{M} _ F(\mu)$ 이면 $\mu^\ast(A) < \infty$ 이다.
**증명.** $A _ n \in \Sigma$ 가 존재하여 $A _ n \rightarrow A$ 이고, $N \in \mathbb{N}$ 존재하여
$$\mu^\ast(A) \leq d(A _ N, A) + \mu^\ast(A _ N) \leq 1 + \mu^\ast(A _ N) < \infty$$
이다.
**따름정리.** $A _ n \rightarrow A$ 이고 $A _ n, A \in \mathfrak{M} _ F(\mu)$ 이면 $\mu^\ast(A _ n)\rightarrow\mu^\ast(A) < \infty$ 이다.
**증명.** $\mu^\ast(A)$, $\mu^\ast(A _ n)$가 유한하므로, $n \rightarrow\infty$ 일 때 $\lvert \mu^\ast(A _ n) - \mu^\ast(A) \rvert \leq d(A _ n, A) \rightarrow 0$ 이다.
## Construction of Measure
준비가 끝났으니 measure를 construct 해보겠습니다! $\mathcal{P}(\mathbb{R}^p)$에서는 없지만 정의역을 $\mathfrak{M}(\mu)$ 조금 좁히면 measure가 된다는 뜻입니다.
**정리.** $\mathfrak{M}(\mu)$ $\sigma$-algebra 이고 $\mu^\ast$ $\mathfrak{M}(\mu)$ measure가 된다.
**증명.** $\mathfrak{M}(\mu)$ $\sigma$-algebra이고 $\mu^\ast$ $\mathfrak{M}(\mu)$에서 countably additive임을 보이면 충분하다.
**(Step 0)** *$\mathfrak{M} _ F(\mu)$는 ring이다.*
$A, B \in \mathfrak{M} _ F(\mu)$ 라 하자. 그러면 $A _ n, B _ n \in \Sigma$ 이 존재하여 $A _ n \rightarrow A$, $B _ n \rightarrow B$ 된다. 그러면
$$\left.\begin{array}{c}d(A _ n \cup B _ n, A \cup B) \\ d(A _ n \cap B _ n, A \cap B) \\ d(A _ n \setminus B _ n, A \setminus B)\end{array}\right\rbrace\leq d(A _ n, A) + d(B _ n, B) \rightarrow 0$$
이므로 $A _ n \cup B _ n \rightarrow A \cup B, A _ n \setminus B _ n \rightarrow A\setminus B$ 이기 때문에 $\mathfrak{M} _ F(\mu)$ ring이다.
**(Step 1)** *$\mu^\ast$는 $\mathfrak{M} _ F(\mu)$ 위에서 additive이다*.
$\Sigma$ 위에서는 $\mu = \mu^\ast$ 이므로, 따름정리에 의해
$$\begin{matrix} \mu(A _ n) \rightarrow\mu^\ast(A), & \mu(A _ n\cup B _ n) \rightarrow\mu^\ast(A\cup B), \\ \mu(B _ n) \rightarrow\mu^\ast(B), & \mu(A _ n\cap B _ n) \rightarrow\mu^\ast(A\cap B) \end{matrix}$$
성립함을 있다. 일반적으로 $\mu(A _ n) + \mu(B _ n) = \mu(A _ n \cup B _ n) + \mu(A _ n \cap B _ n)$ 이므로 여기서 $n \rightarrow\infty$ 두면
$$\mu^\ast(A) + \mu^\ast(B) = \mu^\ast(A\cup B) + \mu^\ast(A \cap B)$$
얻는다. $A \cap B = \varnothing$ 라는 조건이 추가되면 $\mu^\ast$ additive임을 있다.
**(Step 2)** *$\mathfrak{M} _ F(\mu) = \lbrace A \in \mathfrak{M}(\mu) : \mu^\ast(A) < \infty\rbrace$.*[^2]
**Claim**. 쌍마다 서로소인 $\mathfrak{M} _ F(\mu)$ 원소들을 잡아 이들의 합집합으로 $A \in \mathfrak{M}(\mu)$ 표현할 있다.
**증명.** $A _ n' \in \mathfrak{M} _ F(\mu)$ 대하여 $A = \bigcup A _ n'$ 두자.
> $A _ 1 = A _ 1'$, $n \geq 2$ 이면 $A _ n = A _ n' \setminus(A _ 1'\cup \cdots \cup A _ {n-1}')$
같이 정의하면 $A _ n$이 쌍마다 서로소이고 $A _ n \in \mathfrak{M} _ F(\mu)$ 임을 있다.
사실을 이용하여 $A _ n \in \mathfrak{M} _ F(\mu)$ 대하여 $A = \displaystyle\bigcup _ {n=1}^\infty A _ n$ 으로 두자.
1. Countable subadditivity에 의해 $\displaystyle\mu^\ast(A) \leq \sum _ {n=1}^{\infty} \mu^\ast (A _ n)$ 성립한다.
2. Step 1에 의해 $\displaystyle\bigcup _ {n=1}^k A _ n \subseteq A$, $\displaystyle\sum _ {n=1}^{k} \mu^\ast(A _ n) \leq \mu^\ast(A)$ 이다. $k \rightarrow\infty$ 두면 $\displaystyle\mu^\ast(A) \geq \sum _ {n=1}^\infty \mu^\ast(A _ n)$ 임을 있다.
따라서 $\displaystyle\mu^\ast(A) = \sum _ {n=1}^\infty \mu^\ast(A _ n)$ 이다.[^3] [^4]
이제 $B _ n =\displaystyle\bigcup _ {k=1}^n A _ k$ 로 두자. $\mu^\ast(A) < \infty$ 를 가정하면 $\displaystyle\sum _ {n=1}^\infty \mu^\ast(A _ n)$ 수렴성에 의해
$$\displaystyle d(A, B _ n) = \mu^\ast\left( \bigcup _ {k=n+1}^\infty A _ k \right) = \sum _ {k=n+1}^{\infty} \mu^\ast(A _ i) \rightarrow 0 \text{ as } n \rightarrow\infty$$
임을 있다.
$B _ n \in \mathfrak{M} _ F(\mu)$ 이므로 $C _ n \in \Sigma$ 를 잡아 각 $n \in \mathbb{N}$ 에 대하여 $d(B _ n, C _ n)$를 임의로 작게 만들 수 있다. 그러면 $d(A, C _ n) \leq d(A, B _ n) + d(B _ n, C _ n)$ 이므로 충분히 큰 $n$에 대하여 $d(A, C _ n)$ 임의로 작게 만들 있다. 따라서 $C _ n \rightarrow A$ 임을 알 수 있고 $A \in \mathfrak{M} _ F(\mu)$ 라는 결론을 내릴 있다.
**(Step 3)** *$\mu^\ast$는 $\mathfrak{M}(\mu)$ 위에서 countably additive이다.*
$A _ n \in \mathfrak{M}(\mu)$ 가 $A \in \mathfrak{M}(\mu)$ 의 분할이라 하자. 적당한 $m \in \mathbb{N}$ 에 대하여 $\mu^\ast(A _ m) = \infty$ 이면
$$\mu^\ast\left( \bigcup _ {n=1}^\infty A _ n \right) \geq \mu^\ast(A _ m) = \infty = \sum _ {n=1}^\infty \mu^\ast(A _ n)$$
이므로 countable additivity가 성립한다.
이제 모든 $n\in \mathbb{N}$ 대하여 $\mu^\ast(A _ n) < \infty$ 이면, Step 2에 의해 $A _ n \in \mathfrak{M} _ F(\mu)$ 이고
$$\mu^\ast(A) = \mu^\ast\left( \bigcup _ {n=1}^\infty A _ n \right) = \sum _ {n=1}^\infty \mu^\ast(A _ n)$$
성립한다.
**(Step 4)** *$\mathfrak{M}(\mu)$는 $\sigma$-ring이다.*
$A _ n \in \mathfrak{M}(\mu)$ 이면 $B _ {n, k} \in \mathfrak{M} _ F(\mu)$ 가 존재하여 $\displaystyle A _ n = \bigcup _ k B _ {n,k}$ 이다. 그러면
$$\bigcup _ n A _ n = \bigcup _ {n, k} B _ {n, k} \in \mathfrak{M}(\mu)$$
이다.
$A, B \in \mathfrak{M}(\mu)$ 하면 $A _ n, B _ n \in \mathfrak{M} _ F(\mu)$ 에 대해 $\displaystyle A = \bigcup A _ n$, $\displaystyle B = \bigcup B _ n$ 이므로,
$$A \setminus B = \bigcup _ {n=1}^\infty \left( A _ n \setminus B \right) = \bigcup _ {n=1}^\infty (A _ n\setminus(A _ n\cap B))$$
임을 있다. 그러므로 $A _ n \cap B \in \mathfrak{M} _ F(\mu)$ 것만 보이면 충분하다. 정의에 의해
$$A _ n \cap B = \bigcup _ {k=1}^\infty (A _ n \cap B _ k) \in \mathfrak{M}(\mu)$$
이고 $\mu^\ast(A _ n \cap B) \leq \mu^\ast(A _ n) < \infty$ 이므로 $A _ n\cap B \in \mathfrak{M} _ F(\mu)$ 이다. 따라서 $A \setminus B$ $\mathfrak{M} _ F(\mu)$ 원소들의 countable 합집합으로 표현되므로 $A\setminus B \in \mathfrak{M}(\mu)$ 이다.
따라서 $\mathfrak{M}(\mu)$ $\sigma$-ring이고 $\sigma$-algebra이다.
---
이제 $\Sigma$ 위의 $\mu$ 정의를 $\mathfrak{M}(\mu)$ ($\sigma$-algebra) 확장하여 $\mathfrak{M}(\mu)$ 위에서는 $\mu = \mu^\ast$ 정의합니다. $\Sigma$ 위에서 $\mu = m$ , 이와 같이 확장한 $\mathfrak{M}(m)$ 위의 $m$ **Lebesgue measure** on $\mathbb{R}^p$ 합니다. 그리고 $A \in \mathfrak{M}(m)$ Lebesgue measurable set이라 합니다.
[^1]: $A$ open이 아니면 자명하지 않은 명제입니다.
[^2]: $A$ $\mu$-measurable인데 $\mu^\ast(A) < \infty$이면 $A$ finitely $\mu$-measurable이다.
[^3]: $A$ countable union of sets in $\mathfrak{M} _ F(\mu)$이므로 $\mu^\ast$ set의 $\mu^\ast$ 합이 된다.
[^4]: 아직 증명이 끝나지 않았습니다. $A _ n$은 $\mathfrak{M}(\mu)$의 원소가 아니라 $\mathfrak{M} _ F(\mu)$ 원소입니다.

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Mathematics - Mathematics
- Measure Theory - Measure Theory
path: _posts/mathematics/measure-theory
tags: tags:
- math - math
- analysis - analysis
@@ -13,24 +14,24 @@ title: 03. Measure Spaces
date: 2023-01-24 date: 2023-01-24
github_title: 2023-01-24-measure-spaces github_title: 2023-01-24-measure-spaces
image: image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-03.png path: /assets/img/posts/mathematics/measure-theory/mt-03.png
attachment: attachment:
folder: assets/img/posts/Mathematics/Measure Theory folder: assets/img/posts/mathematics/measure-theory
--- ---
## Remarks on Construction of Measure ## Remarks on Construction of Measure
Construction of measure 증명에서 추가로 참고할 내용입니다. Construction of measure 증명에서 추가로 참고할 내용입니다.
![mt-03.png](../../../assets/img/posts/Mathematics/Measure%20Theory/mt-03.png) ![mt-03.png](../../../assets/img/posts/mathematics/measure-theory/mt-03.png)
**명제.** $A$가 열린집합이면 $A \in \mathfrak{M}(\mu)$ 이다. 또한 $A^C \in \mathfrak{M}(\mu)$ 이므로, $F$가 닫힌집합이면 $F \in \mathfrak{M}(\mu)$ 이다. **명제.** $A$가 열린집합이면 $A \in \mathfrak{M}(\mu)$ 이다. 또한 $A^C \in \mathfrak{M}(\mu)$ 이므로, $F$가 닫힌집합이면 $F \in \mathfrak{M}(\mu)$ 이다.
**증명.** 중심이 $x\in \mathbb{R}^p$ 이고 반지름이 $r$인 열린 box를 $I(x, r)$이라 두자. $I(x, r)$은 명백히 $\mathfrak{M}_F(\mu)$의 원소이다. 이제 **증명.** 중심이 $x\in \mathbb{R}^p$ 이고 반지름이 $r$인 열린 box를 $I(x, r)$이라 두자. $I(x, r)$은 명백히 $\mathfrak{M} _ F(\mu)$의 원소이다. 이제
$$A = \bigcup_ {\substack{x \in \mathbb{Q}^p, \; r \in \mathbb{Q}\\ I(x, r)\subseteq A}} I(x, r)$$ $$A = \bigcup _ {\substack{x \in \mathbb{Q}^p, \; r \in \mathbb{Q}\\ I(x, r)\subseteq A}} I(x, r)$$
로 적을 수 있으므로 $A$는 $\mathfrak{M}_F(\mu)$의 원소들의 countable union이 되어 $A \in \mathfrak{M}(\mu)$ 이다. 이제 $\mathfrak{M}(\mu)$가 $\sigma$-algebra이므로 $A^C\in \mathfrak{M}(\mu)$ 이고, 이로부터 임의의 닫힌집합 $F$도 $\mathfrak{M}(\mu)$의 원소임을 알 수 있다. 로 적을 수 있으므로 $A$는 $\mathfrak{M} _ F(\mu)$의 원소들의 countable union이 되어 $A \in \mathfrak{M}(\mu)$ 이다. 이제 $\mathfrak{M}(\mu)$가 $\sigma$-algebra이므로 $A^C\in \mathfrak{M}(\mu)$ 이고, 이로부터 임의의 닫힌집합 $F$도 $\mathfrak{M}(\mu)$의 원소임을 알 수 있다.
**명제.** $A \in \mathfrak{M}(\mu)$ 이면 임의의 $\epsilon > 0$ 에 대하여 **명제.** $A \in \mathfrak{M}(\mu)$ 이면 임의의 $\epsilon > 0$ 에 대하여
@@ -40,15 +41,15 @@ $$F \subseteq A \subseteq G, \quad \mu\left( G \setminus A \right) < \epsilon, \
이는 곧 정의역을 $\mathfrak{M}(\mu)$로 줄였음에도 $\mu$가 여전히 $\mathfrak{M}(\mu)$ 위에서 regular라는 뜻입니다. 이는 곧 정의역을 $\mathfrak{M}(\mu)$로 줄였음에도 $\mu$가 여전히 $\mathfrak{M}(\mu)$ 위에서 regular라는 뜻입니다.
**증명.** $A = \bigcup_ {n=1}^\infty A_n$ ($A_n \in \mathfrak{M}_F(\mu)$) 로 두고 $\epsilon > 0$ 을 고정하자. 각 $n \in \mathbb{N}$ 에 대하여 열린집합 $B_ {n, k} \in \Sigma$ 를 잡아 $A_n \subseteq\bigcup_ {k=1}^\infty B_ {n, k}$ 와 **증명.** $A = \bigcup _ {n=1}^\infty A _ n$ ($A _ n \in \mathfrak{M} _ F(\mu)$) 로 두고 $\epsilon > 0$ 을 고정하자. 각 $n \in \mathbb{N}$ 에 대하여 열린집합 $B _ {n, k} \in \Sigma$ 를 잡아 $A _ n \subseteq\bigcup _ {k=1}^\infty B _ {n, k}$ 와
$$\mu\left( \bigcup_ {k=1}^{\infty} B_ {n, k} \right) \leq \sum_ {k=1}^{\infty} \mu\left( B_ {n, k} \right) < \mu\left( A_n \right) + 2^{-n}\epsilon$$ $$\mu\left( \bigcup _ {k=1}^{\infty} B _ {n, k} \right) \leq \sum _ {k=1}^{\infty} \mu\left( B _ {n, k} \right) < \mu\left( A _ n \right) + 2^{-n}\epsilon$$
을 만족하도록 할 수 있다.[^1] 을 만족하도록 할 수 있다.[^1]
이제 열린집합을 잡아보자. $G_n = \bigcup_ {k=1}^{\infty} B_ {n, k}$ 으로 두고 $G = \bigcup_ {n=1}^{\infty} G_n$ 로 잡는다. $A_n \in \mathfrak{M}_F(\mu)$ 이므로 $\mu\left( A_n \right) < \infty$ 이고, 다음이 성립한다. 이제 열린집합을 잡아보자. $G _ n = \bigcup _ {k=1}^{\infty} B _ {n, k}$ 으로 두고 $G = \bigcup _ {n=1}^{\infty} G _ n$ 로 잡는다. $A _ n \in \mathfrak{M} _ F(\mu)$ 이므로 $\mu\left( A _ n \right) < \infty$ 이고, 다음이 성립한다.
$$\begin{aligned} \mu\left( G \setminus A \right) & = \mu\left( \bigcup_ {n=1}^{\infty} G_n \setminus\bigcup_ {n=1}^{\infty} A_n \right) \leq \mu\left( \bigcup_ {n=1}^{\infty} G_n \setminus A_n \right) \\ &\leq \sum_ {n=1}^{\infty} \mu\left( G_n \setminus A_n \right) \leq \sum_ {n=1}^{\infty} 2^{-n}\epsilon = \epsilon. \end{aligned}$$ $$\begin{aligned} \mu\left( G \setminus A \right) & = \mu\left( \bigcup _ {n=1}^{\infty} G _ n \setminus\bigcup _ {n=1}^{\infty} A _ n \right) \leq \mu\left( \bigcup _ {n=1}^{\infty} G _ n \setminus A _ n \right) \\ &\leq \sum _ {n=1}^{\infty} \mu\left( G _ n \setminus A _ n \right) \leq \sum _ {n=1}^{\infty} 2^{-n}\epsilon = \epsilon. \end{aligned}$$
닫힌집합의 존재성을 보이기 위해 위 과정을 $A^C$에 대해 반복하면 $A^C \subseteq F^C$, $\mu\left( F^C \setminus A^C \right) < \epsilon$ 되도록 열린집합 $F^C$ 잡을 있다. $F$ 닫힌집합이고 $F^C \setminus A^C = F^C \cap A = A\setminus F$ 이므로 $\mu\left( A \setminus F \right) < \epsilon$ 이고 $F\subseteq A$ 이다. 닫힌집합의 존재성을 보이기 위해 위 과정을 $A^C$에 대해 반복하면 $A^C \subseteq F^C$, $\mu\left( F^C \setminus A^C \right) < \epsilon$ 되도록 열린집합 $F^C$ 잡을 있다. $F$ 닫힌집합이고 $F^C \setminus A^C = F^C \cap A = A\setminus F$ 이므로 $\mu\left( A \setminus F \right) < \epsilon$ 이고 $F\subseteq A$ 이다.
@@ -56,7 +57,7 @@ $$\begin{aligned} \mu\left( G \setminus A \right) & = \mu\left( \bigcup_
Borel $\sigma$-algebra는 $\mathbb{R}^p$의 열린집합을 포함하는 가장 작은 $\sigma$-algebra로 정의할 수도 있습니다. $O$가 $\mathbb{R}^p$의 열린집합의 모임이라 하면 Borel $\sigma$-algebra는 $\mathbb{R}^p$의 열린집합을 포함하는 가장 작은 $\sigma$-algebra로 정의할 수도 있습니다. $O$가 $\mathbb{R}^p$의 열린집합의 모임이라 하면
$$\mathfrak{B} = \bigcap_ {O \subseteq G,\;G:\, \sigma\text{-algebra}} G$$ $$\mathfrak{B} = \bigcap _ {O \subseteq G,\;G:\, \sigma\text{-algebra}} G$$
로 정의합니다. 여기서 '가장 작은'의 의미는 집합의 관점에서 가장 작다는 의미로, 위 조건을 만족하는 임의의 집합 $X$를 가져오더라도 $X \subseteq\mathfrak{B}$ 라는 뜻입니다. 그래서 교집합을 택하게 됩니다. 위 정의에 의해 $\mathfrak{B} \subseteq\mathfrak{M}(\mu)$ 임도 알 수 있습니다. 로 정의합니다. 여기서 '가장 작은'의 의미는 집합의 관점에서 가장 작다는 의미로, 위 조건을 만족하는 임의의 집합 $X$를 가져오더라도 $X \subseteq\mathfrak{B}$ 라는 뜻입니다. 그래서 교집합을 택하게 됩니다. 위 정의에 의해 $\mathfrak{B} \subseteq\mathfrak{M}(\mu)$ 임도 알 수 있습니다.
@@ -66,23 +67,23 @@ $$\mathfrak{B} = \bigcap_ {O \subseteq G,\;G:\, \sigma\text{-algebra}} G$$
**명제.** $A \in \mathfrak{M}(\mu)$ 이면 $F \subseteq A \subseteq G$ 인 Borel set $F$, $G$가 존재한다. 추가로, $A$는 Borel set과 $\mu$-measure zero set의 합집합으로 표현할 수 있으며, $A$와 적당한 $\mu$-measure zero set을 합집합하여 Borel set이 되게 할 수 있다. **명제.** $A \in \mathfrak{M}(\mu)$ 이면 $F \subseteq A \subseteq G$ 인 Borel set $F$, $G$가 존재한다. 추가로, $A$는 Borel set과 $\mu$-measure zero set의 합집합으로 표현할 수 있으며, $A$와 적당한 $\mu$-measure zero set을 합집합하여 Borel set이 되게 할 수 있다.
**증명.** $\mathfrak{M}(\mu)$의 regularity를 이용하여 다음을 만족하는 열린집합 $G_n \in \Sigma$, 닫힌집합 $F_n \in \Sigma$ 를 잡는다. **증명.** $\mathfrak{M}(\mu)$의 regularity를 이용하여 다음을 만족하는 열린집합 $G _ n \in \Sigma$, 닫힌집합 $F _ n \in \Sigma$ 를 잡는다.
$$F_n \subseteq A \subseteq G_n, \quad \mu\left( G_n \setminus A \right) < \frac{1}{n}, \quad \mu\left( A \setminus F_n \right) < \frac{1}{n}.$$ $$F _ n \subseteq A \subseteq G _ n, \quad \mu\left( G _ n \setminus A \right) < \frac{1}{n}, \quad \mu\left( A \setminus F _ n \right) < \frac{1}{n}.$$
이제 $F = \bigcup_ {n=1}^{\infty} F_n$, $G = \bigcap_ {n=1}^{\infty} G_n$ 로 정의하면 $F, G \in \mathfrak{B}$ 이고 $F \subseteq A \subseteq G$ 이다. 이제 $F = \bigcup _ {n=1}^{\infty} F _ n$, $G = \bigcap _ {n=1}^{\infty} G _ n$ 로 정의하면 $F, G \in \mathfrak{B}$ 이고 $F \subseteq A \subseteq G$ 이다.
한편, $A = F \cup (A \setminus F)$, $G = A \cup (G \setminus A)$ 로 적을 수 있다. 그런데 $n \rightarrow\infty$ 일 때 한편, $A = F \cup (A \setminus F)$, $G = A \cup (G \setminus A)$ 로 적을 수 있다. 그런데 $n \rightarrow\infty$ 일 때
$$\left.\begin{array}{r}\mu\left( G \setminus A \right)\leq \mu\left( G_n \setminus A \right) < \frac{1}{n} \\ \mu\left( A \setminus F \right) \leq \mu\left( A \setminus F_n \right) < \frac{1}{n}\end{array}\right\rbrace \rightarrow 0$$ $$\left.\begin{array}{r}\mu\left( G \setminus A \right)\leq \mu\left( G _ n \setminus A \right) < \frac{1}{n} \\ \mu\left( A \setminus F \right) \leq \mu\left( A \setminus F _ n \right) < \frac{1}{n}\end{array}\right\rbrace \rightarrow 0$$
이므로 $A \in \mathfrak{M}(\mu)$ 는 Borel set 과 $\mu$-measure zero set의 합집합이다. 그리고 $A \in \mathfrak{M}(\mu)$ 에 적당한 $\mu$-measure zero set을 합집합하여 Borel set이 되게 할 수 있다. 이므로 $A \in \mathfrak{M}(\mu)$ 는 Borel set 과 $\mu$-measure zero set의 합집합이다. 그리고 $A \in \mathfrak{M}(\mu)$ 에 적당한 $\mu$-measure zero set을 합집합하여 Borel set이 되게 할 수 있다.
**명제.** 임의의 measure $\mu$에 대하여 $\mu$-measure zero set의 모임은 $\sigma$-ring이다. **명제.** 임의의 measure $\mu$에 대하여 $\mu$-measure zero set의 모임은 $\sigma$-ring이다.
**증명.** Countable subadditivity를 확인하면 나머지는 자명하다. 모든 $n\in \mathbb{N}$ 에 대하여 $\mu\left( A_n \right) = 0$ 이라 하면 **증명.** Countable subadditivity를 확인하면 나머지는 자명하다. 모든 $n\in \mathbb{N}$ 에 대하여 $\mu\left( A _ n \right) = 0$ 이라 하면
$$\mu\left( \bigcup_ {n=1}^{\infty} A_n \right) \leq \sum_ {n=1}^{\infty} \mu\left( A_n \right) = 0$$ $$\mu\left( \bigcup _ {n=1}^{\infty} A _ n \right) \leq \sum _ {n=1}^{\infty} \mu\left( A _ n \right) = 0$$
이다. 이다.
@@ -90,15 +91,15 @@ $$\mu\left( \bigcup_ {n=1}^{\infty} A_n \right) \leq \sum_ {n=1}^{\infty} \mu\le
**증명.** $A$가 countable set이라 하자. 그러면 $A$는 점들의 countable union이고, 점은 measure가 0인 $\mathbb{R}^p$의 닫힌집합이므로 $A$는 measurable이면서 (닫힌집합의 합집합) $m(A) = 0$ 이 된다. **증명.** $A$가 countable set이라 하자. 그러면 $A$는 점들의 countable union이고, 점은 measure가 0인 $\mathbb{R}^p$의 닫힌집합이므로 $A$는 measurable이면서 (닫힌집합의 합집합) $m(A) = 0$ 이 된다.
Uncountable인 경우에는 Cantor set $P$를 생각한다. $E_n$을 다음과 같이 정의한다. Uncountable인 경우에는 Cantor set $P$를 생각한다. $E _ n$을 다음과 같이 정의한다.
- $E_0 = [0, 1]$. - $E _ 0 = [0, 1]$.
- $E_1 = \left[0, \frac{1}{3}\right] \cup \left[\frac{2}{3}, 1\right]$, $E_0$의 구간을 3등분하여 가운데를 제외한 것이다. - $E _ 1 = \left[0, \frac{1}{3}\right] \cup \left[\frac{2}{3}, 1\right]$, $E _ 0$의 구간을 3등분하여 가운데를 제외한 것이다.
- $E_2 = \left[0, \frac{1}{9}\right] \cup \left[\frac{2}{9}, \frac{3}{9}\right] \cup \left[\frac{6}{9}, \frac{7}{9}\right] \cup \left[\frac{8}{9}, 1\right]$, 마찬가지로 $E_1$의 구간을 3등분하여 가운데를 제외한 것이다. - $E _ 2 = \left[0, \frac{1}{9}\right] \cup \left[\frac{2}{9}, \frac{3}{9}\right] \cup \left[\frac{6}{9}, \frac{7}{9}\right] \cup \left[\frac{8}{9}, 1\right]$, 마찬가지로 $E _ 1$의 구간을 3등분하여 가운데를 제외한 것이다.
위 과정을 반복하여 $E_n$을 얻고, Cantor set은 $P = \bigcap_ {n=1}^{\infty} E_n$ 로 정의한다. 여기서 $m(E_n) = \left( \frac{2}{3} \right)^n$ 임을 알 수 있고, $P \subseteq E_n$ 이므로 $m(P)\leq m(E_n)$ 가 성립한다. 이제 $n \rightarrow\infty$ 로 두면 $m(P) = 0$ 이다. 위 과정을 반복하여 $E _ n$을 얻고, Cantor set은 $P = \bigcap _ {n=1}^{\infty} E _ n$ 로 정의한다. 여기서 $m(E _ n) = \left( \frac{2}{3} \right)^n$ 임을 알 수 있고, $P \subseteq E _ n$ 이므로 $m(P)\leq m(E _ n)$ 가 성립한다. 이제 $n \rightarrow\infty$ 로 두면 $m(P) = 0$ 이다.
**참고.** $\mathfrak{M}(m) \subsetneq \mathcal{P}(\mathbb{R}^p)$. $\mathbb{R}^p$의 부분집합 중 measurable하지 않은 집합이 존재한다.[^2] **참고.** $\mathfrak{M}(m) \subsetneq \mathcal{P}(\mathbb{R}^p)$. $\mathbb{R}^p$의 부분집합 중 measurable하지 않은 집합이 존재한다.[^2]

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Mathematics - Mathematics
- Measure Theory - Measure Theory
path: _posts/mathematics/measure-theory
tags: tags:
- math - math
- analysis - analysis
@@ -13,14 +14,14 @@ title: 04. Measurable Functions
date: 2023-02-06 date: 2023-02-06
github_title: 2023-02-06-measurable-functions github_title: 2023-02-06-measurable-functions
image: image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-04.png path: /assets/img/posts/mathematics/measure-theory/mt-04.png
attachment: attachment:
folder: assets/img/posts/Mathematics/Measure Theory folder: assets/img/posts/mathematics/measure-theory
--- ---
Lebesgue integral을 공부하기 전 마지막 준비입니다. Lebesgue integral은 다음과 같이 표기합니다. Lebesgue integral을 공부하기 전 마지막 준비입니다. Lebesgue integral은 다음과 같이 표기합니다.
$$\int_X f \,d{\mu}$$ $$\int _ X f \,d{\mu}$$
표기를 보면 크게 3가지 요소가 있음을 확인할 수 있습니다. 바로 집합 $X$, measure $\mu$, 그리고 함수 $f$입니다. 집합과 measure는 다루었으니 마지막으로 함수에 관한 이야기를 조금 하면 Lebesgue integral을 정의할 수 있습니다! 표기를 보면 크게 3가지 요소가 있음을 확인할 수 있습니다. 바로 집합 $X$, measure $\mu$, 그리고 함수 $f$입니다. 집합과 measure는 다루었으니 마지막으로 함수에 관한 이야기를 조금 하면 Lebesgue integral을 정의할 수 있습니다!
@@ -54,7 +55,7 @@ $$\lbrace x \in X : f(x) > a\rbrace$$
**증명.** 우선 (1)을 가정하고, 다음 관계식을 이용하면 **증명.** 우선 (1)을 가정하고, 다음 관계식을 이용하면
$$\begin{aligned} \lbrace x : f(x) \geq a\rbrace & = f^{-1}\left( [a, \infty) \right) \\ & = f^{-1}\left( \bigcup_ {n=1}^{\infty} \left( a + \frac{1}{n}, \infty \right) \right) \\ & = \bigcup_ {n=1}^{\infty} f^{-1}\left( \left( a + \frac{1}{n}, \infty \right) \right) \end{aligned}$$ $$\begin{aligned} \lbrace x : f(x) \geq a\rbrace & = f^{-1}\left( [a, \infty) \right) \\ & = f^{-1}\left( \bigcup _ {n=1}^{\infty} \left( a + \frac{1}{n}, \infty \right) \right) \\ & = \bigcup _ {n=1}^{\infty} f^{-1}\left( \left( a + \frac{1}{n}, \infty \right) \right) \end{aligned}$$
measurable set의 countable union도 measurable이므로 ($\sigma$-algebra) (2)가 성립한다. 이제 (2)를 가정하면 measurable set의 countable union도 measurable이므로 ($\sigma$-algebra) (2)가 성립한다. 이제 (2)를 가정하면
@@ -62,7 +63,7 @@ $$\lbrace x : f(x) < a\rbrace = X \setminus\lbrace x : f(x) \geq a\rbrace$$
로부터 (3)이 성립하는 것을 알 수 있다. (3)을 가정하면 위와 마찬가지 방법으로 로부터 (3)이 성립하는 것을 알 수 있다. (3)을 가정하면 위와 마찬가지 방법으로
$$\begin{aligned} \lbrace x : f(x) \leq a\rbrace & = f^{-1}\left( (-\infty, a] \right) \\ & = f^{-1}\left( \bigcup_ {n=1}^{\infty} \left( -\infty, a - \frac{1}{n} \right) \right) \\ & = \bigcup_ {n=1}^{\infty} f^{-1}\left( \left( -\infty, a - \frac{1}{n} \right) \right) \end{aligned}$$ $$\begin{aligned} \lbrace x : f(x) \leq a\rbrace & = f^{-1}\left( (-\infty, a] \right) \\ & = f^{-1}\left( \bigcup _ {n=1}^{\infty} \left( -\infty, a - \frac{1}{n} \right) \right) \\ & = \bigcup _ {n=1}^{\infty} f^{-1}\left( \left( -\infty, a - \frac{1}{n} \right) \right) \end{aligned}$$
과 같이 변형하여 (4)가 성립함을 알 수 있다. 마지막으로 (4)를 가정하면 과 같이 변형하여 (4)가 성립함을 알 수 있다. 마지막으로 (4)를 가정하면
@@ -102,23 +103,23 @@ $$\begin{aligned} \lbrace x : \max\lbrace f, g\rbrace > a\rbrace & = \lbr
다음은 함수열의 경우입니다. Measurable 함수열의 극한함수도 measurable일까요? 다음은 함수열의 경우입니다. Measurable 함수열의 극한함수도 measurable일까요?
**정리.** $\lbrace f_n\rbrace$가 measurable 함수열이라 하자. 그러면 **정리.** $\lbrace f _ n\rbrace$가 measurable 함수열이라 하자. 그러면
$$\sup_ {n\in \mathbb{N}} f_n, \quad \inf_ {n\in \mathbb{N}} f_n, \quad \limsup_ {n \rightarrow\infty} f_n, \quad \liminf_ {n \rightarrow\infty} f_n$$ $$\sup _ {n\in \mathbb{N}} f _ n, \quad \inf _ {n\in \mathbb{N}} f _ n, \quad \limsup _ {n \rightarrow\infty} f _ n, \quad \liminf _ {n \rightarrow\infty} f _ n$$
은 모두 measurable이다. 은 모두 measurable이다.
**증명.** 다음이 성립한다. **증명.** 다음이 성립한다.
$$\inf f_n = -\sup\left( -f_n \right), \quad \limsup f_n = \inf_n \sup_ {k\geq n} f_k, \quad \liminf f_n = -\limsup\left( -f_n \right).$$ $$\inf f _ n = -\sup\left( -f _ n \right), \quad \limsup f _ n = \inf _ n \sup _ {k\geq n} f _ k, \quad \liminf f _ n = -\limsup\left( -f _ n \right).$$
따라서 위 명제는 $\sup f_n$에 대해서만 보이면 충분하다. 이제 $\sup f_n$이 measurable function인 것은 따라서 위 명제는 $\sup f _ n$에 대해서만 보이면 충분하다. 이제 $\sup f _ n$이 measurable function인 것은
$$\lbrace x : \sup_ {n\in\mathbb{N}} f_n(x) > a\rbrace = \bigcup_ {n=1}^{\infty} \lbrace x : f_n(x) > a\rbrace \in \mathscr{F}$$ $$\lbrace x : \sup _ {n\in\mathbb{N}} f _ n(x) > a\rbrace = \bigcup _ {n=1}^{\infty} \lbrace x : f _ n(x) > a\rbrace \in \mathscr{F}$$
로부터 당연하다. 로부터 당연하다.
$\lim f_n$이 존재하는 경우, 위 명제를 이용하면 $\lim f_n = \limsup f_n = \liminf f_n$ 이기 때문에 다음을 알 수 있습니다. Measurability는 극한에 의해서 보존됩니다! $\lim f _ n$이 존재하는 경우, 위 명제를 이용하면 $\lim f _ n = \limsup f _ n = \liminf f _ n$ 이기 때문에 다음을 알 수 있습니다. Measurability는 극한에 의해서 보존됩니다!
**따름정리.** 수렴하는 measurable 함수열의 극한함수는 measurable이다. **따름정리.** 수렴하는 measurable 함수열의 극한함수는 measurable이다.
@@ -126,13 +127,13 @@ $\lim f_n$이 존재하는 경우, 위 명제를 이용하면 $\lim f_n = \limsu
**정리.** $X$에서 정의된 실함수 $f, g$가 measurable이라 하자. 연속함수 $F: \mathbb{R}^2 \rightarrow\mathbb{R}$ 에 대하여 $h(x) = F\big(f(x), g(x)\big)$ 는 measurable이다. 이로부터 $f + g$와 $fg$가 measurable임을 알 수 있다.[^2] **정리.** $X$에서 정의된 실함수 $f, g$가 measurable이라 하자. 연속함수 $F: \mathbb{R}^2 \rightarrow\mathbb{R}$ 에 대하여 $h(x) = F\big(f(x), g(x)\big)$ 는 measurable이다. 이로부터 $f + g$와 $fg$가 measurable임을 알 수 있다.[^2]
**증명.** $a \in \mathbb{R}$ 에 대하여 $G_a = \lbrace (u, v)\in \mathbb{R}^2 : F(u, v) > a\rbrace$ 로 정의합니다. 그러면 $F$가 연속이므로 $G_a$는 열린집합이고, $G_a$ 열린구간의 합집합으로 적을 수 있다. 따라서 $a_n, b_n, c_n, d_n\in \mathbb{R}$ 에 대하여 **증명.** $a \in \mathbb{R}$ 에 대하여 $G _ a = \lbrace (u, v)\in \mathbb{R}^2 : F(u, v) > a\rbrace$ 로 정의합니다. 그러면 $F$가 연속이므로 $G _ a$는 열린집합이고, $G _ a$ 열린구간의 합집합으로 적을 수 있다. 따라서 $a _ n, b _ n, c _ n, d _ n\in \mathbb{R}$ 에 대하여
$$G_a = \displaystyle\bigcup_ {n=1}^{\infty} (a_n, b_n) \times (c_n, d_n)$$ $$G _ a = \displaystyle\bigcup _ {n=1}^{\infty} (a _ n, b _ n) \times (c _ n, d _ n)$$
로 두면 로 두면
$$\begin{aligned} \lbrace x \in X : F\bigl(f(x), g(x)\bigr) > a\rbrace = & \lbrace x \in X : \bigl(f(x), g(x)\bigr) \in G_a\rbrace \\ = & \bigcup_ {n=1}^{\infty} \lbrace x \in X : a_n < f(x) < b_n,\, c_n < g(x) < d_n\rbrace \\ = & \bigcup_ {n=1}^{\infty} \lbrace x \in X : a_n < f(x) < b_n\rbrace \cap \lbrace x \in X : c_n < g(x) < d_n\rbrace \end{aligned}$$ $$\begin{aligned} \lbrace x \in X : F\bigl(f(x), g(x)\bigr) > a\rbrace = & \lbrace x \in X : \bigl(f(x), g(x)\bigr) \in G _ a\rbrace \\ = & \bigcup _ {n=1}^{\infty} \lbrace x \in X : a _ n < f(x) < b _ n,\, c _ n < g(x) < d _ n\rbrace \\ = & \bigcup _ {n=1}^{\infty} \lbrace x \in X : a _ n < f(x) < b _ n\rbrace \cap \lbrace x \in X : c _ n < g(x) < d _ n\rbrace \end{aligned}$$
이다. 여기서 $f, g$가 measurable이므로 $\lbrace x \in X : F\bigl(f(x), g(x)\bigr) > a\rbrace$도 measurable이다. 이로부터 $F(x, y) = x + y$, $F(x, y) = xy$ 인 경우를 고려하면 $f+g$, $fg$가 measurable임을 알 수 있다. 이다. 여기서 $f, g$가 measurable이므로 $\lbrace x \in X : F\bigl(f(x), g(x)\bigr) > a\rbrace$도 measurable이다. 이로부터 $F(x, y) = x + y$, $F(x, y) = xy$ 인 경우를 고려하면 $f+g$, $fg$가 measurable임을 알 수 있다.
@@ -140,11 +141,11 @@ $$\begin{aligned} \lbrace x \in X : F\bigl(f(x), g(x)\bigr) > a\rbrace =
아래 내용은 Lebesgue integral의 정의에서 사용할 매우 중요한 building block입니다. 아래 내용은 Lebesgue integral의 정의에서 사용할 매우 중요한 building block입니다.
**정의.** (Characteristic Function) 집합 $E \subseteq X$ 의 **characteristic function** $\chi_E$는 다음과 같이 정의한다. **정의.** (Characteristic Function) 집합 $E \subseteq X$ 의 **characteristic function** $\chi _ E$는 다음과 같이 정의한다.
$$\chi_E(x) = \begin{cases} 1 & (x\in E) \\ 0 & (x \notin E). \end{cases}$$ $$\chi _ E(x) = \begin{cases} 1 & (x\in E) \\ 0 & (x \notin E). \end{cases}$$
참고로 characteristic function은 indicator function 등으로도 불리며, $\mathbf{1}_E, K_E$로 표기하는 경우도 있습니다. 참고로 characteristic function은 indicator function 등으로도 불리며, $\mathbf{1} _ E, K _ E$로 표기하는 경우도 있습니다.
## Simple Function ## Simple Function
@@ -152,63 +153,63 @@ $$\chi_E(x) = \begin{cases} 1 & (x\in E) \\ 0 & (x \notin E). \end{cas
치역이 유한집합임을 이용하면 simple function은 다음과 같이 적을 수 있습니다. 치역이 유한집합임을 이용하면 simple function은 다음과 같이 적을 수 있습니다.
**참고.** 치역의 원소를 잡아 $s(X) = \lbrace c_1, c_2, \dots, c_n\rbrace$ 로 두자. 여기서 $E_i = s^{-1}(c_i)$ 로 두면 다음과 같이 적을 수 있다. **참고.** 치역의 원소를 잡아 $s(X) = \lbrace c _ 1, c _ 2, \dots, c _ n\rbrace$ 로 두자. 여기서 $E _ i = s^{-1}(c _ i)$ 로 두면 다음과 같이 적을 수 있다.
$$s(x) = \sum_ {i=1}^{n} c_i \chi_ {E_i}(x).$$ $$s(x) = \sum _ {i=1}^{n} c _ i \chi _ {E _ i}(x).$$
이로부터 모든 simple function은 characteristic function의 linear combination으로 표현됨을 알 수 있습니다. 물론 $E_i$는 쌍마다 서로소입니다. 이로부터 모든 simple function은 characteristic function의 linear combination으로 표현됨을 알 수 있습니다. 물론 $E _ i$는 쌍마다 서로소입니다.
여기서 $E_i$에 measurable 조건이 추가되면, 정의에 의해 $\chi_ {E_i}$도 measurable function입니다. 따라서 모든 measurable simple function을 measurable $\chi_ {E_i}$의 linear combination으로 표현할 수 있습니다. 여기서 $E _ i$에 measurable 조건이 추가되면, 정의에 의해 $\chi _ {E _ i}$도 measurable function입니다. 따라서 모든 measurable simple function을 measurable $\chi _ {E _ i}$의 linear combination으로 표현할 수 있습니다.
![mt-04.png](../../../assets/img/posts/Mathematics/Measure%20Theory/mt-04.png) ![mt-04.png](../../../assets/img/posts/mathematics/measure-theory/mt-04.png)
아래 정리는 simple function이 Lebesgue integral의 building block이 되는 이유를 잘 드러냅니다. 모든 함수는 simple function으로 근사할 수 있습니다. 아래 정리는 simple function이 Lebesgue integral의 building block이 되는 이유를 잘 드러냅니다. 모든 함수는 simple function으로 근사할 수 있습니다.
**정리.** $f : X \rightarrow\overline{\mathbb{R}}$ 라 두자. 모든 $x \in X$ 에 대하여 **정리.** $f : X \rightarrow\overline{\mathbb{R}}$ 라 두자. 모든 $x \in X$ 에 대하여
$$\lim_ {n \rightarrow\infty} s_n(x) = f(x), \quad \lvert s_n(x) \rvert \leq \lvert f(x) \rvert$$ $$\lim _ {n \rightarrow\infty} s _ n(x) = f(x), \quad \lvert s _ n(x) \rvert \leq \lvert f(x) \rvert$$
인 simple 함수열 $s_n$이 존재한다. 여기서 추가로 인 simple 함수열 $s _ n$이 존재한다. 여기서 추가로
1. $f$가 유계이면 $s_n$은 $f$로 고르게 수렴한다. 1. $f$가 유계이면 $s _ n$은 $f$로 고르게 수렴한다.
2. $f\geq 0$ 이면 단조증가하는 함수열 $s_n$이 존재하며 $\displaystyle\sup_ {n\in \mathbb{N}} s_n = f$ 이다. 2. $f\geq 0$ 이면 단조증가하는 함수열 $s _ n$이 존재하며 $\displaystyle\sup _ {n\in \mathbb{N}} s _ n = f$ 이다.
3. **$f$가 measurable이면 measurable simple 함수열 $s_n$이 존재한다.** 3. **$f$가 measurable이면 measurable simple 함수열 $s _ n$이 존재한다.**
**증명.** 우선 $f \geq 0$ 인 경우부터 보인다. $n \in \mathbb{N}$ 에 대하여 집합 $E_ {n, i}$를 다음과 같이 정의한다. **증명.** 우선 $f \geq 0$ 인 경우부터 보인다. $n \in \mathbb{N}$ 에 대하여 집합 $E _ {n, i}$를 다음과 같이 정의한다.
$$E_ {n, i} = \begin{cases} \left\lbrace x : \dfrac{i}{2^n} \leq f(x) < \dfrac{i+1}{2^n}\right\rbrace & (i = 0, 1, \dots, n\cdot 2^n - 1) \\ \lbrace x : f(x) \geq n\rbrace & (i = n\cdot 2^n) \end{cases}$$ $$E _ {n, i} = \begin{cases} \left\lbrace x : \dfrac{i}{2^n} \leq f(x) < \dfrac{i+1}{2^n}\right\rbrace & (i = 0, 1, \dots, n\cdot 2^n - 1) \\ \lbrace x : f(x) \geq n\rbrace & (i = n\cdot 2^n) \end{cases}$$
이를 이용하여 이를 이용하여
$$s_n(x) = \sum_ {n=0}^{n\cdot 2^n} \frac{i}{2^n} \chi_ {E_ {n, i}} (x)$$ $$s _ n(x) = \sum _ {n=0}^{n\cdot 2^n} \frac{i}{2^n} \chi _ {E _ {n, i}} (x)$$
로 두면 $s_n$은 simple function이다. 여기서 $E_ {n, i}$와 $s_n$의 정의로부터 $s_n(x) \leq f(x)$ 은 자연스럽게 얻어지고, $x \in \lbrace x : f(x) < n\rbrace$ 대하여 $\lvert f(x) - s_n(x) \rvert \leq 2^{-n}$ 것도 있다. 여기서 $f(x) \rightarrow\infty$ 발산하는 부분이 존재하더라도, 충분히 $n$ 대하여 $\lbrace x : f(x) \geq n\rbrace$ 위에서는 $s_n(x) = n \rightarrow\infty$ 이므로 문제가 되지 않는다. 따라서 로 두면 $s _ n$은 simple function이다. 여기서 $E _ {n, i}$와 $s _ n$의 정의로부터 $s _ n(x) \leq f(x)$ 은 자연스럽게 얻어지고, $x \in \lbrace x : f(x) < n\rbrace$ 대하여 $\lvert f(x) - s _ n(x) \rvert \leq 2^{-n}$ 인 것도 알 수 있다. 여기서 $f(x) \rightarrow\infty$ 로 발산하는 부분이 존재하더라도, 충분히 큰 $n$에 대하여 $\lbrace x : f(x) \geq n\rbrace$ 위에서는 $s _ n(x) = n \rightarrow\infty$ 이므로 문제가 되지 않는다. 따라서
$$\lim_ {n \rightarrow\infty} s_n(x) = f(x), \quad (x \in X)$$ $$\lim _ {n \rightarrow\infty} s _ n(x) = f(x), \quad (x \in X)$$
라 할 수 있다. 라 할 수 있다.
(1)을 증명하기 위해 $f$가 유계임을 가정하면, 적당한 $M > 0$ 에 대해 $f(x) < M$ 이다. 그러면 충분히 $n$ 대하여 $\lbrace x : f(x) < n\rbrace = X$ 이므로 모든 $x \in X$ 대해 (1)을 증명하기 위해 $f$가 유계임을 가정하면, 적당한 $M > 0$ 에 대해 $f(x) < M$ 이다. 그러면 충분히 $n$ 대하여 $\lbrace x : f(x) < n\rbrace = X$ 이므로 모든 $x \in X$ 대해
$$\lvert f(x) - s_n(x) \rvert \leq 2^{-n}$$ $$\lvert f(x) - s _ n(x) \rvert \leq 2^{-n}$$
가 되어 $s_n$이 $f$로 고르게 수렴함을 알 수 있다. 가 되어 $s _ n$이 $f$로 고르게 수렴함을 알 수 있다.
(2)의 경우 $s_n$의 정의에 의해 단조증가함을 알 수 있다. 여기서 $f \geq 0$ 조건은 분명히 필요하다. $s_n(x) \leq s_ {n+1}(x)$ 이므로 당연히 $\displaystyle\sup_ {n\in \mathbb{N}} s_n = f$ 이다. (2)의 경우 $s _ n$의 정의에 의해 단조증가함을 알 수 있다. 여기서 $f \geq 0$ 조건은 분명히 필요하다. $s _ n(x) \leq s _ {n+1}(x)$ 이므로 당연히 $\displaystyle\sup _ {n\in \mathbb{N}} s _ n = f$ 이다.
(3)을 증명하기 위해 $f$가 measurable임을 가정하면 $E_ {n, i}$도 measurable이므로 $s_n$은 measurable simple 함수열이 된다. (3)을 증명하기 위해 $f$가 measurable임을 가정하면 $E _ {n, i}$도 measurable이므로 $s _ n$은 measurable simple 함수열이 된다.
이제 일반적인 $f$에 대해서는 $f = f^+ - f^-$ 로 적는다.[^3] 그러면 앞서 증명한 사실을 이용해 $g_n \rightarrow f^+$, $h_n \rightarrow f^-$ 인 simple function $g_n, h_n$을 잡을 수 있다. 이제 $s_n = g_n - h_n$ 으로 두면 $\lvert s_n(x) \rvert \leq \lvert f(x) \rvert$ 가 성립하고, $s_n \rightarrow f$ 도 성립한다. 이제 일반적인 $f$에 대해서는 $f = f^+ - f^-$ 로 적는다.[^3] 그러면 앞서 증명한 사실을 이용해 $g _ n \rightarrow f^+$, $h _ n \rightarrow f^-$ 인 simple function $g _ n, h _ n$을 잡을 수 있다. 이제 $s _ n = g _ n - h _ n$ 으로 두면 $\lvert s _ n(x) \rvert \leq \lvert f(x) \rvert$ 가 성립하고, $s _ n \rightarrow f$ 도 성립한다.
한편 이 정리를 이용하면 $f + g$, $fg$가 measurable임을 증명하기 쉬워집니다. 단, $f+g$, $fg$가 잘 정의되어야 합니다. 이는 $\infty - \infty$ 와 같은 상황이 발생하지 않는 경우를 말합니다. 한편 이 정리를 이용하면 $f + g$, $fg$가 measurable임을 증명하기 쉬워집니다. 단, $f+g$, $fg$가 잘 정의되어야 합니다. 이는 $\infty - \infty$ 와 같은 상황이 발생하지 않는 경우를 말합니다.
**따름정리.** $f, g$가 measurable이고 $f + g$, $fg$가 잘 정의된다면, $f+g$와 $fg$는 measurable이다. **따름정리.** $f, g$가 measurable이고 $f + g$, $fg$가 잘 정의된다면, $f+g$와 $fg$는 measurable이다.
**증명.** $f, g$를 각각 measurable simple function $f_n, g_n$으로 근사한다. 그러면 **증명.** $f, g$를 각각 measurable simple function $f _ n, g _ n$으로 근사한다. 그러면
$$f_n + g_n \rightarrow f + g, \quad f_ng_n \rightarrow fg$$ $$f _ n + g _ n \rightarrow f + g, \quad f _ ng _ n \rightarrow fg$$
이고 measurability는 극한에 의해 보존되므로 $f+g, fg$는 measurable이다. 이고 measurability는 극한에 의해 보존되므로 $f+g, fg$는 measurable이다.

View File

@@ -5,6 +5,7 @@ math: true
categories: categories:
- Mathematics - Mathematics
- Measure Theory - Measure Theory
path: _posts/mathematics/measure-theory
tags: tags:
- math - math
- analysis - analysis
@@ -13,9 +14,9 @@ title: 05. Lebesgue Integration
date: 2023-02-13 date: 2023-02-13
github_title: 2023-02-13-lebesgue-integration github_title: 2023-02-13-lebesgue-integration
image: image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-05.png path: /assets/img/posts/mathematics/measure-theory/mt-05.png
attachment: attachment:
folder: assets/img/posts/Mathematics/Measure Theory folder: assets/img/posts/mathematics/measure-theory
--- ---
## Lebesgue Integration ## Lebesgue Integration
@@ -24,11 +25,11 @@ attachment:
$E \in \mathscr{F}$ 일 때, 적분을 정의하기 위해 $E \in \mathscr{F}$ 일 때, 적분을 정의하기 위해
$$\mathscr{F}_E = \lbrace A \cap E : A \in \mathscr{F}\rbrace, \quad \mu_E = \mu|_ {\mathscr{F}_E}$$ $$\mathscr{F} _ E = \lbrace A \cap E : A \in \mathscr{F}\rbrace, \quad \mu _ E = \mu| _ {\mathscr{F} _ E}$$
로 설정하고 $\int = \int_E$ 로 두어 ($X, \mathscr{F}_E, \mu_E$) 위에서 적분을 정의할 수 있습니다. 그러나 굳이 이렇게 하지 않아도 됩니다. $\int = \int_X$ 로 두고 로 설정하고 $\int = \int _ E$ 로 두어 ($X, \mathscr{F} _ E, \mu _ E$) 위에서 적분을 정의할 수 있습니다. 그러나 굳이 이렇게 하지 않아도 됩니다. $\int = \int _ X$ 로 두고
$$\int_E f \,d{\mu} = \int f \chi _E \,d{\mu}$$ $$\int _ E f \,d{\mu} = \int f \chi _ E \,d{\mu}$$
로 정의하면 충분하기 때문입니다. 로 정의하면 충분하기 때문입니다.
@@ -40,23 +41,23 @@ $$\int_E f \,d{\mu} = \int f \chi _E \,d{\mu}$$
**(Step 1)** $A \in \mathscr{F}$ 에 대하여 **(Step 1)** $A \in \mathscr{F}$ 에 대하여
$$\int \chi_A \,d{\mu} = \mu(A)$$ $$\int \chi _ A \,d{\mu} = \mu(A)$$
로 정의한다. 로 정의한다.
함수 $\chi_A$는 $x \in A$ 일 때만 함숫값 $1$을 갖고 이외의 경우에는 $0$이기 때문에 이 함수를 $X$ 위에서 적분하면 $A$의 길이’에 대응되는 $\mu(A)$가 결과인 것이 자연스럽습니다. 함수 $\chi _ A$는 $x \in A$ 일 때만 함숫값 $1$을 갖고 이외의 경우에는 $0$이기 때문에 이 함수를 $X$ 위에서 적분하면 $A$의 길이’에 대응되는 $\mu(A)$가 결과인 것이 자연스럽습니다.
### Step 2. For Positive Measurable Simple Functions ### Step 2. For Positive Measurable Simple Functions
다음으로 양의 값을 갖는 measurable simple function에 대해 정의합니다. $f = f^+ - f^-$ 에서 $f^+, f^-$ 모두 양의 값을 갖기 때문에 양의 값에 대해 먼저 정의합니다. 다음으로 양의 값을 갖는 measurable simple function에 대해 정의합니다. $f = f^+ - f^-$ 에서 $f^+, f^-$ 모두 양의 값을 갖기 때문에 양의 값에 대해 먼저 정의합니다.
**(Step 2)** $f: X \rightarrow[0, \infty)$ 가 measurable simple function이라 하자. 그러면 $A_k \subseteq\mathscr{F}$ 이면서 쌍마다 서로소인 집합열 $\left( A_k \right)_{k=1}^n$과 $a_k \in [0, \infty)$ 인 수열 $\left( a_k \right)_{k=1}^n$을 잡아 **(Step 2)** $f: X \rightarrow[0, \infty)$ 가 measurable simple function이라 하자. 그러면 $A _ k \subseteq\mathscr{F}$ 이면서 쌍마다 서로소인 집합열 $\left( A _ k \right) _ {k=1}^n$과 $a _ k \in [0, \infty)$ 인 수열 $\left( a _ k \right) _ {k=1}^n$을 잡아
$$f(x) = \sum_ {k=1}^n a_k \chi_ {A_k}$$ $$f(x) = \sum _ {k=1}^n a _ k \chi _ {A _ k}$$
와 같이 표현할 수 있다. 이제 와 같이 표현할 수 있다. 이제
$$\int f\,d{\mu} = \sum_ {k=1}^n a_k \mu(A_k) \in [0, \infty]$$ $$\int f\,d{\mu} = \sum _ {k=1}^n a _ k \mu(A _ k) \in [0, \infty]$$
로 정의한다. 로 정의한다.
@@ -68,17 +69,17 @@ Well-definedness를 증명하기 위해 임의의 linear combination을 잡아
**증명.** $f$가 다음과 같이 두 가지 방법으로 표현된다고 하자. **증명.** $f$가 다음과 같이 두 가지 방법으로 표현된다고 하자.
$$f(x) = \sum_ {k=1}^n a_k \chi_ {A_k} = \sum_ {i=1}^m b_i \chi_ {B_i}.$$ $$f(x) = \sum _ {k=1}^n a _ k \chi _ {A _ k} = \sum _ {i=1}^m b _ i \chi _ {B _ i}.$$
여기서 $k = 1, \dots, n$, $i = 1, \dots, m$ 에 대하여 $0\leq a_k, b_i < \infty$ 이고 $A_k, B_i \in \mathscr{F}$ 이다. 여기서 $A_k, B_i$ 각각 쌍마다 서로소로, $X$ 분할이 된다. $C_ {k, i} = A_k \cap B_i$ 두면 여기서 $k = 1, \dots, n$, $i = 1, \dots, m$ 에 대하여 $0\leq a _ k, b _ i < \infty$ 이고 $A _ k, B _ i \in \mathscr{F}$ 이다. 여기서 $A _ k, B _ i$ 각각 쌍마다 서로소로, $X$ 분할이 된다. $C _ {k, i} = A _ k \cap B _ i$ 두면
$$\sum_ {k=1}^n a_k \mu(A_k) = \sum_ {k=1}^n a_k \mu\left( A_k \cap \bigcup_ {i=1}^m B_i \right) = \sum_ {k=1}^n \sum_ {i=1}^m a_k \mu(C_ {k, i}),$$ $$\sum _ {k=1}^n a _ k \mu(A _ k) = \sum _ {k=1}^n a _ k \mu\left( A _ k \cap \bigcup _ {i=1}^m B _ i \right) = \sum _ {k=1}^n \sum _ {i=1}^m a _ k \mu(C _ {k, i}),$$
$$\sum_ {i=1}^m b_i \mu(B_i) = \sum_ {i=1}^{m} b_i \mu\left( B_i \cap \bigcup_ {k=1}^n A_k \right)= \sum_ {i=1}^m \sum_ {k=1}^n b_i \mu(C_ {k, i})$$ $$\sum _ {i=1}^m b _ i \mu(B _ i) = \sum _ {i=1}^{m} b _ i \mu\left( B _ i \cap \bigcup _ {k=1}^n A _ k \right)= \sum _ {i=1}^m \sum _ {k=1}^n b _ i \mu(C _ {k, i})$$
이다. 이 때 $C_ {k, i} \neq \varnothing$ 이면 $x \in C_ {k, i}$ 에 대해 $f(x) = a_k = b_i$ 가 된다. 한편 $C_ {k, i} = \varnothing$ 이면 $\mu(C_ {k, i}) = 0$ 이다. 이로부터 모든 $k, i$에 대하여 $b_i \mu(C_ {k, i}) = a_k \mu(C_ {k, i})$ 임을 알 수 있다.[^1] 따라서 이다. 이 때 $C _ {k, i} \neq \varnothing$ 이면 $x \in C _ {k, i}$ 에 대해 $f(x) = a _ k = b _ i$ 가 된다. 한편 $C _ {k, i} = \varnothing$ 이면 $\mu(C _ {k, i}) = 0$ 이다. 이로부터 모든 $k, i$에 대하여 $b _ i \mu(C _ {k, i}) = a _ k \mu(C _ {k, i})$ 임을 알 수 있다.[^1] 따라서
$$\int f \,d{\mu }= \sum_ {k=1}^n a_k \mu(A_k) = \sum_ {i=1}^m b_i \mu(B_i)$$ $$\int f \,d{\mu }= \sum _ {k=1}^n a _ k \mu(A _ k) = \sum _ {i=1}^m b _ i \mu(B _ i)$$
가 되어 적분값은 유일하고 위 정의가 well-defined임을 알 수 있다. 가 되어 적분값은 유일하고 위 정의가 well-defined임을 알 수 있다.
@@ -94,11 +95,11 @@ $$\int \left( af + bg \right) \,d{\mu} = a \int f \,d{\mu} + b \int g \,d{\mu}$$
**증명.** 위 Step 2와 동일하게 **증명.** 위 Step 2와 동일하게
$$f = \sum_ {j=1}^m y_j \chi_ {A_j}, \quad g = \sum_ {k=1}^n z_k \chi_ {B_k}$$ $$f = \sum _ {j=1}^m y _ j \chi _ {A _ j}, \quad g = \sum _ {k=1}^n z _ k \chi _ {B _ k}$$
로 둘 수 있다. 여기서 $A_j, B_k$는 $X$의 분할이고 $y_j, z_k \geq 0$ 이다. 마찬가지로 $C_ {j, k} = A_j \cap B_k$ 로 정의하면 로 둘 수 있다. 여기서 $A _ j, B _ k$는 $X$의 분할이고 $y _ j, z _ k \geq 0$ 이다. 마찬가지로 $C _ {j, k} = A _ j \cap B _ k$ 로 정의하면
$$\begin{aligned} a \int f \,d{\mu} + b \int g \,d{\mu} & = \sum_ {j} ay_j \mu(A_j) + \sum_k b z_k \mu(B_k) \\ & = \sum_ {j} ay_j \sum_k \mu(A_j \cap B_k) + \sum_k b z_k \sum_j \mu(B_k \cap A_j) \\ & = \sum_ {j} \sum_k ay_j \mu(C_ {j, k}) + \sum_k \sum_j b z_k \mu(C_ {j, k}) \\ & = \sum_ {j, k} (ay_j + bz_k) \mu(C_ {j, k}) = \int \left( af + bg \right) \,d{\mu} \end{aligned}$$ $$\begin{aligned} a \int f \,d{\mu} + b \int g \,d{\mu} & = \sum _ {j} ay _ j \mu(A _ j) + \sum _ k b z _ k \mu(B _ k) \\ & = \sum _ {j} ay _ j \sum _ k \mu(A _ j \cap B _ k) + \sum _ k b z _ k \sum _ j \mu(B _ k \cap A _ j) \\ & = \sum _ {j} \sum _ k ay _ j \mu(C _ {j, k}) + \sum _ k \sum _ j b z _ k \mu(C _ {j, k}) \\ & = \sum _ {j, k} (ay _ j + bz _ k) \mu(C _ {j, k}) = \int \left( af + bg \right) \,d{\mu} \end{aligned}$$
이다. 이다.
@@ -126,11 +127,11 @@ $$\int f \,d{\mu} = \sup\left\lbrace \int h \,d{\mu}: 0\leq h \leq f, h \text{ m
$f$보다 작은 measurable simple function의 적분값 중 상한을 택하겠다는 의미입니다. $f$보다 작은 measurable simple function으로 $f$를 근사한다고도 이해할 수 있습니다. 또한 $f$가 simple function이면 Step 2의 정의와 일치하는 것을 알 수 있습니다. $f$보다 작은 measurable simple function의 적분값 중 상한을 택하겠다는 의미입니다. $f$보다 작은 measurable simple function으로 $f$를 근사한다고도 이해할 수 있습니다. 또한 $f$가 simple function이면 Step 2의 정의와 일치하는 것을 알 수 있습니다.
![mt-05.png](../../../assets/img/posts/Mathematics/Measure%20Theory/mt-05.png) ![mt-05.png](../../../assets/img/posts/mathematics/measure-theory/mt-05.png)
$f \geq 0$ 가 measurable이면 증가하는 measurable simple 함수열 $s_n$이 존재함을 지난 번에 보였습니다. 이 $s_n$에 대하여 적분값을 계산해보면 $f \geq 0$ 가 measurable이면 증가하는 measurable simple 함수열 $s _ n$이 존재함을 지난 번에 보였습니다. 이 $s _ n$에 대하여 적분값을 계산해보면
$$\int_E s_n \,d{\mu} = \sum_ {i=1}^{n2^n} \frac{i - 1}{2^n}\mu\left( \left\lbrace x \in E : \frac{i-1}{2^n} \leq f(x) \leq \frac{i}{2^n}\right\rbrace \right) + n\mu(\lbrace x \in E : f(x)\geq n\rbrace)$$ $$\int _ E s _ n \,d{\mu} = \sum _ {i=1}^{n2^n} \frac{i - 1}{2^n}\mu\left( \left\lbrace x \in E : \frac{i-1}{2^n} \leq f(x) \leq \frac{i}{2^n}\right\rbrace \right) + n\mu(\lbrace x \in E : f(x)\geq n\rbrace)$$
임을 알 수 있습니다. 이제 $n \rightarrow\infty$ 일 때 우변이 곧 $\displaystyle\int f \,d{\mu}$ 이기를 기대합니다. 임을 알 수 있습니다. 이제 $n \rightarrow\infty$ 일 때 우변이 곧 $\displaystyle\int f \,d{\mu}$ 이기를 기대합니다.
@@ -142,7 +143,7 @@ $$\int_E s_n \,d{\mu} = \sum_ {i=1}^{n2^n} \frac{i - 1}{2^n}\mu\left( \left\lbra
**(Step 4)** $f$가 measurable이면 $f^+, f^- \geq 0$ 도 measurable이다. 그러므로 $E \in \mathscr{F}$ 에 대하여 **(Step 4)** $f$가 measurable이면 $f^+, f^- \geq 0$ 도 measurable이다. 그러므로 $E \in \mathscr{F}$ 에 대하여
$$\int_E f \,d{\mu} = \int_E f^+ \,d{\mu} - \int_E f^- \,d{\mu}$$ $$\int _ E f \,d{\mu} = \int _ E f^+ \,d{\mu} - \int _ E f^- \,d{\mu}$$
와 같이 정의한다. 단, 우변에서 $\infty - \infty$ 가 등장하는 경우는 제외한다. 와 같이 정의한다. 단, 우변에서 $\infty - \infty$ 가 등장하는 경우는 제외한다.
@@ -154,7 +155,7 @@ $$\int_E f \,d{\mu} = \int_E f^+ \,d{\mu} - \int_E f^- \,d{\mu}$$
**정의.** (르벡 적분 가능) $f$가 measurable이고, **정의.** (르벡 적분 가능) $f$가 measurable이고,
$$\int_E \lvert f \rvert \,d{\mu} = \int_E f^+ \,d{\mu} + \int_E f^- \,d{\mu} < \infty$$ $$\int _ E \lvert f \rvert \,d{\mu} = \int _ E f^+ \,d{\mu} + \int _ E f^- \,d{\mu} < \infty$$
이면 $f$가 $E$에서 $\mu$에 대해 **르벡 적분 가능**하다고 한다. 이면 $f$가 $E$에서 $\mu$에 대해 **르벡 적분 가능**하다고 한다.

View File

@@ -0,0 +1,206 @@
---
share: true
toc: true
math: true
categories:
- Mathematics
- Measure Theory
path: _posts/mathematics/measure-theory
tags:
- math
- analysis
- measure-theory
title: 06. Convergence Theorems
date: 2023-03-25
github_title: 2023-03-25-convergence-theorems
image:
path: /assets/img/posts/mathematics/measure-theory/mt-06.png
attachment:
folder: assets/img/posts/mathematics/measure-theory
---
르벡 적분 이론에서 굉장히 자주 사용되는 수렴 정리에 대해 다루겠습니다. 이 정리들을 사용하면 굉장히 유용한 결과를 쉽게 얻을 수 있습니다.
## Monotone Convergence Theorem
먼저 단조 수렴 정리(monotone convergence theorem, MCT)입니다. 이 정리에서는 $f _ n \geq 0$ 인 것이 매우 중요합니다.
![mt-06.png](../../../assets/img/posts/mathematics/measure-theory/mt-06.png)
**정리.** (단조 수렴 정리) $f _ n: X \rightarrow[0, \infty]$ 가 measurable이고 모든 $x \in X$ 에 대하여 $f _ n(x) \leq f _ {n+1}(x)$ 라 하자.
$$\lim _ {n\rightarrow\infty} f _ n(x) = \sup _ {n} f _ n(x) = f(x)$$
로 두면,
$$\int f \,d{\mu} = \lim _ {n\rightarrow\infty} \int f _ n \,d{\mu} = \sup _ {n \in \mathbb{N}} \int f _ n \,d{\mu}$$
이다.
**증명.**
($\geq$) $f _ n(x) \leq f(x)$ 이므로 단조성을 이용하면 모든 $n \in \mathbb{N}$ 에 대하여 $\displaystyle\int f _ n \,d{\mu} \leq \displaystyle\int f \,d{\mu}$ 이다. 따라서 다음이 성립한다.
$$\sup _ n \int f _ n \,d{\mu} \leq \int f \,d{\mu}.$$
($\leq$) 실수 $c \in (0, 1)$ 를 잡자. 마지막에 $c \nearrow 1$ 로 둘 것이다. 이제 measurable simple function $s$가 $0 \leq s \leq f$ 라 하자. 그러면 모든 $x \in X$ 에 대하여 $c \cdot s(x) < f(x)$ 것이다.
이제
$$E _ n = \lbrace x \in X : f _ n(x) \geq cs(x)\rbrace$$
으로 두면, $f _ n(x) - cs(x)$ 가 measurable function이므로 $E _ n$ 또한 measurable이다. 여기서 $f _ n$이 증가하므로 $E _ n\subseteq E _ {n+1} \subseteq\cdots$ 임을 알 수 있고 $f _ n \rightarrow f$ 이므로 $\bigcup _ {n=1}^\infty E _ n = X$ 이다.
충분히 $N \in \mathbb{N}$ 대하여 $n \geq N$ , 모든 $x$ 대하여 $f(x) \geq f _ n(x) > cs(x)$ 가 되게 할 수 있다. 그리고 $f _ n \geq f _ n \chi _ {E _ n} \geq cs \chi _ {E _ n}$ 이므로
$$\tag{\(\star\)} \int f _ n \,d{\mu} \geq \int f _ n \chi _ {E _ n} \,d{\mu} \geq c\int s \chi _ {E _ n} \,d{\mu},$$
이고 여기서 $s, \chi _ {E _ n}$ simple function이다. 그러므로 $s = \sum _ {k=0}^m y _ k \chi _ {A _ k}$ 라고 적으면
$$s\chi _ {E _ n} = \sum _ {k=0}^m y _ k \chi _ {A _ k\cap E _ n} \implies \int s \chi _ {E _ n} \,d{\mu} = \sum _ {k=0}^m y _ k \mu(A _ k\cap E _ n)$$
이다. $n\rightarrow\infty$ $A _ k\cap E _ n \nearrow A _ k$ 이므로, continuity of measure를 사용해 $\mu(A _ k \cap E _ n) \nearrow \mu(A _ k)$ 얻고
$$\lim _ {n\rightarrow\infty} \int s \chi _ {E _ n}\,d{\mu} = \int s \,d{\mu}$$
임도 있다. 이제 ($\star$) 이용하면
$$\lim _ {n\rightarrow\infty} \int f _ n \,d{\mu} \geq c\int s \,d{\mu}$$
이므로, $c \nearrow 1$ 두고 $0\leq s\leq f$ 대하여 $\sup$ 취하면
$$\lim _ {n\rightarrow\infty} \int f _ n \,d{\mu} \geq \sup _ {0\leq s\leq f} \int s \,d{\mu} = \int f \,d{\mu}$$
되어 원하는 결과를 얻는다.
**참고.** 만약 부등식 $0 \leq f _ n \leq f _ {n+1}$ 정의역 전체가 아닌 정의역의 부분집합 $E$에서만 성립한다고 하면, 다음과 같이 생각할 있다.
$$0 \leq f _ n \chi _ E \leq f _ {n+1} \chi _ E \nearrow f \chi _ E.$$
그러므로 단조 수렴 정리가 $E$에서도 성립함을 있다.
> $E$에서 $0\leq f _ n \leq f _ {n+1} \nearrow f$ 이면 $\displaystyle\lim _ {n\rightarrow\infty} \int _ E f _ n \,d{\mu} = \int _ E f \,d{\mu}$.
**참고.** 함수열 $f _ n$이 증가하는 경우에만 정리가 성립합니다. 감소하는 경우에는 반례로 함수 $f _ n = \chi _ {[n, \infty)}$ 를 생각할 수 있습니다. 그러면 $n \rightarrow\infty$ 일 때 $\chi _ {[n, \infty)} \searrow 0$ 입니다.
그러면 Lebesgue measure $m$ 대하여
$$\infty = \int \chi _ {[n, \infty)} \,d{m} \neq \int 0 \,d{m} = 0$$
되어 단조 수렴 정리가 성립하지 않음을 확인할 있습니다.
---
지난 번에 $f \geq 0$ measurable이면 증가하는 measurable simple 함수열 $s _ n$이 존재함을 보였고, 이 $s _ n$ 대하여 적분값을 계산하여
$$\int _ E s _ n \,d{\mu} = \sum _ {i=1}^{n2^n} \frac{i - 1}{2^n}\mu\left( \left\lbrace x \in E : \frac{i-1}{2^n} \leq f(x) \leq \frac{i}{2^n}\right\rbrace \right) + n\mu(\lbrace x \in E : f(x)\geq n\rbrace)$$
라는 결과까지 얻었습니다. 그런데 여기서
$$f(x) = \displaystyle\lim _ {n\rightarrow\infty} s _ n(x)$$
이기 때문에, 단조 수렴 정리에 의해
$$\int _ E f \,d{\mu} = \lim _ {n\rightarrow\infty} \int _ E s _ n \,d{\mu}$$
성립하여 기대했던 결과를 얻었습니다. 지난 설명한 것처럼, 이는 르벡 적분은 치역을 잘게 잘라 넓이를 계산한 것으로 이해할 있다는 의미가 됩니다.
---
다음은 단조 수렴 정리를 활용하여 유용한 결과를 쉽게 얻을 있는 예제입니다.
**참고.** Measurable function $f, g \geq 0$ $\alpha, \beta \in [0, \infty)$ 대하여 다음이 성립한다.
$$\int _ E \left( \alpha f + \beta g \right) \,d{\mu} = \alpha \int _ E f \,d{\mu} + \beta \int _ E g\,d{\mu}.$$
**증명.** Measurable function은 measurable simple function으로 근사할 있고, $f, g \geq 0$ 이므로 단조증가하도록 잡을 있다. 그러므로 measurable simple function $f _ n$, $g _ n$ 대하여 $0 \leq f _ n \leq f _ {n+1} \nearrow f$, $0 \leq g _ n \leq g _ {n+1} \nearrow g$ 으로 잡는다.
그러면 $\alpha f _ n + \beta g _ n \nearrow \alpha f + \beta g$ 이고 $\alpha f _ n + \beta g _ n$ 단조증가하는 measurable simple 함수열이다. 따라서 단조 수렴 정리에 의해
$$\int _ E \left( \alpha f _ n + \beta g _ n \right) \,d{\mu} = \alpha \int _ E f _ n \,d{\mu} + \beta \int _ E g _ n \,d{\mu} \rightarrow\alpha \int _ E f \,d{\mu} + \beta \int _ E g\,d{\mu}$$
이다.
이와 비슷한 방법을 급수에도 적용할 있습니다.
**정리.** Measurable function $f _ n: X \rightarrow[0, \infty]$ 에 대하여 $\sum _ {n=1}^\infty f _ n$ measurable이고, 단조 수렴 정리에 의해 다음이 성립한다.
$$\int _ E \sum _ {n=1}^\infty f _ n \,d{\mu} = \sum _ {n=1}^\infty \int _ E f _ n \,d{\mu}.$$
**증명.** $\sum _ {n=1}^\infty f _ n$ measurable function의 극한이므로 measurable이다. 무한급수를 부분합의 극한으로 생각하면 $f _ n \geq 0$ 이므로 부분합이 증가함을 있다. 따라서 단조 수렴 정리를 적용하여 결론을 얻는다.
## Fatou's Lemma
단조 수렴 정리와 동치인 수렴 정리를 하나 소개합니다. Fatou's lemma로 알려져 있습니다.
**정리.** (Fatou) $f _ n \geq 0$ measurable이고 $E$ measurable이라 하자. 다음이 성립한다.
$$\int _ E \liminf _ {n\rightarrow\infty} f _ n \,d{\mu} \leq \liminf _ {n\rightarrow\infty} \int _ E f _ n \,d{\mu}.$$
**증명.** $g _ n = \displaystyle\inf _ {k \geq n} f _ k$ 으로 두면 $\displaystyle\lim _ {n \rightarrow\infty} g _ n = \liminf _ {n\rightarrow\infty} f _ n$ 이다. $g _ n$ 증가함은 쉽게 확인할 있으며 $g _ n \geq 0$ 이다. $g _ n$ 정의로부터 모든 $k \geq n$ 대하여 $g _ n \leq f _ k$ 이므로,
$$\int _ E g _ n \,d{\mu} \leq \inf _ {k\geq n} \int _ E f _ k \,d{\mu}$$
이다. 여기서 $n \rightarrow\infty$ 두면
$$\int _ E \liminf _ {n\rightarrow\infty} f _ n \,d{\mu} = \lim _ {n \rightarrow\infty} \int _ E g _ n \,d{\mu} \leq \lim _ {n \rightarrow\infty} \inf _ {k \geq n}\int _ E f _ k \,d{\mu} = \liminf _ {n \rightarrow\infty} \int _ E f _ n \,d{\mu}$$
된다. 여기서 번째 등호는 단조 수렴 정리에 의해 성립한다.
**참고.** 증명에서는 단조 수렴 정리를 활용했습니다. 반대로 정리를 가정하면 단조 수렴 정리를 증명할 있기도 합니다. 따라서 둘은 동치입니다. 증명은 생략합니다.
**참고.** 왠지 위와 비슷한 결론이 $\limsup$ 대해서도 성립해야 같습니다. 구체적으로,
$$\int _ E \limsup _ {n \rightarrow\infty} f _ n \,d{\mu} \geq \limsup _ {n \rightarrow\infty} \int _ E f _ n \,d{\mu}$$
같습니다. 안타깝게도 이는 성립하지 않습니다. 반례로 앞서 소개한 $\chi _ {[n, \infty)}$를 한 번 더 가져올 수 있습니다. 좌변을 계산해 보면 0이지만, 우변을 계산해 보면 $\infty$입니다. 나중에 소개하겠지만, $\lvert f _ n \rvert \leq g$ 만족하는 함수 $g \in \mathcal{L}^{1}$ 존재해야 부등식이 성립합니다.
## Properties of the Lebesgue Integral
르벡 적분의 가지 성질을 소개하고 마칩니다.
1. $f$ measurable이고 $E$에서 bounded이며 $\mu(E) < \infty$ , 적당한 실수 $M > 0$ 에 대하여 $\lvert f \rvert \leq M$ 이므로
$$\int _ E \lvert f \rvert \,d{\mu} \leq \int _ E M \,d{\mu} = M\mu(E) < \infty$$
임을 있습니다. 그러므로 $f \in \mathcal{L}^{1}(E, \mu)$ 입니다. $E$ measure가 finite라는 가정 하에, bounded function은 모두 르벡 적분 가능합니다.
2. $f, g \in \mathcal{L}^{1}(E, \mu)$ 이고 $E$에서 $f \leq g$ , 단조성이 성립함을 보이려고 합니다. 앞에서는 $0 \leq f \leq g$ 경우에만 단조성을 증명했었는데, 이를 확장하여 함수가 음의 값을 가지는 경우에도 증명하고 싶습니다. 그러므로 양수인 부분과 음수인 부분을 나누어 고려하여 다음과 같이 적을 있습니다.
$$\chi _ E (x) f^+(x) \leq \chi _ E(x) g^+(x), \qquad \chi _ E(x) g^-(x) \leq \chi _ E (x) f^-(x)$$
이로부터
$$\int _ E f^+ \,d{\mu} \leq \int _ E g^+ \,d{\mu} < \infty, \qquad \int _ E g^- \,d{\mu} \leq \int _ E f^- \,d{\mu} < \infty$$
얻습니다. 따라서
$$\int _ E f\,d{\mu} \leq \int _ E g \,d{\mu}$$
성립하고, 함수가 음의 값을 가지는 경우에도 단조성이 성립함을 있습니다.
3. $f \in \mathcal{L}^{1}(E, \mu)$, $c \in \mathbb{R}$ 하면 $cf \in \mathcal{L}^{1}(E, \mu)$ 입니다. 왜냐하면
$$\int _ E \lvert c \rvert\lvert f \rvert \,d{\mu} = \lvert c \rvert \int _ E \lvert f \rvert\,d{\mu} < \infty$$
이기 때문입니다. 적분이 가능하니 실제 적분값을 계산할 선형성이 성립했으면 좋겠습니다. 앞에서는 음이 아닌 실수에 대해서만 증명했었는데, 이도 마찬가지로 확장하려 합니다. $c < 0$ 경우만 보이면 됩니다. , $(cf)^+ = -cf^-$, $(cf)^- = -cf^+$ 이므로, 다음이 성립합니다.
$$\int _ E cf \,d{\mu} = \int _ E (cf)^+ - \int _ E (cf)^- \,d{\mu} = -c \int _ E f^- \,d{\mu} - (-c) \int _ E f^+ \,d{\mu} = c\int _ E f\,d{\mu}.$$
4. Measurable function $f$ 대하여 $E$에서 $a \leq f(x) \leq b$ 이고 $\mu(E) < \infty$ 다음이 성립합니다.
$$\int _ E a \chi _ E \,d{\mu} \leq \int _ E f\chi _ E \,d{\mu} \leq \int _ E b \chi _ E \,d{\mu} \implies a \mu(E) \leq \int _ E f \,d{\mu} \leq b \mu(E).$$
$f$ 르벡 적분 가능하다는 사실은 $f$ bounded라는 사실을 이용합니다.
5. $f \in \mathcal{L}^{1}(E, \mu)$ measurable set $A \subseteq E$ 주어지는 경우, $f$ $E$ 부분집합인 $A$ 위에서도 르벡 적분 가능합니다. 이는 다음 부등식에서 확인할 있습니다.
$$\int _ A \lvert f \rvert \,d{\mu} \leq \int _ E \lvert f \rvert\,d{\mu} < \infty.$$
6. 만약 measure가 0인 집합에서 적분을 하면 어떻게 될까요? $\mu(E) = 0$ 하고, measurable function $f$ 적분해 보겠습니다. 여기서 $\min\lbrace \lvert f \rvert, n\rbrace\chi _ E$ 도 measurable이며 $n \rightarrow\infty$ 일 때 $\min\lbrace \lvert f \rvert, n\rbrace\chi _ E \nearrow \lvert f \rvert\chi _ E$ 임을 이용합니다. 마지막으로 단조 수렴 정리를 적용하면
$$\begin{aligned} \int _ E \lvert f \rvert \,d{\mu} &= \lim _ {n \rightarrow\infty} \int _ E \min\lbrace \lvert f \rvert, n\rbrace \,d{\mu} \\ &\leq \lim _ {n \rightarrow\infty} \int _ E n \,d{\mu} = \lim _ {n \rightarrow\infty} n\mu(E) = 0 \end{aligned}$$
임을 얻습니다. 따라서 $f \in \mathcal{L}^{1}(E, \mu)$ 이고, $\displaystyle\int _ E f \,d{\mu} = 0$ 되어 적분값이 0임을 있습니다. , measure가 0인 집합 위에서 적분하면 결과는 0이 됩니다.[^1]
[^1]: 편의상 $0\cdot\infty = 0$ 으로 정의했기 때문에 $f \equiv \infty$ 경우에도 성립합니다.

View File

@@ -2,15 +2,21 @@
share: true share: true
toc: true toc: true
math: true math: true
categories: [Mathematics, Measure Theory] categories:
tags: [math, analysis, measure-theory] - Mathematics
title: "07. Dominated Convergence Theorem" - Measure Theory
date: "2023-04-07" path: _posts/mathematics/measure-theory
github_title: "2023-04-07-dominated-convergence-theorem" tags:
- math
- analysis
- measure-theory
title: 07. Dominated Convergence Theorem
date: 2023-04-07
github_title: 2023-04-07-dominated-convergence-theorem
image: image:
path: /assets/img/posts/Mathematics/Measure Theory/mt-07.png path: /assets/img/posts/mathematics/measure-theory/mt-07.png
attachment: attachment:
folder: assets/img/posts/Mathematics/Measure Theory folder: assets/img/posts/mathematics/measure-theory
--- ---
## Almost Everywhere ## Almost Everywhere
@@ -27,17 +33,17 @@ attachment:
**정리.** (Markov's Inequality) $u \in \mathcal{L}^{1}(E, \mu)$ 라 하자. 모든 $c > 0$ 에 대하여 **정리.** (Markov's Inequality) $u \in \mathcal{L}^{1}(E, \mu)$ 라 하자. 모든 $c > 0$ 에 대하여
$$\mu\left( \lbrace \lvert u \rvert \geq c\rbrace \cap E \right) \leq \frac{1}{c} \int_E \lvert u \rvert \,d{\mu}$$ $$\mu\left( \lbrace \lvert u \rvert \geq c\rbrace \cap E \right) \leq \frac{1}{c} \int _ E \lvert u \rvert \,d{\mu}$$
이다. 이다.
**증명.** $\displaystyle\int_E \lvert u \rvert \,d{\mu} \geq \int_ {E\cap \lbrace \lvert u \rvert\geq c\rbrace} \lvert u \rvert \,d{\mu} \geq \int_ {E\cap \lbrace \lvert u \rvert\geq c\rbrace} c \,d{\mu} = c \mu\left( \lbrace \lvert u \rvert \geq c\rbrace \cap E \right)$. **증명.** $\displaystyle\int _ E \lvert u \rvert \,d{\mu} \geq \int _ {E\cap \lbrace \lvert u \rvert\geq c\rbrace} \lvert u \rvert \,d{\mu} \geq \int _ {E\cap \lbrace \lvert u \rvert\geq c\rbrace} c \,d{\mu} = c \mu\left( \lbrace \lvert u \rvert \geq c\rbrace \cap E \right)$.
아래 정리는 measure가 0인 집합에서의 적분은 무시해도 됨을 알려줍니다. $u(x) \neq 0$ 인 점들이 존재하더라도, 이 점들의 집합의 measure가 0이면 적분값에 영향을 줄 수 없습니다. 아래 정리는 measure가 0인 집합에서의 적분은 무시해도 됨을 알려줍니다. $u(x) \neq 0$ 인 점들이 존재하더라도, 이 점들의 집합의 measure가 0이면 적분값에 영향을 줄 수 없습니다.
**정리.** $u\in \mathcal{L}^{1}(E, \mu)$ 일 때, 다음은 동치이다. **정리.** $u\in \mathcal{L}^{1}(E, \mu)$ 일 때, 다음은 동치이다.
1. $\displaystyle\int_E \lvert u \rvert \,d{\mu} = 0$. 1. $\displaystyle\int _ E \lvert u \rvert \,d{\mu} = 0$.
2. $u = 0$ $\mu$-a.e. on $E$. 2. $u = 0$ $\mu$-a.e. on $E$.
@@ -47,11 +53,11 @@ $$\mu\left( \lbrace \lvert u \rvert \geq c\rbrace \cap E \right) \leq \frac{1}{c
(2 $\iff$ 3) $E\cap\lbrace u\neq 0\rbrace$ 가 measurable이므로 정의에 의해 당연하다. (2 $\iff$ 3) $E\cap\lbrace u\neq 0\rbrace$ 가 measurable이므로 정의에 의해 당연하다.
(2 $\implies$ 1) $\displaystyle\int_E \lvert u \rvert \,d{\mu} = \int_ {E \cap \lbrace \lvert u \rvert > 0\rbrace} \lvert u \rvert \,d{\mu} + \int_ {E \cap \lbrace \lvert u \rvert = 0\rbrace} \lvert u \rvert \,d{\mu} = 0 + 0 = 0$. (2 $\implies$ 1) $\displaystyle\int _ E \lvert u \rvert \,d{\mu} = \int _ {E \cap \lbrace \lvert u \rvert > 0\rbrace} \lvert u \rvert \,d{\mu} + \int _ {E \cap \lbrace \lvert u \rvert = 0\rbrace} \lvert u \rvert \,d{\mu} = 0 + 0 = 0$.
(1 $\implies$ 3) Markov's inequality를 사용하면 (1 $\implies$ 3) Markov's inequality를 사용하면
$$\mu\left( \left\lbrace \lvert u \rvert \geq \frac{1}{n}\right\rbrace \cap E \right) \leq n\int_E \lvert u \rvert \,d{\mu} = 0$$ $$\mu\left( \left\lbrace \lvert u \rvert \geq \frac{1}{n}\right\rbrace \cap E \right) \leq n\int _ E \lvert u \rvert \,d{\mu} = 0$$
이다. 이제 $n\rightarrow\infty$ 일 때 continuity of measure를 사용하면 $\mu\left( \lbrace \lvert u \rvert > 0\rbrace \cap E \right) = 0$ 이다. 이다. 이제 $n\rightarrow\infty$ 일 때 continuity of measure를 사용하면 $\mu\left( \lbrace \lvert u \rvert > 0\rbrace \cap E \right) = 0$ 이다.
@@ -59,7 +65,7 @@ $$\mu\left( \left\lbrace \lvert u \rvert \geq \frac{1}{n}\right\rbrace \cap E \r
**참고.** $A, B$가 measurable이라 하자. $B \subseteq A$ 이고 $\mu\left( A \setminus B \right) = 0$ 이면 모든 $f \in \mathcal{L}^{1}(A, \mu)$ 에 대하여 **참고.** $A, B$가 measurable이라 하자. $B \subseteq A$ 이고 $\mu\left( A \setminus B \right) = 0$ 이면 모든 $f \in \mathcal{L}^{1}(A, \mu)$ 에 대하여
$$\int_A f \,d{\mu} = \int_B f \,d{\mu}$$ $$\int _ A f \,d{\mu} = \int _ B f \,d{\mu}$$
이다. 이다.
@@ -67,41 +73,41 @@ $$\int_A f \,d{\mu} = \int_B f \,d{\mu}$$
**정리.** $u \in \mathcal{L}^{1}(E, \mu)$ 이면 $u(x) \in \mathbb{R}$ $\mu$-a.e. on $E$ 이다. 즉, $u(x) = \infty$ 인 집합의 measure가 0이다. **정리.** $u \in \mathcal{L}^{1}(E, \mu)$ 이면 $u(x) \in \mathbb{R}$ $\mu$-a.e. on $E$ 이다. 즉, $u(x) = \infty$ 인 집합의 measure가 0이다.
**증명.** $\mu\left( \lbrace \lvert u \rvert \geq 1\rbrace\cap E \right) \leq \displaystyle\int_E \lvert u \rvert \,d{\mu} < \infty$.[^2] 그러므로 **증명.** $\mu\left( \lbrace \lvert u \rvert \geq 1\rbrace\cap E \right) \leq \displaystyle\int _ E \lvert u \rvert \,d{\mu} < \infty$.[^2] 그러므로
$$\begin{aligned} \mu\left( \lbrace \lvert u \rvert = \infty\rbrace \cap E \right) & = \mu\left( \bigcap_ {n=1}^\infty \lbrace x \in E : \lvert u(x) \rvert \geq n\rbrace \right) \\ & = \lim_ {n \rightarrow\infty} \mu\left( \lbrace \lvert u \rvert \geq n\rbrace \cap E \right) \leq \limsup_ {n\rightarrow\infty} \frac{1}{n} \int_E \lvert u \rvert \,d{\mu} = 0 \end{aligned}$$ $$\begin{aligned} \mu\left( \lbrace \lvert u \rvert = \infty\rbrace \cap E \right) & = \mu\left( \bigcap _ {n=1}^\infty \lbrace x \in E : \lvert u(x) \rvert \geq n\rbrace \right) \\ & = \lim _ {n \rightarrow\infty} \mu\left( \lbrace \lvert u \rvert \geq n\rbrace \cap E \right) \leq \limsup _ {n\rightarrow\infty} \frac{1}{n} \int _ E \lvert u \rvert \,d{\mu} = 0 \end{aligned}$$
이다. 이다.
적분 가능하다면 어차피 함숫값이 무한한 영역은 적분값에 영향을 주지 않으므로, 함숫값이 유한한 곳에서만 적분해도 될 것입니다. 적분 가능하다면 어차피 함숫값이 무한한 영역은 적분값에 영향을 주지 않으므로, 함숫값이 유한한 곳에서만 적분해도 될 것입니다.
**따름정리.** $u \in \mathcal{L}^{1}(E, \mu)$ 이면 $\displaystyle\int_E u \,d{\mu} = \int_ {E \cap \lbrace \lvert u \rvert < \infty\rbrace} u \,d{\mu}$ 이다. **따름정리.** $u \in \mathcal{L}^{1}(E, \mu)$ 이면 $\displaystyle\int _ E u \,d{\mu} = \int _ {E \cap \lbrace \lvert u \rvert < \infty\rbrace} u \,d{\mu}$ 이다.
### Linearity of the Lebesgue Integral ### Linearity of the Lebesgue Integral
드디어 일반적인 경우에서 적분의 선형성을 증명합니다! 드디어 일반적인 경우에서 적분의 선형성을 증명합니다!
**정리.** $f_1, f_2 \in \mathcal{L}^{1}(E, \mu)$ 이면 $f_1 + f_2 \in \mathcal{L}^{1}(E, \mu)$ 이고 **정리.** $f _ 1, f _ 2 \in \mathcal{L}^{1}(E, \mu)$ 이면 $f _ 1 + f _ 2 \in \mathcal{L}^{1}(E, \mu)$ 이고
$$\int_E \left( f_1 + f_2 \right) \,d{\mu} = \int_E f_1 \,d{\mu} + \int_E f_2 \,d{\mu}$$ $$\int _ E \left( f _ 1 + f _ 2 \right) \,d{\mu} = \int _ E f _ 1 \,d{\mu} + \int _ E f _ 2 \,d{\mu}$$
이다. 이다.
**증명.** $\lvert f_1 + f_2 \rvert \leq \lvert f_1 \rvert + \lvert f_2 \rvert$ 임을 이용하면 $f_1+f_2 \in \mathcal{L}^{1}(E, \mu)$ 인 것은 당연하다. 이제 $f = f_1 + f_2$ 로 두고 **증명.** $\lvert f _ 1 + f _ 2 \rvert \leq \lvert f _ 1 \rvert + \lvert f _ 2 \rvert$ 임을 이용하면 $f _ 1+f _ 2 \in \mathcal{L}^{1}(E, \mu)$ 인 것은 당연하다. 이제 $f = f _ 1 + f _ 2$ 로 두고
$$N = \left\lbrace x : \max\left\lbrace f_1^+, f_1^-, f_2^+, f_2^-, f^+, f^-\right\rbrace = \infty \right\rbrace$$ $$N = \left\lbrace x : \max\left\lbrace f _ 1^+, f _ 1^-, f _ 2^+, f _ 2^-, f^+, f^-\right\rbrace = \infty \right\rbrace$$
으로 정의하자. 함수들이 모두 적분 가능하므로 위 정리에 의해 $\mu(N) = 0$ 이다. 그러므로 $E \setminus N$ 에서는 무한한 값이 없으므로 이항을 편하게 할 수 있다. 즉, 으로 정의하자. 함수들이 모두 적분 가능하므로 위 정리에 의해 $\mu(N) = 0$ 이다. 그러므로 $E \setminus N$ 에서는 무한한 값이 없으므로 이항을 편하게 할 수 있다. 즉,
$$f^+ - f^- = f_1^+ - f_1^- + f_2^+ - f_2^- \implies f^+ + f_1^- + f_2^- = f^- + f_1^+ + f_2^+$$ $$f^+ - f^- = f _ 1^+ - f _ 1^- + f _ 2^+ - f _ 2^- \implies f^+ + f _ 1^- + f _ 2^- = f^- + f _ 1^+ + f _ 2^+$$
이다. 그러면 이다. 그러면
$$\int_ {E\setminus N} f^+ \,d{\mu} + \int_ {E\setminus N} f_1^- \,d{\mu} + \int_ {E\setminus N} f_2^- \,d{\mu} = \int_ {E\setminus N} f^-\,d{\mu} + \int_ {E\setminus N} f_1^+\,d{\mu} + \int_ {E\setminus N} f_2^+ \,d{\mu}$$ $$\int _ {E\setminus N} f^+ \,d{\mu} + \int _ {E\setminus N} f _ 1^- \,d{\mu} + \int _ {E\setminus N} f _ 2^- \,d{\mu} = \int _ {E\setminus N} f^-\,d{\mu} + \int _ {E\setminus N} f _ 1^+\,d{\mu} + \int _ {E\setminus N} f _ 2^+ \,d{\mu}$$
이고, $\mu(N) = 0$ 임을 이용하여 $N$ 위에서의 적분값을 더해주면 이고, $\mu(N) = 0$ 임을 이용하여 $N$ 위에서의 적분값을 더해주면
$$\int_ {E \setminus N} f \,d{\mu} = \int_ {E \setminus N} f_1 \,d{\mu} + \int_ {E \setminus N} f_2 \,d{\mu} \implies \int_ {E} f \,d{\mu} = \int_ {E} f_1 \,d{\mu} + \int_ {E} f_2 \,d{\mu}$$ $$\int _ {E \setminus N} f \,d{\mu} = \int _ {E \setminus N} f _ 1 \,d{\mu} + \int _ {E \setminus N} f _ 2 \,d{\mu} \implies \int _ {E} f \,d{\mu} = \int _ {E} f _ 1 \,d{\mu} + \int _ {E} f _ 2 \,d{\mu}$$
를 얻는다. 를 얻는다.
@@ -109,19 +115,19 @@ $$\int_ {E \setminus N} f \,d{\mu} = \int_ {E \setminus N} f_1 \,d{\mu} + \int_
이제 이를 응용하여 수렴정리를 다시 적어보겠습니다. 지난 글에서는 모든 점에서 특정 성질이 성립할 것이 요구되었으나 이제는 거의 모든 점에서만 성립하면 됩니다. 증명은 해당 성질이 성립하지 않는 집합을 빼고 증명하면 됩니다. 이제 이를 응용하여 수렴정리를 다시 적어보겠습니다. 지난 글에서는 모든 점에서 특정 성질이 성립할 것이 요구되었으나 이제는 거의 모든 점에서만 성립하면 됩니다. 증명은 해당 성질이 성립하지 않는 집합을 빼고 증명하면 됩니다.
**정리.** (단조 수렴 정리) $f_n$이 measurable이고 $0 \leq f_n(x) \leq f_ {n+1}(x)$ $\mu$-a.e. 라 하자. **정리.** (단조 수렴 정리) $f _ n$이 measurable이고 $0 \leq f _ n(x) \leq f _ {n+1}(x)$ $\mu$-a.e. 라 하자.
$$\lim_ {n\rightarrow\infty} f_n(x) = f(x)$$ $$\lim _ {n\rightarrow\infty} f _ n(x) = f(x)$$
로 두면, 로 두면,
$$\lim_ {n \rightarrow\infty} \int_E f_n \,d{\mu} = \int_E f \,d{\mu}.$$ $$\lim _ {n \rightarrow\infty} \int _ E f _ n \,d{\mu} = \int _ E f \,d{\mu}.$$
이다. 이다.
**정리.** (Fatou) $f_n$이 measurable이고 $f_n(x) \geq 0$ $\mu$-a.e. 라 하자. 다음이 성립한다. **정리.** (Fatou) $f _ n$이 measurable이고 $f _ n(x) \geq 0$ $\mu$-a.e. 라 하자. 다음이 성립한다.
$$\int_E \liminf_ {n\rightarrow\infty} f_n \,d{\mu} \leq \liminf_ {n\rightarrow\infty} \int_E f_n \,d{\mu}.$$ $$\int _ E \liminf _ {n\rightarrow\infty} f _ n \,d{\mu} \leq \liminf _ {n\rightarrow\infty} \int _ E f _ n \,d{\mu}.$$
비슷한 느낌으로 다음과 같은 명제를 생각할 수도 있습니다. 비슷한 느낌으로 다음과 같은 명제를 생각할 수도 있습니다.
@@ -149,45 +155,45 @@ $$[f] = \lbrace g \in \mathcal{L}^{1}(E, \mu) : f \sim g\rbrace.$$
마지막 수렴정리를 소개하고 수렴정리와 관련된 내용을 마칩니다. 지배 수렴 정리(dominated convergence theorem, DCT)로 불립니다. 마지막 수렴정리를 소개하고 수렴정리와 관련된 내용을 마칩니다. 지배 수렴 정리(dominated convergence theorem, DCT)로 불립니다.
![mt-07.png](/assets/img/posts/Mathematics/Measure%20Theory/mt-07.png) ![mt-07.png](../../../assets/img/posts/mathematics/measure-theory/mt-07.png)
**정리.** (지배 수렴 정리) Measurable set $E$와 measurable function $f$에 대하여, $\lbrace f_n\rbrace$이 measurable function의 함수열이라 하자. $E$의 거의 모든 점 위에서 극한 $f(x) = \displaystyle\lim_ {n \rightarrow\infty} f_n(x)$ 가 $\overline{\mathbb{R}}$에 존재하고 (점별 수렴) $\lvert f_n \rvert \leq g \quad \mu$-a.e. on $E$ ($\forall n \geq 1$) 를 만족하는 $g \in \mathcal{L}^{1}(E, \mu)$ 가 존재하면, **정리.** (지배 수렴 정리) Measurable set $E$와 measurable function $f$에 대하여, $\lbrace f _ n\rbrace$이 measurable function의 함수열이라 하자. $E$의 거의 모든 점 위에서 극한 $f(x) = \displaystyle\lim _ {n \rightarrow\infty} f _ n(x)$ 가 $\overline{\mathbb{R}}$에 존재하고 (점별 수렴) $\lvert f _ n \rvert \leq g \quad \mu$-a.e. on $E$ ($\forall n \geq 1$) 를 만족하는 $g \in \mathcal{L}^{1}(E, \mu)$ 가 존재하면,
$$\lim_ {n \rightarrow\infty} \int_E \lvert f_n - f \rvert \,d{\mu} = 0$$ $$\lim _ {n \rightarrow\infty} \int _ E \lvert f _ n - f \rvert \,d{\mu} = 0$$
이다. 이다.
**참고.** **참고.**
1. $f_n, f \in \mathcal{L}^{1}(E, \mu)$ 이다. 1. $f _ n, f \in \mathcal{L}^{1}(E, \mu)$ 이다.
2. 적분의 성질에 의해 2. 적분의 성질에 의해
$$\lvert \int f_n \,d{\mu} - \int f \,d{\mu} \rvert \leq \int \lvert f_n - f \rvert \,d{\mu}$$ $$\lvert \int f _ n \,d{\mu} - \int f \,d{\mu} \rvert \leq \int \lvert f _ n - f \rvert \,d{\mu}$$
이므로 위 정리의 결론은 곧 이므로 위 정리의 결론은 곧
$$\lim_ {n \rightarrow\infty} \int f_n \,d{\mu} = \int f \,d{\mu}$$ $$\lim _ {n \rightarrow\infty} \int f _ n \,d{\mu} = \int f \,d{\mu}$$
를 의미한다. 를 의미한다.
**증명.** 다음과 같은 집합을 정의한다. **증명.** 다음과 같은 집합을 정의한다.
$$A = \left\lbrace \displaystyle x \in E : \lim_ {n \rightarrow\infty} f_n(x) \text{가 존재하고}, f_n(x), f(x), g(x) \in \mathbb{R}, \lvert f_n(x) \rvert \leq g(x)\right\rbrace.$$ $$A = \left\lbrace \displaystyle x \in E : \lim _ {n \rightarrow\infty} f _ n(x) \text{가 존재하고}, f _ n(x), f(x), g(x) \in \mathbb{R}, \lvert f _ n(x) \rvert \leq g(x)\right\rbrace.$$
그러면 가정에 의해 $\mu\left( E\setminus A \right) = 0$ 이다. 이제 $x \in A$ 에 대해서만 생각해도 충분하다. 그러면 그러면 가정에 의해 $\mu\left( E\setminus A \right) = 0$ 이다. 이제 $x \in A$ 에 대해서만 생각해도 충분하다. 그러면
$$2g - \lvert f_n - f \rvert \geq 2g - \bigl(\lvert f_n \rvert + \lvert f \rvert \bigr) \geq 0$$ $$2g - \lvert f _ n - f \rvert \geq 2g - \bigl(\lvert f _ n \rvert + \lvert f \rvert \bigr) \geq 0$$
이다. $\lvert f_n - f \rvert \rightarrow 0$, $2g - \lvert f_n - f \rvert \rightarrow 2g$ 이므로, Fatous lemma를 적용하면 이다. $\lvert f _ n - f \rvert \rightarrow 0$, $2g - \lvert f _ n - f \rvert \rightarrow 2g$ 이므로, Fatous lemma를 적용하면
$$\begin{aligned} 2 \int_E g \,d{\mu} = \int_A 2g \,d{\mu} & = \int_A \liminf_ {n \rightarrow\infty} \big(2g - \lvert f_n - f \rvert\big) \,d{\mu} \\ & \leq \liminf_ {n \rightarrow\infty} \left( 2 \int_A g \,d{\mu} - \int_A \lvert f_n - f \rvert \,d{\mu} \right) \\ & = 2\int_A g \,d{\mu} - \limsup_ {n \rightarrow\infty} \int_A \lvert f_n - f \rvert \,d{\mu} \leq 2 \int_A g \,d{\mu} \end{aligned}$$ $$\begin{aligned} 2 \int _ E g \,d{\mu} = \int _ A 2g \,d{\mu} & = \int _ A \liminf _ {n \rightarrow\infty} \big(2g - \lvert f _ n - f \rvert\big) \,d{\mu} \\ & \leq \liminf _ {n \rightarrow\infty} \left( 2 \int _ A g \,d{\mu} - \int _ A \lvert f _ n - f \rvert \,d{\mu} \right) \\ & = 2\int _ A g \,d{\mu} - \limsup _ {n \rightarrow\infty} \int _ A \lvert f _ n - f \rvert \,d{\mu} \leq 2 \int _ A g \,d{\mu} \end{aligned}$$
이다. 따라서 이다. 따라서
$$2 \int_A g \,d{\mu} - \limsup_ {n \rightarrow\infty} \int_A \lvert f_n - f \rvert \,d{\mu} = 2 \int_A g \,d{\mu}$$ $$2 \int _ A g \,d{\mu} - \limsup _ {n \rightarrow\infty} \int _ A \lvert f _ n - f \rvert \,d{\mu} = 2 \int _ A g \,d{\mu}$$
이고, 가정에 의해 $\displaystyle 0 \leq \int_A g \,d{\mu} < \infty$ 이므로 $\displaystyle\limsup_ {n \rightarrow\infty} \int_A \lvert f_n - f \rvert \,d{\mu} = 0$ 이다. 이고, 가정에 의해 $\displaystyle 0 \leq \int _ A g \,d{\mu} < \infty$ 이므로 $\displaystyle\limsup _ {n \rightarrow\infty} \int _ A \lvert f _ n - f \rvert \,d{\mu} = 0$ 이다.
[^1]: 예를 들어, $f(x)$가 연속이다’ 등. [^1]: 예를 들어, $f(x)$가 연속이다’ 등.
[^2]: Continuity of measure를 사용하기 위해서는 첫 번째 집합의 measure가 유한해야 한다. [^2]: Continuity of measure를 사용하기 위해서는 첫 번째 집합의 measure가 유한해야 한다.

View File

@@ -0,0 +1,136 @@
---
share: true
toc: true
math: true
categories:
- Mathematics
- Measure Theory
path: _posts/mathematics/measure-theory
tags:
- math
- analysis
- measure-theory
title: 08. Comparison with the Riemann Integral
date: 2023-06-20
github_title: 2023-06-20-comparison-with-riemann-integral
image:
path: /assets/img/posts/mathematics/measure-theory/mt-08.png
attachment:
folder: assets/img/posts/mathematics/measure-theory
---
![mt-08.png](../../../assets/img/posts/mathematics/measure-theory/mt-08.png)
## Comparison with the Riemann Integral
먼저 혼동을 막기 위해 Lebesgue measure $m$에 대하여 르벡 적분을
$$\int _ {[a, b]} f \,d{m} = \int _ {[a, b]} f \,d{x} = \int _ a^b f \,d{x}$$
와 같이 표기하고, 리만 적분은
$$\mathcal{R}\int _ a^b f\,d{x}$$
로 표기하겠습니다.
**정리.** $a, b \in \mathbb{R}$ 에 대하여 $a < b$ 이고 함수 $f$ 유계라고 하자.
1. $f \in \mathcal{R}[a, b]$ 이면 $f \in \mathcal{L}^{1}[a, b]$ 이고 $\displaystyle\int _ a^b f\,d{x} = \mathcal{R}\int _ a^b f \,d{x}$ 이다.
2. $f \in \mathcal{R}[a, b]$ $\iff$ $f$ 연속 $m$-a.e. on $[a, b]$.
쉽게 풀어서 적어보면, (1) $f$ $[a, b]$에서 리만 적분 가능하면 르벡 적분 또한 가능하며, 적분 값이 같다는 의미입니다. 르벡 적분이 리만 적분보다 강력하다는 것을 있습니다.
또한 (2) 리만 적분 가능성에 대한 동치 조건을 알려줍니다. Almost everywhere라는 조건이 붙었기 때문에, $\mathcal{L}^1$ equivalence class를 고려하면 사실상 연속함수에 대해서만 리만 적분이 가능하다는 뜻이 됩니다.
**증명.** $k \in \mathbb{N}$ 대하여 구간 $[a, b]$ 분할 $P _ k = \lbrace a = x _ 0^k < x _ 1^k < \cdots < x _ {n _ k}^k = b\rbrace$ 를 잡는다. 단 $P _ k \subseteq P _ {k+1}$ (refinement) 이고 $\lvert x _ {i}^k - x _ {i-1}^k \rvert < \frac{1}{k}$ 되도록 한다.
그러면 리만 적분의 정의로부터
$$\lim _ {k \rightarrow\infty} L(P _ k, f) = \mathcal{R}\underline{\int _ {a}^{b}} f\,d{x}, \quad \lim _ {k \rightarrow\infty} U(P _ k, f) = \mathcal{R} \overline{\int _ {a}^{b}} f \,d{x}$$
임을 있다.
이제 measurable simple function $U _ k, L _ k$ 다음과 같이 잡는다.
$$U _ k = \sum _ {i=1}^{n _ k} \sup _ {x _ {i-1}^k \leq y \leq x _ {i}^k} f(y) \chi _ {(x _ {i-1}^k, x _ i^k]}, \quad L _ k = \sum _ {i=1}^{n _ k} \inf _ {x _ {i-1}^k \leq y \leq x _ {i}^k} f(y) \chi _ {(x _ {i-1}^k, x _ i^k]}.$$
그러면 구간 $[a, b]$ 위에서 $L _ k \leq f \leq U _ k$ 것은 당연하고, 르벡 적분이 가능하므로
$$\int _ a^b L _ k \,d{x} = L(P _ k, f), \quad \int _ a^b U _ k \,d{x} = U(P _ k, f)$$
됨을 있다. 여기서 $P _ k \subseteq P _ {k + 1}$ 되도록 잡았기 때문에, $L _ k$는 증가하는 수열, $U _ k$ 감소하는 수열이다.
그러므로
$$L(x) = \lim _ {k \rightarrow\infty} L _ k(x), \quad U(x) = \lim _ {k \rightarrow\infty} U _ k(x)$$
정의했을 , 극한이 존재함을 있다. 여기서 $f, L _ k, U _ k$ 모두 유계인 함수이므로 지배 수렴 정리에 의해
$$\int _ a^b L \,d{x} = \lim _ {k \rightarrow\infty} \int _ a^b L _ k \,d{x} = \lim _ {k \rightarrow\infty} L(P _ k, f) = \mathcal{R}\underline{\int _ {a}^{b}} f\,d{x} < \infty,$$
$$\int _ a^b U\,d{x} = \lim _ {k \rightarrow\infty} \int _ a^b U _ k \,d{x} = \lim _ {k \rightarrow\infty} U(P _ k, f) = \mathcal{R} \overline{\int _ {a}^{b}} f \,d{x} < \infty$$
이므로 $L, U \in \mathcal{L}^{1}[a, b]$ 이다.
사실을 종합하면 $f \in \mathcal{R}[a, b]$ ,
$$\mathcal{R}\underline{\int _ {a}^{b}} f\,d{x} = \mathcal{R}\overline{\int _ {a}^{b}} f\,d{x}$$
이므로
$$\int _ a^b (U - L)\,d{x} = 0$$
되어 $U = L$ $m$-a.e. on $[a, b]$라는 사실을 있다. 역으로 이를 거꾸로 읽어보면 $U = L$ $m$-a.e. on $[a, b]$ $f \in \mathcal{R}[a, b]$ 되는 또한 있다.
(1) 논의에 의해 $f \in \mathcal{R}[a, b]$ 이면 $f = U = L$ a.e. on $[a, b]$ 이다. 따라서 $f$ measurable.
$$\int _ a^b f \,d{x} = \mathcal{R}\int _ a^b f\,d{x} < \infty \implies f \in \mathcal{L}^{1}[a, b].$$
(2) 만약 $x \notin \bigcup _ {k=1}^{\infty} P _ k$ 라고 가정하면, 임의의 $\epsilon > 0$ 에 대해 충분히 큰 $n \in \mathbb{N}$ 을 잡았을 때 적당한 $j _ 0 \in \mathbb{N}$ 이 존재하여 $x \in (t _ {j _ 0-1}^n, t _ {j _ 0}^n)$ 이면서
$$\lvert L _ n(x) - L(x) \rvert + \lvert U _ n(x) - U(x) \rvert < \epsilon$$
되도록 있다. 그러면 $y \in (t _ {j _ 0-1}^n, t _ {j _ 0}^n)$
$$\begin{aligned} \lvert f(x) - f(y) \rvert & \leq M _ {j _ 0}^n - m _ {j _ 0}^n = M _ {j _ 0}^n - U(x) + U(x) - L(x) + L(x) - m _ {j _ 0}^n \\ & \leq U(x) - L(x) + \epsilon \end{aligned}$$
됨을 있다.
부등식에 의해 $y \in \lbrace x : U(x) = L(x)\rbrace \setminus\bigcup _ {k=1}^{\infty} P _ k$ 이면 $f$ $y$에서 연속임을 있게 된다.
따라서, $f$ 연속인 점들의 집합을 $C _ f$ 하면
$$\lbrace x : U(x) = L(x)\rbrace \setminus\bigcup _ {k=1}^{\infty} P _ k \subseteq C _ f \subseteq\lbrace x : U(x) = L(x)\rbrace$$
된다. 한편 $\bigcup _ {k=1}^{\infty} P _ k$ measure가 0 이므로, $U = L$ $m$-a.e. 것과 $f$ 연속 $m$-a.e. 것은 동치이다. 논의의 결과를 이용하면 $f \in \mathcal{R}[a, b]$ 것과 $f$ 연속 $m$-a.e. 것은 동치이다.
아래는 증명의 부산물입니다.
**참고.**
1. $x \notin \bigcup _ {k=1}^\infty P _ k$ 이면 $f$ $x$에서 연속 $\iff f(x) = U(x) = L(x)$ 이다.
2. $L(x) \leq f(x) \leq U(x)$ 이고 measurable function의 극한인 $L(x), U(x)$ 또한 measurable이다.
3. $f$ 유계라는 조건이 있기 때문에 $f \geq 0$ 경우만 생각해도 충분하다. $\lvert f \rvert \leq M$ 라고 하면 $f$ 대신 $f + M$ 생각하면 되기 때문이다.
이제 리만 적분의 유용한 성질들을 가지고 와서 사용할 있습니다.
1. $f \geq 0$ 이고 measurable일 , $f _ n = f\chi _ {[0, n]}$으로 정의한다. 단조 수렴 정리에 의해
$$\int _ 0^\infty f \,d{x} = \lim _ {n \rightarrow\infty} \int _ 0^\infty f _ n \,d{x} = \lim _ {n \rightarrow\infty} \int _ 0^n f \,d{x}$$
이다. 마지막 적분을 리만 적분으로 계산할 있다.
2. 닫힌 유계 구간 $I \subseteq(0, \infty)$ 대하여 $f \in \mathcal{R}(I)$ 하면 $f \in \mathcal{L}^{1}(I)$ 이다. $f _ n = f\chi _ {[0, n]}$ 으로 잡으면 $\lvert f _ n \rvert \leq f$ 이므로 지배 수렴 정리를 적용하여
$$\int _ 0^\infty f \,d{x} = \lim _ {n \rightarrow\infty} \int _ 0^\infty f _ n \,d{x} = \lim _ {n \rightarrow\infty} \int _ 0^n f \,d{x} = \lim _ {n \rightarrow\infty} \mathcal{R} \int _ 0^n f \,d{x}$$
임을 있다.
마찬가지로 $f _ n = f\chi _ {(1/n, 1)}$ 으로 잡은 경우에도 지배 수렴 정리에 의해
$$\int _ 0^1 f\,d{x} = \lim _ {n \rightarrow\infty} \int _ {0}^1 f _ n \,d{x} = \lim _ {n \rightarrow\infty}\int _ {1/n}^1 f \,d{x} = \lim _ {n \rightarrow\infty} \mathcal{R}\int _ {1/n}^1 f \,d{x}$$
된다.

View File

@@ -53,3 +53,12 @@ div.language-plaintext.highlighter-rouge {
div.footnotes { div.footnotes {
font-size: 90%; font-size: 90%;
} }
nav#breadcrumb {
font-family: "Palatino Linotype", Palatino, Pretendard;
}
/* for post title */
h1 {
font-family: "Palatino Linotype", Palatino, Pretendard;
}

View File

Before

Width:  |  Height:  |  Size: 288 KiB

After

Width:  |  Height:  |  Size: 288 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 13 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 9.5 KiB

View File

Before

Width:  |  Height:  |  Size: 20 KiB

After

Width:  |  Height:  |  Size: 20 KiB

View File

Before

Width:  |  Height:  |  Size: 7.8 KiB

After

Width:  |  Height:  |  Size: 7.8 KiB

View File

Before

Width:  |  Height:  |  Size: 13 KiB

After

Width:  |  Height:  |  Size: 13 KiB

View File

Before

Width:  |  Height:  |  Size: 11 KiB

After

Width:  |  Height:  |  Size: 11 KiB

View File

Before

Width:  |  Height:  |  Size: 10 KiB

After

Width:  |  Height:  |  Size: 10 KiB

View File

Before

Width:  |  Height:  |  Size: 19 KiB

After

Width:  |  Height:  |  Size: 19 KiB

View File

Before

Width:  |  Height:  |  Size: 15 KiB

After

Width:  |  Height:  |  Size: 15 KiB

View File

Before

Width:  |  Height:  |  Size: 12 KiB

After

Width:  |  Height:  |  Size: 12 KiB

View File

Before

Width:  |  Height:  |  Size: 7.9 KiB

After

Width:  |  Height:  |  Size: 7.9 KiB