Files
blog/_posts/algorithms/data-structures/2024-04-12-search-time-hash-tables.md
Sungchan Yi 23aeb29ad8 feat: breaking change (unstable) (#198)
* [PUBLISHER] upload files #175

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-19-symmetric-key-encryption.md

* [PUBLISHER] upload files #177

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-19-symmetric-key-encryptio.md

* [PUBLISHER] upload files #178

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #179

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #180

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #181

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #182

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* [PUBLISHER] upload files #183

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #184

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #185

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #186

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* [PUBLISHER] upload files #187

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* DELETE FILE : _posts/Lecture Notes/Modern Cryptography/2023-09-19-symmetric-key-encryption.md

* DELETE FILE : _posts/lecture-notes/modern-cryptography/2023-09-18-symmetric-key-cryptography-2.md

* [PUBLISHER] upload files #188

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* DELETE FILE : _posts/Lecture Notes/Modern Cryptography/2023-09-19-symmetric-key-encryption.md

* chore: remove files

* [PUBLISHER] upload files #197

* PUSH NOTE : 수학 공부에 대한 고찰.md

* PUSH NOTE : 09. Lp Functions.md

* PUSH ATTACHMENT : mt-09.png

* PUSH NOTE : 08. Comparison with the Riemann Integral.md

* PUSH ATTACHMENT : mt-08.png

* PUSH NOTE : 04. Measurable Functions.md

* PUSH ATTACHMENT : mt-04.png

* PUSH NOTE : 06. Convergence Theorems.md

* PUSH ATTACHMENT : mt-06.png

* PUSH NOTE : 07. Dominated Convergence Theorem.md

* PUSH ATTACHMENT : mt-07.png

* PUSH NOTE : 05. Lebesgue Integration.md

* PUSH ATTACHMENT : mt-05.png

* PUSH NOTE : 03. Measure Spaces.md

* PUSH ATTACHMENT : mt-03.png

* PUSH NOTE : 02. Construction of Measure.md

* PUSH ATTACHMENT : mt-02.png

* PUSH NOTE : 01. Algebra of Sets and Set Functions.md

* PUSH ATTACHMENT : mt-01.png

* PUSH NOTE : Rules of Inference with Coq.md

* PUSH NOTE : 블로그 이주 이야기.md

* PUSH NOTE : Secure IAM on AWS with Multi-Account Strategy.md

* PUSH ATTACHMENT : separation-by-product.png

* PUSH NOTE : You and Your Research, Richard Hamming.md

* PUSH NOTE : 10. Digital Signatures.md

* PUSH ATTACHMENT : mc-10-dsig-security.png

* PUSH ATTACHMENT : mc-10-schnorr-identification.png

* PUSH NOTE : 9. Public Key Encryption.md

* PUSH ATTACHMENT : mc-09-ss-pke.png

* PUSH NOTE : 8. Number Theory.md

* PUSH NOTE : 7. Key Exchange.md

* PUSH ATTACHMENT : mc-07-dhke.png

* PUSH ATTACHMENT : mc-07-dhke-mitm.png

* PUSH ATTACHMENT : mc-07-merkle-puzzles.png

* PUSH NOTE : 6. Hash Functions.md

* PUSH ATTACHMENT : mc-06-merkle-damgard.png

* PUSH ATTACHMENT : mc-06-davies-meyer.png

* PUSH ATTACHMENT : mc-06-hmac.png

* PUSH NOTE : 5. CCA-Security and Authenticated Encryption.md

* PUSH ATTACHMENT : mc-05-ci.png

* PUSH ATTACHMENT : mc-05-etm-mte.png

* PUSH NOTE : 1. OTP, Stream Ciphers and PRGs.md

* PUSH ATTACHMENT : mc-01-prg-game.png

* PUSH ATTACHMENT : mc-01-ss.png

* PUSH NOTE : 4. Message Authentication Codes.md

* PUSH ATTACHMENT : mc-04-mac.png

* PUSH ATTACHMENT : mc-04-mac-security.png

* PUSH ATTACHMENT : mc-04-cbc-mac.png

* PUSH ATTACHMENT : mc-04-ecbc-mac.png

* PUSH NOTE : 3. Symmetric Key Encryption.md

* PUSH ATTACHMENT : is-03-ecb-encryption.png

* PUSH ATTACHMENT : is-03-cbc-encryption.png

* PUSH ATTACHMENT : is-03-ctr-encryption.png

* PUSH NOTE : 2. PRFs, PRPs and Block Ciphers.md

* PUSH ATTACHMENT : mc-02-block-cipher.png

* PUSH ATTACHMENT : mc-02-feistel-network.png

* PUSH ATTACHMENT : mc-02-des-round.png

* PUSH ATTACHMENT : mc-02-DES.png

* PUSH ATTACHMENT : mc-02-aes-128.png

* PUSH ATTACHMENT : mc-02-2des-mitm.png

* PUSH NOTE : 18. Bootstrapping & CKKS.md

* PUSH NOTE : 17. BGV Scheme.md

* PUSH NOTE : 16. The GMW Protocol.md

* PUSH ATTACHMENT : mc-16-beaver-triple.png

* PUSH NOTE : 15. Garbled Circuits.md

* PUSH NOTE : 14. Secure Multiparty Computation.md

* PUSH NOTE : 13. Sigma Protocols.md

* PUSH ATTACHMENT : mc-13-sigma-protocol.png

* PUSH ATTACHMENT : mc-13-okamoto.png

* PUSH ATTACHMENT : mc-13-chaum-pedersen.png

* PUSH ATTACHMENT : mc-13-gq-protocol.png

* PUSH NOTE : 12. Zero-Knowledge Proofs (Introduction).md

* PUSH ATTACHMENT : mc-12-id-protocol.png

* PUSH NOTE : 11. Advanced Topics.md

* PUSH NOTE : 0. Introduction.md

* PUSH NOTE : 02. Symmetric Key Cryptography (1).md

* PUSH NOTE : 09. Transport Layer Security.md

* PUSH ATTACHMENT : is-09-tls-handshake.png

* PUSH NOTE : 08. Public Key Infrastructure.md

* PUSH ATTACHMENT : is-08-certificate-validation.png

* PUSH NOTE : 07. Public Key Cryptography.md

* PUSH NOTE : 06. RSA and ElGamal Encryption.md

* PUSH NOTE : 05. Modular Arithmetic (2).md

* PUSH NOTE : 03. Symmetric Key Cryptography (2).md

* PUSH ATTACHMENT : is-03-feistel-function.png

* PUSH ATTACHMENT : is-03-cfb-encryption.png

* PUSH ATTACHMENT : is-03-ofb-encryption.png

* PUSH NOTE : 04. Modular Arithmetic (1).md

* PUSH NOTE : 01. Security Introduction.md

* PUSH ATTACHMENT : is-01-cryptosystem.png

* PUSH NOTE : Search Time in Hash Tables.md

* PUSH NOTE : 랜덤 PS일지 (1).md

* chore: rearrange articles

* feat: fix paths

* feat: fix all broken links

* feat: title font to palatino
2024-11-13 14:28:45 +09:00

115 lines
4.5 KiB
Markdown

---
share: true
toc: true
math: true
categories:
- Algorithms
- Data Structures
path: _posts/algorithms/data-structures
tags:
- algorithms
- data-structures
title: Search Time in Hash Tables
date: 2024-04-12
github_title: 2024-04-12-search-time-hash-tables
---
Here are the expected time complexities of the search operation in hash tables.
## Assumptions
- Let $m$ be the number of buckets in the hash table.
- Let $n$ be the number of entries currently in the hash table.
- Let $\alpha = n/m$ be the *load factor*.
- Elements are uniformly hashed to each bucket of the hash table.
These results imply that the `search` operation takes almost constant time.
## Hashing with Chaining
### Unsuccessful Search
> **Theorem.** The expected time complexity of an unsuccessful search in a hash table using chaining is $\alpha$.
*Proof*. Observe that since elements are hashed uniformly into each bucket, the expected number of elements in each bucket is the same for all buckets. By linearity of expectation, their sum should equal $n$. So each bucket has $\alpha = n/m$ expected number of elements. Thus, on an unsuccessful search, we search at most $\alpha$ elements.
### Successful Search
> **Theorem.** The expected time complexity of a successful search in a hash table using chaining is $1 + \frac{\alpha}{2} - \frac{\alpha}{2n}$.
*Proof*. Suppose we are looking for the element $x$. The number of elements to search is determined by the number of elements inserted after $x$, whose hash collided with $x$.
The probability of collision is $\frac{1}{m}$, since the hash function is uniform by assumption. If $x$ was inserted as the $i$-th element, the number of elements to search equals
$$
1 + \sum _ {j = i + 1}^n \frac{1}{m}.
$$
The additional $1$ comes from searching $x$ itself. Averaging over all $i$ gives the final result.
$$
\begin{aligned}
\frac{1}{n}\sum _ {i=1}^n \paren{1 + \sum _ {j=i+1}^n \frac{1}{m}} &= 1 + \frac{1}{mn} \sum _ {i=1}^n \sum _ {j=i+1}^n 1 \\
&= 1 + \frac{1}{mn}\paren{n^2 - \frac{n(n+1)}{2}} \\
&= 1 + \frac{n(n-1)}{2mn} \\
&= 1+ \frac{(n-1)\alpha}{2n} = 1+ \frac{\alpha}{2} - \frac{\alpha}{2n}.
\end{aligned}
$$
## Hashing with Open Addressing
For open addressing, we first assume that $\alpha < 1$. The case $\alpha = 1$ will be handled separately. Also, we assume no deletion.
### Unsuccessful Search
> **Theorem.** The expected time complexity of an unsuccessful search in a hash table using open addressing is $\frac{1}{1-\alpha}$.
*Proof*. Let the random variable $X$ be the number of probes made in an unsuccessful search. We want to find $\bf{E}[X]$, so we use the identity
$$
\bf{E}[X] = \sum _ {i \geq 1} \Pr[X \geq i].
$$
We want to find a bound for $\Pr[X \geq i]$. For $X \geq i$ to happen, $i - 1$ probes must fail, i.e., it must probe to an occupied bucket. On the $j$-th probe, there are $m - j + 1$ buckets left to be probed, and $n - j + 1$ elements not probed yet. Thus the $j$-th probe fails with probability $\frac{n - j + 1}{m - j + 1} < \frac{n}{m}$. Therefore,
$$
\begin{aligned}
\Pr[X \geq i] &= \frac{n}{m} \cdot \frac{n - 1}{m - 1} \cdot \cdots \cdot \frac{n - (i - 2)}{m - (i - 2)} \\
&\leq \paren{\frac{n}{m}}^{i-1} = \alpha^{i - 1}.
\end{aligned}
$$
Now we have
$$
\bf{E}[X] = \sum _ {i \geq 1} \Pr[X \geq i] \leq \sum _ {i\geq 1} \alpha^{i-1} = \frac{1}{1 - \alpha}.
$$
### Successful Search
> **Theorem.** The expected time complexity of a successful search in a hash table using open addressing is $\frac{1}{\alpha} \log \frac{1}{1- \alpha}$.
*Proof*. On a successful search, the sequence of probes is exactly the same as the sequence of probes when that element was inserted.
Suppose that an element $x$ was the $i$-th inserted element. At the moment of insertion, the load factor is ${} \alpha _ i = (i-1)/m {}$. By the above theorem, the expected number of probes must have been ${} 1/(1 -\alpha _ i) = \frac{m}{m-(i-1)} {}$. Averaging this over all $i$ gives
$$
\begin{aligned}
\frac{1}{n} \sum _ {i=1}^n \frac{m}{m - (i - 1)} &= \frac{m}{n} \sum _ {i=0}^{n-1} \frac{1}{m - i} \\
&\leq \frac{1}{\alpha} \int _ {m-n}^m \frac{1}{x}\,dx \\
&= \frac{1}{\alpha} \log \frac{1}{1-\alpha}.
\end{aligned}
$$
### When the Hash Table is Full ($\alpha = 1$)
First of all, on an unsuccessful search, all $m$ buckets should be probed.
On a successful search, set $m = n$ on the above argument, then the average number of probes is
$$
\frac{1}{m} \sum _ {i=1}^m \frac{m}{m - (i - 1)} = \sum _ {i=1}^m \frac{1}{i} = H _ m,
$$
where $H _ m$ is the $m$-th harmonic number.