We have formula two-pass external sort on a relation of size \(B\) blocks \[ B \le M(M-1). \] With \(B = 10{,}000\): \[ M^2 - M - 10{,}000 \ge 0 \;\Rightarrow\; M \ge \frac{1+\sqrt{1+4\cdot 10{,}000}}{2} = \frac{1+\sqrt{40{,}001}}{2} \approx 100.5. \] The minimal is \[ M = 101. \]
a.\(\delta\) duplicate elimination: requires sorting \(R\) in two passes \(101\) buffer blocks. b.\(\gamma\) group by: also \(101\) buffer blocks.
\[B(R)+B(S)\le\ M^2\]
10000+10000=20000
\[M \ge\sqrt{20000}\] => 142
c.a binary operation such as join or union: \(142\) buffers.linear hashing: \[ h_i(k) = k \bmod 2^i, \] if \(h_i(k) < p\) \[ h_{i+1}(k) = k \bmod 2^{i+1}. \]
\(i = 2,\; p = 0\), number of buckets = 4
[0]: 0000, 0100 [1]: 0001, 0101 [2]: 0010, 0110 [3]: 0011, 0111
\(i = 2,\; p = 1\), number of buckets = 5
[0]: 0000, 1000 [1]: 0001, 0101, 1001 [2]: 0010, 0110 [3]: 0011, 0111 [4]: 0100
\(i = 2,\; p = 2\), number of buckets = 6
[0]: 0000, 1000 [1]: 0001, 1001 [2]: 0010, 0110, 1010 [3]: 0011, 0111, 1011 [4]: 0100, 1100 [5]: 0101
\(i = 1,\; p = 1\), number of buckets = 3
[0]: 0000, 0100 [1]: 0001, 0011, 0101 ,0111 [2]: 0010, 0110
overflow block: 0111
\(i = 2,\; p = 0\), number of buckets = 4
[0]: 0000, 0100, 1000 [1]: 0001, 0101, 1001 [2]: 0010, 0110 [3]: 0011, 0111
\(i = 2,\; p = 1\), number of buckets = 5
[0]: 0000, 1000 [1]: 0001, 0101, 1001 [2]: 0010, 0110, 1010 [3]: 0011, 0111, 1011 [4]: 0100, 1100
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 0 | 1 | Null |
| B2 | 1 | 1 | 1111 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 0 | 1 | Null |
| B2 | 1 | 1 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 0 | 1 | Null |
| B2 | 1 | 1 | 1111, 1110, 1101 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | Null |
| B2 | 100, 101 | 2 | Null |
| B3 | 110 | 3 | 1101, 1100 |
| B4 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | Null |
| B2 | 100, 101 | 2 | 1011 |
| B3 | 110 | 3 | 1101, 1100 |
| B4 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | Null |
| B2 | 100, 101 | 2 | 1011, 1010 |
| B3 | 110 | 3 | 1101, 1100 |
| B4 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | Null |
| B2 | 100, 101 | 2 | 1011, 1010, 1001 |
| B3 | 110 | 3 | 1101, 1100 |
| B4 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | Null |
| B2 | 100 | 3 | 1001, 1000 |
| B3 | 101 | 3 | 1011, 1010 |
| B4 | 110 | 3 | 1101, 1100 |
| B5 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | 0111 |
| B2 | 100 | 3 | 1001, 1000 |
| B3 | 101 | 3 | 1011, 1010 |
| B4 | 110 | 3 | 1101, 1100 |
| B5 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | 0111, 0110 |
| B2 | 100 | 3 | 1001, 1000 |
| B3 | 101 | 3 | 1011, 1010 |
| B4 | 110 | 3 | 1101, 1100 |
| B5 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001, 010, 011 | 1 | 0111, 0110, 0101 |
| B2 | 100 | 3 | 1001, 1000 |
| B3 | 101 | 3 | 1011, 1010 |
| B4 | 110 | 3 | 1101, 1100 |
| B5 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001 | 2 | Null |
| B2 | 010 | 3 | 0101, 0100 |
| B3 | 011 | 3 | 0111, 0110 |
| B4 | 100 | 3 | 1001, 1000 |
| B5 | 101 | 3 | 1011, 1010 |
| B6 | 110 | 3 | 1101, 1100 |
| B7 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001 | 2 | 0011 |
| B2 | 010 | 3 | 0101, 0100 |
| B3 | 011 | 3 | 0111, 0110 |
| B4 | 100 | 3 | 1001, 1000 |
| B5 | 101 | 3 | 1011, 1010 |
| B6 | 110 | 3 | 1101, 1100 |
| B7 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001 | 2 | 0011, 0010 |
| B2 | 010 | 3 | 0101, 0100 |
| B3 | 011 | 3 | 0111, 0110 |
| B4 | 100 | 3 | 1001, 1000 |
| B5 | 101 | 3 | 1011, 1010 |
| B6 | 110 | 3 | 1101, 1100 |
| B7 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000, 001 | 2 | 0011, 0010, 0001 |
| B2 | 010 | 3 | 0101, 0100 |
| B3 | 011 | 3 | 0111, 0110 |
| B4 | 100 | 3 | 1001, 1000 |
| B5 | 101 | 3 | 1011, 1010 |
| B6 | 110 | 3 | 1101, 1100 |
| B7 | 111 | 3 | 1111, 1110 |
| Num | Directory | depth | Keys |
|---|---|---|---|
| B1 | 000 | 3 | 0001, 0000 |
| B2 | 001 | 3 | 0011, 0010 |
| B3 | 010 | 3 | 0101, 0100 |
| B4 | 011 | 3 | 0111, 0110 |
| B5 | 100 | 3 | 1001, 1000 |
| B6 | 101 | 3 | 1011, 1010 |
| B7 | 110 | 3 | 1101, 1100 |
| B8 | 111 | 3 | 1111, 1110 |
(a) Bitmap: 01100000001000001000000
Drop trailing zeros: 01100000001000001 then lengths: [1, 0, 7, 5] i=1,prefix 0,binary 1, code = 0+1 i=0,prefix 0,binary 0, code = 0+0 i=7,prefix 110,binary 111, code = 110+111 i=5,prefix 110,binary 101, code = 110+101 Answer:0100110111110101
(b) Bitmap: 10000010000001001101
lengths: [0, 5, 6, 2, 0, 1] i=0,binary 0,prefix 0, code= 00 i=5,101,110,110101 i=6,110,110,110110 i=2,10,10,1010 i=0,00 i=1,01 Answer: 0011010111011010100001
In lecture we have: total bits = number of records × number of distinct values,which is m x n In this question A is a key, every tuple has a different A value. So the number of distinct values of A = number of tuples = n. Then we have:
\[ \underbrace{n}_{\text{bitmaps}} \times \underbrace{n}_{\text{bits per bitmap}} \;=\; n^2\ \text{bits}. \]