Account# | Name | Balance (e.g., 123, Joe, 451.00).Idea Data as Unix files; schema in separate file. Example tuple encodings like Smith#123#CS.
Query SELECT * FROM Students; → list tuples.
Projection+Filter→File SELECT * FROM R WHERE Cond | T; reads schema, scans R, writes qualifying tuples to T, and registers new file T.
SELECT * FROM R WHERE Cond; → read schema; scan R; test Cond; output matches.Students and Depts; check WHERE; output attribute(s).disk capacity=tracks*blocks*sectors*bytes
2^30=1tb
circle pi*diameters, bytes/(90%*diameters*pi), inner and outer are different, everage inner+outer/2
maxseek=d1+maxtrack/trackpersec,averageseek=d1+1/3(maxtrack/trackpersec)
T_fullrotation=time/rpm,T_averagerot=T_fullrotation/2
two pass:RX2+SX2+R+S,$\frac{block}{M}$ < M-1
$\frac{k}{k+1}$
Serverbreak_p:percent_break x recover_hour/8760 hour
raid0: 0 fault tolerance,raid1:mirror,raid3:1diskfail byte level xor,raid4,large read slow writer block level xor, raid5:1 disk can fail distributed parity balance performance,raid6:2disk can fail
raid4:XOR:00:0,01:1,10:1,11:0
number of track to move: left+right, left/total*left_avg+right/total*right_avg
seek=seek_overhead+track_avg/trackpersec
Record_per_block=$\frac{kb*1024}{record_bytes}$
Require_blocks=record_num/Record_per_block
Trand=Tseek+Trot_avg+Trans
Tscan < Trand*Tuple_num
dense Index Block:total_block=$\frac{index entries}{Record_per_block}+\frac{index entries}{Index_per_block}$
Sparse Index Block:total_block=$\frac{index entries}{Record_per_block}+\frac{index entries/Record_per_block}{Index_per_block}$
$RPM=\frac{600000}{avg_dalay*2}
read whole tracks=Tseek+avg_dalay*2*track_number
seek time=overhead+(time per cylinder)*N/3,random accesses distance = N/3.
The first-level index (on data) is typically dense, not sparse. The second-level index (on the first-level index) is sparse, not dense.
Clustering Index Built on a non-key attribute of an ordered data file where records with the same value are stored together (clustered).
Secondary Index on Key Attribute A secondary index built on an attribute that is also a key
Sparse Primary Index is is the usual case for a primary index. order file One entry per data block (not every record).
Dense Secondary Index One entry for every record in the data file. Built on a non-ordering attribute
Phase1 runs=$\frac{diskblock}{Memorysize}$
Mergesort $\frac{diskblock}{M} <= M-1, \sqrt{diskblock}+1$
Mergesort (M-1)*M=Block size
Sortest distance first, elevator: move up then down
Primary Index Built on a sorted (ordered) data file. The index key is the primary key (ordering key) of the data file. There is one index entry per data block (not per record). Points to the first record in each block.
Secondary Index Built on a non-ordering field (not the primary key). The data file itself may be unordered or ordered on another field. There can be multiple secondary indexes on the same file. Typically has one index entry per record, because data are not ordered on this field.
Secondary Level Index (or Multi-Level Index) This is not a different kind of index (primary or secondary) — it’s about the structure of the index. When an index (primary or secondary) is too large to fit in memory, we build another index on top of it — a second-level index. This creates a hierarchy (multi-level index) that speeds up lookups.
Search Key A field or combination of fields used to look up records in an index. It is a physical access concept, used in file organization and indexing. Does not have to be unique — multiple records can have the same search key value.
MovieStar(name CHAR(30) PK, address VARCHAR(255), gender CHAR(1), birthdate DATE).Primary Index Ordering (primary) key Yes Block Sparse 1 entry/block Sparse Primary Index Ordering key Yes Block Sparse Common primary index Dense Primary Index Primary key No Record Dense Used if file unordered Secondary Index Non-ordering field No Record Dense One per record Sparse Secondary Index Non-ordering field No – ❌ Not practical Secondary Index on Key Attribute Non-ordering unique field No Record Dense Each record unique Clustering Index Non-key attribute (with duplicates) Yes Distinct value Spar