Tuesday, July 18, 2006

File organization

File organization - what is hash join?-------------------------------------------
book: Database management systems

heap file
- record in a ~ are stored in random order across the pages of the file.

record id (rid)
- Each record in a file has a unique identifier called a record id, or rid for short. An rid has the property that we can identify the disk address of the page containing the record by using the rid.

Page 276
There are three main alternatives for what to store as a data enery in an index
(1) A data entry k* is an actual data record( with search key value k)
(2) A data entry is a pair
(3) A data entry is a pair, where rid-list is a list of record ids of data records with search key value k.

Clustered Indexes:
- When a file is organized so that the ordering of data records is the same as or close to the ordering of data entries in some index, the index is said to be clustered; otherwise, it is an unclustered index.


8.4 COMPARISON OF FILE ORGANIZATIONS
Assume
- files and indexes are organized according to the composite search key
- All selection operations are specified on these fields

Organizations:
- File of randomly ordered employee records, or heap file
- File of employee records sorted on
- Clustered B+ tree file with search key
- Heap file with an unclustered B+ tree index on
- Heap file with an unclustered hash index on

Cost model:
- 'B' denotes the number of data pages for a table
- 'R' the number of records per page-

No comments: