Database management, Autumn 2001, Exercise 3 (2.-5.10.)

The minimum requirement for active attendance: 3 tasks done

The tasks marked with '(**)' are counted as two tasks.

1. A file has 1 000 000 records with the length of 200 characters. The page (block) size is 4 KB. Let us use an ISAM structure where the length of the index key is 12 bytes, the page number is 4 bytes, and the entire index record 20 bytes (4 bytes for administration). How many blocks are needed for the file with the following assumptions (cf. EN, Ch. 6.2):

a) the data blocks will be only half loaded (load factor = 50%),
b) the data blocks are organized as a heap file, and a dense ISAM structure is built upon it with a load factor 50% for the leaf level of the ISAM structure,
c) as b, but with the load factor 80%?

2. The relation WORKS_ON is implemented as a heap file. In addition, we have a hash-based secondary index to this relation, the index key being ESSN.

a) How this solution relates to the hash file used in task 2.4 (last week)? Is it better or worse? Why?

b) How the relation WORKS_ON should be implemented if we need grouping on projects (PNO) as well as on employees (ESSN)?

3. (**) Assume that the relation EMPLOYEE is implemented as a heap file and we have the following indexes:
- an ISAM index based on the attribute pair (LNAME, FNAME),
- an ISAM index based on attribute SSN,
- a hash-based index based on attribute DNO.

a) Describe the file structure (data file + indexes) with example values. (See E&N, Figs. 6.4-5)

b) Explain how the indexes can be used. Are there some typical situations (e.g. typical queries) for which this index structure is not sufficient? Are there any other indexes which could help to use the relation EMPLOYEE? (Explain the situation and the structure of the index.)

4. (**) Assume that relation EMPLOYEE has the indexes given in task 3. Explain all operations (record fetches and record updates; both for the indexes and for the data file) when the following queries are executed:

insert into EMPLOYEE values (...);
delete from EMPLOYEE where SSN = '123456789' or SSN = '333445555';
update EMPLOYEE set SALARY = SALARY * 1.1 where SSN = '123456789';
update EMPLOYEE set DNO = 4 where FNAME = 'John' and LNAME = 'Smith';

5. (**) In a B+-tree of order 5 there are 2 to 4 index keys in every node. Assume that the following keys are inserted into an empty B+-tree in that order: 23, 65, 37, 60, 46, 92, 48, 71, 56, 59, 18, 21, 10, 74, 78, 15, 16, 20, 24.

a) Describe how the B+-tree structure is formed (stepwise). (See E&N, Fig. 6.12; the essential changes are of course sufficient.)

b) How the structure is changed when the keys 24, 23, 10, and 20 are deleted?

6. For the file of task 1, estimate the size and the number of disk accesses of two dynamic hash indexes: extendible hashing (E&N, 145-146) and dynamic hashing. The latter is not included in E&N, 3rd edition. It can be found on page 29 of the chapter 3 lecture material or in E&N's older version (E&N2, p. 93). (The 'dynamic hashing' case is optional in this task.)

The Finnish lecture notes are available via the Finnish course page ('Luentokalvot') in pdf-format or in a course folder in room A412 (on paper).