Tietokannan hallinta - exercise 2/spring 2002

Database management, Spring 2002, Exercise 2 (8.-12.4.)

The minimum requirement for active attendance: 3 tasks done

1. Assume that the relation PROJECT (in the COMPANY database) is implemented as a) a heap file, b) an ordered (sorted) file. Explain how is the query

   select * from PROJECT where PNUMBER > 10 and PNUMBER < 21;

executed. In b you should find two different principles for the execution.

2. A hash file structure contains B = 10 buckets. The hash key is integer-valued and the hash function h is defined by the equation h(v) = v mod B. A disk block can hold three records of the file.

a) Give the contents of the file structure when records with the keys 60, 86, 40, 18, 92, 50, 30, 22, 36, 35, 46, 66, 11, 15, 65, 32, 76, 16, 21, 56 (in this order) have first been inserted into the file and when the records with keys 18, 40, 46, and 56 have then been deleted from the file.

b) What is the average number of disk accesses for a search with a random key after the insertions? What is the same average after the deletions?

3. What attributes of the relations in the COMPANY database are a) suitable, b) unsuitable as a hash key?
c) Give three examples of different kind of hash functions for some attribute regarded suitable.

4. The relation WORKS_ON is implemented as a hash-based file, with ESSN as the hash key. How would you execute the following query? How many disk accesses it requires? How much buffer space is needed?
(Obs. The query and the implementation structure of the relation are quite (even "too" ?) closely compatible here - at least if we think the other uses the relation may have. The purpose of this special case is only to point out how hash structure could be used.)

        select ESSN, count(PNO)
        from WORKS_ON
        group by ESSN;

5. Explain how the following SQL statements are executed (in terms of operations on datafile and the two indexes) for a relation emp(ssn, name, salary) given on pages 7 to 9 of the (Finnish) lecture notes for the chapter 3 (chapter 3 = luku 3):

a) update emp set name = 'Franck' where name = 'Frank';
b) update emp set name = 'Takala' where name = 'Hakala';
c) update emp set ssn = 9122 where ssn = 1122;

It should be possible to use this example in spite of the Finnish language; there is a sparse (non-dense) index on name (index record: name and page number), and a dense index on ssn (index record: ssn and tuple identifier (tid) containing page number and record number)). A datafile of 7 records is also given.

6. Consider creating a sparse ISAM index (see page 168 in E&N3) for a set of data records with keys given in task 2 (60, 86, etc.). We assume that a disk page (disk block) can hold three data records or four index records. At creation time, we use a 100 % fill-factor, that is, every data page (maybe except he last) is filled with three records.

a) Show the contents of the structure after it is created. Explain then how the structure changes when the records with key values 66 and 50 are deleted, and how records with keys 20, 38, 45, 41, and 37 are then inserted.

b) Give an example for a sequence of records inserted into the structure (after the creation phase) with the following property: After the insertions, one additional disk access is needed (on average) as compared to the structure before these insertions. (The principle is sufficient, not the exact sequence of keys. Calculating the average means that every key of the file is searched with equal probability.)

The Finnish lecture notes are available via the Finnish course page ('Luentokalvot') in pdf-format or in a course folder in room A412 (on paper).