Updated 25.03.2008
58308110 Seminar: Management of Biological Databases (3 cu), spring 2008
17.01.--21.02. Th 10--12 C220, 13.03.-24.04. Th 10--12 C220 Instructor: Jan Lindström, PhD
Overview.
As of 2006, there are over 1,000 public and commercial biological databases. These biological databases usually contain genomics and proteomics data, but databases are also used in taxonomy. The data are nucleotide sequences of genes or amino acid sequences of proteins. Furthermore information about function, structure, localisation on chromosome, clinical effects of mutations as well as similarities of biological sequences can be found.Biological databases have become an important tool in assisting scientists to understand and explain a host of biological phenomena from the structure of biomolecules and their interaction, to the whole metabolism of organisms and to understanding the evolution of species. This knowledge helps facilitate the fight against diseases, assists in the development of medications and in discovering basic relationships amongst species in the history of life.
The biological knowledge is distributed amongst many different general and specialized databases. This sometimes makes it difficult to ensure the consistency of information. Biological databases cross-reference other databases with accession numbers as one way of linking their related knowledge together.
An important resource for finding biological databases is a special yearly issue of the journal Nucleic Acids Research (NAR). The Database Issue of NAR is freely available, and categorizes many of the publicly available online databases related to biology and bioinformatics.
In this seminar, we cover topics related to biological database management.
Prerequisites.
All participants must have a bachelor's degree or have passed the Scientific Writing course. Background in basic database management is required. Knowledge about biological databases is a plus, but not required.
Structure of the Seminar
The language of the seminar is English.To pass the seminar, you need to do the following four tasks:
- 1. Write a paper about a topic agreed during the first meetings,
- 2. Review two papers written by other students,
- 3. Prepare a presentation and discuss it with the other students, and
- 4. Participate in the seminar by asking questions, raising discussions on the topic, and reviewing other students' work.
IEEE guidelines for the paper (Latex and Word) can be found from the IEEE Transaction author guide: http://www.ieee.org/pubs/authors.html.
Schedule
- 17.1.2008: Opening session
- 28.2.2008: First version of the paper is due.
- 10.3.2008: Reviews of assigned papers are due.
- 20.3.2008: Jing Tang: Combining Biological Databases and Text Mining to Support New Bioinformatics Applications (updated)
- 03.4.2008:
- 10.4.2008:
- 17.4.2008:
- 17.4.2008: Chen Ping: Incorporating Goal Analysis in Database Design: A Case Study from Biological Data Management. (updated)
- 24.4.2008: Hasegawa Hitomi: The Challenges of Modeling Biological Information for Genome Databases. (updated)
- 24.4.2008: Töyli Terhi: bdbms - A Database Management System for Biological Data. (updated)
- 24.4.2008: Final papers are due.
Grading
Students will be graded based on i) their written paper (40%), ii) their oral presentation (40%), and iii) their activity in commenting other students' work and participating in the discussion (20%). To pass the course, the student must write the paper on the agreed subject and present his work. In addition, each student is required to attend at least 80% of the seminar presentations.
List of possible topics
- Mohamed Y. Eltabakh, Mourad Ouzzani, Walid G. Aref: bdbms - A Database Management System for Biological Data. CIDR 2007: 196-206.
- Ying Jin, Matthew L. Tescher, Huaqin Xu: Middleware Support to the Specification and Execution of Active Rules for Biological Database Constraint Management. IRI 2007: 406-411.
- Lei Jiang, Thodoros Topaloglou, Alexander Borgida, John Mylopoulos: Incorporating Goal Analysis in Database Design: A Case Study from Biological Data Management. RE 2006: 196-204.
- Lei Jiang, Thodoros Topaloglou, Alexander Borgida, John Mylopoulos: Incorporating Goal Analysis in Database Design: A Case Study from Biological Data Management. SEBD 2006: 14-19.
- Zhiyong Peng, Yuan Shi, Boxuan Zhai: Realization of Biological Data Management by Object Deputy Database System. T. Comp. Sys. Biology 2006: 49-67.
- Gautam B. Singh: Learning Information Patterns in Biological Databases. The Data Mining and Knowledge Discovery Handbook 2005: 1139-1158.
- Joshua Wing Kei Ho, Tristan Manwaring, Seok-Hee Hong, Uwe Röhm, David Cho Yau Fung, Kai Xu, Tim Kraska, David Hart: PathBank: Web-Based Querying and Visualziation of an Integrated Biological Pathway Database. CGIV 2006: 84-89.
- Mousheng Xu, Susan Gauch: Associated Biological Information Retrieval from Distributed Databases. CIKM 1998: 193-200.
- Isabelle Mougenot, Thérèse Libourel, Patrice Déhais: Genetic Sequence Annotation within Biological Databases. DASFAA 1995: 333-341.
- N. Parimala: Graphical User Interface to Multiple Biological Databases. DEXA Workshops 2003: 50-54.
- Rami Rifaieh, Roger Unwin, Jeremy Carver, Mark A. Miller: SWAMI: Integrating Biological Databases and Analysis Tools Within User Friendly Environment. DILS 2007: 48-58.
- Anthony Kosky, I-Min A. Chen, Victor M. Markowitz, Ernest Szeto: Exploring Heterogeneous Biological Databases: Tools and Applications. EDBT 1998: 499-513.
- Shamkant B. Navathe, Andreas M. Kogelnik: The Challenges of Modeling Biological Information for Genome Databases. Conceptual Modeling 1997: 168-182.
- Mark Newsome, Cherri M. Pancake, Joe Hanus: HyperSQL: Web-based Query Interfaces for Biological Databases. HICSS (4) 1997: 329-339.
- René Witte, Christopher J. O. Baker: Combining Biological Databases and Text Mining to Support New Bioinformatics Applications. NLDB 2005: 310-321.
- Carlos A. Heuser, Luiz Fernando Bessa Seibel, Eduardo Kroth: Integrating Biological Databases. SBBD 2003: 3.
- Ela Hunt, Malcolm P. Atkinson, Robert W. Irving: A Database Index to Large Biological Sequences. VLDB 2001: 139- 148.
- Guochun Xie, Reynold DeMarco, Richard Blevins, Yuhong Wang: Storing biological sequence databases in relational form. Bioinformatics 16(3): 288-289 (2000).
- Add your own topic here!
Links
Jan Lindstrom (Jan.Lindstrom@cs.Helsinki.FI)