The background of professor Heikki Mannila's (University of Helsinki, Aalto University) team is in algorithms and data mining. The main research emphasis is on sequence analysis (e.g., finding orders and segmentation), mining complex and heterogeneous data, especially pattern discovery, and structure discovery in high-dimensional data sets. The team has strong ties to application groups in gene mapping and genome structure, environmental sciences, linguistics, and telecommunications.
Subgroups
Data Mining
- Group leader: professor Heikki Mannila's (until 2012, University of Helsinki, Aalto University)
The core Data Mining group studies the basic theory of data mining, develops data mining methods, and applies them to problems arising in other sciences and in industry.
Parsimonious Modelling
- Group leader: Jaakko Hollmén
- Group homepage: http://www.hiit.fi/pm/
The mission of the parsimonious modelling research group is to develop computational methods for data analysis and to apply these methods on two particular application fields: cancer genomics and environmental informatics. Both of these application fields exhibit problems of high dimensional data and complex, unknown interactions between measurements. Parsimonious modeling aims at achieving maximally simple or compact models as a result of the data analysis process. In practical problems, parsimony makes results more understandable and interpretable.
Combinatorics, Algebra, and Computing (CO-ALCO)
- Group leaders: Mikko Koivisto and Petteri Kaski
The group develops and applies combinatorial and algebraic tools for computational problems, focusing on exact deterministic algorithms. Applications range from fundamental combinatorial problems to computational tasks associated with established probabilistic models in machine learning and data mining.
Phenomics
- Group leaders: professor Heikki Mannila (until 2012, University of Helsinki, Aalto University) and Mikko Koivisto
The Phenomics group develops and applies data mining techniques to identify new phenotypic and genotypic associations in population sample databases.