Biological Data Analysis

In University of Tartu, J. Liivi street 2, Tartu, ESTONIA.
Distance learning: 20 November, 2006 – 20 December, 2007;
Face-to-face meeting: 29 January – 3 February, 2007.

The field of bioinformatics is a highly interdisciplinary one. It relies heavily on other subject fields like biology, computer science and statistics. Within the framework of “NORDIC-BALTIC-RUSSIAN ACADEMIC NETWORK IN BIOINFORMATICS” three courses are planned to cover important aspects in Bioinformatics field: The "Biological Data Analysis" course is the second of the 3 planned courses in Bioinformatics field. The students are strongly encouraged (but not obligatory) to plan to participate in all 3 courses.

The aim of the "Biological Data Analysis" course is to provide students with basic knowledge and elementary skills needed for the statistical analysis of biological data.

In the preparatory, distance-learning part of the course, students will be asked to download the freeware package R and to practice simple numerical and graphical tools for data description using this software. R is very flexible software for biological data analysis; the number of implemented methods is virtually unlimited (although only a small subset of them will be covered during this course). The same software and its extensions will also be used in the next, bioinformatics course.

The one-week face-to-face module will be held in Tartu, using the facilities of the Faculty of Mathematics and Computer Sciences, University of Tartu.

First, the concepts of randomness, random sampling and sampling variability will be introduced. Drawing conclusions from the data, while taking into account uncertainty, is based on confidence intervals and statistical hypothesis testing. Their general principles will be seen together with practical examples of data analysis in R.

Next, some methods for exploring and testing associations in the data will be considered: simple two-sample tests, basic statistical models, such as (simple and multivariate) linear regression and analysis of variance. Finally, some tools for dimensionality reduction and cluster analysis will be explored. The practical exercises sessions will be held in a computer lab.

The course will consist of

  1. distance learning part - 1 month before the course students will get the prereading and some exercises;
  2. face-to-face part, where lectures by the Estonian and Norwegian teachers are given as well as the practical sessions in computer lab are carried out.