1 Introduction
1.1 What is Bioinformatics?
Bioinformatics is a newly coined term and refers to a novel branch of science
straddling the traditional domains of biology and informatics, which is itself
a new area of research. Hence, bioinformatics is primarily concerned with the
creation and application of information-based methodologies to the analysis
of biological data sets and the subsequent exploitation of the information
contained therein. The widespread adoption of a range of technologies such
as microarrays as well as large scale genome sequencing projects has resulted
in a situation where a large amount of data is being generated on a daily
basis - too large, in fact, for manual examination and subsequent exploitation.
Hence, the development of a range of suitable informatics tools for
automated feature extraction and analysis of these data sets is required.
The tools provided by bioinformatics are intended to fill this gap.
In addition, biological systems are intrinsically noisy. Fundamentally,
biological systems and the processes driving them are “fuzzy” in nature. As
a result, any data or observations derived thence will inevitably be equally
fuzzy. Due to this inherently noisy nature, the mathematical techniques used
to deal with biological datasets must be able to deal with the uncertainty
that is invariably present in the data. Statistical methods are the natural
solution to this problem.
Hence, it is clear that the effective use of bioinformatics necessitates a
sound mastery of the underlying mathematical and in particular statistical
principles. This short course has been designed to provide a suitable starting
point from which the bioinformatics course may be more effectively attacked.
The objective is to introduce all the relevant statistical concepts so that the
algorithms and methodologies used in bioinformatics can be more readily
understood and more effectively applied.