Modern biological databases comprise not only data, but also sophisticated
query facilities and bioinformatics data analysis tools. This book provides an
exploration through the world of Bioinformatics Database Systems.
The book summarizes the popular and innovative bioinformatics repositories
currently available, including popular primary genetic and protein sequence
databases, phylogenetic databases, structure and pathway databases, microarray
databases and boutique databases. It also explores the data quality and
information integration issues currently involved with managing bioinformatics
databases, including data quality issues that have been observed, and efforts
in the data cleaning field.
This book surveys the developments of techniques and approaches on pattern
recognition related to Computational Molecular Biology. Providing a broad
coverage of the field, the authors cover fundamental and technical information
on these techniques and approaches, as well as discussing their related
problems. The text consists of twenty nine chapters, organized into seven
parts: Pattern Recognition in Sequences, Pattern Recognition in Secondary
Structures, Pattern Recognition in Tertiary Structures, Pattern Recognition in
Quaternary Structures, Pattern Recognition in Microarrays, Pattern Recognition
in Phylogenetic Trees, and Pattern Recognition in Biological Networks.
Biologists are stepping up their efforts in understanding the biological
processes that underlie disease pathways in the clinical contexts. This has
resulted in a flood of biological and clinical data from genomic and protein
sequences, DNA microarrays, protein interactions, biomedical images, to disease
pathways and electronic health records. To exploit these data for discovering
new knowledge that can be translated into clinical applications, there are
fundamental data analysis difficulties that have to be overcome. Practical
issues such as handling noisy and incomplete data, processing compute-intensive
tasks, and integrating various data sources, are new challenges faced by
biologists in the post-genome era. This book covers the fundamentals of
state-of-the-art data mining techniques which have been designed to handle such
challenging data analysis problems,
and demonstrates with real applications how
biologists and clinical scientists can employ data mining to enable them to
make meaningful observations and discoveries from a wide array of heterogeneous
data from molecular biology to pharmaceutical and clinical domains.
Bioinformatics, a field devoted to the interpretation and analysis of
biological data using computational techniques, has evolved tremendously in
recent years due to the explosive growth of biological information generated by
the scientific community. Soft computing is a consortium of methodologies that
work synergistically and provides, in one form or another, flexible information
processing capabilities for handling real-life ambiguous situations. Several
research articles dealing with the application of soft computing tools to
bioinformatics have been published in the recent past; however, they are
scattered in different journals, conference proceedings and technical reports,
thus causing inconvenience to readers, students and researchers.
This book, unique in its nature, is aimed at providing a treatise in a unified
framework, with both theoretical and experimental results, describing the basic
principles of soft computing and demonstrating the various ways in which they
can be used for analyzing biological data in an efficient manner. Interesting
research articles from eminent scientists around the world are brought together
in a systematic way such that the reader will be able to understand the issues
and challenges in this domain, the existing ways of tackling them, recent
trends, and future directions. This book is the first of its kind to bring
together two important research areas, soft computing and bioinformatics, in
order to demonstrate how the tools and techniques in the former can be used for
efficiently solving several problems in the latter.
Bioinformatics is the science of managing, mining,
integrating, and interpreting information from biological
data at the genomic, metabalomic, proteomic, phylogenetic,
cellular, or whole organism levels.
The need for bioinformatics tools and expertise has increased as
genome sequencing projects have resulted in an exponential
growth in complete and partial sequence databases.
These and other projects require the development of
new ways to interpret the flood of
biological data that exists today and
that is anticipated in the future.
Data mining or knowledge discovery from data (KDD),
in its most fundamental form, is to
extract interesting, nontrivial, implicit, previously unknown and
potentially useful information from data.
With the substantial growth of biological data,
KDD will play a significant role in analyzing the data and
in solving emerging problems.
The aim of this book is to introduce the reader
to some of the best techniques
for data mining in bioinformatics (BIOKDD)
in the hope that the reader will build on them to
make new discoveries on his or her own.