Você está na página 1de 1

BIOINFORMATICS

DEFINITION

SOURCE OF
DATA USED

AIMS

Application
of To organizes data in a
way
that
allows Raw DNA sequence
computational
researchers to access
techniques
to existing information & Protein sequence
understand
and to submit new entries Macromolecular
organize
the as they are produced
structure
To develop tool &
information
Genomes
associated
with resources that aid in Gene expression
the analysis of data
biological
To use the tools to Literature
macromolecules
analyze the data & Metabolic pathways
interpret the results in
a
biologically
meaningful manner

GENOMIC
STUDIES
GENE
EXPRESSION
STUDIES

Provides valuable
insight into the
stereochemical
principles of binding
Concentrated on
model organisms &
analysis of
regulatory systems
Focused on devising
methods to cluster genes by
similarities in expression
profiles
To determine proteins that
are expressed together
under diff cellular conditions

GENE
EXPRESSI
ON DATA

FARAH ALIA
BT RAHAMAT
2013235384
EH 222 7C

DATA
INTEGRATI
ON

THE
BIOINFOR
MATICS
SPECTRU
M

REDUNDANCY &
MULTIPLICITY DATA

PROTEIN SEQUENCE
DATABASES

STRUCTURAL
DATABASES

Data classification between genomes & their products


Categorized as primary, composite & secondary
Primary databases as a repository for the raw data (Eg: SWISS-PROT)
Composite databases compile & filter sequence data from diff primary
database to produce combined non-redundant sets that are more
complete than the individual databases (Eg: OWL)
Secondary databases contain information derived from protein
sequences & help the user determine whether a new sequence belongs to
a known protein family (Eg: PROSITE)

Databases of macromolecular structures


PDB provides primary archive of all 3D structures for
macromolecules (proteins, RNA, DNA)
Solved by x-ray crystallography and NMR
3 major databases classify proteins by structure to
identify structural & evolutionary relationships (CATH,
SCOP, FSSP)

Measure
the amount
of mRNA
OR protein
products
produced
by cell

Most
profitable
research
integrate
multiple
sources of
data

Allow
expansion
of
biological
analysis;
depth &
breadth

3 main
technologies
: cDNA
microarray,
Affymatrix
GeneChip,
SAGE
methods

Not always
straightforw
ard to access
because diff
in
nomenclatur
e & file
formats

Depth to
take single
protein &
maximizes
understandi
ng about
proteins
encoded

NUCLEOTIDE &
GENOME SEQUENCES

Yeast
measure
mRNA levels
throughout
whole cell
cycle, some
focus on
particular
stage in cycle

Separated
according
to sources
of
informatio
n

Breadth

compare
a gene
with
others

FINDING
HOMOLOGUES

RATIONAL
DRUG DESIGN

LARGE-SCALE
CENSUS

Simplifies
problem to
understand
complex
genomes

Earliest
medical
applications

Help identify
interesting
subject areas for
further detailed
analysis

Biggest excitement availability of complete genome sequences for


different organisms
Whole-genome sequencing often conducted through international
collaborations, individual genomes are published at diff sites
Entrez genome database combines complete & partial genomes in a
single location
Cluster of Orthologous Group (COG) predict function of uncharacterized
proteins & identify phylogenetic patterns of protein occurence

GENE
EXPRESSION
ANALYSIS
Compile
expression data
for cells affected
by diff diseases

APPLICATION

TRANSCRIPTION
REGULATION

STRUCTURAL
STUDIES

INTRODUCTION
AND OVERVIEW
OF BIO
INFORMATICS

Can be grouped together based on biologically meaningful similarities


Genes grouped by particular functions OR by metabolic pathway
Organisms often have multiple copies of a particular gene through
duplication
Proteins adopt equivalent structures even when they differ greatly in
sequence
Analogues proteins have related folds, unrelated sequences
Homologous proteins both sequentially & structurally similar

Você também pode gostar