Escolar Documentos
Profissional Documentos
Cultura Documentos
The Structure of
Genes and Genomes
E
B
D
?
F
Negative control
experiment: degrade DNA
with DNase -> virulence
not transmitted anymore
– purine N
• Adenine: A 5’
P C
• Guanine: G O
– pyrimidine sugar
• Thymine: T
• Cytosine: C OH
3’
The structure of DNA:
The Double Helix
WATSON, J.D. & CRICK, F.H.C. A Structure for
Deoxyribose Nucleic Acid. Nature 171, 737-738 (1953)
James Watson, Francis Crick and Maurice Wilkins, Nobel Prize 1962
The double helix
• DNA normally consists of two
antiparallel polynucleotide chains
– sugar–phosphate backbone
• phosphodiester bonds
• 5’ to 3’ connection
– complementary base pairs
• A–T
• G–C
• hydrogen bonds
– 2 per A – T
– 3 per G – C
• 5’ → 3’ chain polarity
• Major and minor grooves (see
model)
5’ 3’
3’ 5’
5’-AATTGGCCGATC-3’
3’-TTAACCGGCTAG-5’
Figure 1.9 Genomes 3 (© Garland Science 2007)
WATSON, J.D. & CRICK, F.H.C. A Structure for
Deoxyribose Nucleic Acid. Nature 171, 737-738 (1953)
James Watson, Francis Crick and Maurice Wilkins, Nobel Prize 1962
Blah, blah, blah...
???...
Rosalind Franklin
“The Dark Lady of DNA”
1920-1958
"The instant I saw the picture my mouth fell open and my pulse began
to race.... the black cross of reflections which dominated the picture
could arise only from a helical structure... mere inspection of the X-ray
picture gave several of the vital helical parameters.” -JD Watson
Franklin and Wilkins X-ray diffraction studies
revealed that DNA was helical and had two
distinctive regularities of 0.34 nm and 3.4 nm along
the axis of the molecule. In addition, it was shown
that DNA had a uniform thickness of 2 nm.
Rosalind Franklin 2 nm
3.4 nm
10 bp
Maurice Wilkins
The DNA double helix is 2 nm wide.
A stack of 10 base pairs (= one turn of the
helix) have a linear length of 3.4 nm.
2 nm
Erwin Chargaff’s rules:
(early 1950’s)
YIPEEE!
James Watson, Francis Crick and Maurice Wilkins, Nobel Prize 1962
DNA: summary
• Units of measurement
– base pair (bp)
– kilobase (kb)
– megabase (Mb)
• Replication: each strand serves as template
for synthesis of complement, using rules of
base pairing
• Information: specified by sequence of
nucleotides; may be copied into RNA
• Mutation: replacement, insertion, deletion
of nucleotide results in altered sequence
2. The Structure of Genes
Structure of genes
• Gene = transcriptional unit
• Gene may encode coding RNA (mRNA) OR
non-coding RNA (tRNA, rRNA, miRNA...)
gene
promoter DNA encoding functional RNA
TSS
RNA primary transcript TTS
=Transcription =Transcription
start site termination site
Primary E1 I1 E2 I2 E3 I3 E4
transcript nuclear processing steps,
including splicing
Mature transcript E1 E2 E3 E4
Introns are only present in eukaryotic genes
(but they may be absent in some eukaryotic genes)
Because of the abundance and large size of introns,
some eukaryotic genes can attain huge sizes
Neuraminidase
Hemaglutinin
Restricted to
vertebrates
By acquisition of envelope gene, a Example: HIV-1
retrotransposon can gain infectious
capacity, it becomes a retrovirus
(example of an intermediate: gypsy in
fruit flies)
Eukaryotic genome: 2 or 3 genomes per cell
The Mitochondrial Genome
- Circular
- Resembles a reduced
prokaryotic genome in
terms of organization and
gene numbers
- Encodes genes mostly
involved in the production
of energy (oxydative
phosphorylation) and in
translation (tRNAs, rRNA)
- Maternally inherited
Mitochondrial genomes can vary in
size between species
Human
(mammal)
16.5 kb
Marchantia
(moss)
186 kb
Yeast
(fungus)
75 kb
The Chloroplast Genome (plants)
- Circular
- Resembles a reduced
prokaryotic genome in
terms of organization and
gene numbers
- Encodes genes mostly
involved in photosynthesis
and electron transport
- ‘Maternally’ inherited
(transmitted through the
seed)
Marchantia (moss) - Do not vary much in size
CpDNA 121 kb
4. The structure of
eukaryotic chromosomes
Eukaryotic nuclear genome: chromosomes (1)
• Linear structure
• Chromosome number is conserved within species but greatly
varies between species
• Ploidy refers to number of complete sets of chromosomes
– haploid (1n): one complete set of chromosomes (e.g. yeast)
– diploid (2n): two sets of chromosomes (e.g. most animals)
– polyploid (≥3n): more than two sets (e.g. many plants, a few animals)
• In diploids, chromosomes come in homologous pairs
(homologs)
– structurally similar (i.e size and position of centromere)
– same assortment of genes (homologous genes)
– may contain different alleles for each gene: each gene exist either as
homozygote state (same two alleles) or heterozygote state (two different
alleles)
In humans, somatic cells have 2n = 46 chromosomes
Remy’s
karyotype
Human chromosomes 11 and 17
Eukaryotic nuclear genome:
chromosomes (2)
• Cytogenetics: microscopic study of chromosomes
• Variable centromere position
– telocentric: centromere at end
– acrocentric: centromere close to end
– metacentric: centromere in middle
– For human chromosomes: p arm is shortest, q arm is
longest
• Telomere: end of chromosome
• Nucleolar organizer region (NOR): The chromosomal region
around which the nucleolus forms (contains rRNA gene
tandem array)
• Chromomere (or knob): small bead-like region of condensed
chromatin visible during meiosis and mitosis
Maize chromosomes (2n = 20)
Eukaryotic nuclear genome:
chromosomes (3)
• Considerable difference in size and in the
number of genes carried on chromosomes, both
between and within species
• Genes may occupy only a minor fraction of a
chromosome (extreme case is human Y)
Human chromosomes: size and gene density
300
Chromosome size (Mb)
250
Gene density (per 10 Mb)
200
150
100
50
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
1 2 3 4 5 6 7 8 9 10 1112 13 1415 16 17 18 19 20 21 22 X Y
Eukaryotic nuclear genome:
chromosomes (4)
• Heterochromatin
– densely stained regions of highly condensed
DNA
– mostly made of non-coding repetitive DNA, low
gene density and transcription activity
• Euchromatin:
– poorly stained, less compact chromatin
– contains most transcribed genes
H1
H3 H4
H2A H2B
Solenoid
Electron micrograph of chromosome
shows long DNA loops emanating
from the protein scaffold (at the
bottom of the pic) . Note that there are
only loops -no ends- at the top of the
pic.
5. Genome landscapes and
Comparative Genomics
Prokaryotes and eukaryotes have very
different genome landscapes
• In prokaryotes, genes are compactly
arranged, with little or no spacer sequences
in between (short intergenic regions) = most
of the genome is coding DNA
• In eukaryotes, there is considerable spacer
DNA between genes (large intergenic
regions) and within genes (introns) = most
of the genome is ‘non-coding’ DNA
Eukaryote genomes:
A whole lotta non-coding DNA
– Where is non-coding DNA? In introns, intergenic
regions, centromeric regions, telomeric regions
– the majority of non-coding DNA is repetitive DNA
= identical or nearly identical repeated units
- two types of repetitive DNA:
• Tandem repeats (e.g. DNA at centromeres and
telomeres)
• Interspersed repeats
– Most interspersed repeats are derived from
mobile genetic elements (aka transposable elements)
Genome Size
• In eukaryotes, most of the cell DNA is
from the nuclear genome
• Genome size is measured in pg or Mb
(1pg ~ 1000 Mb) human genome is ~3.2 pg
• Nuclear genome size is extremely
variable among eukaryote species
• ‘C-value paradox’ : no obvious
correlation between genome size and
organism complexity
The ‘C-value paradox’
Gregory 2005
C-value of eukaryotic nuclei varies ~200,000-fold, but
there is only ~20 fold variation in the number of
protein-coding genes
Mb
as
3000
2500
2000
1500
1000
500
mo
0
S
Bu lim diu
dd e m
i m
Fi ng y old
ssi
on east
Ne yea
u s
Ar rosp t
ab or
ido a
Br psi
as s
sic
a
Ri
c
Ne Ma e
m iz
Dr ato e
os de
op
hil
M a
Se osqu
Ze a sq ito
br ui
afi rt
sh
Fu
M gu
o
Hu use
genome size
ma
n
DNA
TE DNA
The amount of TE correlate positively with
Genomic DNA
Protein-coding
TEs
Protein-coding
genes
Transposable Element
2001: first
What have we learned draft of the
human
from the human genome
sequence
genome sequence?
Most of the Human Genome does not code
for proteins
Coding
Non-coding
1.5%
Half of the Human Genome is derived from
Transposable Elements (TEs)
TE-derived
DNA
48.5% Coding
Non-coding
1.5%
The human Genome Browser at UCSC
A snapshot of the Human Genome
Genes
Conservation in other species
TEs
TEs are the most rapidly changing components of the
genome
Human-
specific
GenesTE
Cons-
Ape- erved
specific TE Exon
TEs
Primate-specific
TE
Rapid changes in genome size in the grasses
~50 myr
~10 myr
Genome size:
4800 Mb 430 Mb 750 Mb 2500 Mb
San Miguel et al. (1996) Nested Retrotransposons in the Intergenic Regions of the
Maize Genome. Science 274: 765-768
(+ other studies from Bennetzen lab)
Retrotransposon amplification has resulted in the
doubling of the maize genome in the last ~6 myr
~50 myr
~10 myr
Genome size:
4800 Mb 430 Mb 750 Mb 2500 Mb
Genes
TEs
Comparative genomics
• Study of similarities and differences among
genomes
• Many genes are shared among all living things or
between related groups
• Study of genes in model organisms provides useful
information regarding genes in other organisms
• Large genome projects produce considerable
amount of information
– Requires computer analysis and development of new
software to analyze the avalanche of data (bioinformatics)
2001: first
What have we learned draft of the
human
from the human genome
sequence
genome sequence?
1996: S. cerevisiae 1998: C. elegans 2000: D. melanogaster 2001: H. sapiens
2000: A. thaliana
2002: S. pombe 2002: F. rubripes 2002: P. falciparum 2002: A. gambiae 2002: O. sativa
Genes
THIS WEEK’S MENU
What is the structure of DNA?
What is the organization of a gene?
What are chromosomes? What is chromatin?
How is DNA organized at the chromosome and chromatin level?
How are genes organized in the genome?
What makes the genome of a prokaryote and a eukaryote different?
What’s in a genome?
How are genomes different among eukaryotes?
Overview
• Each species has a uniquely fundamental set of
genetic information, its genome.
• The genome is composed of one or more DNA
molecules, each organized as a chromosome.
• The prokaryotic genome is generally compact and
made of a single circular chromosome.
• The eukaryotic genome consists of one or two sets
of linear chromosomes confined to the nucleus.
• A gene is a segment of DNA that is transcribed
into a ‘functional’ RNA molecule.
• Introns interrupt many eukaryotic genes.
• Eukaryotic genomes are littered with repetitive
DNA (mostly derived from transposable elements)