CONVOLUTIONAL CODES
A Dissertation
Do tor of Philosophy
by
Department of Mathemati
s
Notre Dame, Indiana
June 1999
ABSTRACT
In this dissertation the tools of linear systems theory are used to study
onvolutional
odes and
to develop de
oding algorithms for them.
In parti
ular, the inputstateoutput representation of a
onvolutional
ode is examined. Prop
erties of this representation are explored and its
onne
tion with various other representations is
detailed. Notable among these are the
onne
tions between the syndrome former matrix and the
lo
al des
ription of
odewords obtained via the inputstateoutput representation. Also, for
odes
with rate k=n 1=2, the outputstateinput representation is developed and the
lose
onne
tions
with the inputstateoutput des
ription are detailed. These
onne
tions are then used to enhan
e
ertain algebrai
de
oding s
hemes for
onvolutional
odes.
Turbo
odes are also given a brief treatment. A linear systems representation of these
odes is
developed and a few remarks on the de
oding of turbo
odes utilizing this representation are given.
In a mu
h broader appli
ation of the inputstateoutput representation, a matrix Eu
lidean
algorithm is developed. Not only does this algorithm eÆ
iently de
ide if a given
onvolutional
en
oder is
atastrophi
, it
an
ompute the greatest
ommon left divisor of the en
oder matrix in
a straightforward manner. This has appli
ations in many areas in
luding nding minimal bases of
rational ve
tor spa
es, obtaining irredu
ible matrix fra
tion des
riptions of transfer fun
tions and
for obtaining a basis for the free module generated by the
olumns of the matrix.
Finally, the
lass of ReedSolomon
onvolutional
odes is presented. These
odes are shown to
possess maximum distan
e separable generator and parity
he
k sub
odes. This property, along
with the ability to use BerlekampMassey de
oding on the sub
odes make these
onvolutional
odes ideal for the algebrai
de
oding s
heme presented in this dissertation. Other properties of
these
odes are then examined, in
luding their
olumn distan
e fun
tion. Some theoreti
al results,
examples and open problems regarding this issue are presented.
ACKNOWLEDGEMENTS
I need to thank the people who en
ouraged and supported my work toward this dissertation.
First, my advisor, Joa
him Rosenthal, has en
ouraged, enlightened and nurtured me as a math
emati
ian. For that, I will always be grateful. I have benetted greatly from the friendship and
advi
e of my
olleagues in
luding Eri
York, Paul Weiner, Steve Walk, Je Igo, Chris Moni
o and
Roxana Smaranda
he.
I would like to thank the members of my defense
ommittee, Heide GlusingLueren, Amarjit
Budhiraja and Thomas Fuja, for their time, eort and observations. I would like to thank all of the
fa
ulty, sta and graduates students of the Mathemati
s Department at the University of Notre
Dame. I would espe
ially like to thank Alex Hahn and Juan Migliore for all they have done for me.
I am also very grateful to Daniel Costello Jr. for all his advi
e and help.
I need to a
knowledge all of the generous institutions whi
h have supported my work. The De
partment of Mathemati
s and the Graduate S
hool of the University of Notre Dame have provided
an extraordinary learning environment. I am indebted to the Arthur J. S
hmitt Foundation and
to the Center for Applied Math at Notre Dame for their generous fellowships. I must also thank
the Institute for Mathemati
s and its Appli
ations and the National S
ien
e Foundation for their
support.
Thanks must be given to my wonderful wife who has supported and en
ouraged me to an extent
that I
an never repay. I would also like to thank my family for all they have given me. I will never
be able to show them just how mu
h I love them.
I also would like to thank the Lady on The Dome and the entire Notre Dame family for a truly
enri
hing and wonderful experien
e.
CONTENTS
CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Communi
ation, Coding and Shannon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Blo
k Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Convolutional Codes.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Organization of this Dissertation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
CHAPTER 2 LINEAR SYSTEMS AND CONVOLUTIONAL CODES . . . . . . . . . . . . . . . . . . . . . 6
2.1 Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Spe
ial Properties of the ISO Representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 An Algebrai
Model for Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
CHAPTER 3 CONNECTIONS BETWEEN REPRESENTATIONS.. . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Generator Matri
es and Sequen
es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Parity Che
k Matri
es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Shift Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 From ISO Representations to Generator Matri
es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 From the Lo
al Des
ription to Syndrome Formers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
CHAPTER 4 A MATRIX EUCLIDEAN ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1 Ba
kground Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 A Brief History of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 A Realization Algorithm .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 The Controllability Spa
e. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 The Rening Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.6 The Situation of Constant Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.7 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
CHAPTER 5 DECODING ERROR CORRECTING CODES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1 Channel Models and Maximum Likelihood De
oding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 De
oding Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 A Few De
oding Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 Majority Logi
De
oding of Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 Using the Lo
al Des
ription for Feedba
k De
oding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CHAPTER 6 ALGEBRAIC DECODING OF CONVOLUTIONAL CODES USING THE LO
CAL DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.1 A Basi
Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 Some Classes of Binary InputStateOutput Convolutional Codes . . . . . . . . . . . . . . . . . . . . 40
6.3 Example Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.4 An Alternative to State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.5 Some Notes on the State Elimination Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.6 Enhan
ed State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.7 Analysis of the Enhan
ed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.8 An Algebrai
Look at De
oding Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
CHAPTER 7 MDS CONVOLUTIONAL CODES.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.1 ReedSolomon Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2 Further Properties of ReedSolomon Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.3 Some Insights into the Conje
ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.4 The Case when Æ = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.5 The Road Ahead and a Stronger Conje
ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1
CHAPTER 1
INTRODUCTION
1.1 Communi
ation, Coding and Shannon
One may reasonably dene
ommuni
ation as the
onveyan
e, reliable or otherwise, of information.
Reliable and eÆ
ient
ommuni
ation is be
oming an in
reasingly indispensable tool of the modern
world. Indeed, as we are rmly entren
hed in the \information age" our so
iety is utterly dependent
on reliable
ommuni
ation. Eviden
e of this dependen
e
an be seen everywhere. Finan
ial markets
rely on
omputers to a
urately re
eive and oversee trading a
tivity. Banking transa
tions are
\wired" instead of delivered by armored
ars. Credit
ards are veried automati
ally over phone
lines or the internet for ea
h pur
hase. All of these things involve
ommuni
ation in one form
or another. Whether it is a person entering information into a
omputer, a
omputer displaying
information to a person, two
omputers sharing data, or simply two people
ommuni
ating with
ea
h other; all are forms of
ommuni
ation.
Of
ourse,
ommuni
ation has
ome a long way. The earliest forms of
ommuni
ations between
humans probably involved some sort of gesturing and primitive oral languages. These evolved into
written languages with grammar and syntax. Today,
ommuni
ation has moved into the digital age.
Information is en
oded as sequen
es of 0's and 1's to allow for virtually instantaneous mi
ro
hip
ontrolled
ommuni
ation.
Reliability in
ommuni
ation has always been important. Indeed almost all
ommuni
ation
methods have some inherent error toleran
es. In oral language, it is possible, to a large extent, for
two people speaking the same language but with dierent a
ents and even diale
ts to
ommuni
ate
very well. In written language, errors in spelling are often easily
orre
ted by the reader. For
example, if one en
ounters the passage \The
hilwren did very well in s
hool", one
ould assume
that \
hildren" was the intended word. In fa
t, my automated spell
he
ker has been urging me to
make that very
hange.
Modern digital
ommuni
ation also has these toleran
es for error. It
omes in the form of error
orre
ting
odes. A reasonable denition of an error
orre
ting
ode is a pair (M; R) and two maps
asso
iated with these sets:
: M ! R and : R ! (M)
Here, M is the set of all possible messages, R is the set of all possible re
eived messages, the
map is inje
tive and is
alled the en
oder and the map is
alled the estimator or preliminary
de
oder. Given this framework, we
an present a model of
ommuni
ation employing an error
orre
ting
ode. Suppose the message m 2 M is to be transmitted. Then we have the following
situation.
1
m 7 ! r~ 7 ! r 7 ! m
7 ! r noise
This pro
ess
an be explained as follows. The message m is en
oded under the map to
the transmitted message r. As r is transmitted over the
ommuni
ation
hannel it is subje
ted
to various kinds of interferen
e and distortion whi
h is
olle
tively labeled `noise'. For example,
thunderstorms often ae
t the quality of telephone
onne
tions and dust may ae
t the quality of
sound produ
ed by a
ompa
t dis
. This noise may transform r into a possibly dierent r~ 2 R.
Thus, the distorted message r~ is re
eived and the de
oder must map r~ to a reasonable element, r,
of (M) using the estimator map . Then, the en
oding is reversed by applying 1 to obtain the
de
oded message m 2 M. The transmission su
essfully
ommuni
ates the message if and only if
m equals m . It is one of the main fo
uses of
oding theory to develop
odes whi
h a
hieve a high
rate of su
essful
ommuni
ation over a parti
ular
hannel.
Example 1.1.1 Suppose two possible messages are to be sent. These messages
an be represented
in a digital
ommuni
ation system by 0 and 1. Hen
e, M = f0; 1g. We
an dene by
(0) = 00000 and (1) = 11111. We
an dene by the rule \estimate as 00000 if the re
eived
message has more 0's than1 1's, and vi
e versa". Thus if 01011 is re
eived, then (01011) = 11111
and this is de
oded as (11111) = 1.
2
If we know that for our
hannel, ea
h bit that is transmitted will be re
eived erroneously with
a probability of 0.1, it
an easily be
omputed that a message will be de
oded in
orre
tly with a
probability of 0.00856. This means that there will be 11 times fewer errors by using this
ode as
opposed to using no
ode at all!
The tradeo in this situation is that instead of having to transmit only one digit to send ea
h
message, ve digits have to be sent. Over time, this
an add up to a lot more time and money to
transmit these messages. So the notion of the rate of the error
orre
ting
ode must
onsidered.
The rate of a
ode is simply the ratio of the number of information bits (i.e. the length of a message,
m 2 M) to the number of
ode bits (i.e. the length of (m)). For the
ode of this example, the
rate is 1/5.
It was Claude Shannon in 1948 [64℄ who showed that the goal of nding error
orre
ting
odes
that allowed for a high probability of su
essful transmission was possible. Shannon showed that
ea
h
hannel has a
onstant asso
iated with it
alled the
hannel
apa
ity. Furthermore, he showed
that there exists error
orre
ting
odes that a
hieve a su
essful transmission with probability
arbitrarily
lose to 1 with the rate of the
ode arbitrarily
lose to (but below) the
hannel
apa
ity.
His proof showed only the existen
e of su
h
odes. There is little indi
ation of how to obtain su
h
odes. The
onstru
tion of these
odes, along with eÆ
ient de
oding algorithms, is the goal of
modern
oding theory. Error
orre
ting
odes generally fall into two
ategories: blo
k
odes and
onvolutional
odes.
1.2 Blo
k Codes
Let A be a (nite) set of symbols,
alled the message alphabet. Dene M to be the set
onsisting of
all sequen
es of symbols from A of length k. Also, dene R to be the set
onsisting of all sequen
es
of symbols from A of length n. We dene k and n to be positive integers with k n.
Denition 1.2.1 A blo
k
ode of rate k=n over the alphabet A is dened as the pair (M; R)
together with an inje
tive map : M ! R and an estimator map : R ! (M ).
An important spe
ial
ase of the above denition is when A is a nite eld, Fq = GF (q). In
this
ase, and when is a linear map : F kq ! Fnq , the
ode is
alled a linear blo
k
ode.
From elementary linear algebra, we know that the map
an be represented by a s
alar matrix,
G. This full rank k n matrix is known as a generator matrix of the
ode. Using this generator
matrix des
ription, the en
oding pro
ess for a message m 2 F q
an be des
ribed as:
m7 ! mG
Using the above denitions, it is
lear that for a linear blo
k
ode, the set of
odewords, (M),
is a linear subspa
e of Fnq . Another way to des
ribe this linear subspa
e is through the use of a
kernel representation. Indeed, there exists a s
alar matrix H of size (n k) n,
alled the parity
he
k matrix, su
h that a ve
tor x 2 Fqn is a
odeword if and only if xH T = 0.
The
ode in Example 1.1.1 is a linear blo
k
ode over F 2 . Indeed, a generator matrix is given
by G = [ 1 1 1 1 1 ℄. Also, a parity
he
k matrix is given by
2
1 1 0 0 03
H = 6 1 0 1 0 07
4 1 0 0 1 0 5
1 0 0 0 1
Remark 1.2.2 Although error
orre
ting
odes are dened as a pair of sets together with an
en
oder map and an estimator map, in pra
ti
e, all of these may not be spe
ied. Often, when
the fo
us is not on de
oding, the estimator map is omitted. When the
ode is linear, it is
ommon
pra
ti
e to identify the
ode solely by a generator or parity
he
k matrix. Further, by abuse of
notation, a
ode is often spe
ied simply by the set of
odewords (M). Hen
e, identifying the
linear subspa
e that
omprises the set of
odewords is enough to identify the
ode.
This dis
ussion of blo
k
odes is ended with some basi
denitions.
3
Denition 1.2.3 The Hamming weight or simply weight of a ve
tor in F n is dened as the number
of nonzero
omponents of the ve
tor. Notation: the weight of a ve
tor x is given by wt (x).
Denition 1.2.4 The Hamming distan
e or simply distan
e between two ve
tors in An is the
number of
omponents in whi
h they disagree. Notation: the distan
e between x and y is given by
dist(x; y).
Denition 1.2.5 The distan
e of a
ode is the minimum distan
e between any two
odewords.
If the
ode is linear, then the distan
e is equal to the minimum weight of all the nonzero
odewords.
1.3 Convolutional Codes
We now turn our attention to the other major
lass of error
orre
ting
odes. Suppose we have a
rate k=n linear blo
k
ode over F q with s
alar generator matrix G. Assume we have a sequen
e
of messages u0; u1k; u2 ; : : : ; ur from M that we wish to send. We
ould write this sequen
e as a
polynomial over F q in some indeterminate s as follows:
! u0 + u1s + u2 s2 + + ur sr
u0 ; u1 ; u2 ; : : : ; ur
If we denote this polynomial by u(s), then the en
oding of the entire sequen
e
an be writ
ten simply as u(s) 7! u(s)G, where the matrix multipli
ation is done on ea
h
oeÆ
ient of the
polynomial.
It was Elias in 1955 [22℄ who suggested that the matrix G need not be s
alar. In doing so, the
notion of a
onvolutional
ode was born. The
lassi
al denition of
onvolutional
odes
an now
be stated.
Denition 1.3.1 [23, 52℄ A rate k=n
onvolutional
ode is dened as a k dimensional Flinear
subspa
e of F n, where F is either the eld of rational fun
tions F(s) or the eld of formal Laurent
series F((s)). In either
ase, a k n full rank matrix G(s), with entries in F [s℄ will serve as a
generator matrix. The highest degree polynomial o
urring in G(s) is
alled the memory, m, of the
en
oder. In short, the en
oding of a message depends, not only on the bits to be sent at a
ertain
time, but also on the bits sent in the previous m time intervals. Of
ourse, polynomial parity
he
k
matri
es exist as well, and are sometimes referred to as syndrome formers.
It should be remarked that the notions of Hamming weight and distan
e are easily extended to
onvolutional
odes. The blo
k
ode notion of the distan
e of a
ode also is readily extended, but
it takes on the spe
ial name of free distan
e when it is applied to
onvolutional
odes.
Sin
e
onvolutional
odes
an potentially have innite message sequen
es, some unique situa
tions may arise. In parti
ular, it may happen that an innite weight message sequen
e is en
oded
as a nite weight
odeword. If this happens, then a nite number of errors during transmission
an
lead the estimator map to estimate this re
eived word as the all zero sequen
e, whi
h, of
ourse, is
a valid
odeword. Hen
e, a nite number of transmission errors
an lead to an innite number of
errors in the message word. This unfortunate situation is
alled
atastrophi
en
oding and is to be
avoided. Fortunately, the following proposition, proven by Massey and Sain [51℄
hara
terize this
situation
ompletely in terms of the generator matrix.
Proposition 1.3.2 A
onvolutional en
oder G(s) is non
atastrophi
if and only if the greatest
ommon divisor of its full size minors is sl , for some nonnegative integer l. If l = 0 the en
oder
is
alled observable.
5
CHAPTER 2
LINEAR SYSTEMS AND CONVOLUTIONAL CODES
In this Chapter the basi
notions of linear systems will be reviewed, starting with the
on
ept
of behaviors. These ideas are used to form an alternate denition of
onvolutional
ode. This
denition is then rened by giving a rst order representation and exploring its various properties.
For a more thorough dis
ussion of the systems theory whi
h dominates the early part of this
hapter,
the reader is referred to [70, 36, 35, 68, 69, 37℄.
2.1 Behaviors
In this se
tion the notion of dynami
al systems and behaviors as well as some related
on
epts are
introdu
ed and brie
y dis
ussed. This is a review of the material
ontained in [58, 61, 71℄.
Denition 2.1.1 A dynami
al system is a triple = ( T ; A; B ), where T R is the time
axis, A is the signal alphabet and B AT is
alled the behavior. The elements of B are
alled
traje
tories.
We will work with these abstra
t notions only long enough to write down an alternate denition
of
onvolutional
ode, so we will omit any illuminating examples of the above denitions.
For our purposes we will take the dis
rete time axis T = Z+. Let F = F q and let the signal
spa
e be A = Fn . Hen
e, our behaviors will
onsist of subsets of the set of one sided innite
sequen
es of ve
tors in F n .
Denition 2.1.2 Dene the left shift operator, , and the right shift operator, 1 , on the sequen
e
spa
e AT by
(a0 ; a1 ; a2 ; : : : )
= (a1 ; a2 ; a3 ; : : : )
1 (a0 ; a1 ; a2 ; : : : )
= ( 0; a0 ; a1 ; a2 ; : : : )
A subset C AT is right (left) shift invariant if 1 C C ( C C ). Also if for every element of
C, at most nitely many
omponents are nonzero, we say C has
ompa
t support.
With these denitions in hand, we are led to the following rather
rypti
denition of a
onvo
lutional
ode.
Denition 2.1.3 A subset C AT is
alled a
onvolutional
ode if C is linear (as a ve
tor spa
e
over Fq with
omponentwise addition), right shift invariant and has
ompa
t support.
These requirements may seem odd, but are a
tually quite natural. Certainly, we have seen that
a linear subspa
e is desired. The right shift invarian
e means only that a valid
odeword is still a
valid
odeword if it is delayed before it is sent. The
ondition of nite support is new. Obviously
this
ontradi
ts the
lassi
al denition of a
onvolutional
ode given in Denition 1.3.1. In pra
ti
e
this restri
tion has little ee
t sin
e one would never want to send an innite
odeword. More
a
ademi
ally, this
an be justied by the la
k of a widely a
epted denition of
onvolutional
ode.
With the usual identi
ation between nite sequen
es and polynomials, the following theorem
translates the new denition of
onvolutional
ode into a more re
ognizable form.
Theorem 2.1.4 [71, Theorem 3.1.2℄ Let C AT . Then C is a
onvolutional
ode if and only if
C is a F [s℄submodule of F n [s℄.
Corollary 2.1.5 [71, Corollary 3.1.3℄ As a submodule of a free module (over a PID), C has a
well dened rank k. Hen
e there exists an inje
tive module homomorphism
: Fk [s℄ ! F n [s℄
u(s) 7 ! v(s)
Equivalently, there is a full rank k n polynomial matrix G(s) su
h that
C = f v(s) j 9 u(s) 2 Fk [s℄ : v(s) = u(s)G(s)g
6
Let us introdu
e some terminology and notation. As before, the rate of the
onvolutional
ode
is given by k=n. Given an en
oder G(s), the maximum degree polynomial o
urring in row i is
alled the row degree of row i and denoted by i. After a possible reordering of the rows, we will
assume that 1 2 : : : k . As before, the memory of the en
oder, m, is equal to 1 . Of
ourse, reordering the en
oder might seem to
hange the
ode it denes, so let us resolve some
issues of uniqueness with regard to en
oders with the following proposition.
Proposition 2.1.6 [71, Lemma 3.1.6℄ Two en
oders, G1 (s) and G2 (s), dene the same
onvolu
tional
ode if and only if there exists a k k unimodular matrix U (s) su
h that U (s)G1 (s) = G2 (s).
The
omplexity of an en
oder, Æ(G(s)), is given as the sum of the row degrees, Æ(G(s)) = i.
However, in light of the above proposition, there is a mu
h better way to dene the
omplexity of a
ode. The
omplexity of a
onvolutional
ode, Æ(C) is the maximum degree of all the full size minors
of any en
oder G(s). This
omplexity is often referred to as the M
Millan degree in the systems
theory literature. When the
ontext is
lear, the notation will often be shortened to simply Æ.
An en
oder is minimal if Æ(G(s)) = Æ(C). It
an be shown that any two minimal en
oders must
have the exa
t same row degrees (up to reordering). These row degrees for a minimal en
oder are
known as the Krone
ker indi
es of the
ode.
2.2 Realization
In this se
tion we will develop rst order representations of
onvolutional
odes as dened in the
previous se
tion. Again this se
tion is a review of previous works in
luding [58, 61, 71℄.
The following theorem proves the existen
e of a rst order realization. It should be remarked
that the results here are `dual' statements of those in [37℄ as observed in [58℄.
Theorem 2.2.1 [58℄ Let C F n [s℄ be a rate k=n
onvolutional
ode of
omplexity Æ. Then there
exist size (Æ + n k) Æ matri
es K; L and a size (Æ + n k) n matrix M (all with s
alar entries
in F ) su
h that the
ode C is dened by
C = fv(s) 2 Fn [s℄ j 9 x(s) 2 FÆ [s℄ : sKx(s) + Lx(s) + Mv(s) = 0g:
Further, K has full
olumn rank, [K j M ℄ has full row rank and rank[s0 K + L j M ℄ = Æ + n
k; 8 s0 2 F .
A triple (K; L; M ) satisfying the above is
alled a minimal representation of C .
Proposition 2.2.2 [58℄ If (K; L ; M ) is another representation of the
onvolutional
ode C then
there exist unique invertible (s
alar) matri
es T and S su
h that
L ; M ) = (T KS 1; T LS 1; T M ):
(K;
It
an be shown, after a suitable transformation permitted by the above proposition, and pos
sibly a reordering of the
omponents of the
ode (obviously resulting in an `equivalent'
ode) that
the triple (K; L; M )
an be written in the following spe
ial form.
K = I A 0 B
0 ; L= C ; M= I D
Here, A is size Æ Æ, B is Æ k, C is (n k) Æ and D is (n k) k.
This rewriting of the rst order representation allows us to dene a
onvolutional
ode in terms
of the more familiar (A; B; C; D) representation in systems theory. Hen
e, we arrive at the input
stateoutput representation.
Denition 2.2.3 [InputStateOutput Denition of Convolutional Code℄ [61℄
Let F = Fq be the Galois eld of q elements and
onsider the matri
es A 2 F ÆÆ ; B 2 FÆk ; C 2
7
F (n k)Æ ,and D 2 F (n k)k . A rate nk
onvolutional
ode C of
omplexity Æ
an be des
ribed by
the linear system governed by the equations:
xt+1 = Axt + But ;
yt = Cxt + Dut ; (2.2.1)
yt
vt = ; x0 = 0:
u t
Hen
e the input to the en
oder at time t is the information or message ve
tor, ut . The en
oder
reates the parity ve
tor, yt, and the
ode ve
tor, vt, is transmitted a
ross the
hannel. We will
refer to the
onvolutional
ode
reated in this way by C(A; B; C; D).
From a systems theory point of view, the variable xt is referred to as the state of the system at
time t. The input ve
tor, ut, is
ombined with the
urrent state, xt , to
reate the output, yt. Also,
the
urrent input is used to update the state for the next time interval, xt+1 .
Some enlightening examples, as well as the
onne
tions between the various representations of
onvolutional
odes will be given in Chapter 3. n o
The set of
ode words are, by denition, equal to the set of traje
tories uy t0 of the t
t
dynami
al system (2.2.1). The following Proposition
hara
terizes those traje
tories.
Proposition 2.2.4 (Lo
al Des
ription of Traje
tories) [61℄ Let ;
be positive integers
n o with
<
. Assume that the en
oder is at state x at time t = . Then any
ode sequen
e utt
y
t0
governed by the dynami
al system (2.2.1) must satisfy:
0 10
0
y
1 0
C
1 D 0 0 u
1
y +1 CA B ... .. CB u +1
B
B
C
C
B
B
C
C B CB D . CB C
C
.. .. B ... ... CB ..
B
B . C
C = B
B . C
C x + B
B CAB CB CB
CB . C
C
B
.. C
A
B
.. C
A B .. ... ... C .. C
A
. . . 0 A .
y
CA
CA
1 B CA
2 B CB D u
Moreover the evolution of the state ve
tor xt is given over time as:
0 1
u
xt = At x + At 1B : : : B .. A ; t = + 1; + 2; : : : ;
+ 1: (2.2.2)
.
ut 1
Proof: This follows easily by iterating the equations that dene the system.
For
1 we will dene the following notation:
2
D 0 0 3
6
6 CB D
... ... 77
M
(A; B; C; D) = 6
6
6 CAB CB
... ... 7
7
7 (2.2.3)
6
4 ... ... ... 0 5 7
CA
2 B CA
3 B CB D
Some or all of the parameters may be omitted when the
ontext is
lear. This should not
ause
onfusion with the M from the (K; L; M ) rst order representation be
ause that representation
will not be dis
ussed any further in this dissertation.
Next the set of valid
odewords is given as the kernel of a parity
he
k matrix.
8
Proposition
n 2.2.5 (Global o Des
ription of Traje
tories) [61℄
yt 2 F n j t = 0; : : : ;
represents a valid
ode word if and only if:
ut
0 1
0 1 y0
0 0 A
B A
1B
AB B B y1 C
B D CB .. C
B CB . C
B CB D CB
y
C
CB C
B
B
B I CAB CB
... CB
CB u0
C
C = 0: (2.2.4)
B .. ... ... CB u1 C
. AB
B
C
C
..
CA
1 B CA
2 B CB D
. A
u
Proof: Setting = 0 in Proposition 2.2.4 gives the bottom portion of the matrix. Sin
e x
+1 = 0
the top row of the matrix follows from the se
ond part of Proposition 2.2.4.
Let A; B; C be s
alar matri
es over F of size Æ Æ, Æ k and (n k) Æ respe
tively. Let j be
a positive integer and dene
j (A; B ) := Aj 1B Aj 2B : : : AB B ; (2.2.5)
0 1
C
B CA C
j (A; C ) :=
B
B CA2 C
C: (2.2.6)
B
... C
A
CAj 1
Denition 2.2.6 Let A; B be matri
es of size Æ Æ and Æ k respe
tively. Then (A; B ) is
alled
a
ontrollable pair if
rank Æ (A; B ) = Æ: (2.2.7)
If (A; B ) is a
ontrollable pair then we
all the smallest integer having the property that
rank (A; B ) = Æ the
ontrollability index of (A; B ).
In a similar fashion we dene:
Denition 2.2.7 Let A; C be matri
es of size Æ Æ and (n k) Æ respe
tively. Then (A; C ) is
alled an observable pair if
rank
Æ (A; C ) = Æ: (2.2.8)
If (A; C ) is an observable pair then we
all the smallest integer having the property that
rank
(A; C ) = Æ the observability index of (A; C ).
It is true that the (A; B; C; D) representation of the
onvolutional
ode is minimal (in terms of
omplexity Æ) if and only if (A; B ) is a
ontrollable pair [61℄. Further, the
onvolutional
ode will
be observable (i.e. non
atastrophi
with delay 0) if and only if (A; C ) form an observable pair [61℄.
Hen
e, it is natural to only
onsider
odes whose representations satisfy these
onditions.
Finally, let us translate the result of Proposition 2.2.2 to the inputstateoutput representation.
Proposition 2.2.8 For T 2 GlÆ (F ) we have:
C(A; B; C; D) = C(T AT 1 ; T B; CT 1; D)
2.3 Spe
ial Properties of the ISO Representation
In this se
tion, the properties of the ISO representation (2.2.1) will be dis
ussed. In parti
ular, for
the spe
ial
ase when the rank of D is k, whi
h for
es the rate of the
ode to be at most 1=2, it
is possible to `invert' the system. That is, the system
an be rewritten so that the output drives
the state and the input. The
onne
tions between these two representations, sin
e they will play
a fundamental role in what is to follow, will be dis
ussed here. The
onne
tions between these
9
representations and the other \
lassi
al" representations of
onvolutional
odes will follow later in
Chapter 3.
Let us
onsider a
onvolutional
ode dened by the matri
es (A; B; C; D) as is (2.2.1). For this
se
tion we will assume that rank D = k = n k. However, we will note that what follows
an also
be done with some minor modi
ations for the
ase where rank D = k < n k. This situation will
not be of use to us in this dissertation and it will be mu
h
learer if those details are omitted here.
By simple algebrai
manipulations of (2.2.1), we arrive at the following representation.
Proposition 2.3.1 (OutputStateInput Representation)
Given the (A; B; C; D) representation with
onditions dis
ussed above (most importantly that D
is invertible), the following linear system denes the same
onvolutional
ode.
xt+1 = (A BD 1C )xt + BD 1yt;
ut = D 1Cxt + D 1yt ; (2.3.9)
yt
vt = ; x0 = 0:
u t
Proof: Clear, sin
e the dening equations are simple algebrai
manipulations of the previous ones.
In most pra
ti
al
ases, D
an be
hosen to be the identity whi
h will signi
antly improve the
above notation. Also, we will employ the following shorthand notation for the above matri
es.
A~ := (A BD 1 C ); B~ := BD 1 ; C~ := D 1 C; D~ := D 1
The following
onne
tion between A and A~ will be useful.
Lemma 2.3.2 For i 1 we have
2 3
C
h i6 CA
= A~i + A~i 1 B~ A~i 2B~ A~B~ B~
7
Ai 6
4 .. 7:
5
.
CAi 1
Proof: By indu
tion on i. For i = 1, the result follows immediately from the denition of A~.
Assume the result holds for i j and show for i = j + 1. Hen
e, we have
A~j = Aj j (A; ~ B~ )
j (A; C ):
Multiplying on the left by (A BD 1C ) gives
h i
A~j +1 = Aj +1 BD 1 CAj A~j B~ A~j 1 B~ A~2 B~ A~B~
j (A; C ):
The new `extraneous' term is exa
tly the term `missing' from the matrix multipli
ation. Realizing
this gives the desired result.
The outputstateinput representation serves just as well as the traditional inputstateoutput
des
ription. In parti
ular, a lo
al des
ription of the traje
tories
an be obtained in the same way
as Proposition 2.2.4 by simply repla
ing (A; B; C; D) with (A;~ B; ~ C; ~ D~ ). The same substitution
into (2.2.2) will give another equation for the state based on the new representation. Similarly, the
same is true for the global des
ription of the traje
tories provided in Proposition 2.2.5. Of
ourse,
we may also dene the
ontrollability matri
es,
(A;~ B~ ), and observability matri
es,
(A;~ C~ ),
with respe
t to this representation. It is natural to ask how these various obje
ts from the two
representations are related. It happens that the relationship is quite ni
ely des
ribed, and we will
do so in the following sequen
e of lemmas.
Lemma 2.3.3 Re
all the denition of M
in (2.2.3). Then
M
(A; B; C; D) = M
(A; ~ B;
~ C;
~ D~ ) 1 :
10
Proof: We will show that M
(A;~ B;
~ C;
~ D~ ) M
(A; B; C; D) = Ik
. The fa
t that the entries above
the diagonal are 0 is trivial sin
e we are multiplying two lower diagonal matri
es. The diagonal
onsists of all1 1's sin
e ~ = Ik . The next lower diagonal is all 0's sin
e C~ BD
DD ~ + DCB ~ =
1 1
D CBD D + D CB = 0. All lower diagonals are zero sin
e ea
h blo
k entry
an be written
in the form h i
C~ A~j + j (A; ~ B~ )
j (A; C ) Aj B
An immediate appli
ation of Lemma 2.3.2 shows this to be 0. The fa
t that the reverse multipli
ation also gives the identity is
lear, but
an also be proven dire
tly by obtaining an analog of
Lemma 2.3.2 with the `tildes' and `nontildes' swit
hed.
Lemma 2.3.4
(A; B ) M
(A;~ B;
~ C;
~ D~ ) =
(A;~ B~ )
~ B;
M
(A; ~ C;
~ D~ )
(A; C ) =
(A;~ C~ )
Proof: Both statements are a dire
t
onsequen
e of Lemma 2.3.2.
The above lemmas show that the inputstateoutput representation is very naturally related
to the outputstateinput representation. In Chapter 3, we will see how these representations are
related to the more
lassi
al representations of
onvolutional
odes.
2.4 An Algebrai
Model for Turbo Codes
Turbo
odes were introdu
ed by Berrou, Glavieux and Thitimajshima in 1993 [14℄. Their idea of us
ing parallel
on
atenation of re
ursive systemati
onvolutional
odes (RSCs) (See Denition 3.1.2.)
with an interleaver was a major step in terms of a
hieving low bit error rates (BERs) at signal
tonoise ratios near the Shannon limit. (See Chapter 5 for a more thorough explanation of these
terms.) However, their ideas were originally met with skepti
ism, partly be
ause of the phenome
nal performan
e of the
odes and partly be
ause it was not
lear why they worked. Many papers
have sin
e been devoted to these questions, e.g. [54, 20, 21, 19℄. Mu
h of this work has fo
used on
the improved weight distribution obtained by using a suitable (usually random) interleaver. Oth
ers fo
used on developing a more suitable softde
ision de
oder to be used in the turbode
oding
pro
ess, e.g. [13, 31℄. This se
tion will fo
us on developing turbo
odes in the framework of the
inputstateoutput representation for
onvolutional
odes.
We will give a brief review of turbo
odes. We will then present the natural generator and
parity
he
k des
riptions of turbo
odes that follow from the lo
al des
ription of Proposition 2.2.4.
The de
oding of turbo
odes using this new des
ription will be dis
ussed in Chapter 6.
2.4.1 A Brief Review of Turbo Codes
Here the basi
stru
ture of turbo
odes will be reviewed. The following is the design of the most
simple of turbo
odes. Only two parallel identi
al RSCs are used and no pun
turing of the output
bits is employed. Turbo
odes
ould have many parallel RSCs, but typi
ally only two are used, and
even more typi
ally, the RSCs will be rate 1/2 and be identi
al. Pun
turing is used very often to
improve the rate of the
ode by `spli
ing' together the separate output streams. We will omit this
part of the turbo
oding s
heme simply for
larity of presentation.
The input, u, to the turbo en
oder is used in 3 ways. First, it is sent dire
tly to the
hannel
as part of the
odeword. Thus, turbo
odes are systemati
. Se
ond, the input is sent into the rst
RSC and the output, say y, of this
onvolutional en
oder (separated from the input) is sent along
the
hannel as the se
ond part of the
odeword. Finally, the input is also sent into an interleaver
or permutation S , whi
h rearranges the order of the input bits. The s
rambled input bits are then
sent into the RSC and its output, say y^, is sent as the nal pie
e of the
odeword. This general
s
heme is outlined in gure 2.1.
11
u
Input
u

y
 RSC 
?
Interleaver y^
(S )  RSC 
12
0
0
yN
1 D 0 0 1 0 uN 1
B yN +1 C B B CB D
... ... C
CB uN +1 C
B
B ... C = B
C
. . . . CBB ... C
C
(2.4.10)
B C B B
CAB CB . . C
CB C:
B
... A B
C
.
.. . .
.. .. 0 A CB .
.
.
C
A
yN +N 1 N 2
CA B CA B CB D
N 3 u N + N 1
Reverting to our earlier notation, the large matrix on the right hand side of the above equation
will be denoted MN (A; B; C; D), or simply M . We are now ready to give a parity
he
k matrix
and a pseudogenerator matrix des
ription of the turbo
ode. The proofs of these assertions are
straightforward.
Theorem 2.4.1 A ve
tor V 2 F 3N is a valid N blo
k of the turbo
ode if and only if
"
N (A; B ) 0 0 #
M I 0 V = 0;
MS 0 I
where S is the interleaver (permutation) matrix. Thus any sequen
e in F is a
odeword of the turbo
ode if and only if it is a sequen
e of valid N blo
ks.
Lemma 2.4.2 If u 2 F N and u 2 ker N (A; B ), then the image Gu, where
" #
I
G = M
MS
is a valid N blo
k of the turbo
ode.
This des
ription is not satisfying sin
e the input, u, is
onstrained. This
an be over
ome with
the following observation. With reasonable assumptions on A and B , in
luding that (A; B ) be
a
ontrollable pair and that the
hara
teristi
polynomial of A equals the minimum polynomial,
Cayley's theorem implies that N (A; B )u = 0 if and only if pA(s)ju(s), where pA(s) is the minimum
polynomial of the matrix A, and u(s) is the polynomial whose
oeÆ
ients are the entries of the
ve
tor u. We know the degree of pA(s) is Æ. We write pA(s) = pÆ sÆ + : : : + p1s + p0. We dene
the following matrix, P , of size N N Æ, (Hen
e, we assume N > Æ.) with the property that the
image of P is pre
isely the kernel of N (A; B ):
2
p0 0 3
6 p1 p0 7
6 . ...
6 .. p
7
7
6 1 7
P = 6 pÆ ... . . . p0 7
6
7
6 7
6 pÆ p1 7
6
4 . . . ... 75
0 pÆ
We are now ready to present the generator matrix des
ription of a turbo
ode.
Theorem 2.4.3 The valid N blo
ks for the turbo
ode are generated by the following matrix G.
i.e. For any v 2 F N Æ , Gv is a valid N blo
k. Also every valid N blo
k is of this form.
" #
P
G= MP
MSP
Proof: Follows from denition of P and Lemma 2.4.2.
13
Remark 2.4.4 In the above generator matrix, the input v is not the input to the en
oder, rather
Gv is the input to the en
oder.
Example 2.4.5 A typi
al RSC
hosen for binary turbo
odes is given by [1 s2s+2+1s+1 ℄. We will
develop the representations of this se
tion for this en
oder. First, the matri
es (A; B; C; D) are
given by:
0 1 0
A = 1 1 ; B = 1 ; C = [0 1℄; D = [1℄:
Clearly, and not surprisingly, PA(s) = s2 + s + 1. For the sake of exposition, we will
hoose a
ridi
ulously small value N =5. The matrix P and the matrix M are :
2
1 0 03 2
1 0 0 0 03
6 1 1 0 7 6 1 1 0 0 0 7
P =6 6 1 1 1 7 M =6 1 1 1 0 0 7
7 6 7
4 0 1 1 5 4 0 1 1 1 0 5
0 0 1 1 0 1 1 1
14
CHAPTER 3
CONNECTIONS BETWEEN REPRESENTATIONS
In this
hapter we will present some examples of
onvolutional
odes using ea
h of the various rep
resentations. In parti
ular, example
odes will be developed with the generator matrix des
ription.
From there, it will be shown how to transform this des
ription into a parity
he
k des
ription and
inputstateoutput des
ription. Also, the generator sequen
e representation and ele
troni
shift reg
ister des
riptions will be introdu
ed and derived from the generator des
ription. More importantly,
odes developed using the inputstateoutput representation will be transformed into generator and
parity
he
k des
riptions.
3.1 Generator Matri
es and Sequen
es
We have already seen in Denition 1.3.1 that a
onvolutional
ode
an be des
ribed in terms of
a polynomial generator matrix, G(s). Before pro
eeding any further let us give some
on
rete
examples.
Example 3.1.1 Let us dene two binary
onvolutional en
oders, G1 (s) and G2 (s) by,
G1 (s) := s + 1 s3 + s + 1
2 1 0 s 2+1
G (s) := 0 1 s
For G1(s) it is
lear
2
that m = 1 = 3 and Æ(G1 (s)) = 3. For G2(s) it is easy to see that m = 1 = 2,
2 = 1 and Æ(G (s)) = 3. Let us denote the
odes they des
ribe by C1 and C2 respe
tively. It is easy
to
he
k that Æ(C1 ) = 3 and2 that Æ(C2 ) = 2. Hen
e, G2(s) is not a minimal en
oder. The `problem'
with this en
oder is that G (s) is not a minimal basis of the rational ve
tor spa
e generated by its
rows as in the sense of [24℄. This issued will be tou
hed upon in Chapter 4.
Denition 3.1.2 A generator matrix is systemati
if it is in the form [ G~ (s) I ℄, where G~ (s)
has entries in F (s). The bits of the
odeword
orresponding to the identity matrix are
alled the
information bits of the
odeword sin
e they are exa
tly the bits of the message, or information,
ve
tor. A generator matrix is said to be re
ursive if it
ontains nonpolynomial entries.
We
an see that G2(s) is systemati
but G1(s) is not.
For any generator matrix, we may write in the obvious way:
G(s) = Gm sm + Gm 1 sm 1 + + G1 s + G0
where ea
h Gi is simply the k n matrix whose entries are the s
alar
oeÆ
ients of si in the
orresponding polynomial entries of G(s). In doing so, the sequen
e
G0 G1 : : : Gm 1 Gm
is
alled a generating sequen
e of the
ode. It is
lear that there is a bije
tion between generator
matri
es and generating sequen
es and that transformation between the two is obvious.
Using generating sequen
es it is easy to write down a s
alar generating matrix (as for blo
k
odes) for a
onvolutional
ode. However, sin
e
onvolutional
odes
an have message words and
odewords of arbitrary length, the generator matrix must a
ommodate. We arrive at the following
semiinnite generator matrix des
ription.
Denition 3.1.3
2
G0 G1 G2 Gm 3
6 0 G0 G1 Gm 1 Gm 7
G = 6
4 0 0 G0 Gm 2 Gm 1 Gm 7
5
... ...
15
3.2 Parity Che
k Matri
es
Sin
e we
an also dene a linear subspa
e as the kernel of a matrix, it is
lear that a
onvolutional
ode
an be des
ribed as the kernel of a polynomial matrix. The matrix is
alled a parity
he
k
matrix or syndrome former. If one has a generator matrix for a
onvolutional
ode, how does one
obtain a parity
he
k matrix? The answer is a rather simple appli
ation of linear algebra. Take the
generator matrix, G(s) and append the n n identity matrix below it. Then perform elementary
(over F [s℄)
olumn operations on this matrix until the top k rows
ome into the form [ L(s) j 0 ℄.
Then the bottom n rows and last n k
olumns form the transpose of a parity
he
k matrix. One
may refer to [40, 39, 23℄ for a more in depth dis
ussion.
Remark 3.2.1 The matrix L(s) is a greatest
ommon left divisor of the matrix G(s). This
on
ept
will be explained thoroughly in Chapter 4. For now1 it suÆ
es to know that if L(s) is Ik , then the
above algorithm also gives a polynomial inverse, G (s), for the en
oder G(s) in the rst k
olumns
of the bottom n rows. If L(s) is not the identity, then G(s) is
atastrophi
as we have dened it
before, and more importantly for our purposes here, the parity
he
k matrix will, depending on
how we
hoose to dene a
onvolutional
ode, a
tually dene a larger
ode than the one dened by
G(s). This point will be illustrated in the following examples.
Example 3.2.2 [Continuation of Example 3.1.1℄ For G1 (s), we have
"
s + 1 s3 + s + 1
# "
1 0 #
1 0 2 3
after
olumn operations is s + s s + s + 1
0 1 1 s+1
Hen
e, H 1 (s) = s3 + s + 1 s + 1 . Similarly, it
an be shown that H 2(s) = s2 + 1 s 1 is
the parity
he
k matrix for G2(s).
Example 3.2.3 Consider the binary
atastrophi
onvolutional en
oder
3 s 0
G (s) := 1 s + 1 s2 + s + 1 s
In Chapter 4, an alternative way based on state spa
e realizations is developed for
omputing
su
h divisors. However, the above
al
ulation also shows 3that H 3(s) = [ 1 j s j 1 ℄. The
polynomial ve
tor 3[ 1 j 1 j s + 1 ℄ is in the kernel of H (s). However, 1
no polynomial ve
tor
1 ℄ does generate this
will multiply by G (s) to generate this ve
tor. The rational ve
tor [ s+1 s+1
odeword. Therefore, H 3 (s) denes a larger
onvolutional
ode if we employ Denition 1.3.1 as
opposed to the one given in Corollary 2.1.5. This seemingly
riti
al dieren
e in the two denitions
is made irrelevant sin
e we have already remarked that
atastrophi
en
oders are 1to be avoided.
Any
atastrophi
en
oder, G(s),
an be repla
ed by a non
atastrophi
en
oder L (s)G(s).
Finally, let us remark that, in parallel with generating sequen
es, parity
he
k sequen
es are
also employed. Similarly, there exists semiinnite parity
he
k matrix des
riptions of
onvolutional
odes. We will refer to the semiinnite parity
he
k matri
es as syndrome formers and their poly
nomial
ounterparts as parity
he
k matri
es, although typi
ally the two terms are inter
hangeable.
16
+  y1(s)
6
u(s)    
J
JJ
JJ ?
J^ +
 y2(s)
 y1 (s)
  
u1 (s)
R
+
6  y3 (s)
u2 (s)  
A
A
AA  y2 (s)
17
Of
ourse, it is also quite simple to go from a shift register representation to a generator matrix
des
ription. The pro
ess of inter
hanging shift registers and (A; B; C; D) representations is also
quite natural. This pro
ess is detailed ni
ely in Se
tion 7.3 of [42℄.
3.4 From ISO Representations to Generator Matri
es
Given matri
es (A; B; C; D) whi
h dene a
onvolutional
ode by (2.2.1), it is natural to wonder
what the generator matrix of the
ode is. In prin
iple it is an easy task to
ompute the generator
matrix.
If we take any
odeword uy((ss)) and its
orresponding sequen
e of states x(s) (represented in
polynomial form in the reverse way: (a0 ; a1 ; : : : ; an ) $ a0 sn + a1n 1 + + an), then the following
equation holds: " x(s) #
sI A 0 B y(s) = 0:
C I D u(s)
Proposition 3.4.1 [58℄ There exists polynomial matri
es X (s); Y (s) and U (s) with size Æ k; (n
k) kand k k respe
tively, su
h that
sI A 0 B
"
X (s ) #
ker C I D = im YU ((ss)) :
19
of
onvolutional
odes is nothing more than the Markov parameters CAiB [66℄[p. 213℄. As we will
see later, similar results hold for the parity triangle of feedba
k de
oding. Let us
onsider an
example where this is not the
ase.
Example 3.5.4 Let us examine the
ode given by G(s) = [ s2 +1 s2 + s +1 ℄. We have already
seen that a realization of this
ode is given by:
A = 1 0 1 1 B = 0 1
C = [ 1 0 ℄ D = [ 1 ℄
2 3
u
6 u +1 7
2
1 0 3 2
1 0 0 0 0 1 0 0 0 0 36
6 u +2 7
7
6 1 1 77 6 1 1 0 0 0 0 1 0 0 0 6
76
u +3 7
7
6
6 0 0 75 x = 6
6 1 0 1 0 0 1 1 1 0 0 76 u +4 7
76 y 7
4 0 0 4 0 1 0 1 0 0 1 1 1 0 56
6 y +1
7
7
0 0 0 0 1 0 1 0 0 1 1 1 6
6 y +2
7
7
4 5
y +3
y +4
We will further examine the relationship between the syndrome former and the lo
al des
ription
in Se
tion 5.5 where we utilize these
on
epts to explore various majority logi
de
oding s
hemes.
20
CHAPTER 4
A MATRIX EUCLIDEAN ALGORITHM
In this
hapter, the question of determining if a generator matrix for a
onvolutional
ode is
atastrophi
, non
atastrophi
or observable is addressed. Re
all that these terms were dis
ussed
in Proposition 1.3.2. As will be dis
ussed below, this problem also has mu
h broader impli
ations
and many various appli
ations. Hen
e, let us step ba
k and analyze this problem in terms of
omputing
ommon divisors for matri
es.
We will develop an eÆ
ient algorithm for determining the greatest
ommon left divisor (GCLD)
of two polynomial matri
es. Knowing this divisor allows for several immediate appli
ations: In
oding theory, a non
atastrophi
onvolutional en
oder
an be derived from an arbitrary one. In
systems theory, irredu
ible matrix fra
tion des
riptions of transfer fun
tion matri
es
an be found.
In linear algebra, the greatest
ommon divisor
an be seen as a basis for a free module generated
by the
olumns of the matri
es.
The approa
h taken is based on re
ent ideas from systems theory. A minimal state spa
e real
ization is obtained with minimal
al
ulations, and from this the
ontrollability matrix is analyzed
to produ
e the GCLD. It will be shown that the derived algorithm is a natural extension of the
Eu
lidean algorithm to the matrix
ase.
The results of this
hapter were reported by the author in [5℄.
4.1 Ba
kground Material
Let F be an arbitrary eld and
onsider the polynomial ring F[s℄. If we are given two polynomial
matri
es E (s) and F (s) ea
h with k rows then we may dene a greatest
ommon left divisor (GCLD)
to be any k k polynomial matrix L(s) satisfying:
1. There exists polynomial matri
es E (s) and F (s) su
h that L(s)E (s) = E (s) and L(s)F (s) =
F (s).
2. If L (s) is any other divisor of E (s) and F (s) then there exists a polynomial matrix D(s) su
h
that L (s)D(s) = L(s).
By an arbitrary
hoi
e, we will work with left divisors. The theory holds mutatis mutandis for right
divisors.
Noti
e that GCLDs are not unique. For our appli
ations we will assume that the matrix
[E (s) j F (s)℄ is full rank. This implies that all GCLDs will be nonsingular and dier by a unimodular
right fa
tor [35℄. Note also that the
olumns of the GCLD of the full rank polynomial matrix
[E (s) j F (s)℄ form a basis for the free module spanned by the
olumns of [E (s) j F (s)℄ in F k [s℄.
Two matri
es are said to be
oprime if their GCLD is a unimodular matrix.
Instead of starting with two separate matri
es and then
ombining them into one, we may be
given, as in the
ase of a
onvolutional en
oder, a single full rank matrix P (s) of size k n. We
an speak of the GCLD of this single matrix by writing P (s) = [E (s)jF (s)℄ where usually E (s) is
of size k k and F (s) is of size k (n k), and hen
e the GCLD of P (s) is then the GCLD of E (s)
and F (s). Obviously the GCLD does not depend on how we
hoose the division. Equivalently,
we
ould dene a GCLD of P (s) to be a matrix L(s) su
h that L(s)P (s) = P (s), where P (s) is a
polynomial matrix whose Smith, or equivalently, Hermite form is [Ik j 0℄.
With this last des
ription we are able to see several immediate appli
ations. First, if we are
given P (s) as a polynomial basis for a rational ve
tor spa
e [24℄, then by dividing by L(s) (i.e.
taking P (s)) we get a minimal polynomial basis for the ve
tor spa
e (as dened in [24℄). Se
ondly,
if we are given P (s) as a generating set for its
olumn module over F[s℄, then we observed earlier
that the
olumns of the GCLD, L(s), of P (s) form a basis of the
olumn module of P (s). In
parti
ular if P (s) is of size 1 2 and has the form P (s) = (p(s); q(s)); p(s); q(s) 2 F[s℄ then the
GCLD of P (s) is nothing else than the greatest
ommon divisor (g.
.d.) of p(s); q(s). Moreover
our algorithm is in this
ase equivalent to Eu
lid's algorithm. Finally, if we are given P (s) as a
onvolutional en
oder, then P (s) is an observable (i.e. non
atastrophi
with delay 0) en
oder if
and only if L(s) is a unimodular matrix [2, 18℄.
21
Closely related to this last appli
ation, we
an think of P (s) as des
ribing over the real numbers
R a linear behavior in the sense of Willems [69℄:
d
B = fw(t) 2 C 1(R; R n ) j P dt
w(t) = 0g:
The
omputation of the GCLD is then needed for the
omputation of the
ontrollable subbehavior
of B.
The approa
h that will be taken here is to obtain a minimal state spa
e representation of
the asso
iated behavior B with little or no
al
ulation [57℄. This state spa
e representation will be
ontrollable if and only if our behavioral system (or en
oder) is observable. Further, the
ontribution
of this
hapter will be to
al
ulate a GCLD of P (s) dire
tly from the
ontrollability matrix of this
state spa
e representation. As we shall see, the algorithm presented will be a natural generalization
of the Eu
lidean algorithm to polynomial matri
es.
4.2 A Brief History of the Problem
The problem of nding GCLDs is not new, and, indeed, there are several algorithms in existen
e.
The most obvious way is to append the two matri
es together as [E (s) j F (s)℄ and perform polyno
mial
olumn operations (over F[s℄) to bring the matrix to Smith or Hermite form [8℄. The obvious
drawba
k is that polynomial
olumn operations
an be
ome quite tedious, espe
ially if the degrees
of the polynomial entries are high. This problem was over
ome by Kung et al. [38, 15℄ with their
approa
h using generalized Sylvester matri
es. A problem with that algorithm is that the s
alar
matri
es obtained from the original polynomial matri
es were often quite large.
Several more re
ent works, using somewhat similar but distin
t methods to the one proposed
here, have appeared: Fuhrmann [25℄ obtained an algorithm using a matrix
ontinued fra
tion
representation. Antoulas [6℄ has done
onsiderable work on the subje
t using re
ursive and partial
realizations.
An ex
ellent referen
e on the various te
hniques of
omputing GCDs in the
ase k = 1
an
be found in [12℄. In fa
t, the se
tion on \G.C.D. Using Companion Matrix" from this book give
exa
tly our algorithm in the simple
ase k = 1. In this referen
e it was, unfortunately, not observed
by the author that the
ompanion matrix was, in fa
t, a realization of the polynomial matrix. This
prevented the extension to the general
ase, where the author of that paper instead presents the
algorithm of Kung et al..
4.3 A Realization Algorithm
We now present the realization algorithm we will need, pro
eeded by some notation. For a more
thorough a
ount of the ideas involved, please refer to [57℄.
Partition P (s) into P (s) = [ E (s) j F (s) ℄, where E (s) is k k and F (s) is k (n k). After some
unimodular row operations we
an assume that P (s) is row proper with row degrees (Krone
ker
indi
es) 1 k . After a possible right multipli
ation by an n n invertible matrix we
an
assume that the high order
oeÆ
ient matrix, Ph, has the form [Ik j 0℄. Assume that P (s) has no
onstant rows, i.e. k 1. For i; j = 1; : : : ; k let
i
X i 1
X
ei;j (s) = ei;j s f i (s) = fi s
=0 =0
denote the polynomial entries of E(s) and the ith row of F (s) respe
tively.
Dene for i = 1; : : : ; k matri
es of sizes i i, i (n k) and 1 i respe
tively:
2
0 : : : : : : : : : e0i;i 3
6 1 0 e1i;i 7
2
fi0 3
6 fi1 7
6 7
6
Ai;i := 6 0 1
6 . . . .
. 7
. 7 ; Bi := 64 .. 75 ; Ci := [0; : : : ; 1℄:
7 (4.3.1)
6 .. ... 0 ... 75 .
4 . 1
f i
0 : : : 0 1 ei;i 1 i i
22
For i; j = 1; : : : ; k, i 6= j dene matri
es of size i j :
2
0 : : : 0 e0i;j 3
6 .. ... e1 77
6 .
Ai;j := 6 .
6
...
i;j
... 775 (4.3.2)
4 ..
0 : : : 0 ei;j 1
i
The matri
es Ai;i are just the
ompanion matri
es for the polynomials ei;i(s), while the matri
es
Ai;j are just j 1
olumns of zeroes with the
oeÆ
ient ve
tor of the polynomial ei;j (s) appended
on the right. Similarly ea
h Bi is just the
oeÆ
ient ve
tors of all the polynomials in the ith row
of F (s). So these matri
es are obtained with no
al
ulations at all, provided that the matrix P (s)
meets the somewhat stringent
onditions imposed. If Ph does not have the form Ph = [Ik j 0℄ then
it
an be brought into this form with the unimodular operations outlined above.
Noti
e also the requirement that P (s) has no
onstant rows. If P (s) has
onstant rows then
the row and
olumn operations outlined above will transform P (s) into:
P^ (s) =
^
E 1 ( s ) E ^ 2 ( s ) F^ ( s ) (4.3.3)
0 I 0
and [ E^1 (s) j F^ (s) ℄ has no
onstant rows.
Right unimodular operations will not ae
t the GCLD, however left operations will have to be
`undone' on
e the GCLD of the resulting matrix is
al
ulated. So all of these
onditions
an be
met at the expense of some eÆ
ien
y. From here on, assume that P (s) meets these requirements.
Theorem 4.3.1 [70, 57℄ Given P (s) = [ E (s) j F (s) ℄, satisfying Ph = [ Ik j 0 ℄ and let Ai;j ; Bi ; Ci
be dened as above. Let
2
A1;1 A1;k 3 2
B1 3 2
C1 0 3
A := 4 .. . . . B := 4 .. C := 4 ...
. . .. 5;
. 5; 5
Ak;1 Ak;k Bk 0 Ck
and let represent either the shift operator or the dierential operator dtd . Then
x(t) = Ax(t) + Bu(t); (4.3.4)
y(t) = Cx(t)
represents a minimal state spa
e realization of the system
E ()y(t) + F ()u(t) = 0: (4.3.5)
We see that A has size Æ Æ (where Æ = Pki=1 i), B is Æ (n k), and C is k Æ.
The idea here is that
ontrollability of the state spa
e representation is equivalent to the
on
trollability of the behavioral system given by P (s) whi
h is equivalent to P (s) being an observable
en
oder [53, 57℄.
The relationship between the polynomial matrix P (s) and the matri
es (A; B; C ) is expressed
in the following way: Consider the k Æ matrix
2 3
1 s s1 1 0 0 0
6 0 0 0 1 s s2 1 7
X (s) = 6
6
0 0 0 7
7 (4.3.6)
6
4 . . . 0 0 0 75
1 s s 1 k
whi
h was
alled a basis matrix of size = [1 ; : : : ; k ℄ in [57℄ sin
e it has the property that every
polynomial kve
tor '(s) 2 F k [s℄ whose ith
omponent has degree at most i 1
an uniquely be
des
ribed through '(s) = X (s); 2 F Æ :
23
A dire
t
al
ulation reveals that P (s) and the matri
es (A; B; C ) are related by:
X (s)[sI A j B ℄ = P (s) C 0 (4.3.7)
O In k
Of
ourse, we
an multiply X (s) by an invertible matrix T GLÆ , on the right and obtain the
equivalent realization (T 1 AT; T 1B; CT ). We will make use of this fa
t in Se
tion 4.5 to obtain
a more suitable realization.
4.4 The Controllability Spa
e
We are given a k n full rank polynomial matrix P (s) and wish to determine its GCLD, L(s).
Write P (s) = L(s)P (s), where P (s) has Smith form [Ik j 0℄. We will assume that the rows of
P (s) form a minimal basis in the sense of Forney [24℄. The row degrees (1 ; : : : ; k ) of P (s) are
therefore the minimal indi
es of the rational ve
tor spa
e generated by the rows of the matrix P (s).
We will not assume that (1 ; : : : ; k ) are ordered by size. Also write P (s) = [E (s) j F (s)℄ and let
Ph = [Ik j 0℄ be the high order
oeÆ
ient matrix.
Sin
e L(s) is determined uniquely up to unimodular right multipli
ation, we have a
hoi
e as
to whi
h L(s) to work with, and hen
e whi
h P (s) to work with. The following lemma relates L(s)
and P (s) and it singles out a ni
e
hoi
e:
Lemma 4.4.1 If the rows of P (s) form a minimal basis having row degrees 1 ; : : : ; k then L(s)
is uniquely determined from the identity P (s) = L(s)P (s). The (i; j )entry of L(s) has degree at
most (i j ) or the entry is zero.
It is possible to
hoose P (s) su
h that the s
alar matrix L1 whose (i; j )entry is the
oeÆ
ient
of s in the (i; j )entry of L(s) is lower triangular.
i j
Proof: The rst part of the lemma is a dire
t
onsequen
e of [24℄. The se
ond part will be estab
lished by indu
tion. Using elementary
olumn operations on L(s) (this
orresponds to elementary
row operations on P (s)) it will be possible to eliminate all entries of the rst row of L1 with the
ex
eption of one entry. After a possible permutation of the
olumns we
an assume that the rst
row of L1 has with the ex
eption of the entry (1; 1) all entries equal to zero. Pro
eeding indu
tively
row by row will establish the
laim.
Let Ph be the high order
oeÆ
ient matrix of P (s). From the fa
t that both Ph and Ph have
rank k and from the identity Ph = L1Ph it follows that L1 is invertible. As a dire
t
onsequen
e
we have:
Lemma 4.4.2 Let d := i be the M
Millan degree of P (s). Then
P
24
Now, realize Pr (s) to obtain matri
es A, B , and C , relative to the
anoni
al basis matrix X~ (s).
Hen
e the following equation holds:
X~ (s)[sI A j B ℄ = [ E (s)T C j F (s)℄ (4.4.9)
The
ontrollability matrix of the pair (A; B )
an also be
omputed. However, the usual denition
of the
ontrollability matrix is that of a d d(n k) matrix, where B is of size d (n k). We
an, however, naturally extend the size of this matrix to d Æ(n k). This is ne
essary for the
following key result.
Theorem 4.4.3
L(s)X~ (s)(A;
B ) = X (s)(A; B )
Proof: Repeated appli
ations of (4.4.8) give:
X (s)(A; B ) =
F j sF + ECB j j sÆ 1 F + sÆ 2 ECB + sÆ 3 ECAB + + ECAÆ 2 B
Repeated appli
ations of (4.4.9) give:
L(s)X~ (s)(A;
B ) =
F j sF + ET C B j j sÆ 1 F + sÆ 2 ET C B + sÆ 3 ET C AB + + ET C AÆ 2 B
By examining the above expressions, it is
lear that the only step remaining in the proof is to
show that CAiB = T C AiB for allPnonnegative integer i.
We noti
e that (sI A) 1 = 1 i=0 s +1 . Starting
i
A i
P1 CA
with the equation X (s)(sI A) = E (s)C ,
we apply this inverse to obtain X (s) = E (s) i=o s +1 . Further:
i
i
1
X CAi B
F (s) = X (s)B = E (s) i+1 :
i=o
s
Similarly, we have the equations:
1
X C Ai B
F (s) = X~ (s)B = E (s)T :
i=o
si+1
Multiplying the last equation by L(s) results in
1
X T C Ai B
F (s) = E (s) :
i=o
si+1
Sin
e E (s) has high order
oeÆ
ient matrix Ik , the
olumns of E (s) are linearly independent over
F [s℄
and we get:
1 CAi B
X X1 T C Ai B
si+1
= si+1
i=o i=o
Equating
oeÆ
ients in the above expression gives us the desired equality and
ompletes the proof.
25
This theorem is the key to the entire algorithm as the following
orollary shows.
Corollary 4.4.4 There exists an invertible matrix W 2 Gl(n k)Æ su
h that
X (s)(A; B )W = s1 1 `1 : : : s`1 `1 j : : : j s 1 `k : : : s`k `k j Ok((n k)Æ d) :
k (4.4.10)
In this representation the k k matrix
L(s) = [`1 ; : : : ; `k ℄
represents a greatest
ommon left divisor of [E (s) F (s)℄ and 1 ; : : : ; k are the row degrees of
[E (s) F (s)℄.
Sin
e Pr (s) is a minimal basis, its realization, (A; B ), must be a
ontrollable pair.
Therefore,
there exists a s
alar matrix W 2 Gl(n k)Æ su
h that (A; B )W = Id j Ok((n k)Æ d) . Hen
e, the
Proof:
theorem implies that X (s)(A; B ) is
olumn equivalent to a matrix whose
olumns are exa
tly
the
olumns of a GCLD and also multiples of these
olumns (as the multipli
ation X~ (s)(A; B )
indi
ates).
4.5 The Rening Algorithm
By Theorem 4.4.3 and Corollary 4.4.4, the
olumns of L(s) are
ontained in a matrix that is
olumn
equivalent (over F ) to X (s)(A; B ). The question is now, how to sele
t these k
olumns of L(s)
from the (n k)Æ
olumns of the
ontrollability matrix? The answer is fairly simple:
olumn redu
e
and then
hoose the appropriate k
olumns in a manner that will be des
ribed below. However,
we must rst re
onsider our
hoi
e of basis matrix X (s). The reason we have started with the
one we have
hosen is that it allows us to write down the matri
es A and B a little easier. The
downside is that when we
olumn redu
e the
ontrollability matrix we start by eliminating the
lower degree terms of the polynomials in row 1 of the
orresponding matrix X (s)C(A; B ). It would
make mu
h more sense to start eliminating the highest degree terms in ea
h row. We a
omplish
this by repla
ing the standard basis matrix X (s) introdu
ed in (4.3.6) with the basis matrix X(s) =
2 1 1
s 0 s1 2 0 s1 1 0 0 3
6 0 s2 1 0 s2 2 0 s2 1 7
6
4 . .. . .. . . . 0 75
s 1 0k s 2 0
k 0 s 1k
In this representation, the monom s and the
orresponding
olumn is omitted as soon as the
exponent < 0. X(s) and X (s) are related by a simple permutation of the
olumns, i.e. there is
a permutation matrix U1su
h that X(s) = X (s)U . This permutation transforms the
ontrollability
matrix (A; B ) into U (A; B ).
Although it is mu
h simpler to explain the algorithm by performing the U transformation as
above, in pra
ti
e the
omputer would1 automati
ally perform the realization with respe
t to the
new basis matrix X and arrive at U AU and U 1B instead of A and B . The realization with
respe
t to the new basis matrix is just as simple to
ompute as the original, yet it is in a more
pra
ti
al form and, by arriving at it dire
tly, will not waste time by transforming basis matri
es.
As mentioned earlier, the basis matrix X(s) (as well as the basis matrix X (s)) has the property
that every polynomial kve
tor '(s) 2 F k [s℄ whose ith
omponent has degree at most i 1
an
uniquely be des
ribed through '(s) = X(s); 2 F Æ : It is therefore possible to identify '(s) with
the Æve
tor . We will say that is the
oordinate ve
tor of '(s) with respe
t to the basis matrix
X(s).
Theorem 4.5.1 Assume P (s) has Krone
ker indi
es 1 k and minimal indi
es 1 ; : : : ; k
none of whi
h equal zero. Let L(s) = [`1 ; : : : ; `k ℄ be a GCLD whose (i; j )entry has degree at most
i P
j or is zero. Assume that the matrix L1 is lower triangular (by Lemma 4.4.1) and let
d = ki=1 i . Then the Æ d s
alar matrix whose
olumns form the
oordinate ve
tors of
1 `1 : : : s`1 `1 j : : : j sk 1 `k : : : s`k `k
s1 (4.5.11)
is after a possible permutation of the
olumns in
olumn e
helon form.
26
Proof: Immediate
onsequen
e from the fa
t that L1 is lower triangular, has nonzero diagonal
elements and the spe
i
hoi
e of the basis matrix X(s).
As a
onsequen
e of this theorem we
an immediately read out the minimal indi
es 1 ; : : : ; k
from the pivot indi
es of the
olumn e
helon form of (A; B ). A priori it is not true that
X(s)(A; B ) has the parti
ular form (4.5.11) even if (A; B ) is in
olumn e
helon form. One
observes however that elementary
olumn operations on (A; B )
orrespond to unimodular opera
tions on X(s)(A; B ). By Theorem 4.4.3 we also know that the
olumns of X(s)(A; B ) are in the
olumn module of L(s). By the above remarks it is possible to identify k
olumns [
1 ; : : : ;
k ℄ from
the
olumn e
helon form of (A; B ) su
h that X(s)[
1 ; : : : ;
k ℄ forms a GCLD of P (s). In the sequel
we make this sele
tion pro
ess more pre
ise.
Assume that the
ontrollability matrix (A; B ) is in
olumn e
helon form. We
an think of the
ontrollability matrix as being divided into row blo
ks. The top row blo
k
onsists of k rows and
orresponds (under multipli
ation by X(s)) to
oeÆ
ients of degree i 1 for ea
h respe
tive row
i. The next lower blo
k
orresponds to
oeÆ
ients of degree i 2. Ea
h lower blo
k is similarly
dened. If j < 0 then no row
orresponding to row j o
urs in row blo
k (or any subsequent
blo
ks). Based on this we dene:
Denition 4.5.2 1. A
olumn in the
ontrollability matrix (A; B ) is said to \take its order
in row i" if the leading
oeÆ
ient o
urs in a row whi
h
orresponds (under multipli
ation
by X(s)) to an entry in row i of the resulting polynomial kve
tor.
2. For ea
h row i, 1 i k,
onsider all the
olumn ve
tors of the
ontrollability matrix taking
their order in row i. From this set, the
olumn ve
tor whose leading
oeÆ
ient is lowest (in
the matrix, not ne
essarily in value), is
alled the \row leader for row i".
Theorem 4.5.3 If the
olumn e
helon form of (A; B ) has k row leaders [
1 ; : : : ;
k ℄ then X(s)[
1 ; : : : ;
k ℄
forms a GCLD of P (s).
Proof: It follows from our denition of row leaders [
1 ; : : : ;
k ℄ that
k
X k
X
deg det X(s)[
1 ; : : : ;
k ℄ = i i = Æ d:
i=1 i=1
Sin
e X(s)[
1 ; : : : ;
k ℄ is a subset of the
olumn module of L(s) it follows that the
olumns of
X(s)[
1 ; : : : ;
k ℄ generate this
olumn module and this
ompletes the proof.
Remark 4.5.4 It
an be shown and it is illustrated in an example in Se
tion 4.8 that in the
ase
of (n k) = k = 1, i.e. in the situation where P (s) = (p1 (s); p2 (s)) the
olumn redu
tion of
the
ontrollability matrix (A; B ) is exa
tly the Eu
lidean algorithm. The presented algorithm
generalizes in this way Eu
lid's algorithm.
Remark 4.5.5 The
olumn redu
tion of (A; B )
an be done very eÆ
iently by iteratively
om
puting the ve
tors Ai bj , where bj is the j th
olumn of B . (See [66, 2℄ for more details). Due to
the very sparse stru
ture of (A; B ) the
olumn redu
tion is even easier.
4.6 The Situation of Constant Rows
As remarked earlier, the matrix P (s) that is used in the proof of our algorithm
ould have
onstant
rows, and that poses problems when we try to realize this matrix. In this se
tion we will deal with
this
ase. In parti
ular, assume that P (s) has 0 k
onstant rows (i = 0 for 1 i ).
Similar to before, we know that P (s) has (after possible right s
alar multipli
ation) the form:
P (s) = Ik 0 0 (4.6.12)
0 E (s) F (s)
Letting Pr (s) = [E (s) j F (s)℄, we
an obtain the realization matri
es A, B , and C relative to the
basis matrix X~ (s) for Pr (s). The following result
an easily be shown using arguments mirroring
those in the proof of Theorem 4.4.3:
27
Theorem 4.6.1
0 B ) = X (s)(A; B )
L(s) X~ (s) (A;
the algorithm with the redu
ed matrix [ E^1 (s) j F^ (s) ℄ in order to nd the remaining k
olumns of the GCLD.
Step 4 Obtain the realization matri
es A and B relative to the basis matrix X(s) `by inspe
tion'.
Step 5 Cal
ulate the
ontrollability matrix C(A; B ) and
olumn redu
e it. (This may be done
simultaneously to greatly improve eÆ
ien
y [66, 2℄.)
Step 6 Pi
k out the `row leaders' from the
olumn redu
ed
ontrollability matrix C(A; B ). Mul
tiply the `row leaders' by X(s) and pla
e them in the GCLD.
28
Step 7 If there are k row leaders, then go to step 8. If there are less than k row leaders, then
follow the algorithm of Se
tion 4.6.
Step 8 Multiply the GCLD on the left by V 1 and stop.
Remark 4.7.1 The steps whi
h take the most time are steps 2 and 5. Step 2 is not ne
essary
when P (s) is in the desired form. Of
ourse, in general we will not know or
an not guarantee
what form a matrix will have. However, in
ertain appli
ations, su
h as sear
hing for observable
onvolutional en
oders [2, 18℄, we may pres
ribe what form the matri
es will have.
After having
omputed the GCLD, L(s), there might arise the need to
ompute the `
ontrollable
part' P (s) as well. Let pi(s) and p~ i(s) denote the ith
olumn of P (s) and P (s) respe
tively,
i = 1; : : : ; n. Consider for ea
h index i the equation
L(s)~pi (s) = pi (s): (4.7.14)
We
an view (4.7.14) as a system of Æ + k linear equations in d + k unknowns. We therefore have
to solve simultaneously n systems of equations in d + k unknowns. Due to the fa
t that the matrix
L1 is already in lower triangular form it follows that the
oeÆ
ient matrix appearing in (4.7.14) is
already in triangular form as well. A solution of (4.7.14)
an therefore be
omputed very eÆ
iently
and the method will be illustrated in the next se
tion.
4.8 Examples
We have in
luded some examples to aid in the understanding of the algorithm.
Example 4.8.1 First, take the
ase when P (s) is a 12 matrix. In this
ase we are just determining
the GCD of two polynomials. Noti
e that P (s) will trivially satisfy all of the
onditions unless the
two polynomials have the same degree. In that
ase divide one into the other, and take the
remainder in pla
e of the original polynomial.
Let us work through the following example. Let P (s) =
[ s6 + 5s5 464s4 + 1123s3 887s2 + 234s + 72 s5 2s4 342s3 + 1177s2 1170s + 504 ℄
We get the following realization:
2
5 1 0 0 0 03 2
13
6 464 0 1 0 0 0 7
6 1123 0 0 1 0 0 7
6 2 77
A=6 7 B=6
6 342 7
6 887 0 0 0 1 0 7 6 1177 7
4
234 0 0 0 0 1 5 4
1170 5
72 0 0 0 0 0 504
relative to the basis matrix X(s) = [s5 s4 s3 s2 s 1℄. The
orresponding
olumn redu
ed
ontrollability
matrix is: 2
1 0 0 0 0 03
6 2 1 0 0 0 0 77
6
342 65 1 0 0 0 77
(A; B ) = 666 1177 221 3
19 0 0 0 75
3
4
1170 220 23 0 0 0
504 332 12 0 0 0
Sin
e there is only one3 row of2 X(s)C(A; B ), the row leader must be the rightmost nonzero
olumn.
Hen
e the GCLD is s 19s + 23s 12.
Noti
e that the rst
olumn of the above matrix
orresponds with polynomial of lesser degree
from our original matrix. The se
ond
olumn
orresponds with the `rst remainder' that one obtains
when applying the Eu
lidean algorithm to the two polynomials in our matrix. The third
olumn
orresponds with the `se
ond remainder', and also the last nonzero one, of the Eu
lidean algorithm.
Be
ause of this, our algorithm
an be seen as an extension of the Eu
lidean algorithm to matri
es.
29
Example 4.8.2 Now let us look at a more nontrivial example.
5 4 2 4 2
P (s) := ss s3 +ss2++ss + 1 s2 s++2 3s
6 1 1 1 0 0 0 0 0
6 7
7
4 0 0 0 0 0 0 0 05
0 0 0 0 0 0 0 0
We see that
olumns 2 and 3 take their order in row 2, while
olumn 1 is the only
olumn taking
its order in row 1. Hen
e
olumn 1 is the `row leader' for row 1 and
olumn 3 is the `row leader' for
row 2. It follows that 1 = 1 and 2 = 2. As an independent veri
ation, we
an also see dire
tly
from X(s)C(A; B ) that
olumn 2 is just s 1 times
olumn 3 and hen
e they are dependent.
s4 s3 s2 s2 0 0 0 0 0
X(s)C(A; B ) = 1 s2 1 s + 1 0 0 0 0 0 ;
so the GCLD is
4 s2
s
L(s) = 1 s + 1 :
We
an now also easily
ompute P (s) by solving the following linear system of equations:
2
1 0 0 0 03 2
1 0 03
6 0 0 1 0 0 7 6 0 1 0 7
6 7 6 7
6 7 6 7
6 0 1 1 0 0 7 6 0 1 1 7
6 7 6 7
6 7 3 6 7
6 0 0 1 1 0 72 6 0 1 0 7
6 7 a 1 2 3a a 6 7
6 0 0 0 1 0 76 0 0 0
6 7 b b b 7 6 7
1 2 3 6 7
6 7
1
2
3 7 = 6
6 7 7
6 1 0 0 1 1 76
7 4 d1 d2 d3 5 6 1 1 2 7
6 7
6
6 7 6 7
6 0 0 0 0 1 7 e1 e2 e3 6 0 1 2 7
6 7 6 7
6 7 6 7
6 0 1 0 0 1 7 6 0 1 3 7
6 7 6 7
6 0 0 0 0 0 7 6 0 0 0 7
4 5 4 5
0 0 0 0 0 0 0 0
30
This
orresponds to the equation L(s)P (s) = P (s), where P (s) is represented by the matrix:
a1 s + b1 a 2 s + b2 a 3 s + b3
1 s2 + d1 s + e1
2 s2 + d2 s + e2
3 s2 + d3 s + e3 :
The lefthand matrix in the above equation
omes easily from the
olumn redu
ed
ontrollability
matrix. It
onsists of the `row leaders' plus `shifts' of the row leaders. To be pre
ise, for ea
h i,
the row leader for
olumn i o
urs, along with i upward `shifts' of the row leader. Note that this
ne
essitates adding another row blo
k to the top of the s
alar matrix.
The righthand matrix
is simply
the
oeÆ
ients of the matrix P (s) with respe
t to the `aug
5
mented basis matrix' s0 s03 X(s) .
Not only is the lefthand matrix easily
onstru
ted, but it will be lower diagonal (up to
olumn
permutations) so that the above system
an be solved instantaneously! The resulting matrix P (s)
an now be stated:
s 0
P (s) = 0 s2 + 1 2 : 1
31
CHAPTER 5
DECODING ERROR CORRECTING CODES
This
hapter is in
luded as a basi
introdu
tion to the de
oding of error
orre
ting
odes. It
is intended to provide the ne
essary ba
kground for subsequent
hapters for those readers who
are not familiar with this subje
t. A few of the more basi
on
epts are presented in detail,
while some others are only brie
y introdu
ed with referen
es to works
ontaining a more thorough
dis
ussion. The reader is referred to [43, 44, 49℄ for general referen
es on de
oding. In addition,
Se
tion 5.5 provides new insight into majority logi
de
oding methods using the lo
al des
ription of
onvolutional
odes presented in 2.2.4.
5.1 Channel Models and Maximum Likelihood De
oding
We must be able to de
ode the information that we en
ode if it will be of any use. The fa
t that
our en
oded information is often
orrupted by
hannel noise makes this task nontrivial. We1 have
seen in Se
tion 1.1 that the de
oding pro
ess
an be divided into two fun
tion and . The
map attempts to remove the noise from the re
eived word, thereby estimating whi
h
odeword
was indeed sent. The map 1 1reverses the en
oding pro
ess to determine what message was
en
oded. In pra
ti
e, the map is already spe
ied by
hoosing the en
oder, , and hen
e
an
be
onsidered part of the en
oding pro
ess. So the heart of the de
oding task is to nd an eÆ
ient
and ee
tive .
Let us dis
uss what we mean by eÆ
ient and ee
tive. By eÆ
ient we mean the de
oder should
work in a timely fashion and should be reasonably
heap and easy to build and maintain. These
are
on
epts mainly asso
iated with the `engineering side' of
oding theory, and hen
e, fall outside
the domain of this dissertation. Although, it is
ertainly a mathemati
al issue in evaluating the
omplexity of the de
oding algorithm pres
ribed by . By ee
tive we mean, informally, that the
map
hooses the
orre
t
odeword with very high probability. Let us introdu
e some of the
on
epts that will make this idea more
on
rete.
Denition 5.1.1 Given a re
eived word r, and the set of all possible
odewords fxi gi2I . Let
P r(xi jr) denote the probability that the
odeword xi was sent given that the word r was re
eived.
Then, dening as follows, is said to be a maximum likelihood de
oder.
(r) = xj ; where P r(xj jr) = max P r(xi jr)
i2I
It is
lear that this is intuitively the
orre
t approa
h for
onstru
ting a good de
oder. However,
there are two immediate issues whi
h arise from this denition. First, how does one
ompute the
indi
ated probabilities. Se
ond, for large
odes it is impossible to
ompute all of these probabilities
individually in a timely manner.
Let us address the rst issue. (We will delay dis
ussion of the se
ond issue until Se
tion 5.3.)
To
ompute these probabilities it is ne
essary to give our
ommuni
ation
hannel a `model'. This
model must a
urately re
e
t the reallife
hannel that it represents and, hopefully, it should have
some denable mathemati
al properties. In reality it is not surprising that there are many dierent
kinds of
hannels and hen
e we must develop a separate model for ea
h of them. We will
ontent
ourselves by
onsidering two of the major
hannels.
Denition 5.1.2 A binary symmetri
hannel (BSC) is one su
h that A = F 2 and if we dene
P r(0j1) = p0 and P r(1j0) = p1 to be the probabilities that a transmitted 0 will be re
eived as a 1,
and that a 1 will be re
eived as a 0 respe
tively, then p0 = p1 := p. Similarly, a qary symmetri
hannel is one su
h that A = F q where P r(ai jai ) = 1 p for all ai 2 F q and P r(ai jaj ) = p=(q 1)
for all ai 6= aj .
So, if we are given a blo
k
ode over a BSC with `transition probability' p, and a re
eived word r
it is a simple matter of
omputing binomial probabilities to determine whi
h
odeword is the most
likely. Unless we are transmitting at a rate above the
hannel
apa
ity, it is true that p < 1=2.
32
Therefore the issue of nding the most likely
odeword is the same as nding the
odeword whi
h
diers from r in the fewest
omponents. That is, should
hoose xj su
h that
dist(xj; r) = min
i2I
dist(xi ; r)
Su
h a de
oding s
heme is
alled nearest neighbor de
oding. For many
hannels, nearest neighbor
de
oding and maximum likelihood de
oding are equivalent.
Our rst
hannel model makes use of hard de
ision de
oding in that ea
h bit of the re
eived
word is immediately estimated as either a 0 or 1. The probability of ea
h estimate being in
orre
t
is given by the
hannel transition probability p. While mathemati
ally this is very ni
e, it is often
not the best
hannel model. From a pra
ti
al point of view one
an think of the transmission of
0's and 1's by sending a voltage of 1 or 1. (This is an extremely oversimplied des
ription.) The
noise o
urring during transmission over the
hannel
an be modeled by adding or subtra
ting some
amount of voltage. So for ea
h bit the de
oder will re
eive some realvalued voltage. Instead of
making a hard de
ision on the bit (i.e. de
oding all negative voltages as 0 and all positive voltages
as 1), the de
oder
an assign a probability that ea
h bit is a 1 (or 0) by using some
hannel
information. In parti
ular, it is reasonable to assume that the noise adheres to some sort of normal
distribution with mean 0 and standard deviation . The standard deviation depends (primarily)
on the relative strength of the energy per message bit Eb to the noise power spe
tral density N0 =2.
The ratio Eb=N0 is known as the signal to noise ratio (SNR). Thankfully, one need not know what
a power spe
tral density is to
ompute . In pra
ti
e, the knowledge of Eb=N0 given in de
ibels
and the rate, R = k=n are all that is needed to obtain .
p
= 10 E =20N0 1=2R
b
Denition 5.1.3 The
hannel we have just des
ribed is
alled an additive white Gaussian noise
(AWGN)
hannel or simply a Gaussian
hannel.
A more thorough examination of this and other
hannels
an be found in [17℄. It is
lear that the
probabilities we desired to
ompute
an be gotten easily from the normal distribution asso
iated
with the noise.
5.2 De
oding Parameters
Let us
onsider a hard de
ision
hannel where maximum likelihood de
oding is equivalent to nearest
neighbor de
oding (e.g. a BSC). A very important parameter in determining the ee
tiveness of
a
ode and de
oding algorithm is the
odeword error rate. Simply put, this is just the per
entage
of transmitted
odewords that are de
oded in
orre
tly. For blo
k
odes using a nearest neighbor
de
oding algorithm, the following theorem and
orollary state the importan
e of the minimum
distan
e of the
ode in regards to the
odeword error rate.
Theorem 5.2.1 [49, amongst many others℄ Let a blo
k
ode, C , have minimum distan
e d. Let
t = b(d 1)=2
. Then the
ode
an dete
t up to d 1 errors and it
an
orre
t up to to t errors.
Proof: Let x be the sent
odeword and r = x+e be the re
eived word. An error will be undete
ted if
r = x1 2 C (x1 6= x). If less than d errors o
ur, then wt e < d and hen
e by the triangle inequality
(Hamming distan
e denes a metri
!), r
annot be another
odeword. Hen
e the error will be
dete
ted.
Consider the ve
tor spa
e F n and our
ode C F n . For ea
h
odeword we may dene a ball
of radius for any 0. Simply, these balls
ontain all ve
tors in F n whose distan
e from the
enter (
odeword) are less than or equal to . When t, it is
lear that the balls are disjoint.
Hen
e, when t or less errors o
ur, r will lie in exa
tly one of the balls. Nearest neighbor de
oding
spe
ies that we de
ode as the
odeword in the
enter of that ball.
Corollary 5.2.2 The
odeword error rate, Pe , for a rate k=n (binary) blo
k
ode with minimum
distan
e d (t = b(d 1)=2
) transmitted over a BSC with
hannel transition probability p is given
by
n
X n i
Pe p (1 p)n i
i=t+1
i
33
Proof: The right side of the above inequality is just the probability of more than t errors o
urring
in any
odeword. (The inequality
an be explained by the fa
t that forn most
odes, the spheres of
radius t about ea
h
odeword do not
over the entire ve
tor spa
e F . If one of these ve
tors is
re
eived, the de
oder may `lu
k' into a
orre
t de
oding.)
On the other hand, when a softde
ision de
oding algorithm is employed or for
onvolutional
odes the notion of
odeword error is not as important as the bit error rate or bit error probability.
Simply, this is just the probability of a message bit being de
oded in
orre
tly, and is denoted by
Pb . Similar formulas exists for
omputing (or bounding) this probability for the various
hannel
models.
5.3 A Few De
oding Algorithms
We would like to address the se
ond issue whi
h arose in the dis
ussion on
omputing the relative
probabilities of
odewords given a re
eived word. Namely, that it is infeasible to
ompute every
su
h probability. We shall present some of the most fundamental de
oding te
hniques to show how
this problem
an be over
ome.
Syndrome de
oding is a hard de
ision method for de
oding blo
k
odes. We start with a parity
he
k matrix, H , for the
ode. Then we
ompute the
osets of the
ode as an additive subgroup of
F n . For ea
h
oset, we
hoose a ve
tor with minimal weight and
all it the
oset leader. For every
ve
tor x 2 Fn , the ve
tor xH T is
alled the syndrome of the ve
tor. It is true that all Tve
tors in
the same
oset have the same syndrome. Sin
e every ve
tor
in our
ode satises
H = 0, all
of the
odewords lie in the
oset whose leader is the all zero ve
tor and whose syndrome is the all
zero ve
tor.
Given a re
eived word r =
+ e, where e is the error ve
tor, we de
ode in the following way.
Compute the syndrome of r, whi
h equals the syndrome of e. This syndrome must equal one of the
syndromes of the
oset leaders. The error ve
tor must be one of the ve
tors in that
oset. The most
likely error ve
tor is the one with minimal weight, i.e. the
oset leader. (If more than one minimal
weight ve
tor exist,
hoose one.) Subtra
t the
oset leader from the re
eived word to obtain the
de
oded word.
This is a very ni
e s
heme from a mathemati
al viewpoint, but even the task of storing the
oset leaders and their syndromes
an be
ome quite
ostly and
umbersome for large
odes. Some
odes have even more eÆ
ient ways of syndrome de
oding. Hamming
odes and BCH
odes [49℄
are good examples of
odes whi
h
an
ompute the error ve
tor algebrai
ally rather than having to
store the
omplete list of
oset leaders. There is also the te
hnique of majority logi
de
oding using
orthogonal parity
he
ks. We will explore this te
hnique as it is used to de
ode
onvolutional
odes
in Se
tion 5.4 and then pro
eed to relate this te
hnique to our lo
al des
ription of
onvolutional
odes in Se
tion 5.5.
Softde
ision de
oding of blo
k
odes is be
oming in
reasingly popular. Gallager developed
a softde
ision algorithm based roughly on majority logi
de
oding in the early Sixties [26, 27℄.
While his methods showed promise for large blo
k sizes, the method was widely ignored, probably
due to the la
k of suÆ
ient te
hnology to implement the de
oding pra
ti
ally. The method was
redis
overed and expanded on by Ma
Kay et al. [46, 47℄. With improvements in the te
hnology
used to test and implement error
orre
ting
odes, the
odes developed for use with these methods
are
omparable with the highly touted turbo
odes in terms of bit error rate performan
e [48, 67℄.
There are several prominent algorithms for the de
oding of
onvolutional
odes. Foremost
among them is the Viterbi de
oding algorithm. Sequential de
oding, with its many variations,
is another popular method. The reader is referred to [43, 40℄ for an ex
ellent treatment of these
`
lassi
al' topi
s. More re
ently, softde
ision de
oding te
hniques have gained some attention sin
e
they
an be use quite ee
tively in the de
oding of turbo
odes. Some of these in
lude the BCJR
algorithm [9℄ and various softoutput Viterbi algorithms (SOVA) [13, 28℄. We shall dis
uss some
new algebrai
ideas for de
oding
onvolutional
odes in Chapter 6.
5.4 Majority Logi
De
oding of Convolutional Codes
This se
tion will provide a brief ba
kground on majority logi
de
oding of
onvolutional
odes. (The
underlying syndrome de
oding te
hnique
an, of
ourse, be used for blo
k
odes.) In parti
ular,
34
denite de
oding, feedba
k de
oding and threshold de
oding will be dis
ussed. For a more thorough
investigation of these topi
s the reader is referred to [50, 55, 63, 43, 40℄.
The general idea of majority logi
de
oding is to use the syndrome former matrix of the
on
volutional
ode to obtain parity
he
ks on the bits of the
odeword. These parity
he
ks are then
used in somewhat dierent ways by the three basi
types of majority logi
de
oding. First, let us
fo
us on how the parity
he
ks are obtained.
The syndrome former matrix (see Se
tion 3.2) is
reated with (2m + 1)n
olumns. This matrix
denes a set of equations on ea
h window of length (2m + 1)n bits of the re
eived sequen
e. A
spe
ial time interval is sele
ted from the window and a set of orthogonal parity
he
ks is obtained
from this set of equations on this time interval. Simply put, a set of parity
he
ks is orthogonal
on a bit ui if ea
h
he
k involves ui and no other bit is
he
ked by more than one of the
he
ks.
For denite de
oding, the `middle' time interval of the parity parallelogram is sele
ted, and a set
of orthogonal parity
he
ks is
reated for ea
h of the k information bits of that time interval. For
feedba
k and threshold de
oding, only the middle and right hand bits of the `parallelogram' are
used, thus resulting in the `parity triangle'. We will soon see a more a
urate des
ription of this
pro
ess using the representations of Se
tion 3.5.
For the hard de
ision te
hniques, namely denite and feedba
k de
oding, the de
oding pro
ess
is straightforward. The re
eived bits are given a hard de
ision estimate based on the
hannel model
and are fed into the majority logi
de
oder. That is, the bits are fed into the orthogonal parity
he
k equations and ea
h parity
he
k is evaluated as a 0 or a 1. For ea
h information bit, if a
majority of its syndromes are 1, it is assumed to be in error and (assuming a binary
ode) its hard
de
ision
hannel estimate is reversed. On the other hand, if a majority of its orthogonal parity
he
ks are 0, it assumed to be
orre
t and its hard de
ision
hannel estimate stands.
Now
omes the main dieren
e between denite and feedba
k de
oding. In denite de
oding,
the de
isions made by the majority logi
de
oder are not used to update the bits in future time
intervals. In feedba
k de
oding, the de
oding made by the majority logi
de
oder is used in future
omputations. In parti
ular, if the bit ui is determined to be in error, then all the values of the
orthogonal parity
he
ks are
hanged. In both methods, the sequen
e of syndromes is shifted by
one time interval to prepare for the next set of in
oming bits (rather than redundantly re
al
ulate
the previous syndromes).
For threshold de
oding, the de
oder does not immediately make a hard de
ision estimate.
Therefore the syndromes will not evaluate to 0 or 1 and, hen
e, the de
oder de
ision
annot be
based simply on the majority of the orthogonal parity
he
ks. In general, the pro
edure is to obtain
a real valued fun
tion on the orthogonal parity
he
ks and de
lare a bit to be in error if the fun
tion
ex
eeds some `threshold' value.
Let us review the basi
ideas involved by deriving the parity parallelograms and triangles for
the
ode in Example 3.5.2.
Example 5.4.1 The 22
olumn syndrome former matrix, in whi
h the parity parallelogram (See
Remark 3.5.3.) is
learly visible, is:
2
1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 03
6 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 7
6 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 7
6 7
6 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 7
4
0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 05
0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1
35
The syndrome equations derived are as follows:
2 3
u 5
6 u 4 7
6 u 3 7
6 7
2
s 3 2
1 1 1 0 0 1 0 00 0 0 3 6
6 u 2 7
7
2
y 3
6 s +1 7 6 0 1 1 1 0 0 1 00 0 0 7 6 u 2 7 6 y +1 7
6
6 s +2 7
7 =
6
6 0 0 1 1 1 0 0 01 0 0 7
7
6
6 u 1 7
7
+
6
6 y +2 7
7
6 s +3 7 6 0 0 0 1 1 1 0 10 0 0 7 6
6 u 7
7 6 y +3 7
4
s +4 5 4
0 0 0 0 1 1 1 00 1 0 5 6
6 u +1 7
7
4
y +4 5
s +5 0 0 0 0 0 1 1 1 0 0 1 6
6 u +2 7
7 y +5
6
4
u +3 7
5
u +4
u +5
Here, the syndromes s , s +3 and s +5 form a set of orthogonal parity
he
ks. Thus a majority
logi
de
oder
an
orre
t a single error in any of the 13 bits that are
he
ked by the orthogonal
parity
he
ks.
On the other hand, the parity triangle for feedba
k and threshold de
oding is given by:
2
s 3 2
1 0 0 0 0 0 3 2 u 3 2
y 3
6 s +1 7 6 0 1 0 0 0 0 7 6 u +1 7 6 y +1 7
7 = 6 0 0 1 0 0 0 7 6 u +2 7 + 6 y +2 7
6 s +2 7 6 7 6 7 6 7
6
6 s +3 7 6 1 0 0 1 0 0 7 6 u +3 7 6 y +3 7
4
s +4 5 4
1 1 0 0 1 0 5 4 u +4 5 4
y +4 5
s +5 1 1 1 0 0 1 u +5 y +5
Here, the syndromes s , s +3, s +4 and the
he
k formed as the sum (s +1 + s +5) are a set of
orthogonal parity
he
ks. Thus a majority logi
de
oder
an
orre
t up to two errors in any of the
11 bits that are
he
ked by the orthogonal parity
he
ks.
5.5 Using the Lo
al Des
ription for Feedba
k De
oding
In the usual setting of
onvolutional
odes the parity triangle is obtained dire
tly from the syndrome
former, as shown in the above examples. However, the lo
al des
ription oers valuable insight into
the nature of feedba
k de
oding. Instead of identifying the parity triangle as the right half of the
parity parallelogram, it should rather be identied as the top portion of the matrix in (3.5.2). It
is in this setting that relationship between the state of the en
oder and the syndromes is fully
revealed.
To be exa
t, we take the top m + 1 rows of equation (3.5.2). The left hand side
onsists of the
rst m rows of the observability matrix, followed by a row of zeros, times the state at time . The
right hand side
onsists of the parity triangle times the sequen
e of inputs plus the sequen
e of
outputs.
Theorem 5.5.1 The updated syndromes for feedba
k de
oding for times through + m are given
by:
2 3 2 3 2
s C D 0 0 32 u~ 3 2
y~
3
6 s +1 7 6 CA 7
6
6 CB D 0 7766 u~ +1
7 6 y~ +1 7
.. 7 6 .. 7 6
^ + 66 ... ... ... .. 76 .. 7 6 .. 7
7=6 7x 7+6
6
6 . 7 6
. 7 . 76 . 7 6
. 77
4s
+m 1
5 4 CAm 1 5 4 CAm 2 B CB D 0 54 u~ +m 1 5 4 y~ +m 1 5
s +m 0 X X X X D u~ +m X
where the last row depends on the matrix A as des
ribed in Proposition (3.5.1). (The u~'s and y~'s
are the re
eived bits. x^ is the state of `en
oder' on the re
eiving end based on the de
oded input
bits up to time 1.)
36
Proof: The proof follows from equation (2.2.2).
2 3 2 3
C C
6 CA 7 6 CA 7 2
u^0 3
6
6 CA2 7
7
6
6 CA2 7
7 1 B AB B 4 .. 5
6
6 ... 7 x
7 ^ =6
6 ...
7 A
7 .
6
4 CAm 1
7
5
6
4 CAm 1
7
5 ^
u 1
0 0
Where the u^'s are the de
oded inputs from previous time intervals. Multiplying out the right hand
side gives exa
tly the left half of the parity parallelogram. The result is now immediate.
Remark 5.5.2 The above theorem states that having no errors in the feedba
k at any given time
is equivalent to knowing the
orre
t state of the en
oder at that time. Also, if the matrix A is
nilpotent, then only the most re
ently de
oded input bits are required sin
e the rest of the above
matrix will be 0. In that
ase, we will get the usual parity parallelogram.
37
CHAPTER 6
ALGEBRAIC DECODING OF CONVOLUTIONAL
CODES USING THE LOCAL DESCRIPTION
In this
hapter we will des
ribe an algebrai
de
oding s
heme for
onvolutional
odes that was rst
put forth in [56℄. Some example
odes for this algorithm will be presented. We will pro
eed to make
signi
ant improvements in this algorithm and dis
uss more appli
ations in
luding the de
oding of
turbo
odes.
6.1 A Basi
Algorithm
We begin by restating the notion of nearest neighbor de
oding for
onvolutional
odes. This is then
used to develop a basi
algorithm for the
n algebrai
o
de
oding of
onvolutional
odes.
Assume a
ode sequen
e fvt gt0 = u t0 was sent and the sequen
e
y t
t
y^t
f^vtgt0 = u^t t0
has been re
eived. The de
oding problem then asks for the minimization of the error
1 1 !
X X
error := fmin
v g2C
dist(vt ; v^t ) = min [dist(ut ; u^t ) + dist(yt; y^t )℄ : (6.1.1)
t
t=0 t=0
If no transmission error o
urs then f^vt gt0 is a valid traje
tory and the error value in (6.1.1)
is zero. Let
ft y^t yt
e
:= u^ u (6.1.2)
t t0 t t t0
be the sequen
e of errors.
We
onsider the re
eived
odeword sequen
e fvt gt0 as being divided into time intervals of
size T + 1. There is also a positive integer, ,
hosen for ea
h
ode subje
t to the following
onsiderations:
should be less than T . In parti
ular, it is desirable for b(T + 1)=
to be greater than 1.
We will want (the
olumns of)
(A; C ) to be the generator matrix of a blo
k
ode with
suÆ
iently good distan
e. The matrix
(A; C ) must be full rank, so must be at least
, the observability index of (A; C ). It is easy to see that the distan
e of the
ode is a non
de
reasing fun
tion of . However, a balan
e must be found, as we shall see, between the
distan
e of this
ode and the desire to maximize (T + 1)=.
Algorithm: Assume initially that
u^T +1 ; u^T +2 ; : : : ; u^T (6.1.3)
has been
orre
tly transmitted. From Proposition 2.2.4 it follows that
0 1 0 D 0 0 1 0 uT +1 1
yT +1 ... C
B yT +2 C B ... u
CB D CBB T .+2 C
B C
B
B ... C B
C B ... ... C C B .. C
B C B CAB CB CB C
B
... A B
C
.
.. . .
.. .. 0 A CB .
.
. A
C
yT CA 2 B CA 3 B CB D uT
38
is in the
olumn spa
e of the blo
k
ode generated by the
olumns of
0 1
C
B CA C
B
... C:
A (6.1.4)
CA 1
The larger the value of , the bigger, in general, thejdistan
ek of this blo
k
ode. Assume, now,
that this blo
k
ode has distan
e dgen, and let tgen = d 2 1 . Then, as long as there are tgen or
gen
up to tpar errors in the sequen
e u^0; : : : ; u^T . De
oding errors made in this stage are
alled errors
of TYPE II. If desired, the error sequen
e f0; : : : ; fT
an now be
omputed from the des
ription
of the
ode provided in Proposition 2.2.4, or in Proposition 2.2.5.
Let us now deal with the situation where the re
eived sequen
e in (6.1.3) is not
orre
t. Several
things might happen then. First it is possible that we
annot
ompute the state ve
tor xT +1
from identity (6.1.5) in whi
h
ase we
on
lude that the sequen
e given in (6.1.3) was not
orre
t.
It is possible that a wrong state ve
tor xT +1 is
omputed. This will lead to a wrong syndrome
ve
tor in (6.1.6) and the
omputed error sequen
e e0 ; : : : ; eT will then have weight more than
tpar . At that point, we
on
lude that there must be an error in the sequen
e in (6.1.3). (The
other possibility is that the state has been
orre
tly estimated, but there are a
tually more than
tpar errors in u^0 ; : : : ; u^T . In this
ase, the algorithm won't be able, at this stage, to de
ode
39
orre
tly.) Finally, there might o
ur the situation where we
ompute (by
han
e) the
orre
t state
ve
tor xT +1 from identity (6.1.5). In this situation we
orre
tly will nd the error sequen
e
e0 ; : : : ; eT as well as the error sequen
e f0 ; : : : ; fT and in a later stage of the algorithm we will
re
ognize that (6.1.3) was not
orre
t. We will handle this
ase by saying only that it is `improbable'
and ae
ts at most inputs (if one is
on
erned about bit error rate).
Assume now that it has been re
ognized that (6.1.3) is not
orre
t. In this
ase we repla
e the
sequen
e (6.1.3) with the sequen
e
u^T 2+1 ; u^T 2+2 ; : : : ; u^T (6.1.7)
and we assume that this is a
orre
tly transmitted sequen
e. Again it might turn out that (6.1.7)
is not a
orre
tly transmitted sequen
e. One iteratively pro
eeds until one nds a
orre
t sequen
e
u^T h+1 ; u^T h+2 ; : : : ; u^T (h 1) : (6.1.8)
If we are unable to nd su
h a
orre
t sequen
e, then we say an error of TYPE III o
urs. Su
h a
sequen
e
an be found with high probability, depending on the
hannel, and, more importantly, the
values of T and . This n sequen
e
will then allow
o one to
orre
tly
ompute the state ve
tor xT h+1
as well as the errors e j 0 t T h . After having
omputed xT h+1 we reiterate the
t
t
f
algorithm by attempting to
ompute the state ve
tor x2T (h+1)+1 making the assumption that
the sequen
e
u^2T (h+1)+1 ; u^2T (h+1)+2 ; : : : ; u^2T : (6.1.9)
has been transmitted
orre
tly.
6.2 Some Classes of Binary InputStateOutput Convolutional Codes
We will present two ideas for how to sele
t the matri
es (A; B; C; D) so as to
onstru
t
odes with
properties desirable for this de
oding s
heme. BCH type
odes over larger elds have already been
onstru
ted using these ideas in [61, 62, 59, 71℄. We will examine `maximum distan
e separable'
onvolutional
odes over larger elds in Chapter 7. Our
onstru
tion here will fo
us on permutation
matri
es and matri
es with large order over the binary eld.
Denition 6.2.1
Let A be a nonsingular matrix with minimum polynomial pA(s). Then the order, n, of the
matrix A is equal tonthe order of the polynomial pA(s), whi
h is the smallestninteger, n, su
h
that pA(s) divides s 1. Equivalently, n is the smallest integer su
h that A = I . See [41℄,
for example.
If A is size Æ Æ, then A is primitive, (as well as its minimum polynomial) if its order is the
maximum, qÆ 1 (over F q ).
We will restri
t ourselves to
onvolutional
odes of rate 12 , so our matri
es will have sizes:
A = Æ Æ, B = Æ 1, C = 1 Æ, and D will be a 1 1 matrix (usually the identity).
Noti
e that we now have a bound on the size of T . Sin
e we require T +1(A; B ) to be
a parity
he
k matrix, if T n, the order of A, then the distan
e of that
ode will ne
essarily
be 2, and hen
e will provide for no error
orre
tion.
This bound emphasizes the need to
hoose a matrix, A, with large order. At this point we
an
go in several dire
tions. Obviously, if we wish to maximize order, we may
hoose A as a
ompanion
matrix of a primitive polynomial. Noti
e that in this
ase, the
hoi
e of B is irrelevant, so long as
B is not the zero ve
tor. However, a matrix with high order doesn't ne
essarily
orrespond to good
distan
e. In fa
t, the parity
he
k sub
ode is generated by the minimum polynomial of the matrix,
so the distan
e is immediately upper bounded by the weight of this polynomial (and will de
rease
as T in
reases). We will make this notion more pre
ise.
There is a oneone
orresponden
e between polynomials of degree d and ve
tors of length d + 1
given easily by: a0 + a1s + a2s2 + : : : + adsd $T [a0; 1a1 ; a20 ; : : : ; ad ℄. Every
odeword (here,
odeword
refers to any ve
tor u, su
h that [BAB A B ℄u = 0, i.e. a
odeword of the parity
he
k
sub
ode) is a multiple of the minimum polynomial, under the above
orresponden
e.
40
Denition 6.2.2 The ideal of all polynomials generated by pA(s) is denoted as (pA(s)). Let the
interse
tion of this ideal with the set of all polynomials of degree less or equal to d be denoted by
(pA(s))<d .
Using these denitions we may reformulate the above dis
ussion:
Proposition 6.2.3 For ea
h value of T , the set of
odewords for the parity
he
k sub
ode is
pre
isely (pA (s))<(T ) . The distan
e of this
ode, is exa
tly the weight of the minimum weight
nonzero polynomial in this set.
Hen
e, the order of a matrix is not the sole determining fa
tor in the quality of the
ode.
Nevertheless, there are enough good
odes that randomly sele
ting primitive polynomials with a
high weight will shortly lead to a good parity
he
k sub
ode. Sin
e the generator sub
ode also
depends heavily on A and not so mu
h on C , this
ode also tends to have good distan
e properties
at the same time. An example of a
ode
onstru
ted using these ideas is found below in Example
6.3.2.
Although primitive matri
es have, in general, good distan
e properties, the
orresponding par
ity
he
k sub
ode la
ks any real stru
ture. If we are willing to give up a little distan
e for some
stru
ture, we
an quite possibly take advantage of some soft de
ision de
oding te
hniques. Permu
tation matri
es have the advantages of possessing ex
ellent stru
ture and having potentially large
order. Most importantly, if we sele
t a sparse
olumn, with say, t ones, as our matrix B , then ea
h
olumn of T +1(A; B ) will have exa
tly t ones.
6.3 Example Codes.
Example 6.3.1 Let us
onstru
t a simple example. Let us
hoose a permutation whi
h is the
dire
t sum of
y
les of length 3, 5, 7, and 11. For simpli
ity we
hoose the
y
les: (2 3 1)(5 6 7 8
4)(10 11 12 13 14 15 9)(17 18 19 20 21 22 23 24 25 26 16). The resulting matrix is 26 26, and
has order 3 5 7 11 = 1155. We
hoose as our B matrix, the ve
tor with all zeros ex
ept a
1 in rows 1, 4, 9, and 16. The
orresponding matrix 68(A; B ) has distan
e 5. We
hoose T
to be 67. For a randomly
hosen C matrix with weight 13 the
ode generated by the
olumns of
(A; C ) has distan
e 3 when = 34. Hen
e, we may let T = 101, and = 34, so (T +1)= = 3;
we will have three intervals to work with for estimating the state.
We now give error probabilities for this
ode assuming nearest neighbor de
oding over a BSC
for various values of the
hannel transition probability p.
Example 6.3.2 We now present a simple example 20
ode 19based18on a17primitive matrix. Let A be
the
ompanion matrix for the primitive polynomial s + s + s + s + s14 + s11 + s10 + s9 + s8 +
s7 + s4 + s2 + 1 over F 2 . Choose B and C as simply the rst unit ve
tor, and D = 1. It turns out
that for = 33, the generator sub
ode has distan
e 5. Choosing T = 65 provides a parity
he
k
sub
ode with distan
e 5. The error probabilities for the
ode are presented in the following table:
We defer a rigorous analysis of the de
oding properties of this de
oding algorithm until further
renements are presented. This analysis appears in Se
tion 6.7.
41
TABLE 6.2: ERROR PROBABILITIES FOR THE CODE OF EXAMPLE 6.3.2
p TYPE I TYPE II TYPE III Blo
k Error
.03 .0756 .3178 .2549 .5301
.01 .0044 .0287 .0225 .0547
.001 .000005 .000044 .000034 .00008
42
make full use of this information. That is, the de
oding algorithm will not nd the optimal solution.
The algorithm is designed to trade o some error
orre
ting
apability for redu
ed
omplexity.
Let us make these remarks more pre
ise. The full information of the
onvolutional
ode is pre
sented in the global des
ription of Proposition 2.2.5. The
hoi
e to use an iterative de
oding method
already limits the amount of information available to the de
oder to the information provided by
the lo
al des
ription of Proposition 2.2.4.
Theorem 6.5.1 The
ode dened by the parity
he
k matrix of (6.4.1) is larger than the
ode
dened by the parity
he
k matrix of (3.5.1). i.e.:
ker [ M
(A; B; C; D) j I ℄ ker [
(A; B ) j
(A;~ B~ ) ℄
Proof: This follows immediately by multiplying the lefthand matrix above by
(A;~ B~ ) and
applying lemma 2.3.4.
The theorem states that the proposed state elimination algorithm of Se
tion 6.4 does not make
full use of the information in the lo
al des
ription. Hen
e, the algorithm will not perform optimally
for an iterative de
oding s
heme. However, the parity
he
k matrix involved is
onsiderably smaller
and hen
e oers the possibility of providing a redu
ed
omplexity for de
oding.
6.6 Enhan
ed State Estimation
In se
tion 6.4, the issue of nding a
orre
t state in the algebrai
de
oding algorithm was side
stepped by `
an
eling' the state from the equations. However, the resulting parity
he
k matrix
was less than ideal for error
orre
tion. Our new approa
h is to in
rease the number of available
`windows' that
an be used to estimate a
orre
t state. The idea is to make dire
t use of Lem
mas 2.3.3 and 2.3.4. Namely, by multiplying equation (6.1.5) by M(A;~ B;~ C;
~ D~ ) we arrive at the
following:
D~
2
0 0 3 2 yT +1 3 2 u^T +1 3 2 C~ 3
6
6 C~ B~ D~
... ... 77 6 yT +2 7 6 u^T +2 7 66 C~ A~ 77
6
. . . . 76 ... 77 66 ... 77 = 66 .. 77 x
6
C~ A~B~ C~ B~ . . 766 7 6 7 6 . 7 T +1: (6.6.2)
6
6
4 .
.. ... ... 0 5
76
7 4 .
.. 7 6
5 4 .
.. 7 6
5 4
... 75
C~ A~ 2 B~ C~ A~ 3 B~ C~ B~ D~ yT u^T C~ A~ 1
From this equation it is
lear that we may now examine any outputs to estimate the state
xT +1 . Of
ourse, when we do this, the matrix
(A; ~ C~ ) will be used as the generator matrix for
a linear blo
k
ode, so it is desirable that this
ode have a large distan
e.
Taking this idea even further, it is possible to de
ode the rst T + 1 time intervals using
the re
eived outputs. This follows from the outputstateinput analog of (6.1.6), namely:
0
y^0 1
A~T B~ : : : B~ ... A xT +1 : (6.6.3)
y^T
This idea depends on (A;~ B~ ) being the parity
he
k matrix of a linear blo
k
ode with good
distan
e. On
e the outputs are known, it is a simple task to produ
e the input (information) ve
tor.
6.7 Analysis of the Enhan
ed Algorithm
Let us
onsider the de
oding performan
e of the above algorithm given a rate k=n
onvolutional
ode, (A; B; C; D), with
omplexity Æ, transmitted over a qary symmetri
hannel with transition
probability p.
Denote the minimum distan
es of the parity
he
k
odes given by T +1(A; B ) and T +1(A;~ B~ )
(for some xed T and ) by dpar and d~par respe
tively. Similarly, denote the distan
es of the
43
blo
k
odes generated by
(A; C ) and
(A;~ C~ ) by dgen and d~gen respe
tively. Finally, let
I = b(T + 1)=
, the number of distin
t intervals from whi
h to make an estimate of the state.
The following lemma provides an upper bound on the probability that there are no
onse
utive
orre
tly re
eived inputs or outputs from whi
h to estimate the state (i.e. the probability of type
III errors).
Lemma 6.7.1
P r(no error in inputs) = (1 p)k
P r(no error in outputs) = (1 p)(n k)
P r(all I input intervals are
orrupt) = [1 (1 p)k ℄I
P r(all I output intervals are
orrupt) = [1 (1 p)(n k) ℄I
P r(Type III error) [1 (1 p)k ℄I [ 1 (1 p)(n k) ℄I
44
Remark 6.7.4 If the distan
e of the generator sub
odes was not a fa
tor, we would be able to
signi
antly redu
e the probability of Type III errors. We have only
onsidered `solving' (6.1.5)
for all the inputs or all the outputs. In general, simple row operations would enable us to solve for
either the input or output of ea
h time unit. That would in
rease the number of possible `windows'
in ea
h interval from 2 up to 2 . Theoreti
ally this would redu
e the the probability of Type III
errors to virtually 0. Besides the
omplexity issues asso
iated with this approa
h, the
orresponding
generator sub
odes would
hange with ea
h arrangement. It would be an impossible task to design
a
ode with all 2 of these
odes possessing good distan
e properties. It is extraordinary that we
are able to nd
odes with good generator sub
ode distan
e properties for just our two
hosen
arrangements. Su
h
odes will be presented in Se
tion 7.1.
Similarly we
an
ompute the probability of Type II errors.
Lemma 6.7.5
tpar
P r(Type II error) = 1
X T + 1pi(1 p)T +1 i
i=0
i
where tpar = b(dpar 1)=2
.
Remark 6.7.6 In theory, it would be possible to use both the input and output sequen
es to avoid
Type II errors. That is, if the re
eived input sequen
e has too many errors then one
ould try to
de
ode using the re
eived output sequen
e using (6.6.3). The problem is that in many
ases, it is
not known when a de
oding error o
urs. Of
ourse, there are many equations whi
h we
an
he
k
our de
oded inputs with. This would be done at the expense of
omplexity. A simpler alternative
might be to de
ode on both sides, i.e. use (6.1.6) and (6.6.3) simultaneously and
ompare the
results. This would help aÆrm a
orre
t de
oding, but there would be no obvious `tiebreaker' if
the two de
oded sequen
es disagreed.
We now have the following obvious upper bound on the probability of de
oding error in the
rst blo
k. Furthermore, if we assume that we have the
orre
t state at time , then the same
bound holds for the blo
k beginning with . (If the state x is in
orre
t, we may obtain it in the
same way we obtain the state above. i.e. The error propagation ee
t inherent in this in
remental
de
oding s
heme
an be limited by the use of our state estimation algorithm.)
Theorem 6.7.7 The probability of blo
k error in the enhan
ed de
oding algorithm is upper bounded
by
1  [1  Pr(type I)℄[1  Pr(type II)℄[1  Pr(type III)℄.
Examples of
odes suitable for this de
oding s
heme will be presented in Se
tion 7.1.
6.8 An Algebrai
Look at De
oding Turbo Codes
The basi
idea in turbo de
oding is to have two separate de
oders working together to iteratively
de
ode the
odeword. The rst de
oder re
eives the rst output stream, y, and makes a soft
de
ision estimate for ea
h bit using any of a number of proposed s
hemes for de
oding a RSC.
This estimate is passed along to the se
ond de
oder whi
h also re
eives the other output stream, y^.
Using both streams of data, the se
ond en
oder makes an updated estimate of the
odeword. This
data is then sent ba
k to the original de
oder, whi
h tries to improve its estimate based on this
new data. This swapping of information between en
oders is repeated until a `reliable' de
oding
de
ision is obtained.
Again, there is a
hoi
e as to whi
h softde
ision de
oding algorithm to use in the above pro
ess.
The original work done in this area used a modied version of the BCJR de
oding algorithm [7℄.
More re
ently the fo
us has shifted to a soft output Viterbi algorithm (SOVA) [28, 13℄. Another
approa
h is oered in [31℄. Regardless of whi
h approa
h is used, the basi
prin
iple is the same.
De
oding is based on the fa
t that several parity
he
k matri
es are available to us. The rst two
are immediate from the inputstateoutput representation and the fa
t that we require ea
h valid
45
N blo
k to begin and end with both en
oders in the all zero state. The last two follow from the
outputstateinput representation (See Proposition 2.3.1).
N (A; B ) u = 0 (6.8.1)
N (A; B ) S u = 0 (6.8.2)
~ ~
N (A; B ) y = 0 (6.8.3)
~ ~
N (A; B ) y^ = 0 (6.8.4)
Again, the key idea, to maximize the de
oding potential of the turbo
ode, is the sharing of the
soft de
ision estimates between the de
oders
Remark 6.8.1 Nowhere is the
hoi
e between hardde
ision and softde
ision de
oding so
lear as
here. We
an see that if N is greater than the order of the matrix A (whi
h is less than 2Æ ) then
the parity
he
k matri
es must have distan
e 2 and oer no hardde
ision de
oding ability. On the
other hand, if N is less than or equal to the order of A then the only possible
hoi
e for S , with the
requirements we have imposed, is the identity matrix and our turbo
ode be
omes little more than
a repetition
ode
omposed with an RSC. Thus, softde
ision de
oding is the only viable
hoi
e.
46
CHAPTER 7
MDS CONVOLUTIONAL CODES
In this
hapter the
lass of MDS
onvolutional
odes given by a ReedSolomon type
onstru
tion
will be examined. In parti
ular, it will be shown that this
lass of
odes is extremely well suited
to the algebrai
de
oding te
hniques of Se
tion 6.6. Then we will investigate the
olumn distan
e
fun
tion for this
lass of
odes and
ompare them with theoreti
al limits.
7.1 ReedSolomon Convolutional Codes
The natural question that arises from the des
ription of the enhan
ed algebrai
de
oding algorithm
of Se
tion 6.6 is whether there exist
odes whi
h have good distan
e properties for ea
h of the
sub
odes. Ideally, these sub
odes would also possess some other properties whi
h would aid in the
de
oding pro
ess. The answer to this question is yes, if one
onsiders elds larger than F 2 . Consider
the following
ode
onstru
tion whi
h was presented in [65℄; similar
odes were presented in [34℄.
We will only
onsider rate 1=2. The extension to rate 1=n is simple. The existen
e of these
odes at higher rates is the
ontent of [60℄. Let us x a
omplexity Æ. Choose F q so that q 3Æ 1.
Find a primitive element, , of Fq . Dene the following matri
es:
2
0 0 3 2
13
6 0 2 0 77 6 1 7
A = 6 .
4 .. . .
. . .. 5 ; B = 6 . 7 ; D = [1℄ , and
4 .. 5
0 0 Æ 1
2 3
Æ+1 0
0
6 0 Æ+2 0 77
A0 = 6 6 .. . . . ... 75
4 .
0 0 2Æ
The denition of0 the matrix C requires some
omputation. We would like for A~ = A BC to
be similar to A . This
an be done by
omputing the
hara
teristi
polynomial of A BC (with
the entries of C denoted by variables), setting this polynomial equal to the polynomial of A0 and
solving for the entries of C.
Denition 7.1.1 The
odes dened above are
alled ReedSolomon
onvolutional
odes.
Theorem 7.1.2 (MDS Property of ReedSolomon Codes)
~ B~ ) of ReedSolomon
on
1. The parity
he
k sub
odes dened by T +1(A; B ) and T +1 (A;
volutional
odes are maximum distan
e separable (MDS)
odes and therefore have minimum
distan
e Æ + 1 for any T su
h that Æ T q 2.
~ C~ ) of ReedSolomon
onvolutional
2. The generator sub
odes dened by
(A; C ) and
(A;
odes are MDS blo
k
odes and therefore have minimum distan
e Æ + 1 for any su
h
that Æ q 2.
Proof:
1. The matrix T +1(A; B ) is a Vandermonde matrix and hen
e every fullsize minor is
nonzero. Therefore, any ve
tor in the kernel of this matrix (i.e. a
odeword) must have
weight at least Æ + 1. On the other hand, sin
e (A;~ B; ~ C;
~ D~ ) also represents the same
ode
and has the same (minimal)
omplexity, it must be true that (A;~ B~ ) form a
ontrollable pair.
Sin
e B~0 = B and A~ = A BC = S 1A0S for some invertible s
alar matrix S , we
on
lude
that (A ; SB ) form a
ontrollable0 pair as well. It follows that SB has no zero entries. From
this we
on
lude that T +1(A ; SB ) is a Vandermonde matrix. The result follows by noting
that T +1(A0 ; SB ) = S T +1(A;~ B~ ).
47
2. The result for the observability matri
es will follow in the same line of reasoning as above
on
e we
an establish that the matrix C
ontains no zero entries. This is equivalent to saying
that the pair (A; C ), and hen
e the
onvolutional
ode, is observable. This
an be seen
as follows. The
orresponding polynomial generator matrix representation of these
odes is
simply [ pA (s) pA0 (s) ℄, the
hara
teristi
polynomials of the matri
es A and A0. Sin
e these
polynomials have distin
t roots (f; 2 ; : : : ; Æ g and fÆ+1 ; Æ+2 ; : : : ; 2Æ g respe
tively) the
g
d of the fullsize minors is trivial and hen
e the
ode is observable.
Remark 7.1.3 Not only do these
odes posess ex
ellent sub
ode distan
e properties, but the
sub
odes, being BCH
odes (usually not te
hni
ally ReedSolomon) may be de
oded eÆ
iently via
the BerlekampMassey algorithm.
Table 7.1 presents the probability of blo
k error for many values of Æ, q, , T and transition
probability p when the enhan
ed de
oding algorithm of Se
tion 6.6 is used on a qary symmetri
hannel. Figure 7.1 graphs a few of these
odes over a wide range of transition probabilities. The
two
harts show that, in general, there is not one
ode in this
lass that is optimal for all
hannel
transition probabilities. In parti
ular, it is apparent that lower
omplexity
odes perform better on
noisier
hannels, while higher
omplexity
odes a
hieve better error rates on better quality
hannels.
48
0
10
−1
10
−2
10
−3
10
Probability of Block Error
−4
10
−5
10
−6
10
−7
10
−8
10
q=13 delta=4 theta=8 T+1=16
q=32 delta=10 theta=16 T+1=32
−9
10 q=47 delta=15 theta=23 T+1=69
q=128 delta=42 theta=50 T+1=150
−10
10
0 0.005 0.01 0.015 0.02 0.025 0.03
Transition Probability p
FIGURE 7.1: Plot of blo k error probabilities for sele ted ReedSolomon onvolutional odes.
denote the
th trun
ation. Then the
olumn distan
e fun
tion (CDF) of order
is denoted by d
and dened as
d
= minfdist( [v℄
; [v~℄
) : [v℄0 6= [v~℄0 g
= minfwt ([v℄
) : [v℄0 6= 0g:
For the spe
ial
ase of when
= m (i.e. over one
onstraint length), dm is
alled the minimum
distan
e, dmin , of the
onvolutional
ode.
It is
lear that the CDF is a nonde
reasing fun
tion of
and that for all
it must be that
d
dfree . This distan
e parameter is important for many de
oding s
hemes in
luding sequential
de
oding and majority logi
de
oding.
We now state the
onje
ture whi
h we will dis
uss for the remainder of this
hapter.
Conje
ture 7.2.2 For a rate 1=2 ReedSolomon
onvolutional
ode with
omplexity Æ, and for
Æ 1
3Æ 1, then
d
Æ + 3
In parti
ular, for
= 3Æ 1, we have d3Æ 1 = 2Æ + 2.
Remark 7.2.3 It was shown in [65℄ that the free distan
e of these
odes is 2Æ + 2. So we
an
restate the above
onje
ture in these terms: If at any time, , two
odewords are in the same state,
x , and do not agree at time + 1, then over the interval (; + 1; : : : ; + 3Æ 1) the
odewords
must dier by the full free distan
e, 2Æ + 2, of the
ode. This would be an important result sin
e
the full `de
oding power' of the
ode would be
ontained in any interval of length 3Æ.
49
There is a wealth of empiri
al eviden
e to support this
onje
ture, yet neither a proof nor a
ounterexample has been dis
overed. We oer several example
odes, for small Æ, whi
h satisfy the
properties of the
onje
ture.
Example 7.2.4 For Æ = 1 and F 4 , let be a root of s2 + s + 1 over F 4 . Dene the ReedSolomon
onvolutional
ode using the following matri
es:
A = [ ℄ ; A0 = [ + 1 ℄ ; B = [ 1 ℄ ; C = [ 1 ℄ ; D = [ 1 ℄
One
an easily verify by hand that this
ode has d2 = 4.
Example 7.2.5 For Æ = 2 and F 7 ,
hoose = 3 for the primitive element. Then the dening
matri
es of our
ode are given by:
3 0 0 6 0 1
A = 0 2 ; A = 0 4 ; B = 1 ; C = [ 3 6 ℄ ; D = [1℄
This implies either the input sequen
e or the output sequen
e has weight less than Æ + 1. Sin
e
Æ
these
odes have symmetri
distan
e properties for the inputstateoutput and outputstateinput
representations, we
an assume without loss of generality that the input sequen
e has weight less
than Æ + 1. In parti
ular this implies that any
ounterexample
annot be a
omplete
odeword
sin
e at least Æ + 1 nonzero inputs are required to bring the underlying linear system into the all
zero state on
e it has left (sin
e (A; B ) has distan
e Æ + 1).
50
Consider the spe
ial
ase when only the rst w inputs are allowed to be nonzero and all other
inputs are zero. Then the input is 1; u1 ; u2 ; : : : ; uw 1 ; 0; : : : ; 0, and the lo
al des
ription of the
ode
be
omes:
2
2
y0 3 6 D 0 0 3
6 y1 7 6 CB D
... ... 77
6 . 7 ...
6 .. 7
6 7
6
6
6
. . . . . . 0
7 2
7
7
1 3
6 yw 1 7 6 w 2B 6 u1 7
6 7 = 6 CA CB D 7 7 6 . 7
6 yw 7
6 7
6
6 CA
w 1B CA2 B CB 7 4 .. 5
7
6 yw+1 7 6 CAw B CA3 B CAB 7 uw 1
6 . 7 6 7
4 . 5 . . .
. 6
4 .. .. .. 57
y
1
CA B CA
w +1 B CA
w
We will obtain a lower bound on the weights of the the last
w + 1 outputs yw ; : : : ; y
by
rearranging the bottom half of the matrix on the right side of the above equation as follows:
2 3 2
yw BT 3 2
3
2
1 3
6 yw+1 7 6 BT A 7 1 6 u1 7
6 . 7 = 6
4 .. 5 . 7 4 . . .
5 Aw 1 B AB B 6 ... 75
4 .. 5 4
y
Æ uw 1
B T A
w
This equation says that the last
w + 1 outputs form a
odeword of a
ode generated by a
Vandermonde matrix. This
ode has distan
e
Æ w + 2. Hen
e, if the input to this matrix is
not the all zero ve
tor then the last
w + 1 outputs must have weight at least
Æ w + 2.
Let us analyze the input to this matrix to ensure that the all zero ve
tor is not the input.
First, sin
e all the
i are nonzero, the se
ond matrix on the right side in the above equation is
nonsingular. All that remains is to show that
[ Aw 1B AB B ℄ [ 1 u1 uw 1 ℄0
is nonzero. However, from (2.2.2), we know that this expression is exa
tly the state at time w, xw .
Remark 7.3.1 spe
i
ally ex
ludes the possibility that we reenter the all zero state. Hen
e, the
input into the Vandermonde matrix is not the all zero ve
tor, and we have proven the following
lemma.
Lemma 7.3.2 Let Æ
3Æ 1 and w
Æ, then when the input is of the form 1; u1 ; u2 ; : : : ; uw 1 ; 0; : : : ; 0
then the outputs yw ; yw+1 ; : : : ; y
1 have weight at least
Æ w + 2.
For the spe
ial
ase when the rst w inputs are nonzero, the
onje
tured property must hold.
This is true sin
e the inputs have weight w and the outputs have weight at least
Æ w + 3
(sin
e y0 6= 0). This lemma will be a valuable tool for use in analyzing the
onje
ture.
Let us
onsider an alternate approa
h based on indu
tion on
. This approa
h will further
larify some of the properties of these
odes and will serve to produ
e a
onje
ture for the exa
t
olumn distan
e fun
tion of these
odes, rather than the previously stated lower bound
onje
ture.
There are 2
+2 bits in ea
h partial
odeword we are
onsidering. The
onje
ture is that at least
Æ + 3 of these are nonzero. We know that u0 and y0 are nonzero, so
Æ + 1 of the remaining
2
bits must be nonzero for the
onje
ture to hold. This means that at most
+ Æ 1 of these bits
are zero. In this situation there are at least Æ 1 time units in whi
h both the input and output
bits are zero. Let us
onsider a time unit, , where this is the
ase. From the inputstateoutput
representation, it follows that Cx = 0, hen
e x 2 ker C .
If there exists a
ounterexample to the
onje
ture, then there must be at least Æ time units
in whi
h the input and output bits are zero. This
orresponds to Æ states in the the kernel of C .
Sin
e, ker C has dimension Æ 1, it may be possible to prove the
onje
ture by proving some sort
of linear independen
e property of the states.
Although the above argument has not led to a proof of the
onje
ture, it has led us to a plausible
onje
ture for the a
tual
olumn distan
e fun
tion of these
odes. We begin with a denition.
51
Denition 7.3.3 For any given input sequen
e u of length
+ 1 (with u0 6= 0), denote by K
;u
the number of time units in whi
h both the input and output bits are zero. Denote by D
;u the
number of time units in whi
h both the input and output bits are nonzero.
The following proposition is
lear.
Proposition 7.3.4 The weight of the partial
odeword
orresponding to the input sequen
e u is
given by
+ 1 + D
;u K
; u. The minimum su
h weight over all appropriate input sequen
es is
d
.
Denition 7.3.5 Consider all input sequen
es of the form (1; u1 ; u2 ; : : : ; u
), and the
orrespond
ing sequen
e of states in the partial
odeword ea
h input sequen
e generates. For ea
h su
h se
quen
e,
onsider the dimension of the spa
e spanned by the states that are in ker C . Denote by
N
;w the maximum su
h dimension over all the input sequen
es of weight w.
Conje
ture 7.3.6
N
;w = max ( K
;u
fu : wt (u)=wg
D
;u + 1)
Remark 7.3.7 The above
onje
ture, when
ombined with Proposition 7.3.4, states that d
is
dependent on the number of states in a
ode sequen
e that are in ker C . That fa
t is trivial. What
the
onje
ture intends to point out is that it requires nonzero inputs to drive the system into a
state whi
h fa
ilitates low output, i.e. you
annot redu
e output weight without in
reasing input
weight and
onversely. Sin
e N
;w is at most Æ 1, we see that Conje
ture 7.3.6
ombined with
Proposition 7.3.4 redu
es to Conje
ture 7.2.2. Computation of the exa
t values of N
;w , as diÆ
ult
as that may be, leads to the following
onje
tured exa
t value for d
.
Conje
ture 7.3.8 Dene:
N
= max N
;w
1wÆ
Then Proposition 7.3.4 and Conje
ture 7.3.6
ombine to state:
d
=
+ 2 N
52
Let us
onsider the
ase when
= 4. We wish to show that d4 5. Here it suÆ
es to
onsider
input sequen
es of weight at most 2. An argument identi
al to that in the above proposition proves
that any weight 1 input sequen
e must result in an output sequen
e of weight at least 4. So we need
only
onsider the
ase when the input sequen
e has weight exa
tly 2. This means we let u0 = 1
and ui 6= 0 for exa
tly one i su
h that 1 i 4.
The
ase when u1 6= 0 is easily handled by Lemma 7.3.2. The
ase when u4 6= 0 is proven by
ombining the result of Proposition 7.4.1 for
= 3 with the fa
t that u4 6= 0 to
on
lude d4 5.
Now
onsider the
ase when u2 6= 0. We must show that at least 2 of fy1 ; y2; y3; y4 g are nonzero.
These outputs are given by:
y1 = CB
y2 = CAB + u2
y3 = CA2 B + CBu2
y4 = CA3 B + CABu2
We know that at most one of fCB; CAB; CA2B; CA3B g
an be zero. We will treat ea
h of these
possibilities separately starting with the
ase when all are nonzero. In this situation y1 6= 0 and at
least one of fy3; y4 g must be nonzero be
ause 3of MDS arguments similar to those of Lemma 7.3.2
and the desired result is obtained. When CA B = 0 it must be that y1 6= 0 and y4 6= 0. When
CA2 B = 0 it must be that y1 6= 0 and y3 6= 0. Also, when CAB = 0 it is true that y1 6= 0 and
y2 6= 0. The only
ase that remains is when CB = 0. If this is the
ase then y3 6= 0. Then either
y2 or y4 must be nonzero unless CA3 B CABCAB = 0. However, we also have
C~ A~3 B~ = CA3 B + 2CA2 BCB + (CAB )2 3CAB (CB )2 + (CB )4
Taking into a
ount that CB = 0 gives
~C A~3B~ = CABCAB CA3B
Hen
e, CA3B CABCAB = 0 implies that C~ A~3 B~ = 0 whi
h is impossible sin
e C~ B~ = CB = 0.
This nishes the
ase for when u2 6= 0.
The last
ase we need to
onsider is that when u3 6= 0. Again we must show that at least 2 of
fy1; y2 ; y3; y4 g are nonzero. These outputs are given by:
y1 = CB
y2 = CAB
y3 = CA2 B + u3
y4 = CA3 B + CBu3
The
ases when CB = 0, CA2B = 0, CA3 B = 0 and when none are zero are easily treated
in a fashion similar to the
ase when u2 6= 0. The only remaining
ase is when CAB = 0. Here,
y1 6= 0 and either y3 or y4 is nonzero unless CA3 B CBCA2B = 0. There is no known argument
whi
h ex
ludes this possibility, nor are there any known examples of when this situation arises.
In any event, the probability of this
ase is rare and any
ode
an easily be
he
ked against these
onditions. This results in the following proposition.
Proposition 7.4.2 When Æ = 2, d4 is `generi
ally' at least 5. In parti
ular, d4 5 unless CAB =
0 and CA3B CBCA2B = 0.
We wrap up our dis
ussion of the Æ = 2
ase by
onsidering d5 . Conje
ture 7.2.2
laims this
should be 6. This result has not been expli
itly proven, although it
an be shown that it holds
`generi
ally' and the set of equations governing this situation
an be
omputed expli
itly as in the
above
ase for
= 4. However, sin
e the hard de
ision error
orre
ting
apability remains the same
if we show the distan
e is 5 instead of the full 6, we will
ontent ourselves with showing the mu
h
easier result that d5 5.
Theorem 7.4.3 For rate 1/2 ReedSolomon
onvolutional
odes with Æ = 2, we have d5 5.
53
Proof: Again it suÆ
es to
onsider input sequen
es of weight at most 2. Also, the input
sequen
es of weight 1 are again seen to generate output sequen
es of weight at least 5 (1 more
than we need), so we need
onsider only the weight 2 input sequen
es. The
ases when u1 6= 0 and
u5 6= 0 are easily taken
are of by Lemma 7.3.2 and Proposition 7.4.2 respe
tively.
For u2 6= 0, we have
y1 = CB
y2 = CAB + u2
y3 = CA2 B + CBu2
y4 = CA3 B + CABu2
y5 = CA4 B + CA2 u2
The usual arguments show that at least 2 of fy3; y4 ; y5g are nonzero. Sin
e u0 , y0 and u2 are already
nonzero the result follows.
Similarly for u4 6= 0, we have
y1 = CB
y2 = CAB
y3 = CA2 B
y4 = CA3 B u4
y5 = CA4 B + CBu4
Again, at least 2 of fy1 ; y2; y3 g must be nonzero.
Finally, for the
ase when u3 6= 0 we have
y1 = CB
y2 = CAB
y3 = CA2 B + u3
y4 = CA3 B + CBu3
y5 = CA4 B + CABu3
At least one of fy1 ; y2g must be nonzero and at least one of fy4 ; y5g must be nonzero.
This
ompletes the proof.
Remark 7.4.4 Although a general proof of the
onje
tures remains elusive, the
ases for small Æ
an be treated similarly to the
ase Æ = 2 above. These small
ases further support the
onje
ture,
but sin
e they in
rease fa
torially in
omplexity it is diÆ
ult to pro
eed mu
h further by hand.
7.5 The Road Ahead and a Stronger Conje
ture
We
on
lude with a dis
ussion of the importan
e of the
onje
tures of this
hapter by expanding
upon Remark 7.2.3. We will also introdu
e and dis
uss a stronger
onje
ture regarding the existen
e
of rate 1/2
onvolutional
odes with the smallest possible
olumn distan
e fun
tion order ne
essary
to a
hieve the free distan
e.
If Conje
ture 7.2.2 holds true, then it is possible, at least theoreti
ally, to de
ode any blo
k of
length 3Æ of a re
eived word as long as at most Æ errors o
urred. That is, we
an de
ode if at
most Æ of any 6Æ
onse
utive bits (in
luding inputs and outputs) are in error regardless of in whi
h
position the errors may arise.
On one hand, we may
ompare this to the algebrai
de
oding algorithm of Chapter 6. In that
algorithm it was
riti
ally important as to where the errors o
urred. At most tpar errors
ould
o
ur in the (T )length sequen
e of inputs; at most tgen errors
ould o
ur in the length
sequen
e of outputs. In addition, it is ne
essary for an entire length sequen
e of inputs to be
error free. The enhan
ed de
oding algorithm of Se
tion 6.6 allows for a fra
tion more freedom of
where the errors
an o
ur, but the basi
restraints are still in pla
e. Using the ReedSolomon
odes of this
hapter as an example, we see that sin
e 3Æ=2 (see Table 7.1) we require at most
54
Æ=2 errors in the last 3Æ bits (and all of them either inputs or outputs). Further, at most Æ=2 errors
an o
ur in the rst T 5Æ=2 inputs. Altogether this amounts to
orre
ting about Æ errors
in a sequen
e of 11Æ=2 bits. This is slightly better than the
onje
tured result until one a
ounts
for the `stru
tured error' requirements imposed by the algorithm.
On the other hand, we may
ompare the
onje
tured results with the theoreti
al limits of the
CDF.
Theorem 7.5.1 For a rate 1/2
onvolutional
ode with
omplexity Æ and dfree = 2Æ + 2, the
smallest possible
su
h that d
= dfree is 2Æ.
Proof: Assume there is some
< 2Æ su
h that d
= dfree. Let the
ode be given by the
polynomial en
oder matrix [p(s) q(s)℄. We
an assume that p(0) 6= 0, hen
e p(s) divides s
l
for any
divisible by the order of p(s). Given some su
h
that is also greater than
, there is a
polynomial u(s) su
h that u(s)p(s) = s
1. Consider the rst
+ 1 time units of the
odeword
u(s)[p(s) q(s)℄. The rst polynomial has only the degree 0 term as being nonzero (sin
e
>
+1).
If every
oeÆ
ient of u(s)q(s) (inside the rst
+ 1 time units) is nonzero then the weight of this
partial
odeword be
omes
+ 2 < 2Æ + 2 = dfree. From this it is
lear that the smallest possible
is 2Æ.
The above theorem states that, at best, we
an expe
t to de
ode up to Æ errors for every 4Æ
bits. This is not surprising sin
e this is the MDS bound for rate 1/2 blo
k
odes. Finding
odes
whi
h satisfy this property is not an easy task. In general, the eld size may need to be very large.
When Æ = 1, we have that 2Æ = 3Æ 1 so that ReedSolomon
onvolutional
odes a
tually satisfy
this property. For larger Æ, ReedSolomon
odes do not ne
essarily have this property. Based on
empiri
al eviden
e, very few, if any, ReedSolomon
onvolutional
odes for Æ > 1 have this property.
None of the examples provided earlier in this
hapter satisfy this stronger distan
e
ondition.
For Æ > 1 it is not
lear that there even exists
onvolutional
odes whi
h satisfy this stronger
ondition. The following is one example for Æ = 2.
Example 7.5.2 The
onvolutional
ode over F 11 generated by
2
s + 10s + 4 s2 + s + 9
has dfree = d4 = 6.
Larger examples are mu
h more diÆ
ult to
ome by. Conje
ture 7.3.8 should oer a reasonable
starting pla
e to look for more systemati
means of dis
overing these
odes. We will instead look
to guarantee the existen
e of these
odes.
Conje
ture 7.5.3 For a large enough eld size, given a
omplexity Æ, there exists a rate 1=2
onvolutional
ode su
h that d2Æ = dfree = 2Æ + 2.
Proof: [with a gap℄ Let g1(s) = PÆi=0 ai si and g2 (s) = PÆi=0 bisi be polynomials over a `large
enough' nite eld, F. For any arbitrary (s), with (0) 6= 0, let
hX X i
(s) [g1 (s) g2 (s)℄ = xi si yisi :
We wish to show there exists some polynomials g1 (s) and g2(s) over some nite eld su
h that
wt (x0 ; : : : ; x2Æ+1 ; y0 ; : : : ; y2Æ+1 ) 2Æ + 2.
This is equivalent to the existen
e of the following syndrome former matrix with the property
that any set of 2Æ + 2
olumns whi
h in
ludes the rst
olumn must be linear independent.
2 3
2 3 x0
b0 a0 6 ... 7
6
6
... . . . ... . . . 7
7
6
6 ...
7
7
6 7
6
6 ... ... ... ... 7
7 6
6 x2Æ+1
7
7
6
6
6 ... ...
7
7
7
6
6 y0
7
7 = 0
6 bÆ aÆ 7 6
6 ... 7
7
6
4 ... ... ... ... 7
5 6 7
bÆ b0 aÆ a0
6
4 ... 7
5
y2Æ+1
55
Denote the large matrix above by H . Form a 2Æ + 2 subset of
olumns from H by sele
ting x0 and
any other t 1 of the xi's and also any ` of the yi's (with t + ` = 2Æ +2). We then have a morphism:
F : F 2Æ+1 F Æ+1 F Æ+1 ! F 2Æ+2 2 3
x0
6 x i1 7
(xi2 ; : : : ; xi ; yj1 ; : : : ; yj ; a0 ; : : : ; aÆ ; b0 ; : : : ; bÆ ) 7 ! H
t `
6
4 ... 7
5
yj`
The set of all points P with F (P ) = 0 forms an aÆne variety X. We
laim that
dim X dimdomain (F ) 2Æ + 2 = 2Æ + 1
To prove this
laim we
ompute the Ja
obian J (F (~x; ~0; ~0)). If one
an show that this determi
nant is nonzero for some point (~x; ~0; ~0) in a `large' irredu
ible
omponent then we
ould
on
lude
that the tangent spa
e at that point would have dimension at most 2Æ +1 and hen
e the dimension
of X would Æbe at most 2Æ + 1. With this bound on the dimension of X, we see that the proje
tion
of X onto F +1 FÆ+1
ould not be a surje
tive fun
tion, whi
h means that there must exist g1 (s)
and g2 (s) whi
h satisfy Conje
ture 7.5.3.
56
BIBLIOGRAPHY
[1℄ B.M. Allen. An algebrai
framework for turbo
odes. In Pro
. of the 36th Allerton Conferen
e
on Communi
ation, Control, and Computing, pages 27{28, 1998.
[2℄ B.M. Allen and J. Rosenthal. Analysis of
onvolutional en
oders via generalized sylvester ma
tri
es and state spa
e realization. In Pro
. of the 34th Allerton Conferen
e on Communi
ation,
Control, and Computing, pages 893{902, 1996.
[3℄ B.M. Allen and J. Rosenthal. Analyzing
onvolutional en
oders using realization theory. In
Pro
eedings of the 1997 IEEE International Symposium on Information Theory, page 287,
Ulm, Germany, 1997.
[4℄ B.M. Allen and J. Rosenthal. Parity
he
k de
oding of
onvolutional
odes whose systems
parameters have desirable algebrai
properties. In Pro
eedings of the 1998 IEEE International
Symposium on Information Theory, page 307, Boston, MA, 1998.
[5℄ B.M. Allen and J. Rosenthal. A matrix Eu
lidean algorithm indu
ed by state spa
e realization.
Linear Algebra Appl., 288:105{121, 1999.
[6℄ A.C. Antoulas. On re
ursiveness and related topi
s in linear systems. IEEE Trans. Automat.
Contr., AC31(12):1121{1135, 1986.
[7℄ L.R. Bahl, J. Co
ke, F. Jelinek, and J. Raviv. Optimal de
oding of linear
odes for minimizing
symbol error rate. IEEE Trans. Inform. Theory, IT20:284{287, Mar
h 1974.
[8℄ J. A. Ball, I. Gohberg, and L. Rodman. Interpolation of Rational Matrix Fun
tions. Birkhauser
Verlag, BaselBerlinBoston, 1990.
[9℄ J.S. Baras, R. W. Bro
kett, and P.A. Fuhrmann. State spa
e models for innitedimensional
systems. IEEE Trans. Automat. Contr., AC19:693{700, 1974.
[10℄ A.S. Barbules
u and S.S. Pietrobon. Interleaver design for turbo
odes. Ele
troni
s Letters,
30(25):2107{2108, 1994.
[11℄ A.S. Barbules
u and S.S. Pietrobon. Terminating the trellis of turbo
odes in the same state.
Ele
troni
s Letters, 31(1):22{23, 1995.
[12℄ S. Barnett. Polynomials and linear
ontrol systems. M. Dekker, New York, 1983.
[13℄ C. Berrou, P. Adde, E. Angui, and S. Faudeil. A low
omplexity softoutput Viterbi de
oder
ar
hite
ture. In Pro
. of IEEE Int. Conferen
e on Communi
ation, pages 737{740, Geneva,
Switzerland, May 1993.
[14℄ C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon limit error
orre
ting
oding
and de
oding: Turbo
odes. In Pro
. of IEEE Int. Conferen
e on Communi
ation, pages
1064{1070, Geneva, Switzerland, May 1993.
[15℄ R. R. Bitmead, S. Y. Kung, B. D. O. Anderson, and T. Kailath. Greatest
ommon divisors via
generalized Sylvester and Bezout matri
es. IEEE Trans. Automat. Control, AC23:1043{1047,
1978.
[16℄ W.J. Bla
kert, E.K. Hall, and S.G. Wilson. Turbo
ode termination and interleaver
onditions.
Ele
troni
s Letters, 31(24):2082{2083, 1995.
[17℄ T. Cover and J. Thomas. Elements of Information Theory. Wiley & Sons, New York, 1991.
[18℄ A. Dholakia. Introdu
tion to Convolutional Codes with Appli
ations. Kluwer A
ademi
Pub
lishers, 1994.
[19℄ D. Divsalar and R.J. M
Elie
e. Ee
tive free distan
e of turbo
odes. Ele
troni
s Letters,
32(5):445{446, 1996.
57
[20℄ D. Divsalar and F. Pollara. On the design of turbo
odes. TDA Progress Report, 42123:99{121,
November 15 1995.
[21℄ S. Dolinar and D. Divsalar. Weight distributions for turbo
odes using random and nonrandom
permutations. TDA Progress Report, 42122:56{65, August 15 1995.
[22℄ P. Elias. Coding for noisy
hannels. IRE Conv. Re
., 4:37{46, 1955.
[23℄ G. D. Forney. Convolutional
odes I: Algebrai
stru
ture. IEEE Trans. Inform. Theory, IT
16(5):720{738, 1970.
[24℄ G. D. Forney. Minimal bases of rational ve
tor spa
es, with appli
ations to multivariable linear
systems. SIAM J. Control Optim., 13(3):493{520, 1975.
[25℄ Paul A. Fuhrmann. A matrix eu
lidean algorithm and matrix
ontinued fra
tion expansions.
Systems and Control Letters, 3:263{271, 1983.
[26℄ R.G. Gallager. Lowdensity parity
he
k
odes. IRE Trans. on Info. Theory, IT8:21{28, 1962.
[27℄ R.G. Gallager. LowDensity Parity Che
k Codes. M.I.T. Press, Cambridge, MA, 1963. Number
21 in Resear
h monograph series.
[28℄ J. Hagenauer and P. Hoeher. A viterbi algorithm with softde
ision outputs and its appli
a
tions. In Pro
. of IEEE Globe
om '89, pages 47.11{47.17, 1989.
[29℄ U. Helmke, J. Rosenthal, and J. M. S
huma
her. A
ontrollability test for general rstorder
representations. Automati
a, 33(2):193{201, 1997.
[30℄ R. Johannesson and K. Sh. Zigangirov. Fundamentals of Convolutional Coding. IEEE Press,
New York, 1999.
[31℄ P. Jung. Novel low
omplexity de
oder for turbo
odes. Ele
troni
s Letters, 31(2):86{87, 1995.
[32℄ P. Jung. Comparison of turbo
ode de
oders applied to short frame transmission systems.
IEEE Journal on Sele
ted Areas in Communi
ations, 14(3):530{537, 1996.
[33℄ P. Jung and M. Nasshan. Performan
e evaluation of turbo
odes for short frame transmission
systems. Ele
troni
s Letters, 30(2):111{113, 1994.
[34℄ J. Justesen. An algebrai
onstru
tion of rate 1=
onvolutional
odes. IEEE Trans. Inform.
Theory, IT21(1):577{580, 1975.
[35℄ T. Kailath. Linear Systems. Prenti
eHall, Englewood Clis, N.J., 1980.
[36℄ R. E. Kalman. Mathemati
al des
ription of linear dynami
al systems. SIAM J. Control Optim.,
1:152{192, 1963.
[37℄ M. Kuijper. FirstOrder Representations of Linear Systems. Birkhauser, Boston, 1994.
[38℄ S.Y. Kung, T. Kailath, and M. Morf. A generalized resultant matrix for polynomial matri
es.
In Pro
. IEEE Conf. on De
ision and Control (Florida), pages 892{895, 1976.
[39℄ L.H. Charles Lee. Computation of the rightinverse of G(D) and the leftinverse of H(D).
Ele
troni
s Letters, 26(13):904{906, 1990.
[40℄ L.H. Charles Lee. Convolutional Coding: Fundamentals and Appli
ations. Arte
h House,
Norwalk, MA, 1997.
[41℄ R. Lidl and H. Niederreiter. Introdu
tion to Finite Fields and their Appli
ations. Cambridge
University Press, Cambridge, London, 1986.
[42℄ R. Lidl and H. Niederreiter. Introdu
tion to Finite Fields and their Appli
ations. Cambridge
University Press, Cambridge, London, 1994. Revised edition.
58
[43℄ S. Lin and D. Costello. Error Control Coding: Fundamentals and Appli
ations. Prenti
eHall,
Englewood Clis, NJ, 1983.
[44℄ J. H. van Lint. Introdu
tion to Coding Theory. Springer Verlag, Berlin, New York, 1982.
[45℄ C. C. Ma
Duee. Some appli
ations of matri
es in the theory of equations. Ameri
an Mathe
mati
al Monthly, 57:154{161, 1950.
[46℄ D.J.C. Ma
Kay. Good error
orre
ting
odes based on very sparse matri
es. IEEE Trans.
Inform. Theory, 45(2):399{431, 1999.
[47℄ D.J.C. Ma
Kay and R.M. Neal. Near shannon limit performan
e of low density parity
he
k
odes. Ele
troni
s Letters, 32(18):1645{1646, 1996.
[48℄ D.J.C. Ma
Kay, S.T. Wilson, and M.C. Davey. Comparison of
onstru
tions of irregular
gallager
odes. In Pro
. of the 36th Allerton Conferen
e on Communi
ation, Control, and
Computing, pages 220{229, 1998.
[49℄ F. J. Ma
Williams and N. J.A. Sloane. The Theory of ErrorCorre
ting Codes. North Holland,
Amsterdam, 1977.
[50℄ J. L. Massey. Threshold de
oding. MIT Press, Cambridge, Massa
husetts, 1963.
[51℄ J. L. Massey and M. K. Sain. Inverses of linear sequential
ir
uits. IEEE Trans. on Computers,
C17(4):330{337, 1968.
[52℄ Ph. Piret. Convolutional Codes, an Algebrai
Approa
h. MIT Press, Cambridge, MA, 1988.
[53℄ M. S. Ravi, J. Rosenthal, and J. M. S
huma
her. Homogeneous behaviors. Math. Contr.,
Sign., and Syst., 10:61{75, 1997.
[54℄ P. Robertson. Illuminating the stru
ture of
ode and de
oder of parallel
on
atenated re
ursive
systemati
(turbo)
odes. In Pro
eedings GLOBECOM '94, volume 3, pages 1298{1303, San
Fran
is
o, CA, November 1994.
[55℄ J.P. Robinson. Error propogation and denite de
oding of
onvolutional
odes. IEEE Trans.
Inform. Theory, IT14(1):121{128, January 1968.
[56℄ J. Rosenthal. An algebrai
de
oding algorithm for
onvolutional
odes. In G. Pi
i and
D.S. Gilliam, editors, Dynami
al Systems, Control, Coding, Computer Vision: New Trends,
Interfa
es, and Interplay, pages 343{360. Birkauser, BostonBaselBerlin, 1999.
[57℄ J. Rosenthal and J. M. S
huma
her. Realization by inspe
tion. IEEE Trans. Automat. Contr.,
AC42(9):1257{1263, 1997.
[58℄ J. Rosenthal, J. M. S
huma
her, and E.V. York. On behaviors and
onvolutional
odes. IEEE
Trans. Inform. Theory, 42(6):1881{1891, 1996.
[59℄ J. Rosenthal and R. Smaranda
he. Constru
tion of
onvolutional
odes using methods from
linear systems theory. In Pro
. of the 35th Annual Allerton Conferen
e on Communi
ation,
Control, and Computing, pages 953{960, 1997.
[60℄ J. Rosenthal and R. Smaranda
he. Maximum distan
e separable
onvolutional
odes. Te
hni
al
Report 1998074, MSRI, Berkeley, California, 1998. to appear in Appl. Algebra Engrg. Comm.
Comput.
[61℄ J. Rosenthal and E.V. York. BCH
onvolutional
odes. Te
hni
al report, University
of Notre Dame, Dept. of Mathemati
s, O
tober 1997. Preprint # 271. Available at
http://www.nd.edu/~rosen/preprints.html. To appear in IEEE Trans. Inform. Theory.
[62℄ J. Rosenthal and E.V. York. A
onstru
tion of binary BCH
onvolutional
odes. In Pro
eedings
of the 1997 IEEE International Symposium on Information Theory, page 291, Ulm, Germany,
1997.
59
[63℄ L. D. Rudolph. Generalized threshold de
oding of
onvolutional
odes. IEEE Trans. Inform.
Theory, IT16(6):739{745, November 1970.
[64℄ C.E. Shannon. A mathemati
al theory of
ommuni
ation. Bell Syst. Te
h. J., 27:379{423 and
623{656, 1948.
[65℄ R. Smaranda
he and J. Rosenthal. A state spa
e approa
h for
onstru
ting MDS rate 1=n
onvolutional
odes. In Pro
eedings of the 1998 IEEE Information Theory Workshop on In
formation Theory, pages 116{117, Killarney, Kerry, Ireland, June 1998.
[66℄ E. D. Sontag. Mathemati
al Control Theory: Deterministi
Finite Dimensional Systems.
Springer Verlag, New York, 1990.
[67℄ D.A. Spielman. Finding good LDPC
odes. In Pro
. of the 36th Allerton Conferen
e on
Communi
ation, Control, and Computing, pages 211{219, 1998.
[68℄ J. C. Willems. From time series to linear system. Part I: Finite dimensional linear time invariant
systems. Automati
a, 22:561{580, 1986.
[69℄ J. C. Willems. Paradigms and puzzles in the theory of dynami
al systems. IEEE Trans.
Automat. Control, AC36(3):259{294, 1991.
[70℄ W. A. Wolovi
h. Linear Multivariable Systems, volume 11 of Appl. Math. S
. Springer Verlag,
New York, 1974.
[71℄ E.V. York. Algebrai
Des
ription and Constru
tion of Error Corre
ting Codes, a Sys
tems Theory Point of View. PhD thesis, University of Notre Dame, 1997. Available at
http://www.nd.edu/~rosen/preprints.html.
60