Você está na página 1de 62

LINEAR SYSTEMS ANALYSIS AND DECODING OF

CONVOLUTIONAL CODES

A Dissertation

Submitted to the Graduate S hool


of the University of Notre Dame
in Partial Ful llment of the Requirements
for the Degree of

Do tor of Philosophy

by

Brian Mi hael Allen, B.A., M.A., M.S.

Joa him Rosenthal, Dire tor

Department of Mathemati s
Notre Dame, Indiana
June 1999
ABSTRACT
In this dissertation the tools of linear systems theory are used to study onvolutional odes and
to develop de oding algorithms for them.
In parti ular, the input-state-output representation of a onvolutional ode is examined. Prop-
erties of this representation are explored and its onne tion with various other representations is
detailed. Notable among these are the onne tions between the syndrome former matrix and the
lo al des ription of odewords obtained via the input-state-output representation. Also, for odes
with rate k=n  1=2, the output-state-input representation is developed and the lose onne tions
with the input-state-output des ription are detailed. These onne tions are then used to enhan e
ertain algebrai de oding s hemes for onvolutional odes.
Turbo odes are also given a brief treatment. A linear systems representation of these odes is
developed and a few remarks on the de oding of turbo odes utilizing this representation are given.
In a mu h broader appli ation of the input-state-output representation, a matrix Eu lidean
algorithm is developed. Not only does this algorithm eÆ iently de ide if a given onvolutional
en oder is atastrophi , it an ompute the greatest ommon left divisor of the en oder matrix in
a straightforward manner. This has appli ations in many areas in luding nding minimal bases of
rational ve tor spa es, obtaining irredu ible matrix fra tion des riptions of transfer fun tions and
for obtaining a basis for the free module generated by the olumns of the matrix.
Finally, the lass of Reed-Solomon onvolutional odes is presented. These odes are shown to
possess maximum distan e separable generator and parity he k sub odes. This property, along
with the ability to use Berlekamp-Massey de oding on the sub odes make these onvolutional
odes ideal for the algebrai de oding s heme presented in this dissertation. Other properties of
these odes are then examined, in luding their olumn distan e fun tion. Some theoreti al results,
examples and open problems regarding this issue are presented.

ACKNOWLEDGEMENTS
I need to thank the people who en ouraged and supported my work toward this dissertation.
First, my advisor, Joa him Rosenthal, has en ouraged, enlightened and nurtured me as a math-
emati ian. For that, I will always be grateful. I have bene tted greatly from the friendship and
advi e of my olleagues in luding Eri York, Paul Weiner, Steve Walk, Je Igo, Chris Moni o and
Roxana Smaranda he.
I would like to thank the members of my defense ommittee, Heide Glusing-Lueren, Amarjit
Budhiraja and Thomas Fuja, for their time, e ort and observations. I would like to thank all of the
fa ulty, sta and graduates students of the Mathemati s Department at the University of Notre
Dame. I would espe ially like to thank Alex Hahn and Juan Migliore for all they have done for me.
I am also very grateful to Daniel Costello Jr. for all his advi e and help.
I need to a knowledge all of the generous institutions whi h have supported my work. The De-
partment of Mathemati s and the Graduate S hool of the University of Notre Dame have provided
an extraordinary learning environment. I am indebted to the Arthur J. S hmitt Foundation and
to the Center for Applied Math at Notre Dame for their generous fellowships. I must also thank
the Institute for Mathemati s and its Appli ations and the National S ien e Foundation for their
support.
Thanks must be given to my wonderful wife who has supported and en ouraged me to an extent
that I an never repay. I would also like to thank my family for all they have given me. I will never
be able to show them just how mu h I love them.
I also would like to thank the Lady on The Dome and the entire Notre Dame family for a truly
enri hing and wonderful experien e.
CONTENTS

CHAPTER 1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1 Communi ation, Coding and Shannon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Blo k Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Convolutional Codes.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Organization of this Dissertation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
CHAPTER 2 LINEAR SYSTEMS AND CONVOLUTIONAL CODES . . . . . . . . . . . . . . . . . . . . . 6
2.1 Behaviors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.2 Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.3 Spe ial Properties of the ISO Representation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.4 An Algebrai Model for Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
CHAPTER 3 CONNECTIONS BETWEEN REPRESENTATIONS.. . . . . . . . . . . . . . . . . . . . . . . . 15
3.1 Generator Matri es and Sequen es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Parity Che k Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.3 Shift Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4 From ISO Representations to Generator Matri es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.5 From the Lo al Des ription to Syndrome Formers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
CHAPTER 4 A MATRIX EUCLIDEAN ALGORITHM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.1 Ba kground Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2 A Brief History of the Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.3 A Realization Algorithm .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
4.4 The Controllability Spa e. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
4.5 The Re ning Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.6 The Situation of Constant Rows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.7 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.8 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
CHAPTER 5 DECODING ERROR CORRECTING CODES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.1 Channel Models and Maximum Likelihood De oding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2 De oding Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 A Few De oding Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.4 Majority Logi De oding of Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.5 Using the Lo al Des ription for Feedba k De oding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
CHAPTER 6 ALGEBRAIC DECODING OF CONVOLUTIONAL CODES USING THE LO-
CAL DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.1 A Basi Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
6.2 Some Classes of Binary Input-State-Output Convolutional Codes . . . . . . . . . . . . . . . . . . . . 40
6.3 Example Codes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
6.4 An Alternative to State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.5 Some Notes on the State Elimination Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.6 Enhan ed State Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.7 Analysis of the Enhan ed Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6.8 An Algebrai Look at De oding Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
CHAPTER 7 MDS CONVOLUTIONAL CODES.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.1 Reed-Solomon Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
7.2 Further Properties of Reed-Solomon Convolutional Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
7.3 Some Insights into the Conje ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
7.4 The Case when Æ = 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
7.5 The Road Ahead and a Stronger Conje ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
1
CHAPTER 1
INTRODUCTION
1.1 Communi ation, Coding and Shannon
One may reasonably de ne ommuni ation as the onveyan e, reliable or otherwise, of information.
Reliable and eÆ ient ommuni ation is be oming an in reasingly indispensable tool of the modern
world. Indeed, as we are rmly entren hed in the \information age" our so iety is utterly dependent
on reliable ommuni ation. Eviden e of this dependen e an be seen everywhere. Finan ial markets
rely on omputers to a urately re eive and oversee trading a tivity. Banking transa tions are
\wired" instead of delivered by armored ars. Credit ards are veri ed automati ally over phone
lines or the internet for ea h pur hase. All of these things involve ommuni ation in one form
or another. Whether it is a person entering information into a omputer, a omputer displaying
information to a person, two omputers sharing data, or simply two people ommuni ating with
ea h other; all are forms of ommuni ation.
Of ourse, ommuni ation has ome a long way. The earliest forms of ommuni ations between
humans probably involved some sort of gesturing and primitive oral languages. These evolved into
written languages with grammar and syntax. Today, ommuni ation has moved into the digital age.
Information is en oded as sequen es of 0's and 1's to allow for virtually instantaneous mi ro- hip
ontrolled ommuni ation.
Reliability in ommuni ation has always been important. Indeed almost all ommuni ation
methods have some inherent error toleran es. In oral language, it is possible, to a large extent, for
two people speaking the same language but with di erent a ents and even diale ts to ommuni ate
very well. In written language, errors in spelling are often easily orre ted by the reader. For
example, if one en ounters the passage \The hilwren did very well in s hool", one ould assume
that \ hildren" was the intended word. In fa t, my automated spell- he ker has been urging me to
make that very hange.
Modern digital ommuni ation also has these toleran es for error. It omes in the form of error
orre ting odes. A reasonable de nition of an error orre ting ode is a pair (M; R) and two maps
asso iated with these sets:
: M ! R and : R ! (M)
Here, M is the set of all possible messages, R is the set of all possible re eived messages, the
map is inje tive and is alled the en oder and the map is alled the estimator or preliminary
de oder. Given this framework, we an present a model of ommuni ation employing an error
orre ting ode. Suppose the message m 2 M is to be transmitted. Then we have the following
situation.
1
m 7 ! r~ 7 ! r 7 ! m
7 ! r noise
This pro ess an be explained as follows. The message m is en oded under the map to
the transmitted message r. As r is transmitted over the ommuni ation hannel it is subje ted
to various kinds of interferen e and distortion whi h is olle tively labeled `noise'. For example,
thunderstorms often a e t the quality of telephone onne tions and dust may a e t the quality of
sound produ ed by a ompa t dis . This noise may transform r into a possibly di erent r~ 2 R.
Thus, the distorted message r~ is re eived and the de oder must map r~ to a reasonable element, r,
of (M) using the estimator map . Then, the en oding is reversed by applying 1 to obtain the
de oded message m 2 M. The transmission su essfully ommuni ates the message if and only if
m equals m  . It is one of the main fo uses of oding theory to develop odes whi h a hieve a high
rate of su essful ommuni ation over a parti ular hannel.
Example 1.1.1 Suppose two possible messages are to be sent. These messages an be represented
in a digital ommuni ation system by 0 and 1. Hen e, M = f0; 1g. We an de ne by
(0) = 00000 and (1) = 11111. We an de ne by the rule \estimate as 00000 if the re eived
message has more 0's than1 1's, and vi e versa". Thus if 01011 is re eived, then (01011) = 11111
and this is de oded as (11111) = 1.
2
If we know that for our hannel, ea h bit that is transmitted will be re eived erroneously with
a probability of 0.1, it an easily be omputed that a message will be de oded in orre tly with a
probability of 0.00856. This means that there will be 11 times fewer errors by using this ode as
opposed to using no ode at all!
The tradeo in this situation is that instead of having to transmit only one digit to send ea h
message, ve digits have to be sent. Over time, this an add up to a lot more time and money to
transmit these messages. So the notion of the rate of the error orre ting ode must onsidered.
The rate of a ode is simply the ratio of the number of information bits (i.e. the length of a message,
m 2 M) to the number of ode bits (i.e. the length of (m)). For the ode of this example, the
rate is 1/5.
It was Claude Shannon in 1948 [64℄ who showed that the goal of nding error orre ting odes
that allowed for a high probability of su essful transmission was possible. Shannon showed that
ea h hannel has a onstant asso iated with it alled the hannel apa ity. Furthermore, he showed
that there exists error orre ting odes that a hieve a su essful transmission with probability
arbitrarily lose to 1 with the rate of the ode arbitrarily lose to (but below) the hannel apa ity.
His proof showed only the existen e of su h odes. There is little indi ation of how to obtain su h
odes. The onstru tion of these odes, along with eÆ ient de oding algorithms, is the goal of
modern oding theory. Error orre ting odes generally fall into two ategories: blo k odes and
onvolutional odes.
1.2 Blo k Codes
Let A be a ( nite) set of symbols, alled the message alphabet. De ne M to be the set onsisting of
all sequen es of symbols from A of length k. Also, de ne R to be the set onsisting of all sequen es
of symbols from A of length n. We de ne k and n to be positive integers with k  n.
De nition 1.2.1 A blo k ode of rate k=n over the alphabet A is de ned as the pair (M; R)
together with an inje tive map : M ! R and an estimator map : R ! (M ).
An important spe ial ase of the above de nition is when A is a nite eld, Fq = GF (q). In
this ase, and when is a linear map : F kq ! Fnq , the ode is alled a linear blo k ode.
From elementary linear algebra, we know that the map an be represented by a s alar matrix,
G. This full rank k  n matrix is known as a generator matrix of the ode. Using this generator
matrix des ription, the en oding pro ess for a message m 2 F q an be des ribed as:
m7 ! mG
Using the above de nitions, it is lear that for a linear blo k ode, the set of odewords, (M),
is a linear subspa e of Fnq . Another way to des ribe this linear subspa e is through the use of a
kernel representation. Indeed, there exists a s alar matrix H of size (n k)  n, alled the parity
he k matrix, su h that a ve tor x 2 Fqn is a odeword if and only if xH T = 0.
The ode in Example 1.1.1 is a linear blo k ode over F 2 . Indeed, a generator matrix is given
by G = [ 1 1 1 1 1 ℄. Also, a parity he k matrix is given by
2
1 1 0 0 03
H = 6 1 0 1 0 07
4 1 0 0 1 0 5
1 0 0 0 1
Remark 1.2.2 Although error orre ting odes are de ned as a pair of sets together with an
en oder map and an estimator map, in pra ti e, all of these may not be spe i ed. Often, when
the fo us is not on de oding, the estimator map is omitted. When the ode is linear, it is ommon
pra ti e to identify the ode solely by a generator or parity he k matrix. Further, by abuse of
notation, a ode is often spe i ed simply by the set of odewords (M). Hen e, identifying the
linear subspa e that omprises the set of odewords is enough to identify the ode.
This dis ussion of blo k odes is ended with some basi de nitions.
3
De nition 1.2.3 The Hamming weight or simply weight of a ve tor in F n is de ned as the number
of nonzero omponents of the ve tor. Notation: the weight of a ve tor x is given by wt (x).
De nition 1.2.4 The Hamming distan e or simply distan e between two ve tors in An is the
number of omponents in whi h they disagree. Notation: the distan e between x and y is given by
dist(x; y).
De nition 1.2.5 The distan e of a ode is the minimum distan e between any two odewords.
If the ode is linear, then the distan e is equal to the minimum weight of all the nonzero
odewords.
1.3 Convolutional Codes
We now turn our attention to the other major lass of error orre ting odes. Suppose we have a
rate k=n linear blo k ode over F q with s alar generator matrix G. Assume we have a sequen e
of messages u0; u1k; u2 ; : : : ; ur from M that we wish to send. We ould write this sequen e as a
polynomial over F q in some indeterminate s as follows:
! u0 + u1s + u2 s2 +    + ur sr
u0 ; u1 ; u2 ; : : : ; ur
If we denote this polynomial by u(s), then the en oding of the entire sequen e an be writ-
ten simply as u(s) 7! u(s)G, where the matrix multipli ation is done on ea h oeÆ ient of the
polynomial.
It was Elias in 1955 [22℄ who suggested that the matrix G need not be s alar. In doing so, the
notion of a onvolutional ode was born. The lassi al de nition of onvolutional odes an now
be stated.
De nition 1.3.1 [23, 52℄ A rate k=n onvolutional ode is de ned as a k dimensional F-linear
subspa e of F n, where F is either the eld of rational fun tions F(s) or the eld of formal Laurent
series F((s)). In either ase, a k  n full rank matrix G(s), with entries in F [s℄ will serve as a
generator matrix. The highest degree polynomial o urring in G(s) is alled the memory, m, of the
en oder. In short, the en oding of a message depends, not only on the bits to be sent at a ertain
time, but also on the bits sent in the previous m time intervals. Of ourse, polynomial parity he k
matri es exist as well, and are sometimes referred to as syndrome formers.
It should be remarked that the notions of Hamming weight and distan e are easily extended to
onvolutional odes. The blo k ode notion of the distan e of a ode also is readily extended, but
it takes on the spe ial name of free distan e when it is applied to onvolutional odes.
Sin e onvolutional odes an potentially have in nite message sequen es, some unique situa-
tions may arise. In parti ular, it may happen that an in nite weight message sequen e is en oded
as a nite weight odeword. If this happens, then a nite number of errors during transmission an
lead the estimator map to estimate this re eived word as the all zero sequen e, whi h, of ourse, is
a valid odeword. Hen e, a nite number of transmission errors an lead to an in nite number of
errors in the message word. This unfortunate situation is alled atastrophi en oding and is to be
avoided. Fortunately, the following proposition, proven by Massey and Sain [51℄ hara terize this
situation ompletely in terms of the generator matrix.
Proposition 1.3.2 A onvolutional en oder G(s) is non- atastrophi if and only if the greatest
ommon divisor of its full size minors is sl , for some non-negative integer l. If l = 0 the en oder
is alled observable.

1.4 Organization of this Dissertation


An overview of this dissertation will now be presented. Chapter 2 will introdu e some basi on epts
of linear systems and how they an be used to de ne a onvolutional ode. The idea of turbo odes
will be reviewed and a orresponding linear systems representation for these odes will be given.
4
Chapter 3 will dis uss further properties of onvolutional odes in luding the relationships between
their various representations. In Chapter 4 the linear systems representation of onvolutional
odes is used to investigate atastrophi en oding. In the pro ess, a matrix Eu lidean algorithm
will be developed and its many appli ations dis ussed. Some basi on epts of de oding error
orre ting odes will be presented in Chapter 5. Some of the major de oding s hemes for linear blo k
odes and onvolutional odes will be given a brief survey. In Chapter 6 some algebrai de oding
te hniques for onvolutional odes based on their linear systems representation are introdu ed.
Various onstru tions of binary onvolutional odes are developed to suit the algorithm. Some
alternatives to the algebrai de oding s heme are put forward and their e e tiveness in various
situations is dis ussed and analyzed. In Chapter 7 Reed-Solomon onvolutional odes are introdu ed
and their e e tiveness in one of the enhan ed algebrai de oding s hemes is shown. Exploring these
odes further, the olumn distan e fun tions of the odes will also be dis ussed. Examples and
results will be given for lower omplexities, and some open problems will be stated.

5
CHAPTER 2
LINEAR SYSTEMS AND CONVOLUTIONAL CODES
In this Chapter the basi notions of linear systems will be reviewed, starting with the on ept
of behaviors. These ideas are used to form an alternate de nition of onvolutional ode. This
de nition is then re ned by giving a rst order representation and exploring its various properties.
For a more thorough dis ussion of the systems theory whi h dominates the early part of this hapter,
the reader is referred to [70, 36, 35, 68, 69, 37℄.
2.1 Behaviors
In this se tion the notion of dynami al systems and behaviors as well as some related on epts are
introdu ed and brie y dis ussed. This is a review of the material ontained in [58, 61, 71℄.
De nition 2.1.1 A dynami al system  is a triple  = ( T ; A; B ), where T  R is the time
axis, A is the signal alphabet and B  AT is alled the behavior. The elements of B are alled
traje tories.
We will work with these abstra t notions only long enough to write down an alternate de nition
of onvolutional ode, so we will omit any illuminating examples of the above de nitions.
For our purposes we will take the dis rete time axis T = Z+. Let F = F q and let the signal
spa e be A = Fn . Hen e, our behaviors will onsist of subsets of the set of one sided in nite
sequen es of ve tors in F n .
De nition 2.1.2 De ne the left shift operator, , and the right shift operator,  1 , on the sequen e
spa e AT by
 (a0 ; a1 ; a2 ; : : : )
= (a1 ; a2 ; a3 ; : : : )
 1 (a0 ; a1 ; a2 ; : : : )
= ( 0; a0 ; a1 ; a2 ; : : : )
A subset C  AT is right (left) shift invariant if  1 C  C ( C  C ). Also if for every element of
C, at most nitely many omponents are nonzero, we say C has ompa t support.
With these de nitions in hand, we are led to the following rather rypti de nition of a onvo-
lutional ode.
De nition 2.1.3 A subset C  AT is alled a onvolutional ode if C is linear (as a ve tor spa e
over Fq with omponent-wise addition), right shift invariant and has ompa t support.
These requirements may seem odd, but are a tually quite natural. Certainly, we have seen that
a linear subspa e is desired. The right shift invarian e means only that a valid odeword is still a
valid odeword if it is delayed before it is sent. The ondition of nite support is new. Obviously
this ontradi ts the lassi al de nition of a onvolutional ode given in De nition 1.3.1. In pra ti e
this restri tion has little e e t sin e one would never want to send an in nite odeword. More
a ademi ally, this an be justi ed by the la k of a widely a epted de nition of onvolutional ode.
With the usual identi ation between nite sequen es and polynomials, the following theorem
translates the new de nition of onvolutional ode into a more re ognizable form.
Theorem 2.1.4 [71, Theorem 3.1.2℄ Let C  AT . Then C is a onvolutional ode if and only if
C is a F [s℄-submodule of F n [s℄.
Corollary 2.1.5 [71, Corollary 3.1.3℄ As a submodule of a free module (over a PID), C has a
well de ned rank k. Hen e there exists an inje tive module homomorphism
: Fk [s℄ ! F n [s℄
u(s) 7 ! v(s)
Equivalently, there is a full rank k  n polynomial matrix G(s) su h that
C = f v(s) j 9 u(s) 2 Fk [s℄ : v(s) = u(s)G(s)g
6
Let us introdu e some terminology and notation. As before, the rate of the onvolutional ode
is given by k=n. Given an en oder G(s), the maximum degree polynomial o urring in row i is
alled the row degree of row i and denoted by i. After a possible reordering of the rows, we will
assume that 1  2  : : :  k . As before, the memory of the en oder, m, is equal to 1 . Of
ourse, reordering the en oder might seem to hange the ode it de nes, so let us resolve some
issues of uniqueness with regard to en oders with the following proposition.
Proposition 2.1.6 [71, Lemma 3.1.6℄ Two en oders, G1 (s) and G2 (s), de ne the same onvolu-
tional ode if and only if there exists a k  k unimodular matrix U (s) su h that U (s)G1 (s) = G2 (s).
The omplexity of an en oder, Æ(G(s)), is given as the sum of the row degrees, Æ(G(s)) = i.
However, in light of the above proposition, there is a mu h better way to de ne the omplexity of a
ode. The omplexity of a onvolutional ode, Æ(C) is the maximum degree of all the full size minors
of any en oder G(s). This omplexity is often referred to as the M Millan degree in the systems
theory literature. When the ontext is lear, the notation will often be shortened to simply Æ.
An en oder is minimal if Æ(G(s)) = Æ(C). It an be shown that any two minimal en oders must
have the exa t same row degrees (up to reordering). These row degrees for a minimal en oder are
known as the Krone ker indi es of the ode.
2.2 Realization
In this se tion we will develop rst order representations of onvolutional odes as de ned in the
previous se tion. Again this se tion is a review of previous works in luding [58, 61, 71℄.
The following theorem proves the existen e of a rst order realization. It should be remarked
that the results here are `dual' statements of those in [37℄ as observed in [58℄.
Theorem 2.2.1 [58℄ Let C  F n [s℄ be a rate k=n onvolutional ode of omplexity Æ. Then there
exist size (Æ + n k)  Æ matri es K; L and a size (Æ + n k)  n matrix M (all with s alar entries
in F ) su h that the ode C is de ned by
C = fv(s) 2 Fn [s℄ j 9 x(s) 2 FÆ [s℄ : sKx(s) + Lx(s) + Mv(s) = 0g:
Further, K has full olumn rank, [K j M ℄ has full row rank and rank[s0 K + L j M ℄ = Æ + n
k; 8 s0 2 F .
A triple (K; L; M ) satisfying the above is alled a minimal representation of C .
Proposition 2.2.2 [58℄ If (K;  L ; M ) is another representation of the onvolutional ode C then
there exist unique invertible (s alar) matri es T and S su h that
 L ; M ) = (T KS 1; T LS 1; T M ):
(K;
It an be shown, after a suitable transformation permitted by the above proposition, and pos-
sibly a reordering of the omponents of the ode (obviously resulting in an `equivalent' ode) that
the triple (K; L; M ) an be written in the following spe ial form.
     
K = I A 0 B
0 ; L= C ; M= I D
Here, A is size Æ  Æ, B is Æ  k, C is (n k)  Æ and D is (n k)  k.
This rewriting of the rst order representation allows us to de ne a onvolutional ode in terms
of the more familiar (A; B; C; D) representation in systems theory. Hen e, we arrive at the input-
state-output representation.
De nition 2.2.3 [Input-State-Output De nition of Convolutional Code℄ [61℄
Let F = Fq be the Galois eld of q elements and onsider the matri es A 2 F ÆÆ ; B 2 FÆk ; C 2

7
F (n k)Æ ,and D 2 F (n k)k . A rate nk onvolutional ode C of omplexity Æ an be des ribed by
the linear system governed by the equations:
xt+1 = Axt + But ;
yt = Cxt + Dut ; (2.2.1)
 
yt
vt = ; x0 = 0:
u t
Hen e the input to the en oder at time t is the information or message ve tor, ut . The en oder
reates the parity ve tor, yt, and the ode ve tor, vt, is transmitted a ross the hannel. We will
refer to the onvolutional ode reated in this way by C(A; B; C; D).
From a systems theory point of view, the variable xt is referred to as the state of the system at
time t. The input ve tor, ut, is ombined with the urrent state, xt , to reate the output, yt. Also,
the urrent input is used to update the state for the next time interval, xt+1 .
Some enlightening examples, as well as the onne tions between the various representations of
onvolutional odes will be given in Chapter 3. n o
The set of ode words are, by de nition, equal to the set of traje tories uy t0 of the t
t

dynami al system (2.2.1). The following Proposition hara terizes those traje tories.
Proposition 2.2.4 (Lo al Des ription of Traje tories) [61℄ Let ; be positive integers
n o with
 < . Assume that the en oder is at state x at time t =  . Then any ode sequen e utt
y
t0
governed by the dynami al system (2.2.1) must satisfy:
0 10
0
y
1 0
C
1 D 0  0 u
1
y +1 CA B ... .. CB u +1
B
B
C
C
B
B
C
C B CB D . CB C
C
.. .. B ... ... CB ..
B
B . C
C = B
B . C
C x + B
B CAB CB CB
CB . C
C
B
 .. C
A
B
 .. C
A B .. ... ... C .. C
A
. .  . 0 A .
y CA  CA  1 B CA  2 B    CB D u
Moreover the evolution of the state ve tor xt is given over time as:
0 1

u
xt = At  x + At  1B : : : B  .. A ; t =  + 1;  + 2; : : : ; + 1: (2.2.2)
.
ut 1
Proof: This follows easily by iterating the equations that de ne the system.
For  1 we will de ne the following notation:
2
D 0  0 3
6
6 CB D
... ... 77
M (A; B; C; D) = 6
6
6 CAB CB
... ... 7
7
7 (2.2.3)
6
4 ... ... ... 0 5 7

CA 2 B CA 3 B    CB D
Some or all of the parameters may be omitted when the ontext is lear. This should not ause
onfusion with the M from the (K; L; M ) rst order representation be ause that representation
will not be dis ussed any further in this dissertation.
Next the set of valid odewords is given as the kernel of a parity he k matrix.

8
Proposition
n  2.2.5 (Global o Des ription of Traje tories) [61℄
yt 2 F n j t = 0; : : : ; represents a valid ode word if and only if:
ut
0 1
0 1 y0
0  0 A B A 1B
   AB B B y1 C
B D CB .. C
B CB . C
B CB D CB
y
C
CB C
B
B
B I CAB CB
... CB
CB u0
C
C = 0: (2.2.4)
B .. ... ... CB u1 C
 . AB
B
C
C
..
CA 1 B CA 2 B    CB D 
. A
u
Proof: Setting  = 0 in Proposition 2.2.4 gives the bottom portion of the matrix. Sin e x +1 = 0
the top row of the matrix follows from the se ond part of Proposition 2.2.4.
Let A; B; C be s alar matri es over F of size Æ  Æ, Æ  k and (n k)  Æ respe tively. Let j be
a positive integer and de ne

j (A; B ) := Aj 1B Aj 2B : : : AB B ; (2.2.5)
0 1
C
B CA C

j (A; C ) :=
B
B CA2 C
C: (2.2.6)
B
 ... C
A
CAj 1
De nition 2.2.6 Let A; B be matri es of size Æ  Æ and Æ  k respe tively. Then (A; B ) is alled
a ontrollable pair if
rank Æ (A; B ) = Æ: (2.2.7)
If (A; B ) is a ontrollable pair then we all the smallest integer  having the property that
rank (A; B ) = Æ the ontrollability index of (A; B ).
In a similar fashion we de ne:
De nition 2.2.7 Let A; C be matri es of size Æ  Æ and (n k)  Æ respe tively. Then (A; C ) is
alled an observable pair if
rank
Æ (A; C ) = Æ: (2.2.8)
If (A; C ) is an observable pair then we all the smallest integer  having the property that
rank
 (A; C ) = Æ the observability index of (A; C ).
It is true that the (A; B; C; D) representation of the onvolutional ode is minimal (in terms of
omplexity Æ) if and only if (A; B ) is a ontrollable pair [61℄. Further, the onvolutional ode will
be observable (i.e. non- atastrophi with delay 0) if and only if (A; C ) form an observable pair [61℄.
Hen e, it is natural to only onsider odes whose representations satisfy these onditions.
Finally, let us translate the result of Proposition 2.2.2 to the input-state-output representation.
Proposition 2.2.8 For T 2 GlÆ (F ) we have:
C(A; B; C; D) = C(T AT 1 ; T B; CT 1; D)
2.3 Spe ial Properties of the ISO Representation
In this se tion, the properties of the ISO representation (2.2.1) will be dis ussed. In parti ular, for
the spe ial ase when the rank of D is k, whi h for es the rate of the ode to be at most 1=2, it
is possible to `invert' the system. That is, the system an be rewritten so that the output drives
the state and the input. The onne tions between these two representations, sin e they will play
a fundamental role in what is to follow, will be dis ussed here. The onne tions between these
9
representations and the other \ lassi al" representations of onvolutional odes will follow later in
Chapter 3.
Let us onsider a onvolutional ode de ned by the matri es (A; B; C; D) as is (2.2.1). For this
se tion we will assume that rank D = k = n k. However, we will note that what follows an also
be done with some minor modi ations for the ase where rank D = k < n k. This situation will
not be of use to us in this dissertation and it will be mu h learer if those details are omitted here.
By simple algebrai manipulations of (2.2.1), we arrive at the following representation.
Proposition 2.3.1 (Output-State-Input Representation)
Given the (A; B; C; D) representation with onditions dis ussed above (most importantly that D
is invertible), the following linear system de nes the same onvolutional ode.
xt+1 = (A BD 1C )xt + BD 1yt;
ut = D 1Cxt + D 1yt ; (2.3.9)
yt
vt = ; x0 = 0:
u t
Proof: Clear, sin e the de ning equations are simple algebrai manipulations of the previous ones.
In most pra ti al ases, D an be hosen to be the identity whi h will signi antly improve the
above notation. Also, we will employ the following shorthand notation for the above matri es.
A~ := (A BD 1 C ); B~ := BD 1 ; C~ := D 1 C; D~ := D 1
The following onne tion between A and A~ will be useful.
Lemma 2.3.2 For i  1 we have
2 3
C
h i6 CA
= A~i + A~i 1 B~ A~i 2B~    A~B~ B~
7
Ai 6
4 .. 7:
5
.
CAi 1
Proof: By indu tion on i. For i = 1, the result follows immediately from the de nition of A~.
Assume the result holds for i  j and show for i = j + 1. Hen e, we have
A~j = Aj j (A; ~ B~ )
j (A; C ):
Multiplying on the left by (A BD 1C ) gives
h i
A~j +1 = Aj +1 BD 1 CAj A~j B~ A~j 1 B~    A~2 B~ A~B~
j (A; C ):
The new `extraneous' term is exa tly the term `missing' from the matrix multipli ation. Realizing
this gives the desired result.
The output-state-input representation serves just as well as the traditional input-state-output
des ription. In parti ular, a lo al des ription of the traje tories an be obtained in the same way
as Proposition 2.2.4 by simply repla ing (A; B; C; D) with (A;~ B; ~ C; ~ D~ ). The same substitution
into (2.2.2) will give another equation for the state based on the new representation. Similarly, the
same is true for the global des ription of the traje tories provided in Proposition 2.2.5. Of ourse,
we may also de ne the ontrollability matri es,  (A;~ B~ ), and observability matri es,
(A;~ C~ ),
with respe t to this representation. It is natural to ask how these various obje ts from the two
representations are related. It happens that the relationship is quite ni ely des ribed, and we will
do so in the following sequen e of lemmas.
Lemma 2.3.3 Re all the de nition of M in (2.2.3). Then
M (A; B; C; D) = M (A; ~ B;
~ C;
~ D~ ) 1 :

10
Proof: We will show that M (A;~ B;
~ C;
~ D~ ) M (A; B; C; D) = Ik . The fa t that the entries above
the diagonal are 0 is trivial sin e we are multiplying two lower diagonal matri es. The diagonal
onsists of all1 1's sin e ~ = Ik . The next lower diagonal is all 0's sin e C~ BD
DD ~ + DCB ~ =
1 1
D CBD D + D CB = 0. All lower diagonals are zero sin e ea h blo k entry an be written
in the form h i
C~ A~j + j (A; ~ B~ )
j (A; C ) Aj B
An immediate appli ation of Lemma 2.3.2 shows this to be 0. The fa t that the reverse multipli-
ation also gives the identity is lear, but an also be proven dire tly by obtaining an analog of
Lemma 2.3.2 with the `tildes' and `non-tildes' swit hed.
Lemma 2.3.4
 (A; B ) M (A;~ B;
~ C;
~ D~ ) =  (A;~ B~ )
~ B;
M (A; ~ C;
~ D~ )
(A; C ) =
(A;~ C~ )
Proof: Both statements are a dire t onsequen e of Lemma 2.3.2.
The above lemmas show that the input-state-output representation is very naturally related
to the output-state-input representation. In Chapter 3, we will see how these representations are
related to the more lassi al representations of onvolutional odes.
2.4 An Algebrai Model for Turbo Codes
Turbo odes were introdu ed by Berrou, Glavieux and Thitimajshima in 1993 [14℄. Their idea of us-
ing parallel on atenation of re ursive systemati onvolutional odes (RSCs) (See De nition 3.1.2.)
with an interleaver was a major step in terms of a hieving low bit error rates (BERs) at signal-
to-noise ratios near the Shannon limit. (See Chapter 5 for a more thorough explanation of these
terms.) However, their ideas were originally met with skepti ism, partly be ause of the phenome-
nal performan e of the odes and partly be ause it was not lear why they worked. Many papers
have sin e been devoted to these questions, e.g. [54, 20, 21, 19℄. Mu h of this work has fo used on
the improved weight distribution obtained by using a suitable (usually random) interleaver. Oth-
ers fo used on developing a more suitable soft-de ision de oder to be used in the turbo-de oding
pro ess, e.g. [13, 31℄. This se tion will fo us on developing turbo odes in the framework of the
input-state-output representation for onvolutional odes.
We will give a brief review of turbo odes. We will then present the natural generator and
parity he k des riptions of turbo odes that follow from the lo al des ription of Proposition 2.2.4.
The de oding of turbo odes using this new des ription will be dis ussed in Chapter 6.
2.4.1 A Brief Review of Turbo Codes
Here the basi stru ture of turbo odes will be reviewed. The following is the design of the most
simple of turbo odes. Only two parallel identi al RSCs are used and no pun turing of the output
bits is employed. Turbo odes ould have many parallel RSCs, but typi ally only two are used, and
even more typi ally, the RSCs will be rate 1/2 and be identi al. Pun turing is used very often to
improve the rate of the ode by `spli ing' together the separate output streams. We will omit this
part of the turbo oding s heme simply for larity of presentation.
The input, u, to the turbo en oder is used in 3 ways. First, it is sent dire tly to the hannel
as part of the odeword. Thus, turbo odes are systemati . Se ond, the input is sent into the rst
RSC and the output, say y, of this onvolutional en oder (separated from the input) is sent along
the hannel as the se ond part of the odeword. Finally, the input is also sent into an interleaver
or permutation S , whi h rearranges the order of the input bits. The s rambled input bits are then
sent into the RSC and its output, say y^, is sent as the nal pie e of the odeword. This general
s heme is outlined in gure 2.1.

11
u
Input
u
-

y
- RSC -

?
Interleaver y^
(S ) - RSC -

FIGURE 2.1: The basi en oding stru ture of turbo odes.


Here are a few points worth mentioning. First, the overall rate of the ode as just des ribed
is 1/3, although typi ally, as mentioned above, pun turing is used to in rease the rate to 1/2.
Se ondly, the size of the interleaver, S , plays a ru ial role. The interleaver is just a known
permutation on N elements, or inputs bits in this ase. So we need to wait for ea h blo k of
N input bits before we en ode. Unfortunately, the best results of turbo oding ome when N is
extremely large. A typi al hoi e for N is about 65,000. Needless to say this is not well suited to
ertain appli ations where delays of information annot be tolerated. Regardless, some work has
been done for smaller N , e.g. [33, 32℄.
2.4.2 A New Des ription of Turbo Codes
Now we will use the representation of a onvolutional ode presented in Se tion 2.2 to obtain
representations of turbo odes. In parti ular, we will obtain a generator matrix and parity he k
matrix des ription of turbo odes.
First, let us develop our turbo ode (as des ribed in Figure 2.1). To do this we spe ify the
size of our interleaver and hoose an interleaver. We will arbitrarily denote the size as N and the
interleaver by S . Next we must spe ify whi h RSC we will use. Of ourse, this an be done by
spe ifying s alar matri es (A; B; C; D) of the appropriate size. A on rete example will be presented
at the end of this se tion.
Although not ne essarily required, we will make the typi al assumption that after ea h N time
intervals BOTH RSCs will be ba k in the all zero state. (The paper by Bla kert et al. [16℄ gives
onditions on the interleaver whi h guarantee this possible. For this dissertation, we will assume
all interleavers are hosen with this property. Other inquiries into interleaver design are given in
[10, 11℄.) This will make more lear the prevailing notion that turbo odes an be thought of as
large blo k odes.
We an represent ea h N -blo k of output from the RSC as follows using Proposition 2.2.4. For
  0 we have:

12
0
0
yN
1 D 0  0 1 0 uN 1
B yN +1 C B B CB D
... ... C
CB uN +1 C
B
B ... C = B
C
. . . . CBB ... C
C
(2.4.10)
B C B B
CAB CB . . C
CB C:
B
 ... A B
C
 .
.. . .
.. .. 0 A CB .
.
.
C
A
yN +N 1 N 2
CA B CA B    CB D
N 3 u N + N 1

Reverting to our earlier notation, the large matrix on the right hand side of the above equation
will be denoted MN (A; B; C; D), or simply M . We are now ready to give a parity he k matrix
and a pseudo-generator matrix des ription of the turbo ode. The proofs of these assertions are
straightforward.
Theorem 2.4.1 A ve tor V 2 F 3N is a valid N -blo k of the turbo ode if and only if
"
N (A; B ) 0 0 #
M I 0 V = 0;
MS 0 I
where S is the interleaver (permutation) matrix. Thus any sequen e in F is a odeword of the turbo
ode if and only if it is a sequen e of valid N -blo ks.
Lemma 2.4.2 If u 2 F N and u 2 ker N (A; B ), then the image Gu, where
" #
I
G = M
MS
is a valid N -blo k of the turbo ode.
This des ription is not satisfying sin e the input, u, is onstrained. This an be over ome with
the following observation. With reasonable assumptions on A and B , in luding that (A; B ) be
a ontrollable pair and that the hara teristi polynomial of A equals the minimum polynomial,
Cayley's theorem implies that N (A; B )u = 0 if and only if pA(s)ju(s), where pA(s) is the minimum
polynomial of the matrix A, and u(s) is the polynomial whose oeÆ ients are the entries of the
ve tor u. We know the degree of pA(s) is Æ. We write pA(s) = pÆ sÆ + : : : + p1s + p0. We de ne
the following matrix, P , of size N  N Æ, (Hen e, we assume N > Æ.) with the property that the
image of P is pre isely the kernel of N (A; B ):
2
p0 0 3
6 p1 p0 7
6 . ...
6 .. p
7
7
6 1 7
P = 6 pÆ ... . . . p0 7
6
7
6 7
6 pÆ p1 7
6
4 . . . ... 75
0 pÆ
We are now ready to present the generator matrix des ription of a turbo ode.
Theorem 2.4.3 The valid N -blo ks for the turbo ode are generated by the following matrix G.
i.e. For any v 2 F N Æ , Gv is a valid N -blo k. Also every valid N -blo k is of this form.
" #
P
G= MP
MSP
Proof: Follows from de nition of P and Lemma 2.4.2.
13
Remark 2.4.4 In the above generator matrix, the input v is not the input to the en oder, rather
Gv is the input to the en oder.
Example 2.4.5 A typi al RSC hosen for binary turbo odes is given by [1 s2s+2+1s+1 ℄. We will
develop the representations of this se tion for this en oder. First, the matri es (A; B; C; D) are
given by:    
0 1 0
A = 1 1 ; B = 1 ; C = [0 1℄; D = [1℄:
Clearly, and not surprisingly, PA(s) = s2 + s + 1. For the sake of exposition, we will hoose a
ridi ulously small value N =5. The matrix P and the matrix M are :
2
1 0 03 2
1 0 0 0 03
6 1 1 0 7 6 1 1 0 0 0 7
P =6 6 1 1 1 7 M =6 1 1 1 0 0 7
7 6 7
4 0 1 1 5 4 0 1 1 1 0 5
0 0 1 1 0 1 1 1

14
CHAPTER 3
CONNECTIONS BETWEEN REPRESENTATIONS
In this hapter we will present some examples of onvolutional odes using ea h of the various rep-
resentations. In parti ular, example odes will be developed with the generator matrix des ription.
From there, it will be shown how to transform this des ription into a parity he k des ription and
input-state-output des ription. Also, the generator sequen e representation and ele troni shift reg-
ister des riptions will be introdu ed and derived from the generator des ription. More importantly,
odes developed using the input-state-output representation will be transformed into generator and
parity he k des riptions.
3.1 Generator Matri es and Sequen es
We have already seen in De nition 1.3.1 that a onvolutional ode an be des ribed in terms of
a polynomial generator matrix, G(s). Before pro eeding any further let us give some on rete
examples.
Example 3.1.1 Let us de ne two binary onvolutional en oders, G1 (s) and G2 (s) by,
 
G1 (s) := s + 1 s3 + s + 1

2 1 0 s 2+1 
G (s) := 0 1 s
For G1(s) it is lear
2
that m = 1 = 3 and Æ(G1 (s)) = 3. For G2(s) it is easy to see that m = 1 = 2,
2 = 1 and Æ(G (s)) = 3. Let us denote the odes they des ribe by C1 and C2 respe tively. It is easy
to he k that Æ(C1 ) = 3 and2 that Æ(C2 ) = 2. Hen e, G2(s) is not a minimal en oder. The `problem'
with this en oder is that G (s) is not a minimal basis of the rational ve tor spa e generated by its
rows as in the sense of [24℄. This issued will be tou hed upon in Chapter 4.
De nition 3.1.2 A generator matrix is systemati if it is in the form [ G~ (s) I ℄, where G~ (s)
has entries in F (s). The bits of the odeword orresponding to the identity matrix are alled the
information bits of the odeword sin e they are exa tly the bits of the message, or information,
ve tor. A generator matrix is said to be re ursive if it ontains non-polynomial entries.
We an see that G2(s) is systemati but G1(s) is not.
For any generator matrix, we may write in the obvious way:
G(s) = Gm sm + Gm 1 sm 1 +    + G1 s + G0
where ea h Gi is simply the k  n matrix whose entries are the s alar oeÆ ients of si in the
orresponding polynomial entries of G(s). In doing so, the sequen e
G0 G1 : : : Gm 1 Gm
is alled a generating sequen e of the ode. It is lear that there is a bije tion between generator
matri es and generating sequen es and that transformation between the two is obvious.
Using generating sequen es it is easy to write down a s alar generating matrix (as for blo k
odes) for a onvolutional ode. However, sin e onvolutional odes an have message words and
odewords of arbitrary length, the generator matrix must a ommodate. We arrive at the following
semi-in nite generator matrix des ription.
De nition 3.1.3
2
G0 G1 G2    Gm  3
6 0 G0 G1    Gm 1 Gm 7
G = 6
4 0 0 G0    Gm 2 Gm 1 Gm 7
5
... ...

15
3.2 Parity Che k Matri es
Sin e we an also de ne a linear subspa e as the kernel of a matrix, it is lear that a onvolutional
ode an be des ribed as the kernel of a polynomial matrix. The matrix is alled a parity he k
matrix or syndrome former. If one has a generator matrix for a onvolutional ode, how does one
obtain a parity he k matrix? The answer is a rather simple appli ation of linear algebra. Take the
generator matrix, G(s) and append the n  n identity matrix below it. Then perform elementary
(over F [s℄) olumn operations on this matrix until the top k rows ome into the form [ L(s) j 0 ℄.
Then the bottom n rows and last n k olumns form the transpose of a parity he k matrix. One
may refer to [40, 39, 23℄ for a more in depth dis ussion.
Remark 3.2.1 The matrix L(s) is a greatest ommon left divisor of the matrix G(s). This on ept
will be explained thoroughly in Chapter 4. For now1 it suÆ es to know that if L(s) is Ik , then the
above algorithm also gives a polynomial inverse, G (s), for the en oder G(s) in the rst k olumns
of the bottom n rows. If L(s) is not the identity, then G(s) is atastrophi as we have de ned it
before, and more importantly for our purposes here, the parity he k matrix will, depending on
how we hoose to de ne a onvolutional ode, a tually de ne a larger ode than the one de ned by
G(s). This point will be illustrated in the following examples.
Example 3.2.2 [Continuation of Example 3.1.1℄ For G1 (s), we have
"
s + 1 s3 + s + 1
# "
1 0 #
1 0 2 3
after olumn operations is s + s s + s + 1
0 1 1 s+1
   
Hen e, H 1 (s) = s3 + s + 1 s + 1 . Similarly, it an be shown that H 2(s) = s2 + 1 s 1 is
the parity he k matrix for G2(s).
Example 3.2.3 Consider the binary atastrophi onvolutional en oder
 
3 s 0
G (s) := 1 s + 1 s2 + s + 1 s

After performing the above algorithm we arrive at


2 3
s 0 0
6 1 s+1 0 7
6 1 0 1 775
6
4 0 1 s
0 0 1
This shows that the `greatest ommon left divisor', L(s), of G3(s) is
 
s
L(s) = 1 s + 1 :0

In Chapter 4, an alternative way based on state spa e realizations is developed for omputing
su h divisors. However, the above al ulation also shows 3that H 3(s) = [ 1 j s j 1 ℄. The
polynomial ve tor 3[ 1 j 1 j s + 1 ℄ is in the kernel of H (s). However, 1
no polynomial ve tor
1 ℄ does generate this
will multiply by G (s) to generate this ve tor. The rational ve tor [ s+1 s+1
odeword. Therefore, H 3 (s) de nes a larger onvolutional ode if we employ De nition 1.3.1 as
opposed to the one given in Corollary 2.1.5. This seemingly riti al di eren e in the two de nitions
is made irrelevant sin e we have already remarked that atastrophi en oders are 1to be avoided.
Any atastrophi en oder, G(s), an be repla ed by a non- atastrophi en oder L (s)G(s).
Finally, let us remark that, in parallel with generating sequen es, parity he k sequen es are
also employed. Similarly, there exists semi-in nite parity he k matrix des riptions of onvolutional
odes. We will refer to the semi-in nite parity he k matri es as syndrome formers and their poly-
nomial ounterparts as parity he k matri es, although typi ally the two terms are inter hangeable.
16


+ - y1(s)
6
u(s) - - - -
J  
JJ 

JJ ? 

J^ +  

- y2(s)

FIGURE 3.1: Shift register for C1.

- y1 (s)


-  - -
u1 (s)

 
R
+
6 - y3 (s)

u2 (s) - -
A
A
AA - y2 (s)

FIGURE 3.2: Shift register for C2.

3.3 Shift Registers


Convolutional odes are widely used in digital ommuni ation systems today. One of the primary
reasons for this is that there exists an eÆ ient method of implementing the en oding pro ess. Shift
registers are an extremely simple and fast way to implement binary onvolutional en oding as we
shall see.
L A shift register, informally, is a ir uit onsisting of memory elements, , and modulo-2 adders,
. (For non-binary odes, onstant multipliers an be used as well.) For a rate k=n onvolutional
en oder with row degrees 1; 2 ; : : : ; k , the shift register onsists of k rows of memory elements,
with i + 1 elements in row i. These memory elements are initially lled with 0's. At the rst time
interval the k bits, ui of the message word are ea h entered into the rst memory element of their
respe tive row. The ontents of all the other elements are shifted to the next memory element in
the row, with the ontents of the last element being dis arded. This is the memory side of the
en oding pro ess. We still need to address how the odeword is output.
For ea h time interval, ea h of the n outputs, yi, of the onvolutional ode is formed by adding
modulo-2 (or whatever eld the ode is de ned over) the ontents of the `appropriate' memory
elements. Whi h elements are `appropriate' is determined, of ourse, by the generator matrix (or
other suitable des ription of the onvolutional ode). SuÆ e it to say that a few examples should
make the pro ess lear. The shift registers for C1 and C2 are presented in Figures 3.1 and 3.2.

17
Of ourse, it is also quite simple to go from a shift register representation to a generator matrix
des ription. The pro ess of inter hanging shift registers and (A; B; C; D) representations is also
quite natural. This pro ess is detailed ni ely in Se tion 7.3 of [42℄.
3.4 From ISO Representations to Generator Matri es
Given matri es (A; B; C; D) whi h de ne a onvolutional ode by (2.2.1), it is natural to wonder
what the generator matrix of the ode is. In prin iple it is an easy task to ompute the generator
matrix. 
If we take any odeword uy((ss)) and its orresponding sequen e of states x(s) (represented in
polynomial form in the reverse way: (a0 ; a1 ; : : : ; an ) $ a0 sn + a1n 1 +    + an), then the following
equation holds:   " x(s) #
sI A 0 B y(s) = 0:
C I D u(s)
Proposition 3.4.1 [58℄ There exists polynomial matri es X (s); Y (s) and U (s) with size Æ k; (n
k)  kand k  k respe tively, su h that

sI A 0 B
 "
X (s ) #
ker C I D = im YU ((ss)) :

Also, the matrix G(s) = [ Y (s)0 U (s)0 ℄ forms a polynomial en oder.


In short the above Proposition states that a generator matrix an be obtained by omputing a
basis for the kernel of the matrix on the left hand side of the equation. This an be done by hand
or by omputer for larger matri es.
Example 3.4.2 Consider the binary ode de ned by the matri es:
   
1 1 1
A := 1 0 ; B := 0 ; C := [ 1 0 ℄; D := [ 1 ℄

A simple al ulation reveals that G(s) = [ s2 + 1 s2 + s + 1 ℄.


The next issue is to go from a generator matrix des ription to a rst order representation. This
pro ess is alled realization and was rst dis ussed in Se tion 2.2. The a tual pro ess is simpli ed
if we have a parity he k matrix rather than a generator matrix. Then obtaining a (K; L; M )
des ription is rather straightforward and is des ribed in detail in [57℄. We are more interested,
however, in obtaining an (A; B; C; D) representation. We will show how this is done, and give an
immediate appli ation, in Chapter 4.
3.5 From the Lo al Des ription to Syndrome Formers
In this se tion we will show how the lo al des ription in Proposition 2.2.4 ontains or an easily
be made to ontain a parity he k matrix, or syndrome former, for the onvolutional ode. Let us
start by re alling the lo al des ription in a slightly di erent form:
0 1
u
1B u +1 C
0
C 1 0
D 0  0 I B
B ... C
C
B CA C B
CB D
... ... I CB
u
C
C x = B CB C
B
 ... A B ... ... ... CB
AB y
C
C (3.5.1)

 0 B
B y +1 C
C
CA CA  1 B CA  2 B  D I B
 ... C
A
y
18
For simpli ity of notation, let us restri t ourselves to rate 1/2 odes. Let the minimum polyno-
mial of the matrix A be denoted by pA(s). To omit degenerate ases we will assume the degree of
pA (s) to be m. Here m is the `memory' of the en oder in the usual sense. This is also equal to Æ
in the rate 1/n ase.
Lemma 3.5.1 By performing elementary row operations on the (m + 1)st row blo k of the above
matrix equation in a ordan e with the polynomial pA (s) (e.g. if pA (s) = s3 + s + 1, then add the
row blo k beginning with C and the one beginning with CA to the row blo k beginning with CA ), 3
then the (m+1)st row blo k of the matrix on the left hand side will be 0. Repeating this operation
mutatis mutandis for lower row blo ks will eliminate all lower blo ks of the matrix on the left hand
side. The resulting matrix at and below the (m+1)st row blo k on the right hand side will be (up to
row operations) the parity he k matrix or syndrome former (in semi-in nite blo k form) for the
onvolutional ode.
Proof: The rst two statements follow immediately from the de nition of minimum polynomial.
The last statement is trivial sin e the lower portion of the resulting matrix has a shifted blo k
stru ture and, by the resulting equation, multiplies with the ode bits to equal 0. Thus, it must be
a syndrome former, di ering from the `standard' one by at most elementary row operations.
Letting S denote the top m rows of the observability matrix, we arrive at the following situation:
   
S x = State Equations
0  Syndrome Former (Codeword) (3.5.2)
Simply put, the above equation says that the top portion of the matrix onsists of equations on
the bits of the odeword that depend on the state at any given time  . The bottom portion of the
matrix onsists of the syndrome former matrix as shown in the lemma (3.5.1). Let us larify these
points with a few examples.
Example 3.5.2 Let us onsider the following ode over F 2 :
2
0 0 0 0 03 2
13
6 1 0 0 0 0 7 6 0 7
A = 6 6 0 1 0 0 0 7 B = 6 0 7
7 6 7
4 0 0 1 0 0 5 4 0 5
0 0 0 1 0 0
C = [ 0 0 1 1 1 ℄ D = [ 1 ℄
2 3
u
6 u +1 7
6 u +2 7
6 7
2
6
36 u +3 7
7
0 0 1 1 13 21 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 6 u +4 7
6 0 1 1 1 0 77 66 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 6 7
76 u +5 7
6 1 1 1 0 07 60 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 76 u +6 7
6 76 7
6 1 1 0 0 0 77x =66 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 76 u +7 7
6 76 7
6 1 0 0 0 0 77 66 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 0 76 y 7
6 76 7
6 0 0 0 0 0 75 64 1 1 1 0 0 1 0 0 0 0 0 0 0 1 0 0 76 y +1 7
4 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 1 0 56 y +2 7
6 7
0 0 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 1 6
6 y +3 7
7
6
6 y +4 7
7
6
4
y +5 7
5
y +6
y +7
Remark 3.5.3 The last 3 rows of the above matrix on the right-hand side are shifted versions
of the syndrome former matrix for this en oder. In 5the above example, sin e the matrix A has a
minimum polynomial of a very ni e form, pA(s) = s , no row operations were ne essary to obtain
the syndrome former stru ture on the bottom half of the matrix equation. In general, whenever
pA (s) = si (for some positive integer i) the parity parallelogram [43℄[p. 423℄ used in de nite de oding

19
of onvolutional odes is nothing more than the Markov parameters CAiB [66℄[p. 213℄. As we will
see later, similar results hold for the parity triangle of feedba k de oding. Let us onsider an
example where this is not the ase.
Example 3.5.4 Let us examine the ode given by G(s) = [ s2 +1 s2 + s +1 ℄. We have already
seen that a realization of this ode is given by:
   
A = 1 0 1 1 B = 0 1

C = [ 1 0 ℄ D = [ 1 ℄
2 3
u
6 u +1 7
2
1 0 3 2
1 0 0 0 0 1 0 0 0 0 36
6 u +2 7
7
6 1 1 77 6 1 1 0 0 0 0 1 0 0 0 6
76
u +3 7
7
6
6 0 0 75 x = 6
6 1 0 1 0 0 1 1 1 0 0 76 u +4 7
76 y 7
4 0 0 4 0 1 0 1 0 0 1 1 1 0 56
6 y +1
7
7
0 0 0 0 1 0 1 0 0 1 1 1 6
6 y +2
7
7
4 5
y +3
y +4
We will further examine the relationship between the syndrome former and the lo al des ription
in Se tion 5.5 where we utilize these on epts to explore various majority logi de oding s hemes.

20
CHAPTER 4
A MATRIX EUCLIDEAN ALGORITHM
In this hapter, the question of determining if a generator matrix for a onvolutional ode is
atastrophi , non- atastrophi or observable is addressed. Re all that these terms were dis ussed
in Proposition 1.3.2. As will be dis ussed below, this problem also has mu h broader impli ations
and many various appli ations. Hen e, let us step ba k and analyze this problem in terms of
omputing ommon divisors for matri es.
We will develop an eÆ ient algorithm for determining the greatest ommon left divisor (GCLD)
of two polynomial matri es. Knowing this divisor allows for several immediate appli ations: In
oding theory, a non- atastrophi onvolutional en oder an be derived from an arbitrary one. In
systems theory, irredu ible matrix fra tion des riptions of transfer fun tion matri es an be found.
In linear algebra, the greatest ommon divisor an be seen as a basis for a free module generated
by the olumns of the matri es.
The approa h taken is based on re ent ideas from systems theory. A minimal state spa e real-
ization is obtained with minimal al ulations, and from this the ontrollability matrix is analyzed
to produ e the GCLD. It will be shown that the derived algorithm is a natural extension of the
Eu lidean algorithm to the matrix ase.
The results of this hapter were reported by the author in [5℄.
4.1 Ba kground Material
Let F be an arbitrary eld and onsider the polynomial ring F[s℄. If we are given two polynomial
matri es E (s) and F (s) ea h with k rows then we may de ne a greatest ommon left divisor (GCLD)
to be any k  k polynomial matrix L(s) satisfying:
1. There exists polynomial matri es E (s) and F (s) su h that L(s)E (s) = E (s) and L(s)F (s) =
F (s).
2. If L (s) is any other divisor of E (s) and F (s) then there exists a polynomial matrix D(s) su h
that L (s)D(s) = L(s).
By an arbitrary hoi e, we will work with left divisors. The theory holds mutatis mutandis for right
divisors.
Noti e that GCLDs are not unique. For our appli ations we will assume that the matrix
[E (s) j F (s)℄ is full rank. This implies that all GCLDs will be nonsingular and di er by a unimodular
right fa tor [35℄. Note also that the olumns of the GCLD of the full rank polynomial matrix
[E (s) j F (s)℄ form a basis for the free module spanned by the olumns of [E (s) j F (s)℄ in F k [s℄.
Two matri es are said to be oprime if their GCLD is a unimodular matrix.
Instead of starting with two separate matri es and then ombining them into one, we may be
given, as in the ase of a onvolutional en oder, a single full rank matrix P (s) of size k  n. We
an speak of the GCLD of this single matrix by writing P (s) = [E (s)jF (s)℄ where usually E (s) is
of size k  k and F (s) is of size k (n k), and hen e the GCLD of P (s) is then the GCLD of E (s)
and F (s). Obviously the GCLD does not depend on how we hoose the division. Equivalently,
we ould de ne a GCLD of P (s) to be a matrix L(s) su h that L(s)P (s) = P (s), where P (s) is a
polynomial matrix whose Smith, or equivalently, Hermite form is [Ik j 0℄.
With this last des ription we are able to see several immediate appli ations. First, if we are
given P (s) as a polynomial basis for a rational ve tor spa e [24℄, then by dividing by L(s) (i.e.
taking P (s)) we get a minimal polynomial basis for the ve tor spa e (as de ned in [24℄). Se ondly,
if we are given P (s) as a generating set for its olumn module over F[s℄, then we observed earlier
that the olumns of the GCLD, L(s), of P (s) form a basis of the olumn module of P (s). In
parti ular if P (s) is of size 1  2 and has the form P (s) = (p(s); q(s)); p(s); q(s) 2 F[s℄ then the
GCLD of P (s) is nothing else than the greatest ommon divisor (g. .d.) of p(s); q(s). Moreover
our algorithm is in this ase equivalent to Eu lid's algorithm. Finally, if we are given P (s) as a
onvolutional en oder, then P (s) is an observable (i.e. non- atastrophi with delay 0) en oder if
and only if L(s) is a unimodular matrix [2, 18℄.

21
Closely related to this last appli ation, we an think of P (s) as des ribing over the real numbers
R a linear behavior in the sense of Willems [69℄:
 
d
B = fw(t) 2 C 1(R; R n ) j P dt
w(t) = 0g:

The omputation of the GCLD is then needed for the omputation of the ontrollable sub-behavior
of B.
The approa h that will be taken here is to obtain a minimal state spa e representation of
the asso iated behavior B with little or no al ulation [57℄. This state spa e representation will be
ontrollable if and only if our behavioral system (or en oder) is observable. Further, the ontribution
of this hapter will be to al ulate a GCLD of P (s) dire tly from the ontrollability matrix of this
state spa e representation. As we shall see, the algorithm presented will be a natural generalization
of the Eu lidean algorithm to polynomial matri es.
4.2 A Brief History of the Problem
The problem of nding GCLDs is not new, and, indeed, there are several algorithms in existen e.
The most obvious way is to append the two matri es together as [E (s) j F (s)℄ and perform polyno-
mial olumn operations (over F[s℄) to bring the matrix to Smith or Hermite form [8℄. The obvious
drawba k is that polynomial olumn operations an be ome quite tedious, espe ially if the degrees
of the polynomial entries are high. This problem was over ome by Kung et al. [38, 15℄ with their
approa h using generalized Sylvester matri es. A problem with that algorithm is that the s alar
matri es obtained from the original polynomial matri es were often quite large.
Several more re ent works, using somewhat similar but distin t methods to the one proposed
here, have appeared: Fuhrmann [25℄ obtained an algorithm using a matrix ontinued fra tion
representation. Antoulas [6℄ has done onsiderable work on the subje t using re ursive and partial
realizations.
An ex ellent referen e on the various te hniques of omputing GCDs in the ase k = 1 an
be found in [12℄. In fa t, the se tion on \G.C.D. Using Companion Matrix" from this book give
exa tly our algorithm in the simple ase k = 1. In this referen e it was, unfortunately, not observed
by the author that the ompanion matrix was, in fa t, a realization of the polynomial matrix. This
prevented the extension to the general ase, where the author of that paper instead presents the
algorithm of Kung et al..
4.3 A Realization Algorithm
We now present the realization algorithm we will need, pro eeded by some notation. For a more
thorough a ount of the ideas involved, please refer to [57℄.
Partition P (s) into P (s) = [ E (s) j F (s) ℄, where E (s) is k  k and F (s) is k (n k). After some
unimodular row operations we an assume that P (s) is row proper with row degrees (Krone ker
indi es) 1      k . After a possible right multipli ation by an n  n invertible matrix we an
assume that the high order oeÆ ient matrix, Ph, has the form [Ik j 0℄. Assume that P (s) has no
onstant rows, i.e. k  1. For i; j = 1; : : : ; k let
i
X i 1
X
ei;j (s) = e i;j s f i (s) = fi s
=0 =0
denote the polynomial entries of E(s) and the ith row of F (s) respe tively.
De ne for i = 1; : : : ; k matri es of sizes i  i, i  (n k) and 1  i respe tively:
2
0 : : : : : : : : : e0i;i 3
6 1 0 e1i;i 7
2
fi0 3
6 fi1 7
6 7
6
Ai;i := 6 0 1
6 . . . .
. 7
. 7 ; Bi := 64 .. 75 ; Ci := [0; : : : ; 1℄:
7 (4.3.1)
6 .. ... 0 ... 75 .
4 .  1
f i

0 : : : 0 1 ei;i 1 i i

22
For i; j = 1; : : : ; k, i 6= j de ne matri es of size i  j :
2
0 : : : 0 e0i;j 3
6 .. ... e1 77
6 .
Ai;j := 6 .
6
...
i;j
... 775 (4.3.2)
4 ..
0 : : : 0 ei;j 1
i

The matri es Ai;i are just the ompanion matri es for the polynomials ei;i(s), while the matri es
Ai;j are just j 1 olumns of zeroes with the oeÆ ient ve tor of the polynomial ei;j (s) appended
on the right. Similarly ea h Bi is just the oeÆ ient ve tors of all the polynomials in the ith row
of F (s). So these matri es are obtained with no al ulations at all, provided that the matrix P (s)
meets the somewhat stringent onditions imposed. If Ph does not have the form Ph = [Ik j 0℄ then
it an be brought into this form with the unimodular operations outlined above.
Noti e also the requirement that P (s) has no onstant rows. If P (s) has  onstant rows then
the row and olumn operations outlined above will transform P (s) into:
 
P^ (s) =
^
E 1 ( s ) E ^ 2 ( s ) F^ ( s ) (4.3.3)
0 I 0
and [ E^1 (s) j F^ (s) ℄ has no onstant rows.
Right unimodular operations will not a e t the GCLD, however left operations will have to be
`undone' on e the GCLD of the resulting matrix is al ulated. So all of these onditions an be
met at the expense of some eÆ ien y. From here on, assume that P (s) meets these requirements.
Theorem 4.3.1 [70, 57℄ Given P (s) = [ E (s) j F (s) ℄, satisfying Ph = [ Ik j 0 ℄ and let Ai;j ; Bi ; Ci
be de ned as above. Let
2
A1;1    A1;k 3 2
B1 3 2
C1 0 3

A := 4 .. . . . B := 4 .. C := 4 ...
. . .. 5;
. 5; 5
Ak;1    Ak;k Bk 0 Ck
and let  represent either the shift operator or the di erential operator dtd . Then
x(t) = Ax(t) + Bu(t); (4.3.4)
y(t) = Cx(t)
represents a minimal state spa e realization of the system
E ()y(t) + F ()u(t) = 0: (4.3.5)
We see that A has size Æ  Æ (where Æ = Pki=1 i), B is Æ  (n k), and C is k  Æ.
The idea here is that ontrollability of the state spa e representation is equivalent to the on-
trollability of the behavioral system given by P (s) whi h is equivalent to P (s) being an observable
en oder [53, 57℄.
The relationship between the polynomial matrix P (s) and the matri es (A; B; C ) is expressed
in the following way: Consider the k  Æ matrix
2 3
1 s    s1 1 0 0    0
6 0 0  0 1 s    s2 1 7
X (s) = 6
6
0 0  0 7
7 (4.3.6)
6
4 . . . 0 0    0 75
1 s    s 1 k

whi h was alled a basis matrix of size  = [1 ; : : : ; k ℄ in [57℄ sin e it has the property that every
polynomial k-ve tor '(s) 2 F k [s℄ whose i-th omponent has degree at most i 1 an uniquely be
des ribed through '(s) = X (s) ; 2 F Æ :
23
A dire t al ulation reveals that P (s) and the matri es (A; B; C ) are related by:
 
X (s)[sI A j B ℄ = P (s) C 0 (4.3.7)
O In k

Of ourse, we an multiply X (s) by an invertible matrix T GLÆ , on the right and obtain the
equivalent realization (T 1 AT; T 1B; CT ). We will make use of this fa t in Se tion 4.5 to obtain
a more suitable realization.
4.4 The Controllability Spa e
We are given a k  n full rank polynomial matrix P (s) and wish to determine its GCLD, L(s).
Write P (s) = L(s)P (s), where P (s) has Smith form [Ik j 0℄. We will assume that the rows of
P (s) form a minimal basis in the sense of Forney [24℄. The row degrees (1 ; : : : ; k ) of P (s) are
therefore the minimal indi es of the rational ve tor spa e generated by the rows of the matrix P (s).
We will not assume that (1 ; : : : ; k ) are ordered by size. Also write P (s) = [E (s) j F (s)℄ and let
Ph = [Ik j 0℄ be the high order oeÆ ient matrix.
Sin e L(s) is determined uniquely up to unimodular right multipli ation, we have a hoi e as
to whi h L(s) to work with, and hen e whi h P (s) to work with. The following lemma relates L(s)
and P (s) and it singles out a ni e hoi e:
Lemma 4.4.1 If the rows of P (s) form a minimal basis having row degrees 1 ; : : : ; k then L(s)
is uniquely determined from the identity P (s) = L(s)P (s). The (i; j )-entry of L(s) has degree at
most (i j ) or the entry is zero.
It is possible to hoose P (s) su h that the s alar matrix L1 whose (i; j )-entry is the oeÆ ient
of s  in the (i; j )-entry of L(s) is lower triangular.
i j

Proof: The rst part of the lemma is a dire t onsequen e of [24℄. The se ond part will be estab-
lished by indu tion. Using elementary olumn operations on L(s) (this orresponds to elementary
row operations on P (s)) it will be possible to eliminate all entries of the rst row of L1 with the
ex eption of one entry. After a possible permutation of the olumns we an assume that the rst
row of L1 has with the ex eption of the entry (1; 1) all entries equal to zero. Pro eeding indu tively
row by row will establish the laim.
Let Ph be the high order oeÆ ient matrix of P (s). From the fa t that both Ph and Ph have
rank k and from the identity Ph = L1Ph it follows that L1 is invertible. As a dire t onsequen e
we have:
Lemma 4.4.2 Let d := i be the M Millan degree of P (s). Then
P

deg det L(s) = Æ d = Æ rank (A; B ):


Lemma 4.4.2 establishes a rst relation between the GCLD, L(s), and the ontrollability matrix
(A; B ). It should be noted that this result, for the ase k = 1, was known already in 1950 by
Ma Du ee [45℄ if not earlier. It is the goal of this and the next se tion to show that under ertain
onditions it is possible to ompute L(s) from the olumn spa e of (A; B ), i.e. from the rea hability
spa e of (A; B ).
Sin e the high order oeÆ ient matrix of P (s) has the form [Ik j 0℄ we an realize P (s) by
inspe tion to obtain the s alar matri es A, B , and C relative to the basis matrix X (s). Hen e the
following equation holds:
X (s)[sI A j B ℄ = [ E (s)C j F (s)℄ (4.4.8)
Note that in order to realize P (s), we need Ph = [Ik j 0℄ and that the row degrees, i, of P (s)
are at least one. To satisfy the rst requirement, in general, we will have to multiply E (s) by
T to obtain a realizable form. i.e. Pr (s) = [E (s)T j F (s)℄. The se ond requirement annot be
guaranteed. For this se tion and the following one we will assume that P (s) has no onstant rows,
and the ase where there are onstant rows is onsidered in Se tion 4.6.

24
Now, realize Pr (s) to obtain matri es A, B , and C , relative to the anoni al basis matrix X~ (s).
Hen e the following equation holds:
X~ (s)[sI A j B ℄ = [ E (s)T C j F (s)℄ (4.4.9)
The ontrollability matrix of the pair (A; B ) an also be omputed. However, the usual de nition
of the ontrollability matrix is that of a d  d(n k) matrix, where B is of size d  (n k). We
an, however, naturally extend the size of this matrix to d  Æ(n k). This is ne essary for the
following key result.
Theorem 4.4.3
L(s)X~ (s)(A;
 B ) = X (s)(A; B )
Proof: Repeated appli ations of (4.4.8) give:
X (s)(A; B ) =
 
F j sF + ECB j    j sÆ 1 F + sÆ 2 ECB + sÆ 3 ECAB +    + ECAÆ 2 B
Repeated appli ations of (4.4.9) give:
L(s)X~ (s)(A;
 B ) =
F j sF + ET C B j    j sÆ 1 F + sÆ 2 ET C B + sÆ 3 ET C AB +    + ET C AÆ 2 B
 

By examining the above expressions, it is lear that the only step remaining in the proof is to
show that CAiB = T C AiB for allPnon-negative integer i.
We noti e that (sI A) 1 = 1 i=0 s +1 . Starting
i
A i

P1 CA
with the equation X (s)(sI A) = E (s)C ,
we apply this inverse to obtain X (s) = E (s) i=o s +1 . Further:
i
i

1
X CAi B
F (s) = X (s)B = E (s) i+1 :
i=o
s
Similarly, we have the equations:
1
X C Ai B
F (s) = X~ (s)B = E (s)T :
i=o
si+1
Multiplying the last equation by L(s) results in
1
X T C Ai B
F (s) = E (s) :
i=o
si+1
Sin e E (s) has high order oeÆ ient matrix Ik , the olumns of E (s) are linearly independent over
F [s℄
and we get:
1 CAi B
X X1 T C Ai B
si+1
= si+1
i=o i=o
Equating oeÆ ients in the above expression gives us the desired equality and ompletes the proof.

25
This theorem is the key to the entire algorithm as the following orollary shows.
Corollary 4.4.4 There exists an invertible matrix W 2 Gl(n k)Æ su h that
 
X (s)(A; B )W = s1 1 `1 : : : s`1 `1 j : : : j s 1 `k : : : s`k `k j Ok((n k)Æ d) :
k (4.4.10)
In this representation the k  k matrix
L(s) = [`1 ; : : : ; `k ℄
represents a greatest ommon left divisor of [E (s) F (s)℄ and 1 ; : : : ; k are the row degrees of
[E (s) F (s)℄.
Sin e Pr (s) is a minimal basis, its realization, (A; B ), must be a ontrollable pair.
 Therefore,
there exists a s alar matrix W 2 Gl(n k)Æ su h that (A; B )W = Id j Ok((n k)Æ d) . Hen e, the
Proof:

theorem implies that X (s)(A; B ) is olumn equivalent to a matrix whose olumns are exa tly
the olumns of a GCLD and also multiples of these olumns (as the multipli ation X~ (s)(A; B )
indi ates).
4.5 The Re ning Algorithm
By Theorem 4.4.3 and Corollary 4.4.4, the olumns of L(s) are ontained in a matrix that is olumn
equivalent (over F ) to X (s)(A; B ). The question is now, how to sele t these k olumns of L(s)
from the (n k)Æ olumns of the ontrollability matrix? The answer is fairly simple: olumn redu e
and then hoose the appropriate k olumns in a manner that will be des ribed below. However,
we must rst re onsider our hoi e of basis matrix X (s). The reason we have started with the
one we have hosen is that it allows us to write down the matri es A and B a little easier. The
downside is that when we olumn redu e the ontrollability matrix we start by eliminating the
lower degree terms of the polynomials in row 1 of the orresponding matrix X (s)C(A; B ). It would
make mu h more sense to start eliminating the highest degree terms in ea h row. We a omplish
this by repla ing the standard basis matrix X (s) introdu ed in (4.3.6) with the basis matrix X(s) =
2 1 1
s 0 s1 2 0    s1 1 0 0 3
6 0 s2 1 0 s2 2    0 s2 1 7
6
4 . .. . .. . . . 0 75
s 1 0k s 2    0
k 0 s 1k

In this representation, the monom s and the orresponding olumn is omitted as soon as the
exponent < 0. X(s) and X (s) are related by a simple permutation of the olumns, i.e. there is
a permutation matrix U1su h that X(s) = X (s)U . This permutation transforms the ontrollability
matrix (A; B ) into U (A; B ).
Although it is mu h simpler to explain the algorithm by performing the U transformation as
above, in pra ti e the omputer would1 automati ally perform the realization with respe t to the
new basis matrix X and arrive at U AU and U 1B instead of A and B . The realization with
respe t to the new basis matrix is just as simple to ompute as the original, yet it is in a more
pra ti al form and, by arriving at it dire tly, will not waste time by transforming basis matri es.
As mentioned earlier, the basis matrix X(s) (as well as the basis matrix X (s)) has the property
that every polynomial k-ve tor '(s) 2 F k [s℄ whose i-th omponent has degree at most i 1 an
uniquely be des ribed through '(s) = X(s) ; 2 F Æ : It is therefore possible to identify '(s) with
the Æ-ve tor . We will say that is the oordinate ve tor of '(s) with respe t to the basis matrix
X(s).
Theorem 4.5.1 Assume P (s) has Krone ker indi es 1      k and minimal indi es 1 ; : : : ; k
none of whi h equal zero. Let L(s) = [`1 ; : : : ; `k ℄ be a GCLD whose (i; j )-entry has degree at most
i P
j or is zero. Assume that the matrix L1 is lower triangular (by Lemma 4.4.1) and let
d = ki=1 i . Then the Æ  d s alar matrix whose olumns form the oordinate ve tors of
1 `1 : : : s`1 `1 j : : : j sk 1 `k : : : s`k `k 
 
s1 (4.5.11)
is after a possible permutation of the olumns in olumn e helon form.

26
Proof: Immediate onsequen e from the fa t that L1 is lower triangular, has nonzero diagonal
elements and the spe i hoi e of the basis matrix X(s).
As a onsequen e of this theorem we an immediately read out the minimal indi es 1 ; : : : ; k
from the pivot indi es of the olumn e helon form of (A; B ). A priori it is not true that
X(s)(A; B ) has the parti ular form (4.5.11) even if (A; B ) is in olumn e helon form. One
observes however that elementary olumn operations on (A; B ) orrespond to unimodular opera-
tions on X(s)(A; B ). By Theorem 4.4.3 we also know that the olumns of X(s)(A; B ) are in the
olumn module of L(s). By the above remarks it is possible to identify k olumns [ 1 ; : : : ; k ℄ from
the olumn e helon form of (A; B ) su h that X(s)[ 1 ; : : : ; k ℄ forms a GCLD of P (s). In the sequel
we make this sele tion pro ess more pre ise.
Assume that the ontrollability matrix (A; B ) is in olumn e helon form. We an think of the
ontrollability matrix as being divided into row blo ks. The top row blo k onsists of k rows and
orresponds (under multipli ation by X(s)) to oeÆ ients of degree i 1 for ea h respe tive row
i. The next lower blo k orresponds to oeÆ ients of degree i 2. Ea h lower blo k is similarly
de ned. If j < 0 then no row orresponding to row j o urs in row blo k (or any subsequent
blo ks). Based on this we de ne:
De nition 4.5.2 1. A olumn in the ontrollability matrix (A; B ) is said to \take its order
in row i" if the leading oeÆ ient o urs in a row whi h orresponds (under multipli ation
by X(s)) to an entry in row i of the resulting polynomial k-ve tor.
2. For ea h row i, 1  i  k, onsider all the olumn ve tors of the ontrollability matrix taking
their order in row i. From this set, the olumn ve tor whose leading oeÆ ient is lowest (in
the matrix, not ne essarily in value), is alled the \row leader for row i".
Theorem 4.5.3 If the olumn e helon form of (A; B ) has k row leaders [ 1 ; : : : ; k ℄ then X(s)[ 1 ; : : : ; k ℄
forms a GCLD of P (s).
Proof: It follows from our de nition of row leaders [ 1 ; : : : ; k ℄ that
k
X k
X
deg det X(s)[ 1 ; : : : ; k ℄ = i i = Æ d:
i=1 i=1
Sin e X(s)[ 1 ; : : : ; k ℄ is a subset of the olumn module of L(s) it follows that the olumns of
X(s)[ 1 ; : : : ; k ℄ generate this olumn module and this ompletes the proof.
Remark 4.5.4 It an be shown and it is illustrated in an example in Se tion 4.8 that in the ase
of (n k) = k = 1, i.e. in the situation where P (s) = (p1 (s); p2 (s)) the olumn redu tion of
the ontrollability matrix (A; B ) is exa tly the Eu lidean algorithm. The presented algorithm
generalizes in this way Eu lid's algorithm.
Remark 4.5.5 The olumn redu tion of (A; B ) an be done very eÆ iently by iteratively om-
puting the ve tors Ai bj , where bj is the j -th olumn of B . (See [66, 2℄ for more details). Due to
the very sparse stru ture of (A; B ) the olumn redu tion is even easier.
4.6 The Situation of Constant Rows
As remarked earlier, the matrix P (s) that is used in the proof of our algorithm ould have onstant
rows, and that poses problems when we try to realize this matrix. In this se tion we will deal with
this ase. In parti ular, assume that P (s) has 0    k onstant rows (i = 0 for 1  i  ).
Similar to before, we know that P (s) has (after possible right s alar multipli ation) the form:
 
P (s) = Ik 0 0 (4.6.12)
0 E (s) F (s)
Letting Pr (s) = [E (s) j F (s)℄, we an obtain the realization matri es A, B , and C relative to the
basis matrix X~ (s) for Pr (s). The following result an easily be shown using arguments mirroring
those in the proof of Theorem 4.4.3:
27
Theorem 4.6.1
 
0  B ) = X (s)(A; B )
L(s) X~ (s) (A;

In analogy to Corollary 4.4.4 we have:


Corollary 4.6.2 Let +1 ; : : : ; k be the nonzero minimal indi es of P (s). Then there exists an
invertible matrix W 2 Gl(n k)Æ su h that
 
X (s)(A; B )W = s+1 1 `+1 : : : s`+1 `+1 j : : : j sk 1 `k : : : s`k `k j Ok((n (4.6.13)
k)Æ d) :
In this representation the [`+1 ; : : : ; `k ℄ represent k  generators of the olumn module of P (s).
By Lemma 4.4.2 we know that the rank(A; B ) = Pki=1 i = d: Combining this with Corol-
lary 4.6.2 and Theorem 4.5.3 results in:
Theorem 4.6.3 If P (s) has  zero minimal indi es then the olumn e helon form of (A; B ) has
k  row leaders [ +1 ; : : : ; k ℄ and the olumns of X(s)[ +1 ; : : : ; k ℄ form an independent set of
generators for a GCLD of P (s).
By this last theorem we will be able to ompute the number, , of nonzero minimal indi es of
P (s) and we always will be able to identify k  `row leaders' from the e helon form of (A; B ).
This is very important. Otherwise, we ould perform the algorithm, get k olumns and think we
are done, when in reality we would have sele ted olumns that are unimodularly equivalent and
ended up with a singular matrix!
The question now turns to: How do we sele t the remaining  olumns to ll up our matrix
and arrive at a GCLD?
The answer is a tually quite simple. For this onsider the high order oeÆ ient matrix H of
the k  (k ) matrix X(s)[ +1 ; : : : ; k ℄. This high order oeÆ ient matrix is a sub-matrix of the
matrix L1, introdu ed in Lemma 4.4.1. The high order oeÆ ient matrix of P (s) is assumed to
be Ph = [Ik 0℄. It is therefore possible to augment H with olumns from Ph su h that the overall
matrix L1 be omes invertible. Correspondingly we have a way of sele ting  olumns from the
rst k olumns of P (s) su h that X(s)[ +1 ; : : : ; k ℄ augmented by these olumns results in a GCLD
of P (s). Simply put: For every row, i, whi h does not have a row leader, simply sele t olumn i
from the matrix P (s) to be in L(s).
4.7 The Algorithm
We now present the algorithm of omputing a GCLD in a on ise form:
Step 1 We are given a full rank polynomial matrix P (s).
Step 2 Che k if the high order oeÆ ient matrix Ph has the form [I j 0℄. If not, then use right and
left unimodular operations to bring it into this form. Keep tra k of any left unimodular
operations in the matrix V (s).
Step 3 Che k if P (s) has anyh onstant rows. If P (s) has  onstant rows and is in the form (4.3.3)
^2 (s) i
then the sub-matrix I of (4.3.3) de nes  generators of a GCLD L(s). Continue
E


the algorithm with the redu ed matrix [ E^1 (s) j F^ (s) ℄ in order to nd the remaining k 
olumns of the GCLD.
Step 4 Obtain the realization matri es A and B relative to the basis matrix X(s) `by inspe tion'.
Step 5 Cal ulate the ontrollability matrix C(A; B ) and olumn redu e it. (This may be done
simultaneously to greatly improve eÆ ien y [66, 2℄.)
Step 6 Pi k out the `row leaders' from the olumn redu ed ontrollability matrix C(A; B ). Mul-
tiply the `row leaders' by X(s) and pla e them in the GCLD.
28
Step 7 If there are k row leaders, then go to step 8. If there are less than k row leaders, then
follow the algorithm of Se tion 4.6.
Step 8 Multiply the GCLD on the left by V 1 and stop.
Remark 4.7.1 The steps whi h take the most time are steps 2 and 5. Step 2 is not ne essary
when P (s) is in the desired form. Of ourse, in general we will not know or an not guarantee
what form a matrix will have. However, in ertain appli ations, su h as sear hing for observable
onvolutional en oders [2, 18℄, we may pres ribe what form the matri es will have.
After having omputed the GCLD, L(s), there might arise the need to ompute the ` ontrollable
part' P (s) as well. Let pi(s) and p~ i(s) denote the ith olumn of P (s) and P (s) respe tively,
i = 1; : : : ; n. Consider for ea h index i the equation
L(s)~pi (s) = pi (s): (4.7.14)
We an view (4.7.14) as a system of Æ + k linear equations in d + k unknowns. We therefore have
to solve simultaneously n systems of equations in d + k unknowns. Due to the fa t that the matrix
L1 is already in lower triangular form it follows that the oeÆ ient matrix appearing in (4.7.14) is
already in triangular form as well. A solution of (4.7.14) an therefore be omputed very eÆ iently
and the method will be illustrated in the next se tion.
4.8 Examples
We have in luded some examples to aid in the understanding of the algorithm.
Example 4.8.1 First, take the ase when P (s) is a 12 matrix. In this ase we are just determining
the GCD of two polynomials. Noti e that P (s) will trivially satisfy all of the onditions unless the
two polynomials have the same degree. In that ase divide one into the other, and take the
remainder in pla e of the original polynomial.
Let us work through the following example. Let P (s) =
[ s6 + 5s5 464s4 + 1123s3 887s2 + 234s + 72 s5 2s4 342s3 + 1177s2 1170s + 504 ℄
We get the following realization:
2
5 1 0 0 0 03 2
13
6 464 0 1 0 0 0 7
6 1123 0 0 1 0 0 7
6 2 77
A=6 7 B=6
6 342 7
6 887 0 0 0 1 0 7 6 1177 7
4
234 0 0 0 0 1 5 4
1170 5
72 0 0 0 0 0 504
relative to the basis matrix X(s) = [s5 s4 s3 s2 s 1℄. The orresponding olumn redu ed ontrollability
matrix is: 2
1 0 0 0 0 03
6 2 1 0 0 0 0 77
6
342 65 1 0 0 0 77
(A; B ) = 666 1177 221 3
19 0 0 0 75
3
4
1170 220 23 0 0 0
504 332 12 0 0 0
Sin e there is only one3 row of2 X(s)C(A; B ), the row leader must be the rightmost nonzero olumn.
Hen e the GCLD is s 19s + 23s 12.
Noti e that the rst olumn of the above matrix orresponds with polynomial of lesser degree
from our original matrix. The se ond olumn orresponds with the ` rst remainder' that one obtains
when applying the Eu lidean algorithm to the two polynomials in our matrix. The third olumn
orresponds with the `se ond remainder', and also the last nonzero one, of the Eu lidean algorithm.
Be ause of this, our algorithm an be seen as an extension of the Eu lidean algorithm to matri es.
29
Example 4.8.2 Now let us look at a more nontrivial example.

5 4 2 4 2 
P (s) := ss s3 +ss2++ss + 1 s2 s++2 3s

Here 1 = 5; 2 = 3 and the realization matri es are


2
0 1 1 0 0 0 0 03 2
13
6 0 1 0 1 0 0 0 0 77 6 0 7
6 0 0 0 0 1 0 0 0 6 0 7
6 7 6 7
6 1 1 0 0 0 1 0 0 7 6 2 7
A=6 0 1 0 0 0 0 1 0 7
6 7 B=6 2 7
6
7
6 0 1 0 0 0 0 0 0 6 3 7
6 7 6 7
7
4 0 0 0 0 0 0 0 15 4 0 5
0 0 0 0 0 0 0 0 0
relative to the basis matrix
 
4 3 2
X(s) = s0 s02 s0 0s s0 10 0s 10 :
The olumn redu ed ontrollability matrix is
2
1 0 0 0 0 0 0 03
6 0 1 0 0 0 0 0 0 77
6 0 1 0 0 0 0 0 0 77
6
C(A; B ) = 66 0 01 11 00 00 00 00 00 77
6 0

6 1 1 1 0 0 0 0 0
6 7
7
4 0 0 0 0 0 0 0 05
0 0 0 0 0 0 0 0
We see that olumns 2 and 3 take their order in row 2, while olumn 1 is the only olumn taking
its order in row 1. Hen e olumn 1 is the `row leader' for row 1 and olumn 3 is the `row leader' for
row 2. It follows that 1 = 1 and 2 = 2. As an independent veri ation, we an also see dire tly
from X(s)C(A; B ) that olumn 2 is just s 1 times olumn 3 and hen e they are dependent.

s4 s3 s2 s2 0 0 0 0 0 
X(s)C(A; B ) = 1 s2 1 s + 1 0 0 0 0 0 ;
so the GCLD is 
4 s2 
s
L(s) = 1 s + 1 :
We an now also easily ompute P (s) by solving the following linear system of equations:
2
1 0 0 0 03 2
1 0 03
6 0 0 1 0 0 7 6 0 1 0 7
6 7 6 7
6 7 6 7
6 0 1 1 0 0 7 6 0 1 1 7
6 7 6 7
6 7 3 6 7
6 0 0 1 1 0 72 6 0 1 0 7
6 7 a 1 2 3a a 6 7
6 0 0 0 1 0 76 0 0 0
6 7 b b b 7 6 7
1 2 3 6 7
6 7 1 2 3 7 = 6
6 7 7
6 1 0 0 1 1 76
7 4 d1 d2 d3 5 6 1 1 2 7
6 7
6
6 7 6 7
6 0 0 0 0 1 7 e1 e2 e3 6 0 1 2 7
6 7 6 7
6 7 6 7
6 0 1 0 0 1 7 6 0 1 3 7
6 7 6 7
6 0 0 0 0 0 7 6 0 0 0 7
4 5 4 5
0 0 0 0 0 0 0 0
30
This orresponds to the equation L(s)P (s) = P (s), where P (s) is represented by the matrix:
 
a1 s + b1 a 2 s + b2 a 3 s + b3
1 s2 + d1 s + e1 2 s2 + d2 s + e2 3 s2 + d3 s + e3 :
The left-hand matrix in the above equation omes easily from the olumn redu ed ontrollability
matrix. It onsists of the `row leaders' plus `shifts' of the row leaders. To be pre ise, for ea h i,
the row leader for olumn i o urs, along with i upward `shifts' of the row leader. Note that this
ne essitates adding another row blo k to the top of the s alar matrix.
The right-hand matrix
 is simply
the oeÆ ients of the matrix P (s) with respe t to the `aug-
5
mented basis matrix' s0 s03 X(s) .

Not only is the left-hand matrix easily onstru ted, but it will be lower diagonal (up to olumn
permutations) so that the above system an be solved instantaneously! The resulting matrix P (s)
an now be stated:  
s 0
P (s) = 0 s2 + 1 2 : 1

31
CHAPTER 5
DECODING ERROR CORRECTING CODES
This hapter is in luded as a basi introdu tion to the de oding of error orre ting odes. It
is intended to provide the ne essary ba kground for subsequent hapters for those readers who
are not familiar with this subje t. A few of the more basi on epts are presented in detail,
while some others are only brie y introdu ed with referen es to works ontaining a more thorough
dis ussion. The reader is referred to [43, 44, 49℄ for general referen es on de oding. In addition,
Se tion 5.5 provides new insight into majority logi de oding methods using the lo al des ription of
onvolutional odes presented in 2.2.4.
5.1 Channel Models and Maximum Likelihood De oding
We must be able to de ode the information that we en ode if it will be of any use. The fa t that
our en oded information is often orrupted by hannel noise makes this task non-trivial. We1 have
seen in Se tion 1.1 that the de oding pro ess an be divided into two fun tion and . The
map attempts to remove the noise from the re eived word, thereby estimating whi h odeword
was indeed sent. The map 1 1reverses the en oding pro ess to determine what message was
en oded. In pra ti e, the map is already spe i ed by hoosing the en oder, , and hen e an
be onsidered part of the en oding pro ess. So the heart of the de oding task is to nd an eÆ ient
and e e tive .
Let us dis uss what we mean by eÆ ient and e e tive. By eÆ ient we mean the de oder should
work in a timely fashion and should be reasonably heap and easy to build and maintain. These
are on epts mainly asso iated with the `engineering side' of oding theory, and hen e, fall outside
the domain of this dissertation. Although, it is ertainly a mathemati al issue in evaluating the
omplexity of the de oding algorithm pres ribed by . By e e tive we mean, informally, that the
map hooses the orre t odeword with very high probability. Let us introdu e some of the
on epts that will make this idea more on rete.
De nition 5.1.1 Given a re eived word r, and the set of all possible odewords fxi gi2I . Let
P r(xi jr) denote the probability that the odeword xi was sent given that the word r was re eived.
Then, de ning as follows, is said to be a maximum likelihood de oder.
(r) = xj ; where P r(xj jr) = max P r(xi jr)
i2I

It is lear that this is intuitively the orre t approa h for onstru ting a good de oder. However,
there are two immediate issues whi h arise from this de nition. First, how does one ompute the
indi ated probabilities. Se ond, for large odes it is impossible to ompute all of these probabilities
individually in a timely manner.
Let us address the rst issue. (We will delay dis ussion of the se ond issue until Se tion 5.3.)
To ompute these probabilities it is ne essary to give our ommuni ation hannel a `model'. This
model must a urately re e t the real-life hannel that it represents and, hopefully, it should have
some de nable mathemati al properties. In reality it is not surprising that there are many di erent
kinds of hannels and hen e we must develop a separate model for ea h of them. We will ontent
ourselves by onsidering two of the major hannels.
De nition 5.1.2 A binary symmetri hannel (BSC) is one su h that A = F 2 and if we de ne
P r(0j1) = p0 and P r(1j0) = p1 to be the probabilities that a transmitted 0 will be re eived as a 1,
and that a 1 will be re eived as a 0 respe tively, then p0 = p1 := p. Similarly, a q-ary symmetri
hannel is one su h that A = F q where P r(ai jai ) = 1 p for all ai 2 F q and P r(ai jaj ) = p=(q 1)
for all ai 6= aj .
So, if we are given a blo k ode over a BSC with `transition probability' p, and a re eived word r
it is a simple matter of omputing binomial probabilities to determine whi h odeword is the most
likely. Unless we are transmitting at a rate above the hannel apa ity, it is true that p < 1=2.

32
Therefore the issue of nding the most likely odeword is the same as nding the odeword whi h
di ers from r in the fewest omponents. That is, should hoose xj su h that
dist(xj; r) = min
i2I
dist(xi ; r)
Su h a de oding s heme is alled nearest neighbor de oding. For many hannels, nearest neighbor
de oding and maximum likelihood de oding are equivalent.
Our rst hannel model makes use of hard de ision de oding in that ea h bit of the re eived
word is immediately estimated as either a 0 or 1. The probability of ea h estimate being in orre t
is given by the hannel transition probability p. While mathemati ally this is very ni e, it is often
not the best hannel model. From a pra ti al point of view one an think of the transmission of
0's and 1's by sending a voltage of -1 or 1. (This is an extremely oversimpli ed des ription.) The
noise o urring during transmission over the hannel an be modeled by adding or subtra ting some
amount of voltage. So for ea h bit the de oder will re eive some real-valued voltage. Instead of
making a hard de ision on the bit (i.e. de oding all negative voltages as 0 and all positive voltages
as 1), the de oder an assign a probability that ea h bit is a 1 (or 0) by using some hannel
information. In parti ular, it is reasonable to assume that the noise adheres to some sort of normal
distribution with mean 0 and standard deviation . The standard deviation depends (primarily)
on the relative strength of the energy per message bit Eb to the noise power spe tral density N0 =2.
The ratio Eb=N0 is known as the signal to noise ratio (SNR). Thankfully, one need not know what
a power spe tral density is to ompute . In pra ti e, the knowledge of Eb=N0 given in de ibels
and the rate, R = k=n are all that is needed to obtain .
p
 = 10 E =20N0 1=2R
b

De nition 5.1.3 The hannel we have just des ribed is alled an additive white Gaussian noise
(AWGN) hannel or simply a Gaussian hannel.
A more thorough examination of this and other hannels an be found in [17℄. It is lear that the
probabilities we desired to ompute an be gotten easily from the normal distribution asso iated
with the noise.
5.2 De oding Parameters
Let us onsider a hard de ision hannel where maximum likelihood de oding is equivalent to nearest
neighbor de oding (e.g. a BSC). A very important parameter in determining the e e tiveness of
a ode and de oding algorithm is the odeword error rate. Simply put, this is just the per entage
of transmitted odewords that are de oded in orre tly. For blo k odes using a nearest neighbor
de oding algorithm, the following theorem and orollary state the importan e of the minimum
distan e of the ode in regards to the odeword error rate.
Theorem 5.2.1 [49, amongst many others℄ Let a blo k ode, C , have minimum distan e d. Let
t = b(d 1)=2 . Then the ode an dete t up to d 1 errors and it an orre t up to to t errors.
Proof: Let x be the sent odeword and r = x+e be the re eived word. An error will be undete ted if
r = x1 2 C (x1 6= x). If less than d errors o ur, then wt e < d and hen e by the triangle inequality
(Hamming distan e de nes a metri !), r annot be another odeword. Hen e the error will be
dete ted.
Consider the ve tor spa e F n and our ode C  F n . For ea h odeword we may de ne a ball
of radius  for any   0. Simply, these balls ontain all ve tors in F n whose distan e from the
enter ( odeword) are less than or equal to . When   t, it is lear that the balls are disjoint.
Hen e, when t or less errors o ur, r will lie in exa tly one of the balls. Nearest neighbor de oding
spe i es that we de ode as the odeword in the enter of that ball.
Corollary 5.2.2 The odeword error rate, Pe , for a rate k=n (binary) blo k ode with minimum
distan e d (t = b(d 1)=2 ) transmitted over a BSC with hannel transition probability p is given
by
n  
X n i
Pe  p (1 p)n i
i=t+1
i

33
Proof: The right side of the above inequality is just the probability of more than t errors o urring
in any odeword. (The inequality an be explained by the fa t that forn most odes, the spheres of
radius t about ea h odeword do not over the entire ve tor spa e F . If one of these ve tors is
re eived, the de oder may `lu k' into a orre t de oding.)
On the other hand, when a soft-de ision de oding algorithm is employed or for onvolutional
odes the notion of odeword error is not as important as the bit error rate or bit error probability.
Simply, this is just the probability of a message bit being de oded in orre tly, and is denoted by
Pb . Similar formulas exists for omputing (or bounding) this probability for the various hannel
models.
5.3 A Few De oding Algorithms
We would like to address the se ond issue whi h arose in the dis ussion on omputing the relative
probabilities of odewords given a re eived word. Namely, that it is infeasible to ompute every
su h probability. We shall present some of the most fundamental de oding te hniques to show how
this problem an be over ome.
Syndrome de oding is a hard de ision method for de oding blo k odes. We start with a parity
he k matrix, H , for the ode. Then we ompute the osets of the ode as an additive subgroup of
F n . For ea h oset, we hoose a ve tor with minimal weight and all it the oset leader. For every
ve tor x 2 Fn , the ve tor xH T is alled the syndrome of the ve tor. It is true that all Tve tors in
the same oset have the same syndrome. Sin e every ve tor in our ode satis es H = 0, all
of the odewords lie in the oset whose leader is the all zero ve tor and whose syndrome is the all
zero ve tor.
Given a re eived word r = + e, where e is the error ve tor, we de ode in the following way.
Compute the syndrome of r, whi h equals the syndrome of e. This syndrome must equal one of the
syndromes of the oset leaders. The error ve tor must be one of the ve tors in that oset. The most
likely error ve tor is the one with minimal weight, i.e. the oset leader. (If more than one minimal
weight ve tor exist, hoose one.) Subtra t the oset leader from the re eived word to obtain the
de oded word.
This is a very ni e s heme from a mathemati al viewpoint, but even the task of storing the
oset leaders and their syndromes an be ome quite ostly and umbersome for large odes. Some
odes have even more eÆ ient ways of syndrome de oding. Hamming odes and BCH odes [49℄
are good examples of odes whi h an ompute the error ve tor algebrai ally rather than having to
store the omplete list of oset leaders. There is also the te hnique of majority logi de oding using
orthogonal parity he ks. We will explore this te hnique as it is used to de ode onvolutional odes
in Se tion 5.4 and then pro eed to relate this te hnique to our lo al des ription of onvolutional
odes in Se tion 5.5.
Soft-de ision de oding of blo k odes is be oming in reasingly popular. Gallager developed
a soft-de ision algorithm based roughly on majority logi de oding in the early Sixties [26, 27℄.
While his methods showed promise for large blo k sizes, the method was widely ignored, probably
due to the la k of suÆ ient te hnology to implement the de oding pra ti ally. The method was
redis overed and expanded on by Ma Kay et al. [46, 47℄. With improvements in the te hnology
used to test and implement error orre ting odes, the odes developed for use with these methods
are omparable with the highly touted turbo odes in terms of bit error rate performan e [48, 67℄.
There are several prominent algorithms for the de oding of onvolutional odes. Foremost
among them is the Viterbi de oding algorithm. Sequential de oding, with its many variations,
is another popular method. The reader is referred to [43, 40℄ for an ex ellent treatment of these
` lassi al' topi s. More re ently, soft-de ision de oding te hniques have gained some attention sin e
they an be use quite e e tively in the de oding of turbo odes. Some of these in lude the BCJR
algorithm [9℄ and various soft-output Viterbi algorithms (SOVA) [13, 28℄. We shall dis uss some
new algebrai ideas for de oding onvolutional odes in Chapter 6.
5.4 Majority Logi De oding of Convolutional Codes
This se tion will provide a brief ba kground on majority logi de oding of onvolutional odes. (The
underlying syndrome de oding te hnique an, of ourse, be used for blo k odes.) In parti ular,

34
de nite de oding, feedba k de oding and threshold de oding will be dis ussed. For a more thorough
investigation of these topi s the reader is referred to [50, 55, 63, 43, 40℄.
The general idea of majority logi de oding is to use the syndrome former matrix of the on-
volutional ode to obtain parity he ks on the bits of the odeword. These parity he ks are then
used in somewhat di erent ways by the three basi types of majority logi de oding. First, let us
fo us on how the parity he ks are obtained.
The syndrome former matrix (see Se tion 3.2) is reated with (2m + 1)n olumns. This matrix
de nes a set of equations on ea h window of length (2m + 1)n bits of the re eived sequen e. A
spe ial time interval is sele ted from the window and a set of orthogonal parity he ks is obtained
from this set of equations on this time interval. Simply put, a set of parity he ks is orthogonal
on a bit ui if ea h he k involves ui and no other bit is he ked by more than one of the he ks.
For de nite de oding, the `middle' time interval of the parity parallelogram is sele ted, and a set
of orthogonal parity he ks is reated for ea h of the k information bits of that time interval. For
feedba k and threshold de oding, only the middle and right hand bits of the `parallelogram' are
used, thus resulting in the `parity triangle'. We will soon see a more a urate des ription of this
pro ess using the representations of Se tion 3.5.
For the hard de ision te hniques, namely de nite and feedba k de oding, the de oding pro ess
is straightforward. The re eived bits are given a hard de ision estimate based on the hannel model
and are fed into the majority logi de oder. That is, the bits are fed into the orthogonal parity
he k equations and ea h parity he k is evaluated as a 0 or a 1. For ea h information bit, if a
majority of its syndromes are 1, it is assumed to be in error and (assuming a binary ode) its hard
de ision hannel estimate is reversed. On the other hand, if a majority of its orthogonal parity
he ks are 0, it assumed to be orre t and its hard de ision hannel estimate stands.
Now omes the main di eren e between de nite and feedba k de oding. In de nite de oding,
the de isions made by the majority logi de oder are not used to update the bits in future time
intervals. In feedba k de oding, the de oding made by the majority logi de oder is used in future
omputations. In parti ular, if the bit ui is determined to be in error, then all the values of the
orthogonal parity he ks are hanged. In both methods, the sequen e of syndromes is shifted by
one time interval to prepare for the next set of in oming bits (rather than redundantly re al ulate
the previous syndromes).
For threshold de oding, the de oder does not immediately make a hard de ision estimate.
Therefore the syndromes will not evaluate to 0 or 1 and, hen e, the de oder de ision annot be
based simply on the majority of the orthogonal parity he ks. In general, the pro edure is to obtain
a real valued fun tion on the orthogonal parity he ks and de lare a bit to be in error if the fun tion
ex eeds some `threshold' value.
Let us review the basi ideas involved by deriving the parity parallelograms and triangles for
the ode in Example 3.5.2.
Example 5.4.1 The 22 olumn syndrome former matrix, in whi h the parity parallelogram (See
Remark 3.5.3.) is learly visible, is:
2
1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 03
6 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 7
6 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 7
6 7
6 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 7
4
0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1 05
0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 1

35
The syndrome equations derived are as follows:
2 3
u 5
6 u 4 7
6 u 3 7
6 7
2
s 3 2
1 1 1 0 0 1 0 00 0 0 3 6
6 u 2 7
7
2
y 3
6 s +1 7 6 0 1 1 1 0 0 1 00 0 0 7 6 u 2 7 6 y +1 7
6
6 s +2 7
7 =
6
6 0 0 1 1 1 0 0 01 0 0 7
7
6
6 u 1 7
7
+
6
6 y +2 7
7
6 s +3 7 6 0 0 0 1 1 1 0 10 0 0 7 6
6 u 7
7 6 y +3 7
4
s +4 5 4
0 0 0 0 1 1 1 00 1 0 5 6
6 u +1 7
7
4
y +4 5
s +5 0 0 0 0 0 1 1 1 0 0 1 6
6 u +2 7
7 y +5
6
4
u +3 7
5
u +4
u +5
Here, the syndromes s , s +3 and s +5 form a set of orthogonal parity he ks. Thus a majority
logi de oder an orre t a single error in any of the 13 bits that are he ked by the orthogonal
parity he ks.
On the other hand, the parity triangle for feedba k and threshold de oding is given by:
2
s 3 2
1 0 0 0 0 0 3 2 u 3 2
y 3
6 s +1 7 6 0 1 0 0 0 0 7 6 u +1 7 6 y +1 7
7 = 6 0 0 1 0 0 0 7 6 u +2 7 + 6 y +2 7
6 s +2 7 6 7 6 7 6 7
6
6 s +3 7 6 1 0 0 1 0 0 7 6 u +3 7 6 y +3 7
4
s +4 5 4
1 1 0 0 1 0 5 4 u +4 5 4
y +4 5
s +5 1 1 1 0 0 1 u +5 y +5
Here, the syndromes s , s +3, s +4 and the he k formed as the sum (s +1 + s +5) are a set of
orthogonal parity he ks. Thus a majority logi de oder an orre t up to two errors in any of the
11 bits that are he ked by the orthogonal parity he ks.
5.5 Using the Lo al Des ription for Feedba k De oding
In the usual setting of onvolutional odes the parity triangle is obtained dire tly from the syndrome
former, as shown in the above examples. However, the lo al des ription o ers valuable insight into
the nature of feedba k de oding. Instead of identifying the parity triangle as the right half of the
parity parallelogram, it should rather be identi ed as the top portion of the matrix in (3.5.2). It
is in this setting that relationship between the state of the en oder and the syndromes is fully
revealed.
To be exa t, we take the top m + 1 rows of equation (3.5.2). The left hand side onsists of the
rst m rows of the observability matrix, followed by a row of zeros, times the state at time  . The
right hand side onsists of the parity triangle times the sequen e of inputs plus the sequen e of
outputs.
Theorem 5.5.1 The updated syndromes for feedba k de oding for times  through  + m are given
by:
2 3 2 3 2
s C D 0  0 32 u~ 3 2
y~
3
6 s +1 7 6 CA 7
6
6 CB D  0 7766 u~ +1
7 6 y~ +1 7
.. 7 6 .. 7 6
^ + 66 ... ... ... .. 76 .. 7 6 .. 7
7=6 7x 7+6
6
6 . 7 6
. 7 . 76 . 7 6
. 77
4s
 +m 1
5 4 CAm 1 5 4 CAm 2 B    CB D 0 54 u~ +m 1 5 4 y~ +m 1 5
s +m 0 X X X X D u~ +m X
where the last row depends on the matrix A as des ribed in Proposition (3.5.1). (The u~'s and y~'s
are the re eived bits. x^ is the state of `en oder' on the re eiving end based on the de oded input
bits up to time  1.)

36
Proof: The proof follows from equation (2.2.2).
2 3 2 3
C C
6 CA 7 6 CA 7 2
u^0 3
6
6 CA2 7
7
6
6 CA2 7
7  1 B    AB B 4 .. 5

6
6 ... 7 x
7 ^ =6
6 ...
7 A
7 .
6
4 CAm 1
7
5
6
4 CAm 1
7
5 ^
u  1
0 0
Where the u^'s are the de oded inputs from previous time intervals. Multiplying out the right hand
side gives exa tly the left half of the parity parallelogram. The result is now immediate.
Remark 5.5.2 The above theorem states that having no errors in the feedba k at any given time
is equivalent to knowing the orre t state of the en oder at that time. Also, if the matrix A is
nilpotent, then only the most re ently de oded input bits are required sin e the rest of the above
matrix will be 0. In that ase, we will get the usual parity parallelogram.

37
CHAPTER 6
ALGEBRAIC DECODING OF CONVOLUTIONAL
CODES USING THE LOCAL DESCRIPTION
In this hapter we will des ribe an algebrai de oding s heme for onvolutional odes that was rst
put forth in [56℄. Some example odes for this algorithm will be presented. We will pro eed to make
signi ant improvements in this algorithm and dis uss more appli ations in luding the de oding of
turbo odes.
6.1 A Basi Algorithm
We begin by restating the notion of nearest neighbor de oding for onvolutional odes. This is then
used to develop a basi algorithm for the
n algebrai
o
de oding of onvolutional odes.
Assume a ode sequen e fvt gt0 = u t0 was sent and the sequen e
y t
t

 
y^t
f^vtgt0 = u^t t0
has been re eived. The de oding problem then asks for the minimization of the error
1 1 !
X X
error := fmin
v g2C
dist(vt ; v^t ) = min [dist(ut ; u^t ) + dist(yt; y^t )℄ : (6.1.1)
t
t=0 t=0
If no transmission error o urs then f^vt gt0 is a valid traje tory and the error value in (6.1.1)
is zero. Let    
ft y^t yt
e
:= u^ u (6.1.2)
t t0 t t t0
be the sequen e of errors.
We onsider the re eived odeword sequen e fvt gt0 as being divided into time intervals of
size T + 1. There is also a positive integer, , hosen for ea h ode subje t to the following
onsiderations:
  should be less than T . In parti ular, it is desirable for b(T + 1)= to be greater than 1.
 We will want (the olumns of)
(A; C ) to be the generator matrix of a blo k ode with
suÆ iently good distan e. The matrix
(A; C ) must be full rank, so  must be at least
 , the observability index of (A; C ). It is easy to see that the distan e of the ode is a non-
de reasing fun tion of . However, a balan e must be found, as we shall see, between the
distan e of this ode and the desire to maximize (T + 1)=.
Algorithm: Assume initially that
u^T +1 ; u^T +2 ; : : : ; u^T (6.1.3)
has been orre tly transmitted. From Proposition 2.2.4 it follows that
0 1 0 D 0  0 1 0 uT +1 1
yT +1 ... C
B yT +2 C B ... u
CB D CBB T .+2 C
B C
B
B ... C B
C B ... ... C C B .. C
B C B CAB CB CB C
B
 ... A B
C
 .
.. . .
.. .. 0 A CB .
.
. A
C

yT CA 2 B CA 3 B    CB D uT

38
is in the olumn spa e of the blo k ode generated by the olumns of
0 1
C
B CA C
B
 ... C:
A (6.1.4)
CA 1
The larger the value of , the bigger, in general, thejdistan ek of this blo k ode. Assume, now,
that this blo k ode has distan e dgen, and let tgen = d 2 1 . Then, as long as there are tgen or
gen

fewer errors in the sequen e


y^T +1 ; y^T +2 ; : : : ; y^T
we an ompute, using various standard de oding te hniques for blo k odes, the error sequen e
fT +1; fT +2 ; : : : ; fT
as well as the state ve tor xT +1 from the lo al des ription:
2 3 2 D 0  0 3 2 uT +1 3 2 C 3
y^T +1 ... 77 6 uT +2 7 6 CA 7
6 y^T +2 7 6 6 CB D ...
6 ... 7 66
7
. . . . 76 ... 77 = 66 ... 77 x
7 T +1 : (6.1.5)
6 6
6 7 6 CAB CB . . 7 66 7 7 6
.
6
4 .
.. 5 4 ...
7 6 7 .
. . . . . . 0 5 4 .. 5 4 .. 75
7 6

y^T CA 2 B CA 3 B    CB D uT CA 1


If a de oding error is made here, i.e. if more than tgen errors o ur in the above sequen e of
y^i 's, we all it an error of TYPE I.
Sin e the state ve tor xT +1 is also equal to
0 1
u0
xT +1 = AT  B : : : B

 ... A
uT 
it is possible to ompute the error sequen e e0 ; : : : ; eT  from the syndrome ve tor
0
u^0 1
AT  B : : : B  ... A xT +1 :

(6.1.6)
u^T 
We see now that the key to de oding this part of the ode sequen e is to have (AT B : : : B )
be a parity he k matrix for a ode with good distan e, or have some properties that make it easy
to de ode, e.g. low-density. In this way, one an either hoose to de ode algebrai ally with hard
de ision and require a good distan e, or one an de ode using Gallager-type algorithms or other
soft-de ision algorithms where stru turejis morekimportant than distan e. Let the distan e of this
parity he k ode be dpar , and let tpar = d 2 1 . Hen e, if we de ode algebrai ally, we an orre t
par

up to tpar errors in the sequen e u^0; : : : ; u^T . De oding errors made in this stage are alled errors
of TYPE II. If desired, the error sequen e f0; : : : ; fT  an now be omputed from the des ription
of the ode provided in Proposition 2.2.4, or in Proposition 2.2.5.
Let us now deal with the situation where the re eived sequen e in (6.1.3) is not orre t. Several
things might happen then. First it is possible that we annot ompute the state ve tor xT +1
from identity (6.1.5) in whi h ase we on lude that the sequen e given in (6.1.3) was not orre t.
It is possible that a wrong state ve tor xT +1 is omputed. This will lead to a wrong syndrome
ve tor in (6.1.6) and the omputed error sequen e e0 ; : : : ; eT  will then have weight more than
tpar . At that point, we on lude that there must be an error in the sequen e in (6.1.3). (The
other possibility is that the state has been orre tly estimated, but there are a tually more than
tpar errors in u^0 ; : : : ; u^T  . In this ase, the algorithm won't be able, at this stage, to de ode

39
orre tly.) Finally, there might o ur the situation where we ompute (by han e) the orre t state
ve tor xT +1 from identity (6.1.5). In this situation we orre tly will nd the error sequen e
e0 ; : : : ; eT  as well as the error sequen e f0 ; : : : ; fT  and in a later stage of the algorithm we will
re ognize that (6.1.3) was not orre t. We will handle this ase by saying only that it is `improbable'
and a e ts at most  inputs (if one is on erned about bit error rate).
Assume now that it has been re ognized that (6.1.3) is not orre t. In this ase we repla e the
sequen e (6.1.3) with the sequen e
u^T 2+1 ; u^T 2+2 ; : : : ; u^T  (6.1.7)
and we assume that this is a orre tly transmitted sequen e. Again it might turn out that (6.1.7)
is not a orre tly transmitted sequen e. One iteratively pro eeds until one nds a orre t sequen e
u^T h+1 ; u^T h+2 ; : : : ; u^T (h 1) : (6.1.8)
If we are unable to nd su h a orre t sequen e, then we say an error of TYPE III o urs. Su h a
sequen e an be found with high probability, depending on the hannel, and, more importantly, the
values of T and . This n sequen e

will then allow
o one to orre tly ompute the state ve tor xT h+1
as well as the errors e j 0  t  T h . After having omputed xT h+1 we reiterate the
t
t
f
algorithm by attempting to ompute the state ve tor x2T (h+1)+1 making the assumption that
the sequen e
u^2T (h+1)+1 ; u^2T (h+1)+2 ; : : : ; u^2T  : (6.1.9)
has been transmitted orre tly.
6.2 Some Classes of Binary Input-State-Output Convolutional Codes
We will present two ideas for how to sele t the matri es (A; B; C; D) so as to onstru t odes with
properties desirable for this de oding s heme. BCH type odes over larger elds have already been
onstru ted using these ideas in [61, 62, 59, 71℄. We will examine `maximum distan e separable'
onvolutional odes over larger elds in Chapter 7. Our onstru tion here will fo us on permutation
matri es and matri es with large order over the binary eld.
De nition 6.2.1
 Let A be a nonsingular matrix with minimum polynomial pA(s). Then the order, n, of the
matrix A is equal tonthe order of the polynomial pA(s), whi h is the smallestninteger, n, su h
that pA(s) divides s 1. Equivalently, n is the smallest integer su h that A = I . See [41℄,
for example.
 If A is size Æ  Æ, then A is primitive, (as well as its minimum polynomial) if its order is the
maximum, qÆ 1 (over F q ).
We will restri t ourselves to onvolutional odes of rate 12 , so our matri es will have sizes:
A = Æ  Æ, B = Æ  1, C = 1  Æ, and D will be a 1  1 matrix (usually the identity).
Noti e that we now have a bound on the size of T . Sin e we require T +1(A; B ) to be
a parity he k matrix, if T   n, the order of A, then the distan e of that ode will ne essarily
be 2, and hen e will provide for no error orre tion.
This bound emphasizes the need to hoose a matrix, A, with large order. At this point we an
go in several dire tions. Obviously, if we wish to maximize order, we may hoose A as a ompanion
matrix of a primitive polynomial. Noti e that in this ase, the hoi e of B is irrelevant, so long as
B is not the zero ve tor. However, a matrix with high order doesn't ne essarily orrespond to good
distan e. In fa t, the parity he k sub ode is generated by the minimum polynomial of the matrix,
so the distan e is immediately upper bounded by the weight of this polynomial (and will de rease
as T  in reases). We will make this notion more pre ise.
There is a one-one orresponden e between polynomials of degree d and ve tors of length d + 1
given easily by: a0 + a1s + a2s2 + : : : + adsd $T [a0; 1a1 ; a20 ; : : : ; ad ℄. Every odeword (here, odeword
refers to any ve tor u, su h that [BAB    A B ℄u = 0, i.e. a odeword of the parity he k
sub ode) is a multiple of the minimum polynomial, under the above orresponden e.
40
De nition 6.2.2 The ideal of all polynomials generated by pA(s) is denoted as (pA(s)). Let the
interse tion of this ideal with the set of all polynomials of degree less or equal to d be denoted by
(pA(s))<d .
Using these de nitions we may reformulate the above dis ussion:
Proposition 6.2.3 For ea h value of T , the set of odewords for the parity he k sub ode is
pre isely (pA (s))<(T ) . The distan e of this ode, is exa tly the weight of the minimum weight
non-zero polynomial in this set.
Hen e, the order of a matrix is not the sole determining fa tor in the quality of the ode.
Nevertheless, there are enough good odes that randomly sele ting primitive polynomials with a
high weight will shortly lead to a good parity he k sub ode. Sin e the generator sub ode also
depends heavily on A and not so mu h on C , this ode also tends to have good distan e properties
at the same time. An example of a ode onstru ted using these ideas is found below in Example
6.3.2.
Although primitive matri es have, in general, good distan e properties, the orresponding par-
ity he k sub ode la ks any real stru ture. If we are willing to give up a little distan e for some
stru ture, we an quite possibly take advantage of some soft de ision de oding te hniques. Permu-
tation matri es have the advantages of possessing ex ellent stru ture and having potentially large
order. Most importantly, if we sele t a sparse olumn, with say, t ones, as our matrix B , then ea h
olumn of T +1(A; B ) will have exa tly t ones.
6.3 Example Codes.
Example 6.3.1 Let us onstru t a simple example. Let us hoose a permutation whi h is the
dire t sum of y les of length 3, 5, 7, and 11. For simpli ity we hoose the y les: (2 3 1)(5 6 7 8
4)(10 11 12 13 14 15 9)(17 18 19 20 21 22 23 24 25 26 16). The resulting matrix is 26  26, and
has order 3  5  7  11 = 1155. We hoose as our B matrix, the ve tor with all zeros ex ept a
1 in rows 1, 4, 9, and 16. The orresponding matrix 68(A; B ) has distan e 5. We hoose T 
to be 67. For a randomly hosen C matrix with weight 13 the ode generated by the olumns of

(A; C ) has distan e 3 when  = 34. Hen e, we may let T = 101, and  = 34, so (T +1)= = 3;
we will have three intervals to work with for estimating the state.
We now give error probabilities for this ode assuming nearest neighbor de oding over a BSC
for various values of the hannel transition probability p.

TABLE 6.1: ERROR PROBABILITIES FOR THE CODE OF EXAMPLE 6.3.1


p TYPE I TYPE II TYPE III Blo k Error
.03 .2717 .3343 .2683 .6453
.01 .0454 .0310 .0242 .0974
.001 .0005 .00007 .000037 .0006

Example 6.3.2 We now present a simple example 20 ode 19based18on a17primitive matrix. Let A be
the ompanion matrix for the primitive polynomial s + s + s + s + s14 + s11 + s10 + s9 + s8 +
s7 + s4 + s2 + 1 over F 2 . Choose B and C as simply the rst unit ve tor, and D = 1. It turns out
that for  = 33, the generator sub ode has distan e 5. Choosing T = 65 provides a parity he k
sub ode with distan e 5. The error probabilities for the ode are presented in the following table:

We defer a rigorous analysis of the de oding properties of this de oding algorithm until further
re nements are presented. This analysis appears in Se tion 6.7.

41
TABLE 6.2: ERROR PROBABILITIES FOR THE CODE OF EXAMPLE 6.3.2
p TYPE I TYPE II TYPE III Blo k Error
.03 .0756 .3178 .2549 .5301
.01 .0044 .0287 .0225 .0547
.001 .000005 .000044 .000034 .00008

6.4 An Alternative to State Estimation


In this se tion we re ne the previous algebrai de oding algorithm by altering our treatment of the
state. The approa h is based on making use of both the input-state-output and output-state-input
representations. For simpli ity we will limit the presentation to odes of rate proportional to 1=2.
The adaptation of the method is easily done for rates 1=n. Some e ort is required to extend this
method to general rates below 1=2 and there is no known extension to higher rates.
Let us re all the state representation given in (2.2.2) and the orresponding equation for the
output-state-input representation. If we subtra t these equations, we arrive at the following:
2 3
u
6
6
... 7
7
6 7
(At  A~t  )x = [ At  1 B    B jA~t  1 B~    B~ ℄ 6
6
ut 1 7
(6.4.1)
 7
6 y 7
6
4 ... 7
5
yt 1
This equation immediately suggests an iterative parity he k de oding s heme: Assume the
ode has been orre tly de oded up to time  . In parti ular, x is known. Then (6.4.1) allows us
to use parity he k de oding on the next t  time intervals of the odeword. Corre tly de oding
this part of the odeword gives the state xt whi h enables us to repeat this pro ess through the
entire odeword.
The eÆ ien y of this algorithm depends on how the parity he k de oding is a omplished and
on how the matri es (A; B; C; D) are hosen. As before, there are two basi hoi es for how to
a omplish parity he k de oding. First, one an use a straightforward nearest neighbor de oding
s heme based solely on the minimum distan e of the ode. This s heme will turn out to be most
eÆ ient if the matrix appearing in (6.4.1) is the parity he k matrix for some good blo k ode.
Binary odes that satisfy this requirement are elusive to nd. In fa t, if k = 1 (hen e n =
2) this parity he k ode must have distan e 2. Only by in reasing k and hoosing a suitably
omplex matrix D an the distan e be in reased. For larger eld sizes, some examples do exist
(see [58, 61, 71℄), however the algorithm is still limited by the elimination of information inherent
in this pro ess. These limitations will be des ribed more fully in Se tion 6.5.
Alternatively, one an hoose to implement a Gallager-type low-density soft-de ision de oding
algorithm su h as in [27, 46℄. Low-density de oding depends on having the parity he k matrix with
nearly uniform olumn and row densities. These te hniques an over ome a la k of good distan e
properties. However, the basi strength of soft-de ision te hniques is that they make full use of all
information available at the expense of omplexity in order to de ode. This re nement, as we shall
see in Se tion 6.5, ontradi ts this idea by sa ri ing information in favor of eÆ ien y. Thus, this
re nement is not well suited for soft-de ision de oding methods.
6.5 Some Notes on the State Elimination Algorithm
We now o er some insight as to the e e tiveness of the algorithm presented in Se tion 6.4.
In order to make full use of the error orre ting apabilities of the onvolutional odes with the
iterative manner of the de oding s hemes of this hapter, one must use all the information available
when making de oding de isions. We will see that the algorithm presented in se tion 6.4 does not

42
make full use of this information. That is, the de oding algorithm will not nd the optimal solution.
The algorithm is designed to trade o some error orre ting apability for redu ed omplexity.
Let us make these remarks more pre ise. The full information of the onvolutional ode is pre-
sented in the global des ription of Proposition 2.2.5. The hoi e to use an iterative de oding method
already limits the amount of information available to the de oder to the information provided by
the lo al des ription of Proposition 2.2.4.
Theorem 6.5.1 The ode de ned by the parity he k matrix of (6.4.1) is larger than the ode
de ned by the parity he k matrix of (3.5.1). i.e.:
ker [ M (A; B; C; D) j I ℄  ker [  (A; B ) j  (A;~ B~ ) ℄
Proof: This follows immediately by multiplying the left-hand matrix above by  (A;~ B~ ) and
applying lemma 2.3.4.
The theorem states that the proposed state elimination algorithm of Se tion 6.4 does not make
full use of the information in the lo al des ription. Hen e, the algorithm will not perform optimally
for an iterative de oding s heme. However, the parity he k matrix involved is onsiderably smaller
and hen e o ers the possibility of providing a redu ed omplexity for de oding.
6.6 Enhan ed State Estimation
In se tion 6.4, the issue of nding a orre t state in the algebrai de oding algorithm was side-
stepped by ` an eling' the state from the equations. However, the resulting parity he k matrix
was less than ideal for error orre tion. Our new approa h is to in rease the number of available
`windows' that an be used to estimate a orre t state. The idea is to make dire t use of Lem-
mas 2.3.3 and 2.3.4. Namely, by multiplying equation (6.1.5) by M(A;~ B;~ C;
~ D~ ) we arrive at the
following:

D~
2
0  0 3 2 yT +1 3 2 u^T +1 3 2 C~ 3
6
6 C~ B~ D~
... ... 77 6 yT +2 7 6 u^T +2 7 66 C~ A~ 77
6
. . . . 76 ... 77 66 ... 77 = 66 .. 77 x
6
C~ A~B~ C~ B~ . . 766 7 6 7 6 . 7 T +1: (6.6.2)
6
6
4 .
.. ... ... 0 5
76
7 4 .
.. 7 6
5 4 .
.. 7 6
5 4
... 75
C~ A~ 2 B~ C~ A~ 3 B~    C~ B~ D~ yT u^T C~ A~ 1
From this equation it is lear that we may now examine any  outputs to estimate the state
xT +1 . Of ourse, when we do this, the matrix
(A; ~ C~ ) will be used as the generator matrix for
a linear blo k ode, so it is desirable that this ode have a large distan e.
Taking this idea even further, it is possible to de ode the rst T  + 1 time intervals using
the re eived outputs. This follows from the output-state-input analog of (6.1.6), namely:
 
0
y^0 1
A~T  B~ : : : B~  ... A xT +1 : (6.6.3)
y^T 
This idea depends on (A;~ B~ ) being the parity he k matrix of a linear blo k ode with good
distan e. On e the outputs are known, it is a simple task to produ e the input (information) ve tor.
6.7 Analysis of the Enhan ed Algorithm
Let us onsider the de oding performan e of the above algorithm given a rate k=n onvolutional
ode, (A; B; C; D), with omplexity Æ, transmitted over a q-ary symmetri hannel with transition
probability p.
Denote the minimum distan es of the parity he k odes given by T +1(A; B ) and T +1(A;~ B~ )
(for some xed T and ) by dpar and d~par respe tively. Similarly, denote the distan es of the
43
blo k odes generated by
(A; C ) and
(A;~ C~ ) by dgen and d~gen respe tively. Finally, let
I = b(T + 1)= , the number of distin t intervals from whi h to make an estimate of the state.
The following lemma provides an upper bound on the probability that there are no  onse utive
orre tly re eived inputs or outputs from whi h to estimate the state (i.e. the probability of type
III errors).
Lemma 6.7.1
P r(no error in  inputs) = (1 p)k
P r(no error in  outputs) = (1 p)(n k)
P r(all I input intervals are orrupt) = [1 (1 p)k ℄I
P r(all I output intervals are orrupt) = [1 (1 p)(n k) ℄I
P r(Type III error)  [1 (1 p)k ℄I [ 1 (1 p)(n k) ℄I

For the spe ial ase where k = n:


P r(Type III error)  [ 1 (1 p)k ℄2I
Proof: Follows from elementary properties of probability and binomial distributions.
Remark 6.7.2 The above lemma is only an upper bound on the probability of type III errors
be ause we have restri ted ourselves to investigate only distin t intervals in sear h of  onse utive
orre tly transmitted inputs or outputs. If the rst interval is orrupt it is not ne essary to `slide'
 time units. It is possible (although unlikely) that only the last time unit in the interval was
orrupted and that it would suÆ e to slide only one unit before re omputing the state. This
however would lead to an extreme rise in omplexity. A more reasonable alternative would be to
slide about =2 intervals. The ode designer must make this determination in evaluating omplexity
limitations versus error orre ting needs.
Now, assume that a Type III error has been avoided and we have found  onse utive orre tly
transmitted inputs or outputs. The following lemma gives the probability that a orre t state
estimate an not be obtained (i.e. a type I error o urs).
Lemma 6.7.3 If the orre tly transmitted sequen e is omprised of inputs:
tX
gen  
k i
P r(Type I error) = 1 p (1 p)k i
i=0
i
If the orre tly transmitted sequen e is omprised of outputs:
~
tgen 
P r(Type I error) = 1
X (n k)pi(1 p)(n k) i
i=0
i
where
tgen= b(dgen 1)=2
= b(d~gen 1)=2 :
t~gen
Proof: These statements are just the binomial probability that number of errors does not
ex eed the error orre ting apabilities of the generator sub odes. With this ondition, the state
estimate an be obtained by from (6.1.5) or (6.6.2) respe tively.

44
Remark 6.7.4 If the distan e of the generator sub odes was not a fa tor, we would be able to
signi antly redu e the probability of Type III errors. We have only onsidered `solving' (6.1.5)
for all the inputs or all the outputs. In general, simple row operations would enable us to solve for
either the input or output of ea h time unit. That would in rease the number of possible `windows'
in ea h interval from 2 up to 2 . Theoreti ally this would redu e the the probability of Type III
errors to virtually 0. Besides the omplexity issues asso iated with this approa h, the orresponding
generator sub odes would hange with ea h arrangement. It would be an impossible task to design
a ode with all 2 of these odes possessing good distan e properties. It is extraordinary that we
are able to nd odes with good generator sub ode distan e properties for just our two hosen
arrangements. Su h odes will be presented in Se tion 7.1.
Similarly we an ompute the probability of Type II errors.
Lemma 6.7.5
tpar 
P r(Type II error) = 1
X T  + 1pi(1 p)T +1 i
i=0
i
where tpar = b(dpar 1)=2 .
Remark 6.7.6 In theory, it would be possible to use both the input and output sequen es to avoid
Type II errors. That is, if the re eived input sequen e has too many errors then one ould try to
de ode using the re eived output sequen e using (6.6.3). The problem is that in many ases, it is
not known when a de oding error o urs. Of ourse, there are many equations whi h we an he k
our de oded inputs with. This would be done at the expense of omplexity. A simpler alternative
might be to de ode on both sides, i.e. use (6.1.6) and (6.6.3) simultaneously and ompare the
results. This would help aÆrm a orre t de oding, but there would be no obvious `tie-breaker' if
the two de oded sequen es disagreed.
We now have the following obvious upper bound on the probability of de oding error in the
rst blo k. Furthermore, if we assume that we have the orre t state at time  , then the same
bound holds for the blo k beginning with  . (If the state x is in orre t, we may obtain it in the
same way we obtain the state above. i.e. The error propagation e e t inherent in this in remental
de oding s heme an be limited by the use of our state estimation algorithm.)
Theorem 6.7.7 The probability of blo k error in the enhan ed de oding algorithm is upper bounded
by
1 - [1 - Pr(type I)℄[1 - Pr(type II)℄[1 - Pr(type III)℄.
Examples of odes suitable for this de oding s heme will be presented in Se tion 7.1.
6.8 An Algebrai Look at De oding Turbo Codes
The basi idea in turbo de oding is to have two separate de oders working together to iteratively
de ode the odeword. The rst de oder re eives the rst output stream, y, and makes a soft
de ision estimate for ea h bit using any of a number of proposed s hemes for de oding a RSC.
This estimate is passed along to the se ond de oder whi h also re eives the other output stream, y^.
Using both streams of data, the se ond en oder makes an updated estimate of the odeword. This
data is then sent ba k to the original de oder, whi h tries to improve its estimate based on this
new data. This swapping of information between en oders is repeated until a `reliable' de oding
de ision is obtained.
Again, there is a hoi e as to whi h soft-de ision de oding algorithm to use in the above pro ess.
The original work done in this area used a modi ed version of the BCJR de oding algorithm [7℄.
More re ently the fo us has shifted to a soft output Viterbi algorithm (SOVA) [28, 13℄. Another
approa h is o ered in [31℄. Regardless of whi h approa h is used, the basi prin iple is the same.
De oding is based on the fa t that several parity he k matri es are available to us. The rst two
are immediate from the input-state-output representation and the fa t that we require ea h valid
45
N -blo k to begin and end with both en oders in the all zero state. The last two follow from the
output-state-input representation (See Proposition 2.3.1).

N (A; B ) u = 0 (6.8.1)
N (A; B ) S u = 0 (6.8.2)
~ ~
N (A; B ) y = 0 (6.8.3)
~ ~
N (A; B ) y^ = 0 (6.8.4)
Again, the key idea, to maximize the de oding potential of the turbo ode, is the sharing of the
soft de ision estimates between the de oders
Remark 6.8.1 Nowhere is the hoi e between hard-de ision and soft-de ision de oding so lear as
here. We an see that if N is greater than the order of the matrix A (whi h is less than 2Æ ) then
the parity he k matri es must have distan e 2 and o er no hard-de ision de oding ability. On the
other hand, if N is less than or equal to the order of A then the only possible hoi e for S , with the
requirements we have imposed, is the identity matrix and our turbo ode be omes little more than
a repetition ode omposed with an RSC. Thus, soft-de ision de oding is the only viable hoi e.

46
CHAPTER 7
MDS CONVOLUTIONAL CODES
In this hapter the lass of MDS onvolutional odes given by a Reed-Solomon type onstru tion
will be examined. In parti ular, it will be shown that this lass of odes is extremely well suited
to the algebrai de oding te hniques of Se tion 6.6. Then we will investigate the olumn distan e
fun tion for this lass of odes and ompare them with theoreti al limits.
7.1 Reed-Solomon Convolutional Codes
The natural question that arises from the des ription of the enhan ed algebrai de oding algorithm
of Se tion 6.6 is whether there exist odes whi h have good distan e properties for ea h of the
sub odes. Ideally, these sub odes would also possess some other properties whi h would aid in the
de oding pro ess. The answer to this question is yes, if one onsiders elds larger than F 2 . Consider
the following ode onstru tion whi h was presented in [65℄; similar odes were presented in [34℄.
We will only onsider rate 1=2. The extension to rate 1=n is simple. The existen e of these
odes at higher rates is the ontent of [60℄. Let us x a omplexity Æ. Choose F q so that q  3Æ 1.
Find a primitive element, , of Fq . De ne the following matri es:
2
0  0 3 2
13
6 0 2 0 77 6 1 7
A = 6 .
4 .. . .
. . .. 5 ; B = 6 . 7 ; D = [1℄ , and
4 .. 5
0 0  Æ 1
2 3
Æ+1 0
 0
6 0 Æ+2 0 77
A0 = 6 6 .. . . . ... 75
4 .
0 0    2Æ
The de nition of0 the matrix C requires some omputation. We would like for A~ = A BC to
be similar to A . This an be done by omputing the hara teristi polynomial of A BC (with
the entries of C denoted by variables), setting this polynomial equal to the polynomial of A0 and
solving for the entries of C.
De nition 7.1.1 The odes de ned above are alled Reed-Solomon onvolutional odes.
Theorem 7.1.2 (MDS Property of Reed-Solomon Codes)
~ B~ ) of Reed-Solomon on-
1. The parity he k sub odes de ned by T +1(A; B ) and T +1 (A;
volutional odes are maximum distan e separable (MDS) odes and therefore have minimum
distan e Æ + 1 for any T  su h that Æ  T   q 2.
~ C~ ) of Reed-Solomon onvolutional
2. The generator sub odes de ned by
 (A; C ) and
(A;
odes are MDS blo k odes and therefore have minimum distan e  Æ + 1 for any  su h
that Æ    q 2.
Proof:

1. The matrix T +1(A; B ) is a Vandermonde matrix and hen e every full-size minor is
nonzero. Therefore, any ve tor in the kernel of this matrix (i.e. a odeword) must have
weight at least Æ + 1. On the other hand, sin e (A;~ B; ~ C;
~ D~ ) also represents the same ode
and has the same (minimal) omplexity, it must be true that (A;~ B~ ) form a ontrollable pair.
Sin e B~0 = B and A~ = A BC = S 1A0S for some invertible s alar matrix S , we on lude
that (A ; SB ) form a ontrollable0 pair as well. It follows that SB has no zero entries. From
this we on lude that T +1(A ; SB ) is a Vandermonde matrix. The result follows by noting
that T +1(A0 ; SB ) = S T +1(A;~ B~ ).
47
2. The result for the observability matri es will follow in the same line of reasoning as above
on e we an establish that the matrix C ontains no zero entries. This is equivalent to saying
that the pair (A; C ), and hen e the onvolutional ode, is observable. This an be seen
as follows. The orresponding polynomial generator matrix representation of these odes is
simply [ pA (s) pA0 (s) ℄, the hara teristi polynomials of the matri es A and A0. Sin e these
polynomials have distin t roots (f ; 2 ; : : : ; Æ g and f Æ+1 ; Æ+2 ; : : : ; 2Æ g respe tively) the
g d of the full-size minors is trivial and hen e the ode is observable.

Remark 7.1.3 Not only do these odes posess ex ellent sub ode distan e properties, but the
sub odes, being BCH odes (usually not te hni ally Reed-Solomon) may be de oded eÆ iently via
the Berlekamp-Massey algorithm.
Table 7.1 presents the probability of blo k error for many values of Æ, q, , T and transition
probability p when the enhan ed de oding algorithm of Se tion 6.6 is used on a q-ary symmetri
hannel. Figure 7.1 graphs a few of these odes over a wide range of transition probabilities. The
two harts show that, in general, there is not one ode in this lass that is optimal for all hannel
transition probabilities. In parti ular, it is apparent that lower omplexity odes perform better on
noisier hannels, while higher omplexity odes a hieve better error rates on better quality hannels.

TABLE 7.1: ERROR PROBABILITIES FOR VARIOUS REED-SOLOMON CONVOLUTIONAL


CODES USING THE ENHANCED DECODING ALGORITHM
Code Parameters Probabilities of Blo k Error
q Æ  T +1 p=.03 p=.01 p=.005 p=.001
7 2 4 8 .0105 .0012 .0003 1.2e-5
11 3 5 10 .0173 .0019 .0005 2.0e-5
13 4 6 12 .0137 .0015 .0004 1.5e-5
13 4 8 16 .0049 .0001 1.6e-5 1.2e-7
16 5 7 14 .0193 .0021 .0005 2.1e-5
16 5 9 18 .0072 .0002 2.4e-5 1.7e-5
32 10 12 24 .0570 .0063 .0016 6.6e-5
32 10 14 42 .0096 .0003 4.4e-5 3.6e-7
32 10 16 32 .0232 .0005 3.6e-5 6.5e-8
47 15 19 57 .0253 .0008 .0001 9.6e-7
47 15 21 63 .0143 .0001 4.5e-6 6.0e-9
47 15 23 69 .0169 8.0e-5 1.8e-6 1.7e-10
64 21 27 81 .0389 .0003 1.4e-5 1.8e-8
64 21 29 87 .0423 .0003 6.5e-6 6.6e-10
128 42 50 150 .2415 .0039 .0001 1.6e-8

7.2 Further Properties of Reed-Solomon Convolutional Codes


We ontinue our exploration of Reed-Solomon onvolutional odes by examining another distan e
parameter of these odes, namely the olumn distan e fun tion. For ba kground on this distan e
parameter, please see [43, 30℄.
De nition 7.2.1 For any onvolutional ode, let v and v~ be any odewords. Also, for any ode-
word, let
[v℄ = ( v0 ; v1 ; v2 ; : : : ; v )

48
0
10

−1
10

−2
10

−3
10
Probability of Block Error

−4
10

−5
10

−6
10

−7
10

−8
10
q=13 delta=4 theta=8 T+1=16
q=32 delta=10 theta=16 T+1=32
−9
10 q=47 delta=15 theta=23 T+1=69
q=128 delta=42 theta=50 T+1=150
−10
10
0 0.005 0.01 0.015 0.02 0.025 0.03
Transition Probability p

FIGURE 7.1: Plot of blo k error probabilities for sele ted Reed-Solomon onvolutional odes.

denote the th trun ation. Then the olumn distan e fun tion (CDF) of order is denoted by d
and de ned as
d = minfdist( [v℄ ; [v~℄ ) : [v℄0 6= [v~℄0 g
= minfwt ([v℄ ) : [v℄0 6= 0g:
For the spe ial ase of when = m (i.e. over one onstraint length), dm is alled the minimum
distan e, dmin , of the onvolutional ode.
It is lear that the CDF is a nonde reasing fun tion of and that for all it must be that
d  dfree . This distan e parameter is important for many de oding s hemes in luding sequential
de oding and majority logi de oding.
We now state the onje ture whi h we will dis uss for the remainder of this hapter.
Conje ture 7.2.2 For a rate 1=2 Reed-Solomon onvolutional ode with omplexity Æ, and for
Æ 1   3Æ 1, then
d  Æ + 3
In parti ular, for = 3Æ 1, we have d3Æ 1 = 2Æ + 2.
Remark 7.2.3 It was shown in [65℄ that the free distan e of these odes is 2Æ + 2. So we an
restate the above onje ture in these terms: If at any time,  , two odewords are in the same state,
x , and do not agree at time  + 1, then over the interval (;  + 1; : : : ;  + 3Æ 1) the odewords
must di er by the full free distan e, 2Æ + 2, of the ode. This would be an important result sin e
the full `de oding power' of the ode would be ontained in any interval of length 3Æ.
49
There is a wealth of empiri al eviden e to support this onje ture, yet neither a proof nor a
ounterexample has been dis overed. We o er several example odes, for small Æ, whi h satisfy the
properties of the onje ture.
Example 7.2.4 For Æ = 1 and F 4 , let be a root of s2 + s + 1 over F 4 . De ne the Reed-Solomon
onvolutional ode using the following matri es:
A = [ ℄ ; A0 = [ + 1 ℄ ; B = [ 1 ℄ ; C = [ 1 ℄ ; D = [ 1 ℄
One an easily verify by hand that this ode has d2 = 4.
Example 7.2.5 For Æ = 2 and F 7 , hoose = 3 for the primitive element. Then the de ning
matri es of our ode are given by:
     
3 0 0 6 0 1
A = 0 2 ; A = 0 4 ; B = 1 ; C = [ 3 6 ℄ ; D = [1℄

Again, an easy al ulation veri es that this ode has d5 = 6.


Example 7.2.6 For Æ = 3 and F 11 , hoose = 2 as the primitive element. Then we may de ne
our ode using the following matri es:
"
2 0 0# 0 "5 0 0# "
1#
A = 0 4 0 ; A = 0 10 0 ; B = 1 ;
0 0 8 0 0 9 1
C = [ 8 1 3 ℄ ; D = [1℄
A short omputer veri ation (although possible to do by hand) reveals that this ode has d8 = 8.
Example 7.2.7 For Æ = 4 and F 13 , hoose = 7. Then the following matri es de ne our ode:
2
7 0 0 03 2
11 0 0 0 3 2
13
A = 6 0 10 0 0 7 0 6 0 12 0 0 75 ; B = 64 1 75 ;
4 0 0 5 0 5; A = 4 0 0 6 0 1
0 0 0 9 0 0 0 3 1
C = [ 11 2 6 6 ℄ ; D = [ 1 ℄
Again, a omputer veri ation yields that d11 = 10.
Remark 7.2.8 It should be noted that for ea h of the example odes above, the full onje ture
holds true, not just for when = 3Æ 1. In fa t, for smaller the bound is not tight. In parti ular,
for the example when Æ = 4, it an be shown that d3 = 5 while the bound of the onje ture is
merely d3  2. A better onje ture for smaller will follow.
7.3 Some Insights into the Conje ture
We will analyze the nature of these odes in order to better understand why the onje ture should
hold. What we learn will enable us to handle the onje ture for small values of Æ.
Remark 7.3.1 Let us make some observations whi h will simplify our analysis. First, we assume
without loss of generality that the rst input,
 u 0 , is 1 (hen e y0 = 1). If the onje ture is false
then there is a partial ode sequen e f uy00 ; uy11 ; : : : ; uy33 11 g with total weight less than 2Æ + 2.
Æ

This implies either the input sequen e or the output sequen e has weight less than Æ + 1. Sin e
Æ

these odes have symmetri distan e properties for the input-state-output and output-state-input
representations, we an assume without loss of generality that the input sequen e has weight less
than Æ + 1. In parti ular this implies that any ounterexample annot be a omplete odeword
sin e at least Æ + 1 nonzero inputs are required to bring the underlying linear system into the all
zero state on e it has left (sin e (A; B ) has distan e Æ + 1).
50
Consider the spe ial ase when only the rst w inputs are allowed to be nonzero and all other
inputs are zero. Then the input is 1; u1 ; u2 ; : : : ; uw 1 ; 0; : : : ; 0, and the lo al des ription of the ode
be omes:
2
2
y0 3 6 D 0  0 3
6 y1 7 6 CB D
... ... 77
6 . 7 ...
6 .. 7
6 7
6
6
6
. . . . . . 0
7 2
7
7
1 3
6 yw 1 7 6 w 2B    6 u1 7
6 7 = 6 CA CB D 7 7 6 . 7
6 yw 7
6 7
6
6 CA
w 1B    CA2 B CB 7 4 .. 5
7
6 yw+1 7 6 CAw B    CA3 B CAB 7 uw 1
6 . 7 6 7
4 . 5 . . .
. 6
4 .. .. .. 57
y 1
CA B    CA w +1 B CA w
We will obtain a lower bound on the weights of the the last w + 1 outputs yw ; : : : ; y by
rearranging the bottom half of the matrix on the right side of the above equation as follows:
2 3 2
yw BT 3 2 3
2
1 3
6 yw+1 7 6 BT A 7 1  6 u1 7
6 . 7 = 6
4 .. 5 . 7 4 . . .

5 Aw 1 B    AB B 6 ... 75
4 .. 5 4
y Æ uw 1
B T A w

This equation says that the last w + 1 outputs form a odeword of a ode generated by a
Vandermonde matrix. This ode has distan e Æ w + 2. Hen e, if the input to this matrix is
not the all zero ve tor then the last w + 1 outputs must have weight at least Æ w + 2.
Let us analyze the input to this matrix to ensure that the all zero ve tor is not the input.
First, sin e all the i are non-zero, the se ond matrix on the right side in the above equation is
nonsingular. All that remains is to show that
[ Aw 1B    AB B ℄ [ 1 u1    uw 1 ℄0
is nonzero. However, from (2.2.2), we know that this expression is exa tly the state at time w, xw .
Remark 7.3.1 spe i ally ex ludes the possibility that we re-enter the all zero state. Hen e, the
input into the Vandermonde matrix is not the all zero ve tor, and we have proven the following
lemma.
Lemma 7.3.2 Let Æ   3Æ 1 and w  Æ, then when the input is of the form 1; u1 ; u2 ; : : : ; uw 1 ; 0; : : : ; 0
then the outputs yw ; yw+1 ; : : : ; y 1 have weight at least Æ w + 2.
For the spe ial ase when the rst w inputs are nonzero, the onje tured property must hold.
This is true sin e the inputs have weight w and the outputs have weight at least Æ w + 3
(sin e y0 6= 0). This lemma will be a valuable tool for use in analyzing the onje ture.
Let us onsider an alternate approa h based on indu tion on . This approa h will further
larify some of the properties of these odes and will serve to produ e a onje ture for the exa t
olumn distan e fun tion of these odes, rather than the previously stated lower bound onje ture.
There are 2 +2 bits in ea h partial odeword we are onsidering. The onje ture is that at least
Æ + 3 of these are nonzero. We know that u0 and y0 are nonzero, so Æ + 1 of the remaining
2 bits must be nonzero for the onje ture to hold. This means that at most + Æ 1 of these bits
are zero. In this situation there are at least Æ 1 time units in whi h both the input and output
bits are zero. Let us onsider a time unit,  , where this is the ase. From the input-state-output
representation, it follows that Cx = 0, hen e x 2 ker C .
If there exists a ounterexample to the onje ture, then there must be at least Æ time units
in whi h the input and output bits are zero. This orresponds to Æ states in the the kernel of C .
Sin e, ker C has dimension Æ 1, it may be possible to prove the onje ture by proving some sort
of linear independen e property of the states.
Although the above argument has not led to a proof of the onje ture, it has led us to a plausible
onje ture for the a tual olumn distan e fun tion of these odes. We begin with a de nition.
51
De nition 7.3.3 For any given input sequen e u of length + 1 (with u0 6= 0), denote by K ;u
the number of time units in whi h both the input and output bits are zero. Denote by D ;u the
number of time units in whi h both the input and output bits are nonzero.
The following proposition is lear.
Proposition 7.3.4 The weight of the partial odeword orresponding to the input sequen e u is
given by + 1 + D ;u K ; u. The minimum su h weight over all appropriate input sequen es is
d .
De nition 7.3.5 Consider all input sequen es of the form (1; u1 ; u2 ; : : : ; u ), and the orrespond-
ing sequen e of states in the partial odeword ea h input sequen e generates. For ea h su h se-
quen e, onsider the dimension of the spa e spanned by the states that are in ker C . Denote by
N ;w the maximum su h dimension over all the input sequen es of weight w.
Conje ture 7.3.6
N ;w = max ( K ;u
fu : wt (u)=wg
D ;u + 1)
Remark 7.3.7 The above onje ture, when ombined with Proposition 7.3.4, states that d is
dependent on the number of states in a ode sequen e that are in ker C . That fa t is trivial. What
the onje ture intends to point out is that it requires nonzero inputs to drive the system into a
state whi h fa ilitates low output, i.e. you annot redu e output weight without in reasing input
weight and onversely. Sin e N ;w is at most Æ 1, we see that Conje ture 7.3.6 ombined with
Proposition 7.3.4 redu es to Conje ture 7.2.2. Computation of the exa t values of N ;w , as diÆ ult
as that may be, leads to the following onje tured exa t value for d .
Conje ture 7.3.8 De ne:
N = max N ;w
1wÆ
Then Proposition 7.3.4 and Conje ture 7.3.6 ombine to state:
d = + 2 N

7.4 The Case when Æ = 2


We will give a thorough examination of the ase when Æ = 2. (The ase when Æ = 1, is ompletely
trivial.) Even for this small ase, there is no known proof of Conje ture 7.2.2. However, it an be
easily shown that d5  5, whi h is just as good from a hard de ision error orre tion point of view.
Proposition 7.4.1 For Reed-Solomon onvolutional odes with Æ = 2 and 1   3, we have
d  Æ + 3 = + 1
Proof: For = 1, the statement is trivial sin e u0 6= 0 and y0 6= 0. For the larger it suÆ es
to onsider input sequen es of weight 1 in light of Remark 7.3.1. Thus, u0 = 1 and all other inputs
are 0. We also have that y0 = Du0 6= 0. The remaining outputs are given by:
y1 = CB
y2 = CAB
y3 = CA2 B
The fa t that at least two of these must be nonzero is a dire t onsequen e of Lemma 7.3.2, or more
dire tly, a onsequen e of the fa t that the observability and ontrollability matri es are MDS.

52
Let us onsider the ase when = 4. We wish to show that d4  5. Here it suÆ es to onsider
input sequen es of weight at most 2. An argument identi al to that in the above proposition proves
that any weight 1 input sequen e must result in an output sequen e of weight at least 4. So we need
only onsider the ase when the input sequen e has weight exa tly 2. This means we let u0 = 1
and ui 6= 0 for exa tly one i su h that 1  i  4.
The ase when u1 6= 0 is easily handled by Lemma 7.3.2. The ase when u4 6= 0 is proven by
ombining the result of Proposition 7.4.1 for = 3 with the fa t that u4 6= 0 to on lude d4  5.
Now onsider the ase when u2 6= 0. We must show that at least 2 of fy1 ; y2; y3; y4 g are nonzero.
These outputs are given by:
y1 = CB
y2 = CAB + u2
y3 = CA2 B + CBu2
y4 = CA3 B + CABu2
We know that at most one of fCB; CAB; CA2B; CA3B g an be zero. We will treat ea h of these
possibilities separately starting with the ase when all are nonzero. In this situation y1 6= 0 and at
least one of fy3; y4 g must be nonzero be ause 3of MDS arguments similar to those of Lemma 7.3.2
and the desired result is obtained. When CA B = 0 it must be that y1 6= 0 and y4 6= 0. When
CA2 B = 0 it must be that y1 6= 0 and y3 6= 0. Also, when CAB = 0 it is true that y1 6= 0 and
y2 6= 0. The only ase that remains is when CB = 0. If this is the ase then y3 6= 0. Then either
y2 or y4 must be nonzero unless CA3 B CABCAB = 0. However, we also have
C~ A~3 B~ = CA3 B + 2CA2 BCB + (CAB )2 3CAB (CB )2 + (CB )4
Taking into a ount that CB = 0 gives
~C A~3B~ = CABCAB CA3B
Hen e, CA3B CABCAB = 0 implies that C~ A~3 B~ = 0 whi h is impossible sin e C~ B~ = CB = 0.
This nishes the ase for when u2 6= 0.
The last ase we need to onsider is that when u3 6= 0. Again we must show that at least 2 of
fy1; y2 ; y3; y4 g are nonzero. These outputs are given by:
y1 = CB
y2 = CAB
y3 = CA2 B + u3
y4 = CA3 B + CBu3
The ases when CB = 0, CA2B = 0, CA3 B = 0 and when none are zero are easily treated
in a fashion similar to the ase when u2 6= 0. The only remaining ase is when CAB = 0. Here,
y1 6= 0 and either y3 or y4 is nonzero unless CA3 B CBCA2B = 0. There is no known argument
whi h ex ludes this possibility, nor are there any known examples of when this situation arises.
In any event, the probability of this ase is rare and any ode an easily be he ked against these
onditions. This results in the following proposition.
Proposition 7.4.2 When Æ = 2, d4 is `generi ally' at least 5. In parti ular, d4  5 unless CAB =
0 and CA3B CBCA2B = 0.
We wrap up our dis ussion of the Æ = 2 ase by onsidering d5 . Conje ture 7.2.2 laims this
should be 6. This result has not been expli itly proven, although it an be shown that it holds
`generi ally' and the set of equations governing this situation an be omputed expli itly as in the
above ase for = 4. However, sin e the hard de ision error orre ting apability remains the same
if we show the distan e is 5 instead of the full 6, we will ontent ourselves with showing the mu h
easier result that d5  5.
Theorem 7.4.3 For rate 1/2 Reed-Solomon onvolutional odes with Æ = 2, we have d5  5.

53
Proof: Again it suÆ es to onsider input sequen es of weight at most 2. Also, the input
sequen es of weight 1 are again seen to generate output sequen es of weight at least 5 (1 more
than we need), so we need onsider only the weight 2 input sequen es. The ases when u1 6= 0 and
u5 6= 0 are easily taken are of by Lemma 7.3.2 and Proposition 7.4.2 respe tively.
For u2 6= 0, we have
y1 = CB
y2 = CAB + u2
y3 = CA2 B + CBu2
y4 = CA3 B + CABu2
y5 = CA4 B + CA2 u2
The usual arguments show that at least 2 of fy3; y4 ; y5g are nonzero. Sin e u0 , y0 and u2 are already
nonzero the result follows.
Similarly for u4 6= 0, we have
y1 = CB
y2 = CAB
y3 = CA2 B
y4 = CA3 B u4
y5 = CA4 B + CBu4
Again, at least 2 of fy1 ; y2; y3 g must be nonzero.
Finally, for the ase when u3 6= 0 we have
y1 = CB
y2 = CAB
y3 = CA2 B + u3
y4 = CA3 B + CBu3
y5 = CA4 B + CABu3
At least one of fy1 ; y2g must be nonzero and at least one of fy4 ; y5g must be nonzero.
This ompletes the proof.
Remark 7.4.4 Although a general proof of the onje tures remains elusive, the ases for small Æ
an be treated similarly to the ase Æ = 2 above. These small ases further support the onje ture,
but sin e they in rease fa torially in omplexity it is diÆ ult to pro eed mu h further by hand.
7.5 The Road Ahead and a Stronger Conje ture
We on lude with a dis ussion of the importan e of the onje tures of this hapter by expanding
upon Remark 7.2.3. We will also introdu e and dis uss a stronger onje ture regarding the existen e
of rate 1/2 onvolutional odes with the smallest possible olumn distan e fun tion order ne essary
to a hieve the free distan e.
If Conje ture 7.2.2 holds true, then it is possible, at least theoreti ally, to de ode any blo k of
length 3Æ of a re eived word as long as at most Æ errors o urred. That is, we an de ode if at
most Æ of any 6Æ onse utive bits (in luding inputs and outputs) are in error regardless of in whi h
position the errors may arise.
On one hand, we may ompare this to the algebrai de oding algorithm of Chapter 6. In that
algorithm it was riti ally important as to where the errors o urred. At most tpar errors ould
o ur in the (T )-length sequen e of inputs; at most tgen errors ould o ur in the -length
sequen e of outputs. In addition, it is ne essary for an entire -length sequen e of inputs to be
error free. The enhan ed de oding algorithm of Se tion 6.6 allows for a fra tion more freedom of
where the errors an o ur, but the basi restraints are still in pla e. Using the Reed-Solomon
odes of this hapter as an example, we see that sin e   3Æ=2 (see Table 7.1) we require at most
54
Æ=2 errors in the last 3Æ bits (and all of them either inputs or outputs). Further, at most Æ=2 errors
an o ur in the rst T   5Æ=2 inputs. Altogether this amounts to orre ting about Æ errors
in a sequen e of 11Æ=2 bits. This is slightly better than the onje tured result until one a ounts
for the `stru tured error' requirements imposed by the algorithm.
On the other hand, we may ompare the onje tured results with the theoreti al limits of the
CDF.
Theorem 7.5.1 For a rate 1/2 onvolutional ode with omplexity Æ and dfree = 2Æ + 2, the
smallest possible su h that d = dfree is 2Æ.
Proof: Assume there is some < 2Æ su h that d = dfree. Let the ode be given by the
polynomial en oder matrix [p(s) q(s)℄. We an assume that p(0) 6= 0, hen e p(s) divides s l
for any divisible by the order of p(s). Given some su h that is also greater than , there is a
polynomial u(s) su h that u(s)p(s) = s 1. Consider the rst + 1 time units of the odeword
u(s)[p(s) q(s)℄. The rst polynomial has only the degree 0 term as being nonzero (sin e > +1).
If every oeÆ ient of u(s)q(s) (inside the rst + 1 time units) is nonzero then the weight of this
partial odeword be omes + 2 < 2Æ + 2 = dfree. From this it is lear that the smallest possible
is 2Æ.
The above theorem states that, at best, we an expe t to de ode up to Æ errors for every 4Æ
bits. This is not surprising sin e this is the MDS bound for rate 1/2 blo k odes. Finding odes
whi h satisfy this property is not an easy task. In general, the eld size may need to be very large.
When Æ = 1, we have that 2Æ = 3Æ 1 so that Reed-Solomon onvolutional odes a tually satisfy
this property. For larger Æ, Reed-Solomon odes do not ne essarily have this property. Based on
empiri al eviden e, very few, if any, Reed-Solomon onvolutional odes for Æ > 1 have this property.
None of the examples provided earlier in this hapter satisfy this stronger distan e ondition.
For Æ > 1 it is not lear that there even exists onvolutional odes whi h satisfy this stronger
ondition. The following is one example for Æ = 2.
Example 7.5.2 The onvolutional ode over F 11 generated by
 2 
s + 10s + 4 s2 + s + 9
has dfree = d4 = 6.
Larger examples are mu h more diÆ ult to ome by. Conje ture 7.3.8 should o er a reasonable
starting pla e to look for more systemati means of dis overing these odes. We will instead look
to guarantee the existen e of these odes.
Conje ture 7.5.3 For a large enough eld size, given a omplexity Æ, there exists a rate 1=2
onvolutional ode su h that d2Æ = dfree = 2Æ + 2.
Proof: [with a gap℄ Let g1(s) = PÆi=0 ai si and g2 (s) = PÆi=0 bisi be polynomials over a `large
enough' nite eld, F. For any arbitrary (s), with (0) 6= 0, let
hX X i
(s) [g1 (s) g2 (s)℄ = xi si yisi :
We wish to show there exists some polynomials g1 (s) and g2(s) over some nite eld su h that
wt (x0 ; : : : ; x2Æ+1 ; y0 ; : : : ; y2Æ+1 )  2Æ + 2.
This is equivalent to the existen e of the following syndrome former matrix with the property
that any set of 2Æ + 2 olumns whi h in ludes the rst olumn must be linear independent.
2 3
2 3 x0
b0 a0 6 ... 7
6
6
... . . . ... . . . 7
7
6
6 ...
7
7
6 7
6
6 ... ... ... ... 7
7 6
6 x2Æ+1
7
7
6
6
6 ... ...
7
7
7
6
6 y0
7
7 = 0
6 bÆ aÆ 7 6
6 ... 7
7
6
4 ... ... ... ... 7
5 6 7
bÆ       b0 aÆ       a0
6
4 ... 7
5
y2Æ+1
55
Denote the large matrix above by H . Form a 2Æ + 2 subset of olumns from H by sele ting x0 and
any other t 1 of the xi's and also any ` of the yi's (with t + ` = 2Æ +2). We then have a morphism:
F : F 2Æ+1  F Æ+1  F Æ+1 ! F 2Æ+2 2 3
x0
6 x i1 7
(xi2 ; : : : ; xi ; yj1 ; : : : ; yj ; a0 ; : : : ; aÆ ; b0 ; : : : ; bÆ ) 7 ! H
t `
6
4 ... 7
5
yj`
The set of all points P with F (P ) = 0 forms an aÆne variety X. We laim that
dim X  dimdomain (F ) 2Æ + 2 = 2Æ + 1
To prove this laim we ompute the Ja obian J (F (~x; ~0; ~0)). If one an show that this determi-
nant is nonzero for some point (~x; ~0; ~0) in a `large' irredu ible omponent then we ould on lude
that the tangent spa e at that point would have dimension at most 2Æ +1 and hen e the dimension
of X would Æbe at most 2Æ + 1. With this bound on the dimension of X, we see that the proje tion
of X onto F +1  FÆ+1 ould not be a surje tive fun tion, whi h means that there must exist g1 (s)
and g2 (s) whi h satisfy Conje ture 7.5.3.

56
BIBLIOGRAPHY
[1℄ B.M. Allen. An algebrai framework for turbo odes. In Pro . of the 36-th Allerton Conferen e
on Communi ation, Control, and Computing, pages 27{28, 1998.
[2℄ B.M. Allen and J. Rosenthal. Analysis of onvolutional en oders via generalized sylvester ma-
tri es and state spa e realization. In Pro . of the 34-th Allerton Conferen e on Communi ation,
Control, and Computing, pages 893{902, 1996.
[3℄ B.M. Allen and J. Rosenthal. Analyzing onvolutional en oders using realization theory. In
Pro eedings of the 1997 IEEE International Symposium on Information Theory, page 287,
Ulm, Germany, 1997.
[4℄ B.M. Allen and J. Rosenthal. Parity- he k de oding of onvolutional odes whose systems
parameters have desirable algebrai properties. In Pro eedings of the 1998 IEEE International
Symposium on Information Theory, page 307, Boston, MA, 1998.
[5℄ B.M. Allen and J. Rosenthal. A matrix Eu lidean algorithm indu ed by state spa e realization.
Linear Algebra Appl., 288:105{121, 1999.
[6℄ A.C. Antoulas. On re ursiveness and related topi s in linear systems. IEEE Trans. Automat.
Contr., AC-31(12):1121{1135, 1986.
[7℄ L.R. Bahl, J. Co ke, F. Jelinek, and J. Raviv. Optimal de oding of linear odes for minimizing
symbol error rate. IEEE Trans. Inform. Theory, IT-20:284{287, Mar h 1974.
[8℄ J. A. Ball, I. Gohberg, and L. Rodman. Interpolation of Rational Matrix Fun tions. Birkhauser
Verlag, Basel-Berlin-Boston, 1990.
[9℄ J.S. Baras, R. W. Bro kett, and P.A. Fuhrmann. State spa e models for in nite-dimensional
systems. IEEE Trans. Automat. Contr., AC-19:693{700, 1974.
[10℄ A.S. Barbules u and S.S. Pietrobon. Interleaver design for turbo odes. Ele troni s Letters,
30(25):2107{2108, 1994.
[11℄ A.S. Barbules u and S.S. Pietrobon. Terminating the trellis of turbo- odes in the same state.
Ele troni s Letters, 31(1):22{23, 1995.
[12℄ S. Barnett. Polynomials and linear ontrol systems. M. Dekker, New York, 1983.
[13℄ C. Berrou, P. Adde, E. Angui, and S. Faudeil. A low omplexity soft-output Viterbi de oder
ar hite ture. In Pro . of IEEE Int. Conferen e on Communi ation, pages 737{740, Geneva,
Switzerland, May 1993.
[14℄ C. Berrou, A. Glavieux, and P. Thitimajshima. Near Shannon limit error- orre ting oding
and de oding: Turbo- odes. In Pro . of IEEE Int. Conferen e on Communi ation, pages
1064{1070, Geneva, Switzerland, May 1993.
[15℄ R. R. Bitmead, S. Y. Kung, B. D. O. Anderson, and T. Kailath. Greatest ommon divisors via
generalized Sylvester and Bezout matri es. IEEE Trans. Automat. Control, AC-23:1043{1047,
1978.
[16℄ W.J. Bla kert, E.K. Hall, and S.G. Wilson. Turbo ode termination and interleaver onditions.
Ele troni s Letters, 31(24):2082{2083, 1995.
[17℄ T. Cover and J. Thomas. Elements of Information Theory. Wiley & Sons, New York, 1991.
[18℄ A. Dholakia. Introdu tion to Convolutional Codes with Appli ations. Kluwer A ademi Pub-
lishers, 1994.
[19℄ D. Divsalar and R.J. M Elie e. E e tive free distan e of turbo odes. Ele troni s Letters,
32(5):445{446, 1996.
57
[20℄ D. Divsalar and F. Pollara. On the design of turbo odes. TDA Progress Report, 42-123:99{121,
November 15 1995.
[21℄ S. Dolinar and D. Divsalar. Weight distributions for turbo odes using random and nonrandom
permutations. TDA Progress Report, 42-122:56{65, August 15 1995.
[22℄ P. Elias. Coding for noisy hannels. IRE Conv. Re ., 4:37{46, 1955.
[23℄ G. D. Forney. Convolutional odes I: Algebrai stru ture. IEEE Trans. Inform. Theory, IT-
16(5):720{738, 1970.
[24℄ G. D. Forney. Minimal bases of rational ve tor spa es, with appli ations to multivariable linear
systems. SIAM J. Control Optim., 13(3):493{520, 1975.
[25℄ Paul A. Fuhrmann. A matrix eu lidean algorithm and matrix ontinued fra tion expansions.
Systems and Control Letters, 3:263{271, 1983.
[26℄ R.G. Gallager. Low-density parity- he k odes. IRE Trans. on Info. Theory, IT-8:21{28, 1962.
[27℄ R.G. Gallager. Low-Density Parity Che k Codes. M.I.T. Press, Cambridge, MA, 1963. Number
21 in Resear h monograph series.
[28℄ J. Hagenauer and P. Hoeher. A viterbi algorithm with soft-de ision outputs and its appli a-
tions. In Pro . of IEEE Globe om '89, pages 47.11{47.17, 1989.
[29℄ U. Helmke, J. Rosenthal, and J. M. S huma her. A ontrollability test for general rst-order
representations. Automati a, 33(2):193{201, 1997.
[30℄ R. Johannesson and K. Sh. Zigangirov. Fundamentals of Convolutional Coding. IEEE Press,
New York, 1999.
[31℄ P. Jung. Novel low omplexity de oder for turbo odes. Ele troni s Letters, 31(2):86{87, 1995.
[32℄ P. Jung. Comparison of turbo- ode de oders applied to short frame transmission systems.
IEEE Journal on Sele ted Areas in Communi ations, 14(3):530{537, 1996.
[33℄ P. Jung and M. Nasshan. Performan e evaluation of turbo odes for short frame transmission
systems. Ele troni s Letters, 30(2):111{113, 1994.
[34℄ J. Justesen. An algebrai onstru tion of rate 1= onvolutional odes. IEEE Trans. Inform.
Theory, IT-21(1):577{580, 1975.
[35℄ T. Kailath. Linear Systems. Prenti e-Hall, Englewood Cli s, N.J., 1980.
[36℄ R. E. Kalman. Mathemati al des ription of linear dynami al systems. SIAM J. Control Optim.,
1:152{192, 1963.
[37℄ M. Kuijper. First-Order Representations of Linear Systems. Birkhauser, Boston, 1994.
[38℄ S.-Y. Kung, T. Kailath, and M. Morf. A generalized resultant matrix for polynomial matri es.
In Pro . IEEE Conf. on De ision and Control (Florida), pages 892{895, 1976.
[39℄ L.H. Charles Lee. Computation of the right-inverse of G(D) and the left-inverse of H(D).
Ele troni s Letters, 26(13):904{906, 1990.
[40℄ L.H. Charles Lee. Convolutional Coding: Fundamentals and Appli ations. Arte h House,
Norwalk, MA, 1997.
[41℄ R. Lidl and H. Niederreiter. Introdu tion to Finite Fields and their Appli ations. Cambridge
University Press, Cambridge, London, 1986.
[42℄ R. Lidl and H. Niederreiter. Introdu tion to Finite Fields and their Appli ations. Cambridge
University Press, Cambridge, London, 1994. Revised edition.

58
[43℄ S. Lin and D. Costello. Error Control Coding: Fundamentals and Appli ations. Prenti e-Hall,
Englewood Cli s, NJ, 1983.
[44℄ J. H. van Lint. Introdu tion to Coding Theory. Springer Verlag, Berlin, New York, 1982.
[45℄ C. C. Ma Du ee. Some appli ations of matri es in the theory of equations. Ameri an Mathe-
mati al Monthly, 57:154{161, 1950.
[46℄ D.J.C. Ma Kay. Good error- orre ting odes based on very sparse matri es. IEEE Trans.
Inform. Theory, 45(2):399{431, 1999.
[47℄ D.J.C. Ma Kay and R.M. Neal. Near shannon limit performan e of low density parity he k
odes. Ele troni s Letters, 32(18):1645{1646, 1996.
[48℄ D.J.C. Ma Kay, S.T. Wilson, and M.C. Davey. Comparison of onstru tions of irregular
gallager odes. In Pro . of the 36-th Allerton Conferen e on Communi ation, Control, and
Computing, pages 220{229, 1998.
[49℄ F. J. Ma Williams and N. J.A. Sloane. The Theory of Error-Corre ting Codes. North Holland,
Amsterdam, 1977.
[50℄ J. L. Massey. Threshold de oding. MIT Press, Cambridge, Massa husetts, 1963.
[51℄ J. L. Massey and M. K. Sain. Inverses of linear sequential ir uits. IEEE Trans. on Computers,
C-17(4):330{337, 1968.
[52℄ Ph. Piret. Convolutional Codes, an Algebrai Approa h. MIT Press, Cambridge, MA, 1988.
[53℄ M. S. Ravi, J. Rosenthal, and J. M. S huma her. Homogeneous behaviors. Math. Contr.,
Sign., and Syst., 10:61{75, 1997.
[54℄ P. Robertson. Illuminating the stru ture of ode and de oder of parallel on atenated re ursive
systemati (turbo) odes. In Pro eedings GLOBECOM '94, volume 3, pages 1298{1303, San
Fran is o, CA, November 1994.
[55℄ J.P. Robinson. Error propogation and de nite de oding of onvolutional odes. IEEE Trans.
Inform. Theory, IT-14(1):121{128, January 1968.
[56℄ J. Rosenthal. An algebrai de oding algorithm for onvolutional odes. In G. Pi i and
D.S. Gilliam, editors, Dynami al Systems, Control, Coding, Computer Vision: New Trends,
Interfa es, and Interplay, pages 343{360. Birkauser, Boston-Basel-Berlin, 1999.
[57℄ J. Rosenthal and J. M. S huma her. Realization by inspe tion. IEEE Trans. Automat. Contr.,
AC-42(9):1257{1263, 1997.
[58℄ J. Rosenthal, J. M. S huma her, and E.V. York. On behaviors and onvolutional odes. IEEE
Trans. Inform. Theory, 42(6):1881{1891, 1996.
[59℄ J. Rosenthal and R. Smaranda he. Constru tion of onvolutional odes using methods from
linear systems theory. In Pro . of the 35-th Annual Allerton Conferen e on Communi ation,
Control, and Computing, pages 953{960, 1997.
[60℄ J. Rosenthal and R. Smaranda he. Maximum distan e separable onvolutional odes. Te hni al
Report 1998-074, MSRI, Berkeley, California, 1998. to appear in Appl. Algebra Engrg. Comm.
Comput.
[61℄ J. Rosenthal and E.V. York. BCH onvolutional odes. Te hni al report, University
of Notre Dame, Dept. of Mathemati s, O tober 1997. Preprint # 271. Available at
http://www.nd.edu/~rosen/preprints.html. To appear in IEEE Trans. Inform. Theory.
[62℄ J. Rosenthal and E.V. York. A onstru tion of binary BCH onvolutional odes. In Pro eedings
of the 1997 IEEE International Symposium on Information Theory, page 291, Ulm, Germany,
1997.
59
[63℄ L. D. Rudolph. Generalized threshold de oding of onvolutional odes. IEEE Trans. Inform.
Theory, IT-16(6):739{745, November 1970.
[64℄ C.E. Shannon. A mathemati al theory of ommuni ation. Bell Syst. Te h. J., 27:379{423 and
623{656, 1948.
[65℄ R. Smaranda he and J. Rosenthal. A state spa e approa h for onstru ting MDS rate 1=n
onvolutional odes. In Pro eedings of the 1998 IEEE Information Theory Workshop on In-
formation Theory, pages 116{117, Killarney, Kerry, Ireland, June 1998.
[66℄ E. D. Sontag. Mathemati al Control Theory: Deterministi Finite Dimensional Systems.
Springer Verlag, New York, 1990.
[67℄ D.A. Spielman. Finding good LDPC odes. In Pro . of the 36-th Allerton Conferen e on
Communi ation, Control, and Computing, pages 211{219, 1998.
[68℄ J. C. Willems. From time series to linear system. Part I: Finite dimensional linear time invariant
systems. Automati a, 22:561{580, 1986.
[69℄ J. C. Willems. Paradigms and puzzles in the theory of dynami al systems. IEEE Trans.
Automat. Control, AC-36(3):259{294, 1991.
[70℄ W. A. Wolovi h. Linear Multivariable Systems, volume 11 of Appl. Math. S . Springer Verlag,
New York, 1974.
[71℄ E.V. York. Algebrai Des ription and Constru tion of Error Corre ting Codes, a Sys-
tems Theory Point of View. PhD thesis, University of Notre Dame, 1997. Available at
http://www.nd.edu/~rosen/preprints.html.

60