Você está na página 1de 490

.

PREFACE
Linear Algebra plays an important role in the spheres of Mathematics, Physics, and
Engineering due to their inherent viabilities. The aim of this text book is to give rigorous
and thorough analysis and applications of various aspects of Linear algebra and analysis with
applications. Also, the present book has been designed in a lucid and coherent manner so
that the Honours and Postgraduate students of various Universities may reap considerable
benefit out of it. I have chosen the topics with great care and have tried to present them
systematically with various examples.
The author expresses his sincere gratitude to his teacher Prof. S. Das, Department
of Mathematics, R. K. Mission Residential College, Narendrapur, India, who taught him
this course at the UG level. Author is thankful to his friends and colleagues, especially,
Dr. S. Bandyopadhyay, Mr. Utpal Samanta and Mr. Arup Mukhopadhyay of Bankura
Christian College, Dr. Jayanta Majumdar, Durgapur Govt. College, Pratikhan Mandal,
Durgapur Govt. College, for their great help and valuable suggestions in the preparation
of the book. Author also extends his thanks to Prof. (Dr.) Madhumangal Pal, Dept. of
Applied Mathematics, Vidyasagar University, for his encouragement and handy suggestions.
This book could not have been completed without the loving support and encouragement
of my parents, wife (Mousumi) and son (Bubai). I extend my thanks to other well wishers
relatives and students for embalming me to sustain enthusiasm for this book. Finally, I
express my gratitude to Books and Allied (P) Ltd., specially Amit Ganguly, for bringing
out this book.
I would like to thank to Dr. Sk. Md. Abu Nayeem of Aliah University, West Bengal and
my student Mr. Buddhadeb Roy for support in writing/typing in LaTex verision.
This book could not have been completed without the loving support and encouragement
of my parents, wife (Mousumi) and son (Bubai). I extend my thanks to other well wishers
relatives and students for embalming me to sustain enthusiasm for this book. Finally, I
express my gratitude to Asian Books Private Limited, Delhi, for bringing out this book.
Critical evaluation, suggestions and comments for further improvement of the book will
be appreciated and gratefully acknowledged.
Prasun Kumar Nayak ,
(nayak prasun@rediffmail.com)
Bankura Christian College,
Bankura, India, 722 101.

Dedicated to my parents
Sankar Nath Nayak and Mrs. Indrani Nayak
for their continuous encouragement and support..

Contents
1 Theory of Sets
1.1 Sets . . . . . . . . . . . . . . . . . . . .
1.1.1 Description of Sets . . . . . . . .
1.1.2 Types of Sets . . . . . . . . . . .
1.2 Algebraic Operation on Sets . . . . . . .
1.2.1 Union of Sets . . . . . . . . . . .
1.2.2 Intersection of Sets . . . . . . .
1.2.3 Disjoint Sets . . . . . . . . . . .
1.2.4 Complement of a Set . . . . . . .
1.2.5 Difference . . . . . . . . . . . . .
1.2.6 Symmetric Difference . . . . . .
1.3 Duality and Algebra Sets . . . . . . . .
1.4 Cartesian Product of Sets . . . . . . . .
1.5 Cardinal Numbers . . . . . . . . . . . .
1.6 Relation . . . . . . . . . . . . . . . . . .
1.6.1 Equivalence Relation . . . . . . .
1.7 Equivalence Class . . . . . . . . . . . . .
1.7.1 Partitions . . . . . . . . . . . . .
1.8 Poset . . . . . . . . . . . . . . . . . . . .
1.8.1 Dual Order . . . . . . . . . . . .
1.8.2 Chain . . . . . . . . . . . . . . .
1.8.3 Universal Bounds . . . . . . . . .
1.8.4 Covering Relation . . . . . . . .
1.8.5 Maximal and Minimal Elements
1.8.6 Supremum and Infimum . . . . .
1.9 Lattices . . . . . . . . . . . . . . . . . .
1.9.1 Lattice Algebra . . . . . . . . . .
1.9.2 Sublattices . . . . . . . . . . . .
1.9.3 Bounded Lattices . . . . . . . . .
1.9.4 Distributive Lattices . . . . . . .
1.9.5 Trivially Complement . . . . . .
1.10 Mapping . . . . . . . . . . . . . . . . . .
1.10.1 Types of Functions . . . . . . . .
1.10.2 Composite mapping . . . . . . .
1.11 Permutation . . . . . . . . . . . . . . . .
1.11.1 Equal permutations . . . . . . .
1.11.2 Identity permutation . . . . . . .
1.11.3 Product of permutations . . . . .
1.11.4 Inverse of permutations . . . . .
iii

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

1
1
2
3
6
6
7
8
9
10
11
11
15
18
20
22
30
31
34
35
36
36
37
38
40
42
44
45
45
45
46
47
48
57
63
63
63
63
64

iv

CONTENTS
1.11.5 Cyclic permutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
1.12 Enumerable Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

2 Theory of Numbers
2.1 Number System . . . . . . . . . . . . . . .
2.1.1 Non-positional Number System . . .
2.1.2 Positional Number System . . . . .
2.2 Natural Number . . . . . . . . . . . . . . .
2.2.1 Basic Properties . . . . . . . . . . .
2.2.2 Well Ordering Principle . . . . . . .
2.2.3 Mathematical Induction . . . . . . .
2.3 Integers . . . . . . . . . . . . . . . . . . . .
2.3.1 Divisibility . . . . . . . . . . . . . .
2.3.2 Division Algorithm . . . . . . . . . .
2.4 Common Divisor . . . . . . . . . . . . . . .
2.4.1 Greatest Common Divisor . . . . . .
2.5 Common Multiple . . . . . . . . . . . . . .
2.5.1 Lowest Common Multiple . . . . . .
2.6 Diophantine Equations . . . . . . . . . . . .
2.6.1 Linear Diophantine Equations . . . .
2.7 Prime Numbers . . . . . . . . . . . . . . . .
2.7.1 Relatively Prime Numbers . . . . . .
2.7.2 Fundamental Theorem of Arithmetic
2.8 Modular/Congruence System . . . . . . . .
2.8.1 Elementary Properties . . . . . . . .
2.8.2 Complete Set of Residues . . . . . .
2.8.3 Reduced Residue System . . . . . .
2.8.4 Linear Congruences . . . . . . . . .
2.8.5 Simultaneous Linear Congruences .
2.8.6 Inverse of a Modulo m . . . . . . . .
2.9 Fermats Theorem . . . . . . . . . . . . . .
2.9.1 Wilsons Theorem . . . . . . . . . .
2.10 Arithmetic Functions . . . . . . . . . . . . .
2.10.1 Eulers Phi Function . . . . . . . . .
2.10.2 The Mobius Function: . . . . . . . .
2.10.3 Divisor Function . . . . . . . . . . .
2.10.4 Floor and Ceiling Functions . . . . .
2.10.5 Mod Function . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

83
83
83
84
85
85
85
86
89
90
92
94
95
97
97
99
99
102
103
108
111
111
117
121
122
125
130
131
133
136
136
141
143
144
144

3 Theory of Matrices
3.1 Matrix . . . . . . . . . . . .
3.1.1 Special Matrices . . .
3.1.2 Square Matrix . . . .
3.2 Matrix Operations . . . . . .
3.2.1 Equality of matrices .
3.2.2 Matrix Addition . . .
3.2.3 Matrix Multiplication
3.2.4 Transpose of a Matrix
3.3 Few Matrices . . . . . . . . .
3.3.1 Nilpotent Matrix . . .
3.3.2 Idempotent Matrix . .

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

149
149
149
150
153
153
153
154
160
161
161
162

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.

CONTENTS

3.3.3 Involuntary Matrix . . . . . . . . . . . . . . . .


3.3.4 Periodic Matrix . . . . . . . . . . . . . . . . . .
3.3.5 Symmetric Matrices . . . . . . . . . . . . . . .
3.3.6 Skew-symmetric Matrices . . . . . . . . . . . .
3.3.7 Normal Matrix . . . . . . . . . . . . . . . . . .
3.4 Determinants . . . . . . . . . . . . . . . . . . . . . . .
3.4.1 Product of Determinants . . . . . . . . . . . .
3.4.2 Minors and Co-factors . . . . . . . . . . . . .
3.4.3 Adjoint and Reciprocal of Determinant . . . .
3.4.4 Symmetric and Skew-symmetric Determinants
3.4.5 Vander-Mondes Determinant . . . . . . . . . .
3.4.6 Cramers Rule . . . . . . . . . . . . . . . . . .
3.5 Complex Matrices . . . . . . . . . . . . . . . . . . . .
3.5.1 Transpose Conjugate of a Matrix . . . . . . . .
3.5.2 Harmitian Matrix . . . . . . . . . . . . . . . .
3.5.3 Skew-Harmitian Matrix . . . . . . . . . . . . .
3.5.4 Unitary Matrix . . . . . . . . . . . . . . . . . .
3.5.5 Normal Matrix . . . . . . . . . . . . . . . . . .
3.6 Adjoint of a Matrix . . . . . . . . . . . . . . . . . . . .
3.6.1 Reciprocal of a Matrix . . . . . . . . . . . . . .
3.6.2 Inverse of a Matrix . . . . . . . . . . . . . . . .
3.6.3 Singular Value Decomposition . . . . . . . . . .
3.7 Orthogonal Matrix . . . . . . . . . . . . . . . . . . . .
3.8 Submatrix . . . . . . . . . . . . . . . . . . . . . . . . .
3.9 Partitioned Matrix . . . . . . . . . . . . . . . . . . . .
3.9.1 Square Block Matrices . . . . . . . . . . . . . .
3.9.2 Block Diagonal Matrices . . . . . . . . . . . . .
3.9.3 Block Addition . . . . . . . . . . . . . . . . . .
3.9.4 Block Multiplication . . . . . . . . . . . . . . .
3.9.5 Inversion of a Matrix by Partitioning . . . . . .
3.10 Rank of a Matrix . . . . . . . . . . . . . . . . . . . . .
3.10.1 Elementary Operation . . . . . . . . . . . . . .
3.10.2 Row-reduced Echelon Matrix . . . . . . . . . .
3.11 Elementary Matrices . . . . . . . . . . . . . . . . . .
3.11.1 Equivalent Matrices . . . . . . . . . . . . . . .
3.11.2 Congruent Matrices . . . . . . . . . . . . . . .
3.11.3 Similar Matrices . . . . . . . . . . . . . . . . .
4 Vector Space
4.1 Vector Space . . . . . . . . . . . . . . . . . . .
4.1.1 Vector Subspaces . . . . . . . . . . . . .
4.2 Linear Sum . . . . . . . . . . . . . . . . . . . .
4.2.1 Smallest Subspace . . . . . . . . . . . .
4.2.2 Direct Sum . . . . . . . . . . . . . . . .
4.3 Quotient Space . . . . . . . . . . . . . . . . . .
4.4 Linear Combination of Vectors . . . . . . . . .
4.4.1 Linear Span . . . . . . . . . . . . . . . .
4.4.2 Linearly Dependence and Independence
4.5 Basis and Dimension . . . . . . . . . . . . . . .
4.6 Co-ordinatisation of Vectors . . . . . . . . . . .
4.6.1 Ordered Basis . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

163
163
163
164
165
166
171
181
183
184
186
186
189
190
190
191
192
192
192
195
195
201
202
205
206
206
207
208
208
209
211
212
213
216
218
220
220

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

235
235
240
246
247
247
249
251
252
257
262
277
278

vi

CONTENTS
.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

278
279
279
281
283

5 Linear Transformations
5.1 Linear Transformations . . . . . . . . . . . . . .
5.1.1 Kernal of Linear Mapping . . . . . . . . .
5.1.2 Image of Linear Mapping . . . . . . . . .
5.2 Isomorphism . . . . . . . . . . . . . . . . . . . .
5.3 Vector Space of Linear Transformation . . . . . .
5.3.1 Product of Linear Mappings . . . . . . . .
5.3.2 Invertible Mapping . . . . . . . . . . . . .
5.4 Singular and Non-singular Transformation . . . .
5.5 Linear Operator . . . . . . . . . . . . . . . . . .
5.6 Matrix Representation of Linear Transformation
5.7 Orthogonal Linear Transformation . . . . . . . .
5.8 Linear Functional . . . . . . . . . . . . . . . . . .
5.8.1 Dual Space . . . . . . . . . . . . . . . . .
5.8.2 Second Dual Space . . . . . . . . . . . . .
5.8.3 Annihilators . . . . . . . . . . . . . . . . .
5.9 Transpose of a Linear Mapping . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

293
293
297
300
311
316
318
321
324
326
327
341
344
345
349
350
354

6 Inner Product Space


6.1 Inner Product Space . . . . . .
6.1.1 Euclidean Spaces . . . .
6.1.2 Unitary Space . . . . .
6.2 Norm . . . . . . . . . . . . . .
6.3 Orthogonality . . . . . . . . . .
6.3.1 Orthonormal Set . . . .
6.3.2 Orthogonal Complement
6.3.3 Direct Sum . . . . . . .
6.4 Projection of a Vector . . . . .

4.7

4.8

4.6.2 Co-ordinates . . . . . . .
Rank of a Matrix . . . . . . . . .
4.7.1 Row Space of a Matrix . .
4.7.2 Column Space of a Matrix
Isomorphic . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

365
365
365
366
369
374
375
376
377
380

7 Matrix Eigenfunctions
7.1 Matrix Polynomial . . . . . . . . . .
7.1.1 Polynomials of Matrices . . .
7.1.2 Matrices and Linear Operator
7.2 Characteristic Polynomial . . . . . .
7.2.1 Eigen Value . . . . . . . . . .
7.2.2 Eigen Vector . . . . . . . . .
7.2.3 Eigen Space . . . . . . . . . .
7.3 Diagonalization . . . . . . . . . . . .
7.3.1 Orthogonal Diagonalisation .
7.4 Minimal Polynomial . . . . . . . . .
7.5 Bilinear Forms . . . . . . . . . . . .
7.5.1 Real Quadratic Forms . . . .
7.6 Canonical Form . . . . . . . . . . . .
7.6.1 Jordan Canonical Form . . .
7.7 Functions of Matrix . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

395
395
395
396
396
398
398
410
413
417
420
425
425
427
435
439

.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.

CONTENTS
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

440
441
443
443
445
446

8 Boolean Algebra
8.1 Operation . . . . . . . . . . . . . . . . . . . . .
8.1.1 Unary Operation . . . . . . . . . . . . .
8.1.2 Binary Operation . . . . . . . . . . . . .
8.2 Boolean Algebra . . . . . . . . . . . . . . . . .
8.2.1 Boolean Algebra as a Lattice . . . . . .
8.2.2 Boolean Algebra as an Algebraic System
8.2.3 Boolean Algebra Rules . . . . . . . . . .
8.2.4 Duality . . . . . . . . . . . . . . . . . .
8.2.5 Partial Order Relation . . . . . . . . . .
8.3 Boolean Function . . . . . . . . . . . . . . . . .
8.3.1 Constant . . . . . . . . . . . . . . . . .
8.3.2 Literal . . . . . . . . . . . . . . . . . . .
8.3.3 Variable . . . . . . . . . . . . . . . . . .
8.3.4 Monomial . . . . . . . . . . . . . . . . .
8.3.5 Polynomial . . . . . . . . . . . . . . . .
8.3.6 Factor . . . . . . . . . . . . . . . . . . .
8.3.7 Boolean Function . . . . . . . . . . . . .
8.4 Truth Table . . . . . . . . . . . . . . . . . . .
8.5 Disjunctive Normal Form . . . . . . . . . . . .
8.5.1 Complete DNF . . . . . . . . . . . . . .
8.6 Conjunctive Normal Form . . . . . . . . . . . .
8.6.1 Complete CNF . . . . . . . . . . . . . .
8.7 Switching Circuit . . . . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

455
455
455
455
455
455
456
460
461
466
467
467
467
468
468
468
468
468
469
470
470
472
472
475

7.8

7.9

7.7.1 Powers of a Matrix . . . . . . . .


7.7.2 Roots of a Matrix . . . . . . . .
Series . . . . . . . . . . . . . . . . . . .
7.8.1 Exponential of a Matrix . . . . .
7.8.2 Logarithm of a Matrix . . . . . .
Hyperbolic and Trigonometric Functions

vii
.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

Chapter 1

Theory of Sets
George Cantor gave an intuitive definition of sets in 1895. Sets are building blocks of various
discrete structures. The theory of sets is one of the most important tools of mathematics.
The main aim of this chapter is to discuss some properties of sets.

1.1

Sets

A well defined collection of distinct object is defined as set. Each object is known as an
element or member of a set. The following are some examples of the set.
(i) All integers.
(ii) The positive rational numbers less than or equal to 5.
(iii) The planet in solar system.
(iv) Indian rivers.
(v) 4th semester B.Sc. students of Burdwan University.
(vi) The peoples in particular locality.
(vii) Cricketers in the world.
By the term well defined, we mean that we are given a collection of objects with certain
definite property or properties, given in such a way that we are clearly able to distinguish
whether a given object is our collection or not. The following collections are not examples
of set.
(i) Good students of a class, because good is not well-defined word, a student may be
good for particular people, but he/she may not be good for other people.
(ii) Tall students, tall is not well-defined measurement.
(iii) Girls and boys of a particular locality, because there is no sharp boundary of age for
which a female can surely identify.
These type of collections are designated as fuzzy sets. The elements of a set must be
distinct and distinguishable. By distinct, it means that no element is repeated, and by
distinguishable, means there is no doubt whether an element is either in the set or not in
the set.
(i) The standard mathematical symbols used to represent sets are upper-case letters like
A, B, X, etc. and the elements of the set can be written in lower-case letters like a,
b, p, q, x, y, etc.
(ii) If an element x is a member of a set A, we write x A, read as x belongs to A or
a is an element of S or a is in S. The symbol is a Greek alphabet epsilon. On
1

Theory of Sets
the other hand, if x is not the element of A, then we write x 6 A , which is read as x
does not belong to A or x is not an element of the set A.

(iii) If A is a set and a is any object, it should be easy to decide whether a A or a 6 A.


Only then is A termed well-defined.
For example, A = {1, 2, 3, 4, 5} be a set, has elements 1, 2, 3, 4, 5. Here 1 A, but 6 6 A.
Note that, S = {1, 1, 3} is not a set.

1.1.1

Description of Sets

As a set is determined by its elements, we have specify the elements of set A in order to
define A. Five common methods are used to describe the sets, they are (i) roster or list or
enumeration or tabular method, and (ii) selector or rule or set-builder property method (iii)
The characteristics method and (iv) Diagrammatic method.
(i) Roster method
In this method, all elements are listed explicitly separated by commas and are enclosed
within braces { }. Sometimes parenthesis ( ) or square [ ] may also be used.
A set is defined by naming all its members and can be used only for finite sets. Let X,
whose elements are x1 , x2 , , xn is usually written as X = {x1 , x2 , , xn }. For examples,
the set of all natural numbers less than 5 can be represented as A = {1, 2, 3, 4}.
Sometimes, it is not humanly possible to enumerate all elements, but after knowing some
initial elements one can guess the other elements. In this case dots are used at the end within
the braces. For an example, set of positive integers can be written as A = {1, 2, 3, 4, },
set of all integers B = {. . . , 2, 1, 0, 1, 2, . . .}, etc.
It may be noted that the elements of a set can be written in any order, but the name
of an element is listed only once. For example, {2,3,4}, {4,3,2}, {2,4,3} all represent the
same set. Thus, while we describe a set in this manner, the order of the elements is not
important.
(ii) Set-builder method
In this method, a set can be specified by stating one or more properties, which uniquely
satisfy by the elements. A set in this method is written as
A = {x : P1 (x) or P2 (x), etc},

(1.1)

i.e., x A if x satisfy the properties P1 (x), P2 (x), etc. The symbol : is read as such
that, it is also denoted by or /. For example,
A = {x : x is a positive even integers }
B = {x : x is a vowel in English alphabet }
C = {x : x is integer and 1 x 10} etc .
It is required that the property P be such that for any given x U , the universal set, the
proposition P (x) is either true of false.
(iii) The characteristics method
A set is defined by a function, usually called a characteristic function, that declares which
elements of U are members of the set and which are not. Let U = {u1 , u2 , . . . , un } be the
universal set and A U . Then the characteristic
function of A is defined as

1, if ui A
A (ui ) =
(1.2)
0, if ui 6 A.

Sets

i.e., the characteristic function maps elements of U to elements of the set {0, 1}, which
is formally expressed by A : U {0, 1}. For example, let U = {1, 2, 3, . . . , 10} and A =
{2, 4, 6, 8, 10}, then A (2) = 1, A (4) = 1, A (6) = 1, A (8) = 1, A (10) = 1 and A (a) = 0
for all other elements of U . It may be observed that A is onto function but not one-one.
(iv) Diagrammatic method
A set can be represented diagrammatically by closed figures like circles, triangles, rectangles,
etc. The point in the interior of the figure represents the elements of the set. Such a representation is called a Venn diagram or Venn-Euler diagram, after the British mathematician
Venn. In this diagram the universal set U is represented by the interior of a rectangle and
each subset of U is represented by the circle inside the rectangle.
If two sets are equal then they represent by same circle. If the sets A and B are disjoint
then the circles for A and B are drawn in such a way that they have no common area, If
the sets A and B have a small common area. If A B then the circle for A is drawn fully
inside the circle for B. This visual representation helps us to prove the set identities very
easily.
(v) Recursion method
A set can be described by giving one or more elements of the set and a rule for generating
the remaining elements. The underlying process is called recursion. For example, (a) the
set A = {1, 4, 7, } can be described as A = {a0 = 1, an+1 = an + 3}; (b) F = {Fn : F0 =
0, F1 = 1, Fn = Fn1 + Fn2 } is a set described by recursion. This set is called the set of
Fibonacci numbers.
Some standard sets and their notations
Some sets are frequently used in mathematical analysis or in algebraic structure, which are
stated below.
N The set of all natural numbers.
Z The set of all integers.
Z + The set of all positive integer.
Q The set of all rational numbers.
Q+ The set of all positive rational numbers.
R The set of all real numbers.
R+ The set of all positive real numbers.
C The set of all complex numbers.

1.1.2

Types of Sets

Null set
A set which contains no element is called null set or empty set or void set and is denoted by
the Greek alphabet (read as phi). In Roaster method, it is denoted by {}. For example,
A = {x : x2 + 4 = 0 and x R}
is a null set. To describe the null set, we can use any property, which is not true for any
element. It may be noted that the set {} or {0} is not a null set.
A set which is not a null set, is called non-empty set.
Singleton set
A set consisting only a single element is called a singleton or unit set. For example, A = {0},
B ={x: 1 < x < 3, x is integer}, the solution set C = {x : x 2 = 0} = {2} etc., are
examples of singleton set. Note that {0} is not a null set, since it contains 0 as its member,
it is singleton set.

Theory of Sets

Finite and infinite sets


A set containing finite elements is called finite sets, otherwise it is called infinite set, i.e., a
set does not contain a definite number of elements.
(i) A ={x: x is the consonant in English alphabet},
(ii) B ={x: 1 < x < 15, x is integer}
are examples of finite sets, whereas
(i) A ={x: x is a rational numbers},
(ii) B ={x: x is a straight line in space},
are the examples of infinite sets. Here, the process of counting the different elements comes
to one end. Here, the process of counting the different elements comes to an end. We let
|A| to denote the numbers of elements of a finite set A.
Indexed set
A set, whose elements are themselves sets is often referred to as a family of sets. Let us
consider a family of n sets A1 , A2 , . . . , An in the form F = {A | I} where A corresponds
to an element in the set I. I is said to be an indexing set and is called the set index.
In general, I be an arbitrary set then F = {A | I} is an arbitrary collection of sets A
indexed by I.
Set of sets
If the elements of a set be also the some other sets, then this set is known as a family of
sets, or a set of sets. For example, A ={{2, 3}, {5, 6, 8}, z, R+ } is a set of sets.
Subset and superset
Let A and B be two given sets. The set B is said to be a subset of A if
x B = x A,

(1.3)

i.e., every element of B is an element of A. This is very often denoted by B A (written


as B is contained or included in A). This is called set inclusion. For example,
(i) The set of all integers (Z) is the subset of all rational numbers (Q).
(ii) A ={a, b, c, d}, B ={a, b, c, d, e, f }. Here each element of A is also an element of B,
thus A B.
(iii) A ={1, 5, 7} and B ={1, 5, 7}. Here A B and B A.
(iv) is a subset of every set.
(v) The subsets of A ={2, 3, 4} are , {2}, {3}, {4}, {2, 3}, {2, 4}, {3, 4} and {2, 3, 4}.
(vi) A ={2, 3, 5}, B ={2, 3, 6}. Then A 6 B and B 6 A, because, 5 A but 5 6 B and
6 B but 6 6 B.
If B A, then A is called the super set of B, which is read as A is a super set of B.

Sets

Proper subset
The set A is called proper subset of B if every element of A is a member of B and there is
at least one element in B such that it is not in the set A. It is written as A B. Therefore,
B is the proper subset of A if
(i) x B = x A
(ii) y A such that y
/ B.
In this case, B A and A 6= B and B is said to the proper subset of A and is denoted by
B A. If B is the subset of A(i.e. B A) A is called the super set of B. For example,
(i) {1, 2} is the proper subset of {1, 2, 3, 4}.
(ii) The set of vowels is a proper subset of the set of English alphabet
(iii) N (set of natural numbers) is a proper subset of Z (set of integers).
Note the following:
(i) If even a single element in A which is not in B, then A is not a subset of B and we
write A 6 B. For example {1, 2} 6 {2, 4, 6, 8, 9}.
(ii) If A B or B A, then the sets A and B are said to be comparable. For example,
if A = {1, 2}, B = {5, 6, 7} then A 6 B and these are not comparable.
(iii) Every set is a subset of itself and every set is a subset of the universal set.
(iv) has no proper subset. Also, A A = .
(v) For any set A, A A. This is known as the reflexive law of inclusion.
(vi) If A B and B C, then A C. This is known as transitive law of inclusion.
In Venn diagram, the universal set is usually represented by a rectangular region and its
subset by closed bounded regions inside the rectangular region.
Equality of sets
If A B and B A, then A and B contain the same members. Two sets A and B are
said to be equal if every element of A is an element of B and also every element of B is an
element of A. That is, A B and B A. The equality, of two sets is denoted by A = B.
Conversely, if A = B then A B and B A must be satisfied. For example, A = {1, 4, 9}
and B = {4, 9, 1} are equal sets. To indicate that A and B are not equal, we write A 6= B.
Theorem 1.1.1 The null set is a subset of every set.
Proof: Let A be an arbitrary set. Then, in order to show that A, we must show that
there is no element of which is not contained in A. Since contains no element at all, no
such element can therefore be found. Hence, A.
Theorem 1.1.2 The number of subsets of a given set containing n elements is 2n .
Proof: Let A be an arbitrary set containing n elements. Then, one of its subsets is the
empty set. Apart from this,

(i) The number of subsets of A, each containing 1 element = n1 .

(ii) The number of subsets of A, each containing 2 elements = n2 .

Theory of Sets

(iii) The number of subsets of A, each containing 3 elements =


..
.

n
3

(iv) The number of subsets of A, each containing n elements =

n
n

Therefore, the total number of subsets of A is


   
 
n
n
n
=1+
+
+ +
= (1 + 1)n = 2n .
1
2
n
The number of proper subsets of a set with n elements is 2n 1.
Power set
A set formed by the all subsets of a given non empty set set S is called the power set of the
set S and is denoted by P (S). If S = {a, b, c} then
n
o
P (S) = , {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} .
Note that P () = {}. The power set of any given set is always non-empty. The family of
all subsets of P (A) is called a second order power set of A and is denoted by P 2 (A), which
stands for P (P (A)). Similarly, higher order power sets P 3 (A), P 4 (A), . . . are defined.
Order of a set is defined by the numbers of elements of A and is denoted by O(A). From
the above property, it is observed that the number of elements of the power set P (A) is 2n
if A contains n elements. For example, if A = {1, 2, 1} then O(A) = 3 and O{P (A)} = 8.
In general, if O(A) = n then O{P (A)} = 2n and O(P 2 (A)) = 22n .
Universal sets
In the theory of set it is observed that all the sets under consideration are the subsets of
a certain set. This set is called the universal set and it is usually denoted by U or S.
Conversely, the universal set is the superset of every set. For example, The set of real
numbers < is the universal set for the set of integers Z and set of rational numbers Q.
Again, the set of integers Z is the universal set for the sets of even integers, set of positive
integers, etc.
This set is of all possible elements that are relevant and considered under particular
context or application from which sets can be formed. The set U is not unique and it is a
super set of each of the given set. In venn diagram, the universal set is usually represented
by a rectangular region.

1.2

Algebraic Operation on Sets

Like addition, multiplication and other operations on numbers in arithmetic, there are certain operations on sets, namely union, intersection, complementation, etc. In this section,
we shall discuss several ways of combining different sets and develop some properties among
them.

1.2.1

Union of Sets

Let A and B be two given subsets of an universal set U . Union (or join) of two subsets A
and B, denoted by A B, is defined by
A B = {x : x A or x B or both},

(1.4)

Algebraic Operation on Sets

here the or is means and/or, i.e. the set contains the elements which either belong to A
or B or both. The Venn diagram of Fig. 1.1 illustrate pictorially the meaning of , where
U is the rectangular area, and A and B are disks. Union is also known as join or logical sum
of A and B. Note that, the common elements are to be taken only once. For example,

Figure 1.1: Venn diagram of A B (shaded area)


(i) If A ={1, 3, 4, 5, a, b} and B ={a, b, c, 2, 3, 4, 6} then
A B ={1, 2, 3, 4, 5, 6, a, b, c}.
(ii) If A = [2, 5] and B = [1, 3], then A B = [1, 5] = {x : 1 x 5}.
From the Venn diagram we get the following properties of set union:
(i) Union s idempotent, i.e., A A = A,
(ii) Set union is associative, i.e., (A B) C = A (B C),
(iii) A U = U : Absorption by U . A = A: identity law
(iv) Set union is commutative, i.e., A B = B A,
(v) A A B and B A B for sets A and B.
(vi) If A U A U = U , U = the universal set and if A B then A B = B.
The union operation can be generalized for any number of sets. The union of the subsets
A1 , A2 , . . . , An is given by,
n
[
Ai = A1 A2 . . . An = {x : x Ai ; for some i = 1, 2, . . . , n}
i=1

and for a family of sets {Ai ; i I} is defined as


[
Ai = {x : x Ai , for some i I}.
iI

1.2.2

Intersection of Sets

Let A and B be two given subsets of an universal set U . The intersection of A and B is
denoted by A B and is defined by
A B = {x : x A and x B}.
(1.5)
The intersection of the sets A and B is the set of all elements which are in both the sets A
and B. The AB is shown in Fig. 1.2. The intersection is also known as meet. AB is read
as A intersection B or A meet B. For example, (i) let A ={2, 5, 6, 8} and B ={2, 6, 8, 9, 10}
then A B ={2, 6, 8} (ii) if A = [2, 5] and B = [1, 3], then A B = [2, 3] = {x : 2 x 3}.
From the Venn diagram we get the following properties of intersection:

Theory of Sets

Figure 1.2: Venn diagram of A B (shaded area)


(i) Intersection is idempotent, i.e., A A = A, follows from A A,
(ii) A B A ; A B B and A B then A B = A.
(iii) A = : absorption by .

A U = A: identity law.

(iv) Set intersection is commutative, i.e., A B = B A.


(v) Set intersection is associative, i.e., (A B) C = A (B C).
The intersection of the n subsets is given by
n
\
Ai = A1 A2 . . . An = {x : x Ai , i}.
i=1

Ex 1.2.1 For given two sets A = {x : 2 cos2 x + sinx 2}; B = {x : x [ 2 , 3


2 ]}, find
A B. [KH 06]
Solution: The solution of the trigonometric relation 2 cos2 x + sinx 2 is given by,
2 cos2 x + sinx 2 sin x[1 2 sin x] 0
sin x 0 and 1 2 sin x 0
or sin x 0 and 1 2 sin x 0
1
sin x 0 or sin x .
2
If x [ 2 , 3
2 ], then the solutions of sin x 0 are given by x
sin x 12 are given by 2 x 5
6 . Therefore,

 

3
5
A B = ,

,
.
2
2 6

1.2.3

3
2

and the solutions of

Disjoint Sets

It is sometimes observed that the intersection between two non-empty sets produced a null
set. In this case, no element is common in A and B and these two sets are called disjoint
or mutually exclusive sets. Thus two sets A and B are said to be disjoint if and only if
A B = , i.e. they have no element in common. Then Venn diagram of disjoint sets A and
B is shown in Fig. 1.3. For example, (i) A = {1, 2, 3} and B = {6, 7, 9} and (ii) if A ={x:
x is even integer} and B ={x: x is odd integer} then A B = , i.e., A and B are disjoint.
When A B 6= , the sets A and B are said to be intersecting.
Note 1.2.1 The three relations B A, A B = A and A B = B are mutually equivalent,
i.e., one implies the other two.

Algebraic Operation on Sets

Figure 1.3: Disjoint sets A and B

1.2.4

Complement of a Set

Let A be a subset of an universal set U . Then complement of a subset A, with respect to


U , denoted by A0 , Ac , A or A, is defined by
A0 = {x : x U but x 6 A},

(1.6)

i.e., the set contains the elements which belong to the universal set U but not elements of
A. The venn diagram of Ac is shown in Fig. 1.4. Clearly, if A0 is the complement of A, then
#

A
"!
Ac
Figure 1.4: Complement of A
A is a complement of A0 . For example, let A = {1, 3, 5, 7, 9}, if U = {1, 2, 3, 4, 5, 6, 7, 8, 9}
then A0 = {2, 4, 6, 8}. From Venn diagram we have
(i) (A0 )0 = A : involution property.
(ii) U 0 = ; 0 = U .
(iii) If A B then B c Ac and conversely, if Ac B c then B A.
(iv) A A0 = U : law of excluded middle.

A A0 = : law of contradiction.

(v) (a) (A B)c = Ac B c and (b) (A B)c = Ac B c : De Morgans laws.


In particular, for a finite family of subsets F = {A1 , A2 , . . . , An }, the De Morgans law can
be written as
!0
!0
n
n
n
n
\
\
[
[
0
Ai =
Ai and
Ai =
A0i .
i=1

i=1

i=1

i=1

Ex 1.2.2 Prove that (A C) (B C 0 ) = A B = .


Solution: The relation (A C) (B C 0 ) = gives A C = and B C 0 = . Now,
B C 0 = B C.
Therefore, A C = A B = .

10

1.2.5

Theory of Sets

Difference

Let A, B be any two subsets of an universal set U . The difference of two subsets A and B
of an universal set U is a subset of A, denoted by A B or A/B and is defined by
A B = {x : x A and x
/ B},
(1.7)
i.e., the set containing of those elements of A which are not elements of B. Also
B A = {x : x B and x 6 A}.

(1.8)

This is also called the relative component of the set B with respect to the set A. A B is
called the complement of B relative to A. The differences A B and B A are shown in
B

6
AB

6
BA

Figure 1.5: Set difference A B and B A


Fig. 1.5. A B is read as A difference B or A minus B. For example, if A ={2, 4, 5, 8}
and B ={2, 5, 7, 10} then A B ={4, 8} and B A ={7, 10}. From the Venn diagram we
have:
(i) A A = , A = A,
(ii) A B A, B A B, and A B = A if A B = .
(iii) Set difference is non-commutative, i.e., A B 6= B A,
(iv) A B = when A B.
(v) A A = then A B = A and B A = B,
(vi) A B = A B 0 .
(vii) (A B)A = A, (A B) B = A B and (A B) B = .
(viii) A B = A if and only if A B = .
(ix) A B, A B and B A are mutually exclusive.
If the set A is the universal set, the complement is absolute, known as complementation and
is usually denoted by B.
Ex 1.2.3 For two subsets A and B of an universal set U , show that
A (B C) = (A B) (A C).
Solution: Let x be any element of A (B C), then by definition,
A (B C) {x :
{x :
{x :
{x :

x A and x 6 (B C)}
x A and (x 6 B or x 6 C)}
(x A and x 6 B) or (x A and x C)}
x (A B) or x (A C)} = (A B) (A C).

Hence, A (B C) (A B) (A C) and (A B) (A C) A (B C), consequently,


A (B C) = (A B) (A C).

Duality and Algebra Sets

1.2.6

11

Symmetric Difference

If A and B be two subsets of an universal set U . The symmetric difference, denoted by


AB or A B, is defined by
AB = = {x : x A or x B but x 6 B}
= {x : (x A and x
/ B) or (x B and x
/ A)}.
The set (A B) (B A) is also called the symmetric difference of A and B. Thus,
AB = (A B) (A B) = (A B 0 ) (A0 B)
= (A B) 4 (A B).
The Venn diagram of AB is shown in Fig. 1.6. If A = {1, 2, 4, 7, 9} and B = {2, 3, 7, 8, 9}

}
A4B
Figure 1.6: Symmetric difference of A and B (shaded ares)
then, AB = {1, 3, 4, 8}. Note that,
(A B) (B A) = (A B 0 ) (B A0 )
= A (B B 0 ) A0
= (A ) A0 = A A0 = .
Therefore, A 4 B can be considered as the union of disjoint subsets A B and B A,
provided A B and B A are both non empty. From the definition, we have the following,
(i) Symmetric difference is commutative, i.e., AB = BA,
(ii) Symmetric difference is associative, i.e., (AB)C = A(BC),
(iii) A = A, for all subsets of A.
(iv) AA = , for all subsets of A.
(v) AB = iff A = B,
(vi) A (B 4 C) = (A B) 4 (A C) : Distributive property.

1.3

Duality and Algebra Sets

Principle of duality
Let E is an equation(or law) of set algebra (involving , , U , ). If we replace by ,
by , U by and by U in E then we obtain E another law, which is also a valid law.
This is known as principle of duality. For example,

12

Theory of Sets

(i) A (B C) = (A B) (A C), its dual law is A (B C) = (A B) (A C),


(ii) A A = its dual is A A = U .
It is a fact of set algebra, called the principle of duality, that, if any equation E is an identity,
then its dual E is also an identity.
Algebra of sets
Some commonly used laws of sets are stated below. Note that the law stated in (b) is the
dual law of (a) and conversely.
1. Idempotent laws
(a) A A = A
(b) A A = A.
2. Identity laws
(i)
(a) A = A
(b) A U = A.
(ii)
(a) A =
(b) A U = U.
3. Commutative laws
(a) A B = B A
(b) A B = B A.
4. Associative laws
(a) (A B) C = A (B C)
(b) (A B) C = A (B C).
5. Distributive laws
(a) A (B C) = (A B) (A C)
(b) A (B C) = (A B) (A C)
6. Inverse law
(b) A Ac =
(a) A Ac = U
7. Domination laws
(a) A U = U
(b) A =
8. Absorption laws
(a) A (A B) = A
(b) A (A B) = A
9. De Morgans laws
(a) (A B)c = Ac B c
(b) (A B)c = Ac B c .
Let S1 and S2 be two set expressions. The notation S1 S2 as well as S2 S1 . These
are the main algebraic operations on sets.
Property 1.3.1 Let A, B and C are any three finite sets, then
[WBUT 09]
(i) A (B C) = (A B) (A C); (ii) A (B C) = (A B) (A C)
Proof: (i) Let x be any element of A (B C). Then,
x A (B C) x A or x (B C).
x A or (x B and x C)
(x A or x B) and (x A or x C)
(x A B) and (x A C)
x (A B) (A C).
The symbol stands implies and is implied by. It also stands for if and only if. Hence
A (B C) (A B) (A C) and (A B) (A C) A (B C). Hence
A (B C) = (A B) (A C).
(ii) Let x be any element of A (B C). Then,
x A (B C) x A x (B C)
x A (x B x C)
(x A x B) (x A x C)
(x A B) (x A C)
x (A B) (A C)

Duality and Algebra Sets

13

Hence, A (B C) (A B) (A C) and (A B) (A C) A (B C). Hence,


A (B C) = (A B) (A C).
Property 1.3.2 Let A, B and C are any three finite sets. Then,
(i) (A B)0 = A0 B 0 .
(ii) (A B)0 = A0 B 0 .
(iii) A (B C) = (A B) (A C).
(iv) A (B C) = (A B) (A C)
Proof: (i) Let x be an arbitrary element of (A B)0 . Then,
x (A B)0 x
/ (A B)
x
/ A and x
/B
x A0 and x B 0
x (A0 B 0 )
Hence (A B)0 A0 B 0 and A0 B 0 (A B)0 . Therefore,
(A B)0 = A0 B 0 .
(ii) Similarly, (A B)0 = A0 B 0 .
(iii) Let x be an arbitrary element of A (B C). Now,
x [A (B C)] [(x A) and x
/ (B C)]
[(x A) and (x
/ B and x
/ C)]
[(x A and x
/ B) and (x A and x
/ C)]
[x (A B)] and [x (A C)]
x [(A B) (A C)]
Thus, A (B C) (A B) (A C) and (A B) (A C) A (B C) and so,
A (B C) = (A B) (A C).
(iv) Similarly, A (B C) = (A B) (A C).
Ex 1.3.1 Prove that (A C) (B C) = (A B) C.
Solution: We shall use suitable laws of algebra in sets. Using the results A C = A C 0 ;
B C = B C 0 , we get,
L.H.S = (A C) (B C) = (A C 0 ) (B C 0 )
= (A C 0 ) (C 0 B) = {(A C 0 ) C 0 } B
= (A C 0 ) B = (A B) C 0
= (A B) C = R.H.S.(proved)
Ex 1.3.2 Show (A B) (B A) = (A B) (A B), where A, B, C are three sets.
Solution: We shall use suitable laws of algebra in sets. Using the results A B = A B 0 ;
B A = B A0 , we get,
L.H.S = (A B) (B A) = (A B 0 ) (B A0 )
= {(A B 0 ) B} {(A B 0 ) A0 }
= {(A B) (B B 0 )} {(A A0 ) (B 0 A0 )}
= {(A B) S} {S (B 0 A0 )} ; S being U niversal set.
= (A B) (B 0 A0 ) = (A B) (A B)0
= (A B) (A B) = R.H.S.(proved)

14

Theory of Sets

Ex 1.3.3 Show that (A B) (A C) = (A B) C, where A, B, C are sets.


Solution: We shall use suitable laws of algebra in sets.
L.H.S = (A B) (A C) = (A B) (A C)0
= (A B) (A0 C 0 ) = {(A B) A0 } {(A B) C 0 }
= { (A B)} {(A B) C 0 }
= (A B) C 0 = (A B) C 0
= (A B) C = R.H.S.(proved)
Ex 1.3.4 Show that (A B C) (A B C 0 ) (A B 0 C) (A B 0 C 0 ) = A, where
A, B, C are sets.
[ KH 07]
Solution: We shall use suitable laws of algebra in sets.
L.H.S = (A B C) (A B C 0 ) (A B 0 C) (A B 0 C 0 )
= (X C) (X C 0 ) (Y C) (Y C 0 ) where X = A B , Y = A B 0 .
= [X (C C 0 )] [Y (C C 0 )]
= (X U ) (Y U ); where U = U niversal set.
= X Y = (A B) (A B 0 )
= A (B B 0 ) = A U = A = R.H.S.(P roved).
Ex 1.3.5 If A B = A C and A B = A C simultaneously for subsets A, B, C of a set
S, prove that B = C.
[ CH 09]
Solution: We shall use suitable laws of algebra in sets.
L.H.S = B = (A B) B = (A B) (B B)
= (A C) B [as A B = A C]
= (A B) (C B)
= (A C) (B C) [as A B = A C]
= (A C) C = C = R.H.S.(P roved).
From this, we conclude only A B = A C or only A B = A C does not necessarily
imply B = C.
Ex 1.3.6 Define complement of A by A0 such that A A0 = S, A A0 = show that
(A B)0 = A0 B 0 .
Solution: Let C = A B and D = A0 B 0 . We shall show that D = C 0 . Now,
C D = (A B) (A0 B 0 )
= (A B A0 ) (A B B 0 )
= {(A A0 ) B} {A (B B 0 )}
= {S B} {A S} = S S = S.
Also using the relations C = A B and D = A0 B 0 , we get,
C D = (A B) (A0 B 0 )
= (A A0 B 0 ) (B A0 B 0 )
= ( B 0 ) ( A0 ) = = .
Hence C 0 = D i.e. (A B)0 = A0 B 0 .

Cartesian Product of Sets

15

Ex 1.3.7 If A B and C is any set then show that A C B C.


Solution: Let x be any element of A C. Hence,
x A C x A or x C.
Again, x A x B (since A B). Therefore,
x A C x A or x C
x B or x C
x B C.
Again, x C x B C. Hence, A C B C. (Proved)
Ex 1.3.8 Prove that (A0 B 0 C) (B C) (A C) = C.
Solution: L.H.S. = (A0 B 0 C) (B C) (A C). Now consider,
(B C) (A C) = (C B) (C A)
= C (B A) = (A B) C.
Now again, A0 B 0 C = (A B)0 C. Hence,
L.H.S. = {A B)0 C} {(A B) C}
= {(A B)0 (A B)} C = S C (S = U niversal set)
= C = R.H.S. (P roved).
Ex 1.3.9 A, B, C are subsets of U , prove that [A (B C)] [A0 (B 0 C 0 )] = .
Solution: Using the properties of sets, we get,
LHS = [A (B C)] [A0 (B 0 C 0 )]
= [A (B C) A0 ] [A(B C) (B 0 C 0 )]
= [A A0 (B C)] [A (B C) (B C)0 ]
= [ (B C)] [A ] = = .

1.4

Cartesian Product of Sets

Let A and B are two nonempty sets. An order pair consists of two elements, say a A
and b B, and it is denoted by (a, b). The element a is called the first element or first
coordinate and the element b is called the second element or second coordinate. The ordered
pairs (a, b) and (b, a) are distinct unless a = b. Thus (a, a) is a well-defined ordered pair.
If a, c A and b, d B, two ordered pairs (a, b) and (c, d) are said to be equal, i.e.,
(a, b) = (c, d) if and only if a = c and b = d.
An order triple is ordered triple of objects (a, b, c) where a is first, b is second and c is
third element of triple. An order triple can also be written in terms of ordered pairs as
{(a, b), c}. Similarly, ordered quadruple is an ordered pair {((a, b), c), d} with first element
as ordered pair.
Definition 1.4.1 Let A and B be any two finite sets. The cartesian product (or cross
product or direct product) of A and B, denoted by A B, (read as A cross B), is the set
defined by,
n
o
A B = (x, y)|x A and y B

16

Theory of Sets

i.e., A B is the set of all distinct order pairs (x, y), the first element of the pair is an
element of A and the second is an element of B. For example, let A ={a, b} and B ={1, 2,
3}. Then
A B ={(a, 1), (a, 2), (a, 3), (b, 1), (b, 2), (b, 3)} and
B A ={(1, a), (2, a), (3, a), (1, b), (2, b), (3, b)}.
The geometric representation of A B is depicted in the Fig. 1.7.
3 6 (a, 3)

(b, 3)

(a, 2)

(b, 2)

(a, 1)

(b, 1)

Figure 1.7: Representation of A B


From this example, it is observed that A B 6= B A, so in general, A B 6= B A.
Let A1 , A2 , , An be finite collection of non-empty sets. The cartesian product of the
collection, denoted by A1 A2 An , is the set defined by,
n
Y

Ai = A1 A2 An = {(x1 , x2 , , xn ) : xi Ai }.

i=1

In particular, if A1 = A2 = = An = A, the cartesian product of the collection of sets,


denoted by An , is the set of all ordered n tuples,
An = {(x1 , x2 , , xn ) : xi A}.
If A = B = <, the set of all real numbers, then < < (= <2 ) is a set of all points in the
plane. The ordered pair (a, b) represents a point in the plane. Similarly, < < < (= <3 )
is a set of all points in space, i.e, (a, b, c) < < < is a point in space. Below are some
important properties of cartesian product
(i) The cartesian product is non-commutative, i.e., in general A B 6= B A.
(ii) n(A) = p and n(B) = q then n(A B) = n(B A) = pq.
(iii) If A = or B = then A B = .
(iv) If either A or B is infinite and other is empty then A B = .
(v) If either A or B is infinite and other is non-empty then A B is infinite.
Following are some results on cartesian product.
(i) If A B then A C B C for any sets A, B, C.
(ii) If A B and C D then A C B D.
(iii) If A B then A A = (A B) (B A).

Cartesian Product of Sets

17

Result 1.4.1 For any non-empty sets A and B, A B = B A iff A = B.


Proof: Let A B = B A. Then let,
x A (x, y) A B, where y B
(x, y) B A, since A B = B A
x B.
Thus A B. Similarly, B A. Hence A = B. Conversely, let A = B. Then A B = A A
and B A = A A. Hence, A B = B A.
Ex 1.4.1 For three non-empty sets A, B, C, prove that
A (B C) = (A B) (A C).
Solution: This result is the direct consequence of the definition. Let (x, y) be an arbitrary
element of the set A (B C). Then,
(x, y) A (B C) x A and y (B C)
x A and [y B and y C]
[x A and y B] and [x A and y C]
[(x, y) A B] and [(x, y) A C]
(x, y) (A B) (A C).
Therefore, A (B C) (A B) (A C) and (A B) (A C) A (B C) and
consequently, A (B C) = (A B) (A C). Similarly,
A (B C) = (A B) (A C).
Ex 1.4.2 For any three sets A, B, C prove that
A (B C) = (A B) (A C).
Solution: Let (x, y) be an arbitrary element of A (B C). Then,
(x, y) A (B C) x A and y (B C)
x A and [y B and y 6 C]
[x A and y B] and [x A and y 6 C]
[(x, y) (A B)] and [(x, y) 6 (A C)]
(x, y) (A B) (A C).
Therefore, A (B C) (A B) (A C) and (A B) (A C) A (B C) and
consequently, A (B C) = (A B) (A C).
Ex 1.4.3 For any sets A, B, C and D, we have,
(A B) (C D) = (A C) (B D).

[ CH 96, 01]

Solution: Let (x, y) be an arbitrary element of (A B) (C D). Then,


(x, y) (A B) (C D) (x, y) (A B) and (x, y) (C D)
(x A and y B) and (x C and y D)
(x A and x C) and (y B and y D)
x (A B) and y (B D)
(x, y) (A C) (B D).
Thus, (A B) (C D) (A C) (B D) and (A C) (B D) (A B) (C D),
so that (A B) (C D) = (A C) (B D). Similarly, for any sets A, B, C and D, we
have A B and C D (A C) (B D).

18

1.5

Theory of Sets

Cardinal Numbers

The number of distinct elements in a set A is called the cardinal number of the set and it
is denoted by n(A) or |A| or card(A). For example, n() = 0, n({a}) = 1, n({a, b}) = 2,
n(Z) = , etc. Following are the important properties of cardinal number
(i) If A and B are disjoint, then
(a) n(A B) = n() = 0 and (b) n(A B) = n(A) + n(B),
(ii) Let A and B be two finite sets, with A B 6= , then
n(A B) = n(A) + n(B) n(A B),
(iii) If A, B, C be three arbitrary finite sets then
n(A B C) = n(A) + n(B) + n(C) n(A B)
n(B C) n(C A) + n(A B C),
(iv) Suppose we have any finite number of finite sets, say, A1 , A2 , , Am . Let sk be the
sum of the cardinalities, then
n(A1 A2 Am ) = s1 s2 + s3 + (1)m1 sm .
(v) n(A A) = n(A) and n(A A) = n(A).
The inclusion and exclusion principle
The number of elements in finite sets such as A B, A B, AB, etc. are obtained by
adding and as well as deleting certain elements. This method of finding the number of
elements in a finite set is known as inclusion and exclusion principle.
Ex 1.5.1 If n(A) and n(B) denote the number of elements in the finite sets A and B
respectively, then prove that n(A) + n(B) = n(A B) + n(A B).
Solution: If A and B are disjoint then n(A B) is equal to the sum of the elements of
A and the elements of B. That is n(A B) = n(A) + n(B). If A and B are disjoint then
A

6
K
7
B (A B) A (A B)
AB
Figure 1.8:
A B is express as union of three disjoint sets A B, A (A B) and B (A B). Let
us draw the Venn diagram showing A B 0 , A B, A0 B and A0 B 0 , where A and B are
two subsets of the universal set U . From the diagram, we have,
A = (A B 0 ) (A B) and B = (A B) (A0 B)
and the sets A B 0 , A B and A0 B are disjoint. Therefore,
n(A) = n(A B 0 ) + n(A B) and n(B) = n(A B) + n(A0 B)
n(A) + n(B) = n(A B 0 ) + 2n(A B) + n(A0 B).

Cardinal Numbers

19

Again, from the diagram, we have,


A B = (A B 0 ) (A B) (A0 B),
in which (A B 0 ), (A B) and (A0 B) are disjoint sets and so
n(A B) = n(A B 0 ) + n(A B) + n(A0 B).
Subtracting, we get, n(A) + n(B) = n(A B) + n(A B). Let n(A) = r, n(B) = s and
n(A B) = t. Then from Fig. 1.8,
n(A (A B)) = r t and n(B (A B)) = s t.
Therefore, n(A B) = t + (r t) + (s t) = r + s t
= n(A B) = n(A) + n(B) n(A B).
Following are some results on Venn diagrams:
(i) n(A B) = n(A) + n(B) n(A B) and n(A B) = n(A) + n(B),
provided A and B are disjoints, i.e., n(A B) = 0.
(ii) n(A B 0 ) = n(A) n(A B) and n(A0 B) = n(B) n(A B).
(iii) n(A4B) = n(A) + n(B) 2n(A B).
(iv) n(A0 B 0 ) = n(U ) n(A B) and n(A0 B 0 ) = n(U ) n(A B).
Ex 1.5.2 In a canteen, out of 122 students, 42 students buy ice cream, 36 buys buns and
10 buy cakes, 15 student buy ice-cream and buns, 10 ice-cream and cakes, 4 cakes and buns
hut not ice-cream and buns butt no cakes. Find (i) how many students buy nothing at all.
(ii) how many students buy at least two items. (iii) how many students buy all three items.
Solution: Define the sets A, B and C such that, A = Set of students who buy cakes, B =
Set of students who buy ice-cream, C = Set of students who buy buns.
According to question, we have, n(A) = 10, n(B) = 42, n(C) = 10, n(B C) = 15, n(A
B) = 10, n[(A C) B] = 4, n[(B C) A] = 11 and n[A (B C)] = 10. Now, we have,
n(B C) = n(B) + n(C) n(B C) = 42 + 36 15 = 63.
n(B C) n(B) = 63 42 = 21.
and
n(B C) n(C) = 63 36 = 27.
The above distribution of the student can be illustrated by Venn diagram. Now the total
number of students buying something
= 10 + 6 + 21 + 4 + 4 + 11 + 17 = 73.
Therefore, the number of students who did not buy anything = 123 73 = 50. Number of
students buying all three items = 4.
Ex 1.5.3 In a recent survey 500 students in a college it was found that 150 students read
newspaper A and 200 read newspaper B , 80 students read both the newspaper A and B.
Find how many read either newspaper.
Solution: Let X and Y denote the set of students who read newspapers A and B respectively. It is given that n(A) = 150, n(B) = 200, n(A B) = 80, n(U ) = 500. Number of
students who read either A or B is n(A B). Also,
n(A B) = n(A) + n(B) + n(A B) = 150 + 200 80 = 270.

20

Theory of Sets

Ex 1.5.4 In a class of 42 students, each play at least one of the three games: Cricket, Hokey
and Football. It is found that 14 play Cricket, 20 play Hokey and 24 play Football, 3 paly
both Cricket and Football, 2 play both Hokey and Football and none paly all the three games.
Find the number of students who paly Cricket but not Hokey.
Solution: Let C, H and F be the sets of students who play Cricket, Hockey and Football.
Given that n(C) = 14, n(H) = 20, n(F ) = 24, n(C F ) = 3, n(H F ) = 2, n(H F C) = 0
and n(H F C) = 45 (since each student play at least one game). We know,
n(H F C) = n(H) + n(F ) + n(C) n(H F )
n(H C) n(F C) + n(H F C).
or, 42 = 20 + 24 + 14 2 n(H C) 3 + 0
n(H C) = 11.
Now, the number of students who paly Cricket but not Hockey is n(C H). Also, we know
n(C) = n(C H) + n(C H)
i.e., n(C H) = n(C) n(C H) = 14 11 = 3.
Hence 3 students play Cricket but not Hockey.

1.6

Relation

Let A and B are two non empty sets and a A, b B. A relation or binary relation
between two sets A and B is a subset of A B i.e. if (a, b) , where A B we say
that a stands in the relationship with b and is denoted by ab i.e.,
(a, b) (a, b) A B.
For example, let A = {4, 5, 6, 9} and B = {20, 22, 24, 28, 30}. Let us define a relation from
A into B by stipulating ab if and only if a divides b, where a A and b B. Then it is
clear that
= {(4, 20), (4, 24), (4, 28), (5, 20), (5, 30), (6, 24), (6, 30)}.
If n(A) = n and n(B) = m then the number of elements of A B is mn. It is known that
the number of elements of the power set 2mn , where A B is the original set. Again, any
subset of power set is a relation. Thus the total number of binary relations from A to B is
2mn .
A relation between the non empty set A and A is also said to be a binary relation on
2
A. If n(A) = n, then there are 2n relations on it.
Definition 1.6.1 [Domain and range :] Let be a relation from a set A into the set B.
Then the domain of denoted by D() is the set
D() = {a : (a, b) , a , for some b B}.

(1.9)

The domain of a binary relation is the set of all first elements of the ordered pairs in the
relation. The range or image of denoted by I() is the set
I() = {b : (a, b) , b , for some a B}.

(1.10)

The range of a binary relation is the set of all second elements of the ordered pairs in the
relation. For example, let A ={a, b, c} and B ={1, 2, 3}. Then let a relation ={(a, 1),
(b, 1), (c, 2), (a, 2)}. For this relation D() ={a, b, c} and I() ={1, 2}.
If (a, b)
/ , i.e., (a, b) (AB), then we say that a does not stand in the relation ship
with b and is denoted by ab. Let N N be given by = {(x, y)|x is a devisor of y}.
Then xy, if x is divisor of y. As example 36 but 25.

Relation

21

Universal and null relation


We know, (A A) is a subset of A A. = A A is defined as a universal relation on A. If
= A B then is called universal relation on A and B. Null set A A and if =
then is called null relation or empty relation from A to B. In general, A A.
For example, in the set Z of integers = {(a, b)|a + b is an integer} is an universal
relation and = {(a, b)|a + b is not an integer} is null relation.
Identity relation
If the set A, the idempotent relation = {(x, y) : x A, y A, x = y} is called the identity
or diagonal relation in A and it will be denoted by IA .
Inverse relation
If be a relation from a set A to B. The inverse of denoted by 1 , is a relation from a
set B to A and defined by
1 = {(y, x) : y B, x A, (x, y) }.
For example,
(i) Let = {(1, y), (1, z), (3, y)} from A = {1, 2, 3} to B = {x, y, z}, then its inverse is
n
o
1 = (y, 1), (z, 1), (y, 3) .
(ii) Let A ={a, b, c} and B ={1, 2, 3}. The inverse of the relation ={(a, 1), (b, 1), (c,
2), (a, 2)} is 1 ={(1, a), (1, b), (2, c), (2, a)}.
Note that, if A B then, 1 B A. It is easy to verify that domain of 1 =
D(1 ) = I() = range of and range of 1 = I(1 ) = D() = domain of . Clearly, if
is any relation, then (1 )1 = .
Composition of relations
Let 1 be a relation from a set A into the set B and 2 be another relation from a set B
into C, i.e. 1 A B and 2 B C. The composition of 1 and 2 , denoted by 1 2 ,
is the relation from A into C and is defined by a(1 2 )c, as
a(1 2 )c = {(a, c) : (a, b) 1 , (b, c) 2 }.

(1.11)

if there exists some b B such that a1 b and a2 b for all a A and c C. For example, let
A ={1,2,3} and B = {x, y} and C = {t}, so, 1 = {(1, x), (2, y), (3, y)}, 2 = {(x, t), (y, t)}.
Therefore, 1 2 = {(1, t), (2, t), (3, t)}. From this definition it follows that 1 = , n =
n1 , n > 1.
Definition 1.6.2 [Set operation on relations] Since every binary relation is a set of
ordered pair, so the set operations can also be defined on relations.
Let 1 and 2 be two relations from a set A to a set B. Then 1 2 , 1 2 , 1 2 ,
01 are relations given by
(i) Union : a(1 2 )b = a1 b or a2 b.
(ii) Intersection : a(1 2 )b = a1 b and a2 b.

22

Theory of Sets

(iii) Difference : a(1 2 )b = a1 b and a 6 2 b.


a(2 1 )b = a 6 1 b and a2 b.
(iv) Complement : a(01 )b = a 6 1 b.
The relations correspond to the set operations-union, intersection, difference and complementation on sets. For example, let A ={a, b, c}, B ={2, 4, 6}, C ={a, b}, D ={2, 4, 5}.
Let 1 be a relation from A into B defined as 1 ={(a, 2), (b, 4), (c, 6)} and 2 be another
relation from C into D, which is defined by 2 ={(a, 2), (b, 4), (b, 5)}. Thus,
n
o
(i) 1 2 = (a, 2), (b, 4), (c, 6), (b, 5)
n
o
(ii) 1 2 = (a, 2), (b, 4)
n
o
n
o
(iii) 1 2 = (c, 6) , 2 1 = (b, 5)
and 01 = the set of all ordered pairs of A B those are not in 1
= {(a, 4), (a, 6), (b, 2), (b, 6), (c, 2), (c, 4)}.

1.6.1

Equivalence Relation

Let A be a nonempty set and be a relation on A. Then is called


(i) reflexive, if for all a A, aa,
(ii) symmetric, if ab holds ba must hold, for a, b A,
(iii) antisymmetric, if ab and ba hold then a = b,
(iv) transitive, if ab and bc hold ac must be hold, for a, b, c A.
It may be remembered that the relation is not reflexive, if (a, a) 6 for at least a A,
not symmetric if (a, a) but (b, a) 6 for at least one pair (a, b) and not transitive if
(a, b) , (b, c) but (a, c) 6 if a, b, c A. If a relation which is not symmetric, it is
not necessarily antisymmetric.
Ex 1.6.1 Let be a relation on Z defined by ab iff ab 0 for all a, b Z. Show that is
reflexive and symmetric but not transitive.
Solution: (i) Let a Z. Then obvious a2 0, so ab holds for all a Z. Therefore is
reflexive.
(ii) Let a, b Z, ab and bc hold. Then it does not imply ac 0 and c 0. For
example, if a = 2, b = 0, c = 8 then ab 0 and c 0 hold, but ac = 16 6 16. Hence
is not transitive.
That is, is reflexive and symmetric but not transitive.
Ex 1.6.2 Show that the relation ab if ab > 0, a, b < is symmetric, transitive but not
reflexive.
Solution: (i) 0 <, but 00. Since 0.0 6> 0. So it is not reflexive.
(ii) ab ab > 0 ba > 0 ba. Hence R is symmetric. (iii) Now,
(a, b) ; (b, c) ab > 0; bc > 0
ab2 c > 0 ac > 0;
(a, c) .

since b2 > 0

Hence is transitive. Therefore, is symmetric, transitive but not reflexive.

Relation

23

Ex 1.6.3 Verify wheatear the relations are reflexive, symmetric or transitive on the set <?
(i) xy if |x y| > 0. (ii) xy if 1 + xy > 0. (iii) xy if |x| y (iv) xy if
2x + 3y = 10.
Solution: (i) is not reflexive, as for any x <, x x = 0 and hence
|x x| >
6 0, i.e., xx.
Again, as |x y| = |y x|, we have |x y| > 0 |y x| > 0, whence
xy yx, for all x, y <.
So is symmetric. Consider 0, 1 <, then |1 0| = |0 1| = 1 > 0 shows that
10 and 01 but 11 as |1 1| >
6 0.
Hence is not transitive.
(ii) Since x <, x2 0, 1 + x2 > 0 and hence xx, x <, whence is reflexive.
Again, x, y <, if 1 + xy > 0, then 1 + yx > 0 as xy = yx, whence we see that,
xy yx, for all x, y <.
So is symmetric. Let 3, 19 , 6 <, then 1 + 3.( 19 ) = 23 > 0 shows that
3( 19 ) and 1 + ( 91 )(6) = 53 > 0 ( 19 )(6).
But, 1 + 3.(6) = 17 6> 0, we conclude that 3(6),, whence is not transitive.
(iii) Let us consider 2 <, then | 2| = 2 6 2, whence (2)(2), showing that is
not reflexive. Indeed | 2| = 2 5 and so (2)5, but |5| = 5 6 2, whence 5(2) and
consequently is not symmetric.
But if a, b, c <, then |a| b, |b| c gives |a| b c, i.e., ac and so is transitive.
(iv) As 2 1 + 3 1 = 5 6= 10, so 11. So is not reflexive. As 2 1 + 3 (8/3) = 10, so
1(8/3). But 2 (8/3) + 3 1 = 25/3 6= 10, so (8/3)1. So is not symmetric.
As 2 (1/2) + 3 3 = 10, so (1/2)3. As 2 3 + 3 (4/3) = 10, so 3(4/3) but 2 (1/2) + 3
(4/3) = 5 6= 10, proving (1/2)(4/3). Hence is not transitive and so there is no question
of equivalence relation.
Ex 1.6.4 Verify wheatear the following relation are reflexive, symmetric or transitive
(i) In Z, xy if x + y is odd. (ii) In Z, xy if |x y| y.
Solution: (i) Since, x + x = 2x = even, so xx and is not reflexive. Also,
xy x + y is odd.
y + x is odd. yx; x, y Z.
Hence is symmetric. Again, is not transitive, since 12(as 1+2 is odd) and 23(as 2+3 is
odd) but 13(since 1+3 is even). Hence is not reflexive but symmetric and not transitive.
(ii) Here, xy if |x y| y, x, y Z. Now, |x x| = 0 x is not true, for negative x.
Hence is not reflexive. Now,
13 since |1 3| = 2 < 3 but 31 since |3 1| = 2 > 1.
Hence, is not symmetric. By definition, xy, yz |x y| y; |y z| z. Now,
|x z| = |x y + y z| |x y| + |y z| y + z.
This suggests that |x z| may not be z. For example, 47, 74 but 94. Hence is not
transitive.

24

Theory of Sets

Result 1.6.1 The following tabular form will give a comprehensive idea:
xy iff
y = 4x
x<y
x 6= y
xy > 0
y 6= x + 2
xy
xy 0
x=y

Reflexive Symmetric Transitive

where is the binary relation on <.


Definition 1.6.3 [Equivalence relation] Let be a relation on a set A. A binary relation
on a set A is said to be an equivalence relation(or an RST relation ) if
(i) is defined to be a reflexive if (a, a) , a A i.e. if aa, a A.
(ii) is defined to be a symmetric if for any two elements a and b A,
(a, b) (b, a) i.e. ab ba.
(iii) is defined to be transitive if
(a, b) ; (b, c) (a, c) i.e. ab, bc ac.
For example, let A = {a, b, c} and a relation = {(a, a), (a, b), (b, b), (b, a), (a, c), (c,
a), (c, c)}. It is easy to verify that is reflexive, as (a, a) , (b, a) ,(a, c) , (c, a)
. Again is transitive. Hence is an equivalence relation.
Ex 1.6.5 In Z, xy if 3x + 4y is divisible by 7, verify wheatear the relation is equivalent.
Solution: We know, x Z, 3x + 4x = 7x is divisible by 7. Hence xx, x Z and so
is reflexive. As by definition, xy if 3x + 4y is divisible by 7, so, xy 3x + 4y = 7k, where
k is an integer. Now,
(3x + 4y) + (3y + 4x) = 7(x + y) = 7k1 k Z
or, 3y + 4x = 7(k1 k); k1 k Z.
Hence 3y + 4x is divisible by 7. Thus xy yx, x, y Z. So is symmetric. Now,
xy, yz 3x + 4y = 7k1 ; 3y + 4z = 7k2 .
3x + 4z + 7y = 7(k1 + k2 )
3x + 4z = 7k; k Z
xz; x, y, z Z.
Hence is transitive. Thus is reflexive, symmetric and transitive i.e. equivalent relation.
Ex 1.6.6 In Z, define ab iff a b is divisible by a given + ve integer n. Show that is
an equivalence relation.
Solution: (i) a Z, and a a = 0 is divisible by n. Hence aa; a Z and hence is a
reflexive. Using the definition,
ab a b is divisible by n
b a is divisible by n
ba; a, b Z.

Relation

25

Hence is a symmetric. Now,


ab, bc a b, b c is divisible by n
(a b) + (b c) is divisible by n
(a c)is divisible by n
ac; a, b, c Z.
Hence is transitive and consequently, it is an equivalence relation.
Ex 1.6.7 In the set of all lines in a plane ab if a k b and ab if a b. Verify wheatear the
relations are equivalence relation ?
Solution: We know, any line is parallel to itself, so, a k a; a. Hence aa. Therefore, is
reflexive. Also,
ab a k b b k a ba.
Hence is symmetric. is transitive, since
ab, bc a k b, b k c a k c
Hence is reflexive, transitive and symmetric. In other words it is an equivalent relation.
In the set of all lines in a plane ab if a b, is only symmetric.
Ex 1.6.8 Verify wheatear the following relations are reflexive symmetric or transitive on
the set Z? (i) xy if |x y| 3. (ii) xy if x y is a multiple of 6.
Solution: (i) Let a Z. Then |a a| = 0 < 3. Therefore, ab holds for for all a Z, i.e.,
is reflexive.
Let a, b Z and ab holds, i.e., |a b| 3. Then |b a| = |a b| 3, i.e., ba holds.
Therefore, is symmetric.
Let a, b, c Z and ab, bc hold. Then |a b| 3 and |b c| 3. Now,
|a c| = |a b + b c| |a b| + |b c| 3 + 3 = 6.
i.e., |a c| 3 does not hold for all a, b, c Z.
Therefore, is not transitive. For example, 25, 57 hold, but
27 as |2 7| = 5 6 3.
(ii) Let us consider any integer x Z, then x x = 0 = 0 6 shows that
aa, a Z.
Hence is reflexive. Now, let xy for some x, y Z, this means
x y = 6k, k Z y x = (k) 6, where k Z
and this shows that ya, whence is symmetric. Finally, let x, y, z Z such that xy and
yz. Then x y = 6 k1 and y z = 6 k2 for some k1 , k2 Z. Then
x y + y z = 6(k1 + k2 )
x z = 6 (k1 + k2 ), where k1 + k2 Z.
This shows that xz, whence is transitive. Consequently is an equivalence relation.
Ex 1.6.9 The relation on the set N N of ordered pairs natural numbers is defined as
(a, b)(c, d) iff ad = bc. Prove that is an equivalence relation.
Solution: (i) Let (a, a) N N . Then (a, b)(c, d) holds as ab = ba, i.e., is reflexive.
(ii) Let (a, b) and (c, d) be any elements of N N and (a, b)(c, d) holds. Therefore,
ad = bc or, da = cb, i.e., (c, d)(a, b)

26

Theory of Sets

holds. Hence is symmetric.


(iii) Let (a, b), (c, d), (e, f ) N N and (a, b)(c, d), (c, d)(e, f ) hold. Then ad = bc and
de = cf . Now,
(bc)(de) = (ad)(cf ) or, be = af ; as dc 6= 0
(a, b)(e, f ) holds.
Therefore, is transitive. Hence is equivalence relation.
Ex 1.6.10 Show that the following relation on Z is an equivalence relation: = {(a, b):
a, b Z and a2 + b2 is a multiple of 2}.
[WBUT 08]
Solution: (i) Let a Z. Then a2 + b2 = 2a2 , which is multiple of 2. Therefore, ab holds
for a Z. Thus is reflexive.
(ii) Let a, b Z and ab. Then a2 + b2 is a multiple of 2, and also b2 + a2 is a multiple
of 2. Therefore ba and hence is symmetric.
(iii) Let a, b, c Z and ab, bc hold. Then a2 + b2 and b2 + c2 both are multiple of 2.
Now,
a2 + c2 = (a2 + b2 ) (b2 + c2 ) + 2c2
is a multiple of 2, i.e., ac holds. Thus is transitive. Hence is an equivalence relation.
Ex 1.6.11 Let S be the set of all lines in 3-space. Let a relation on the set S be defined
by lm if l, m S and l lies in the plane m. Examine if is an equivalence relation.
Solution: (i) Let l S. Then l is a coplanar with itself. Therefore lm holds for all l S,
i.e, is reflexive.
(ii) Let l, m S and lm holds. Then obviously, ml holds. That is, lm ml.
Therefore is symmetric.
(iii) Let l, m, n S and lm, mn both hold. Then l lies in the plane m and m lies in
the plane of n. This does not always imply that l lies in the plane of n. For example, if l
lies on the xz-plane, m as taken as x-axis and n is a line on the xy-plane. In this case, lm
and mn hold, but l and n lie on two different planes.
Thus, is not transitive and hence is not an equivalence relation.
Note 1.6.1 It may be noted that reflexive, symmetric and transitive relations are three
independent relations, i.e., no two of them do not imply the third.
Elements of equivalence relation
In this section, some properties of the relations reflexive, symmetric and transitive are
presented.
Ex 1.6.12 Let A = {a1 , a2 , . . ., an }, be a finite set containing n elements. Find how many
relations are constructed on A and how many of there are reflexive and symmetric.
2

Solution: Since A has n elements, so A A has n2 elements. The 2n relations can be


constructed on A including the null relation and the relation A A itself.
If the relation on A is reflexive then each all n ordered pairs (ai , ai ), i = 1, 2, . . . , n
should be in . The remaining n2 n = n(n 1) ordered pairs (ai , aj ), i 6= j may not be
in . Hence by the rule of cartesian product there are 2n(n1) reflexive relations on A.
To count the
n number of symmetrico relations wenwrite A A as A1 A2 , where o
A1 = (ai , ai ): i = 1, 2, . . . , n and A2 = (ai , aj ): i 6= j; i, j = 1, 2, . . . , n .
Thus every element of A A is exactly one element of A1 , A2 . The set A2 contains
n(n 1)/2 subsets of the form
{(ai , aj ), (aj , ai ), i, j = 1, 2, . . . , n}.

Relation

27

To construct a symmetric relation on A, for each ordered pair taken from A2 we have to
choose some number of ordered pairs from A1 . Hence by the rule of cartesian product there
are
2n 2n(n1)/2 = 2n(n+1)/2
symmetric relations on A. Therefore, the number of relations which are both reflexive and
symmetric is 2n(n1)/2 .
Ex 1.6.13 If R and S are equivalence relations on a set A, prove that RS is an equivalence
relation on A.
Solution: Let R and S are equivalence relations on A. Therefore, R AA and S AA.
Hence R S A A, i.e., R S is a relation on A.
(i) Since R and S are reflexive, (a, a) R and (a, a) S for all a A. Thus (a, a)
R S for all a R S. Hence R S is reflexive.
(ii) Let (a, a) R S. Therefore, (a, a) R and (a, a) S. Therefore, (b, a) R S.
Hence R S is symmetric.
(iii) Let (a, a) R S and (a, c) R S. Therefore, (a, b) R, (a, b) S and (a, c)
R, (a, c) S. Since R is transitive, (a, b) R, (a, c) R (a, c) R. Similarly,
(a, c) S. Thus (a, c) R S, i.e., R S is transitive.
Hence R is an equivalence relation. But, union of two equivalence relations is not necessarily an equivalence relation. For example, let A = {1, 2, 3} and R = {(1, 1), (2, 2), (3,
3), (1, 2), (2, 1)}, S = {(1, 1), (2, 2), (3, 3), (2, 3), (3, 2)} be two equivalence relations on
A. Then
R S = {(1, 1), (2, 2), (3, 3), (1, 2), (2, 1), (2, 3), (3, 2)}.
Here (1, 2), (2, 3) R S but (1, 3) 6 R S, i.e., R S is not reflexive and hence it is not
an equivalence relation on A.
Theorem 1.6.1 If R and S are two relations from A into B, then
(a) R1 S 1 then R S.
(b) (R S)1 = R1 S 1 .
(c) (R S)1 = R1 S 1 .
(d) If R is reflexive, R1 is also reflexive.
(e) R is symmetric iff R = R1 .
Ex 1.6.14 f R is an equivalence relation on a set A, then prove that R1 is also an equivalence relation on A.
Solution: Since R is an equivalence relation on A, R is reflexive, symmetric and transitive.
(i) Let a A and (a, a) R. Therefore, (a, a) R1 , i.e., R1 is reflexive.
(ii) Let (a, b) R1 . Then (b, a) R, a, b A.
(a, b) R, since R is symmetric
(b, a) R1 .
Thus R1 is symmetric.
(iii) Let (a, b), (b, c) R1 . Then (b, a), (c, b) R for a, b, c A.
(c, b), (b, a) R
(c, a) R, since R is transitive
(a, c) R1 .
Therefore, R1 is transitive. Hence R1 is an equivalence relation on A.

28

Theory of Sets

Antisymmetric relation
A relation is said to be antisymmetric if ab, ba a = b. For example,
(i) In <, the relation is antisymmetric, since a b, b a a = b.
(ii) In the set of all sets, the relation is a subset of is antisymmetric for A B, B
A A = B.
(iii) Let F consists of all real valued functions f (x) defined on [1, 1] and let f (x) g(x)
mean that f (x) g(x) for every x [1, 1].
(iv) A relation , defined on Z by ab if and only if a is the divisor of b is not antisymmetric.
It contain pairs of elements x, y which are incomparable, in the sense that neither x y
nor y x holds.
Partial ordering relation
A binary relation % defined on a non-empty set A is said to be a partial ordering relation, if %
is reflexive, antisymmetric and transitive. There is also an alternative notation for specifying
partial ordering relation.
(i) P1: Reflexive: x x, x A.
(ii) P2: Anti symmetric: x y and y x x = y, for x, y A.
(iii) P3: Transitive: x y and y z x z, for x, y, z A.
If x y and x 6= y, one writes x < y and says that x is less than or properly contained
in y. The relation x y is also written y x and read y contains x (or includes x).
Similarly, x < y is also written y > x. Strict inclusion is characterized by the anti-reflexive
and transitive laws.
Digraph of a relation
A relation on a finite set A can be represented by a diagram called digraph or directed
graph. Draw a dot for each element of A. Now, join the dots corresponding to the elements
ai and aj (ai , aj A) by an arrowed are if and only if ai aj . In case of ai ai for some
ai A, the arrowed arc from ai should come back to itself and forms a loop. The resulting

s
s3

@4
@
@
@
@


s1
s2


Figure 1.9: Directed graph of the relation
diagram of is called directed graph or digraph, the dots are called vertex and the arrowed
arc is called directed edge or an arc. Thus the ordered pair (A, ) is a directed graph of
the relation . Here two vertices ai , aj A are to be adjacent if ai aj . For example, let
A = {1, 2, 3, 4} and a relation on A be = {(1, 1), (2, 2), (4, 4), (2, 3), (3, 2), (3, 4), (4, 1),
(4, 2)}.

Relation

29

The directed graph (A, ) is shown in Fig. 1.9. From the digraph representation of a
relation R one can test wether it is an equivalence relation or not. The following test are to
be performed to test an equivalence relation.
(i) The relation is reflexive iff there is a loop on each vertex of the digraph.
(ii) The relation is symmetric iff whenever there is an arc from a vertex a to another vertex
b (where a, b are two vertices), there should be an arc from b to a.
(iii) The relation is transitive iff whenever there is an arc from vertex a to a vertex b, an
arc from b to a vertex c, then there must be an arc from a to c.
Ex 1.6.15 Find the relation determined by Fig. 1.10.
Solution: Since ai aj iff there is an edgesbfrom ai to aj .
a

m
s
s
n
d

s
c


Figure 1.10:
Thus = {(a, a), (a, c), (b, c), (c, b), (c, c), (d, c), (d, d)}.
Matrix of a relation
A relation between two finite sets can also be represented by a matrix. Let A = {a1 , a2 , . . .,
am } and B = {b1 , b2 , . . ., bn }. The matrix for the relation is denoted by M = [mij ]mn ,

where,
1, if (ai , bj ) ,
mij =
0, if (ai , bj ) 6 .
The matrix M is called the matrix of . From this matrix one check the property of .
Ex 1.6.16 Let A = {a, b, c} and B = {1, 2, 3, 4}. Consider a relation from A into B as
= {(a, 1), (a, 3), (b, 2), (b, 3), (b, 4), (c, 1), (c, 2), (c, 4)}. Then the matrix M is
1
a 1
M =
b 0
c 1

2
0
1
1

3
1
1
0

4
0
.
1
1

From the matrix M one can draw the digraph of the relation and conversely, from the
digraph the matrix M can also be obtained.
Ex 1.6.17 Let A = {2, 4, 6} and let R be given by the digraph shown in Fig. 1.11.
m
s
s
4
2
s6
m
Figure 1.11:
Find the matrix M and the relation .

30

Theory of Sets

Solution: The matrix M = [mij ] where,



1, if (ai , bj ) ,
mij =
0, if (ai , bj ) 6 .

011
n
Therefore, M = 1 1 0 and the relation is = (2, 4), (2, 6), (4, 2), (4, 4), (6, 4), (6,
011
o
6) .

1.7

Equivalence Class

Let be an equivalence relation on a non-empty set A. Then for each a A, the element
x A satisfying xa constitute a subset of A. This subset is called an equivalence class or
equivalence set of a with respect to . The equivalence class of a is denoted by cl(a), class
a or (a) or [a] or Aa or Ca , i.e.,
[a] = {x : x A and (x, a), i.e., xa} A.

(1.12)

Again, the set of all equivalence classes of elements of A under the equivalence relation on
A is called quotient set denoted by A/, i.e.,
A/ = {[a] : a A}.
For example, let A = {1, 2, 3, 4} and = {(1, 1), (1, 2), (2, 1), (2, 2), (3, 3), (4, 3), (4, 4)} be an
equivalence relation on A. This equivalence relation has the following equivalence classes:
[1] = {1, 2}, [2]n = {1, 2}, [3] =
o {3, 4}, [4] = {3, 4}
and the quotient set is A/ = [1], [2], [3], [4] .
Property 1.7.1 Given that is an equivalence relation, i.e., it is reflexive. Therefore,
(a, a) for all a A. Also,
[a] = {x : x A and (x, a) }.
Therefore, from the definition aa a [a] for all a A. Hence [a] 6= for all a . So
[a] is a non-empty subset of A.
Property 1.7.2 Let be an equivalence relation on the set A. Then if b [a] then [a] = [b],
where a, b A.
Proof: Let b [a]. Then ba holds. Let x be an arbitrary element of [b], then
x [b] and b [a] xb and ba
xa; transitive property
ax; symmetric property
x [a].
Thus [b] [a]. Similarly, it can be prove that [a] [b]. Hence we arrive at a conclusion that
[a] = [b].
Property 1.7.3 Two equivalence classes are either equal or disjoint.

Equivalence Class

31

Proof: If for any two classes [a] and [b], [a][b] = , then the theorem is proved. [a][b] 6= ,
then let x [a] [b]. Then x [a] and x [b]. Therefore,
x [a], x [b] xa and xb hold
ax and xb; symmetric property,
ab; transitive property,
a [b].
Hence by the previous property, [a] = [b]. Hence for all a, b A, either [a] = [b] or [a][b] = ,
i.e., equivalence classes are either equal or disjoint.
Property 1.7.4 ab if and only iff a, b belong to the same equivalence classes.
Proof: We know, a, b []. Then by definition a and b. Hence a; b (by
symmetric property). Hence ab (by transitive property).
Conversely, let ab, then by definition, b [a]. Also a [a] (since aa). Hence a, b belong
to the same class.
S
Property 1.7.5 Let be an equivalence relation on the set A. Then A =
[a].
aA

Proof: Let a A. Then a [a]

, therefore A

aA

[a]. Again, if X =

[a] then

aA

all elements of X belong to A.STherefore,


X A, i.e.,
[a] A.
aA
S
Hence A =
[a].
aA

1.7.1

Partitions

Let be an equivalence relation on a non empty set S, then the equivalence classes are
each non-empty and pairwise disjoint. Further, the union of the family of the classes is the
set S.
Let S = S1 S2 . . ., where S1 , S2 , S3 , . . . are the non-empty subsets of S. Precisely, a
partition of S is a collection = {S1 , S2 , S3 , . . .} of nonempty subsets of S such that
(i) Each x in S belongs to one of the Si .
(ii) The sets of {Si } are mutually disjoint; that is, if
Si 6= Sj then Si Sj = .
The set of all partitions of a set S is denoted by (S). The disjoint sets S1 , S2 , S3 , . . .
are called cells or blocks. For example,
n
o
(i) Consider the subcollection {1, 3, 5}, {2, 4, 6, 8}, {7, 9} of subset of S = {1, 2, , 8, 9},
then it is a partition of S.
n
o
(ii) Consider the subcollection {1, 3, 5}, {2, 6}, {4, 8, 9} of subset of S = {1, 2, , 8, 9},
then it is not a partition of S since 7 in S does not belong to any of the subsets.
n
o
(iii) Consider the subcollection {1, 3, 5}, {2, 4, 6, 8}, {5, 7, 9} of subset of S = {1, 2, , 8, 9},
then it is not a partition of S since {1, 3, 5} and {5, 7, 9} are not disjoint.

32

Theory of Sets
Another examples are

(i) Let Z, Z , Z + , Z c , Z 0 be the set of integers, negative integers, positive integers, even
integer and odd integers. Then the partitions of Z are {Z , {0}, Z + }, {Z c , Z 0 }.
(ii) Let Z be the set of integers. Consider a relation = {(a, b) : (a b) is divisible by 5}.
It is shown that is an equivalence relation on Z. This relation partitions the set Z
into five equivalence classes as [a] = {x : xa i.e., x a is divisible by 5} . Thus,
[0] = {x : x Z and x0}
= {x : x 0 is divisible by 5}
= {x : x 0 = 5k, k Z}
= {. . . , 10, 5, 0, 5, 10, . . .}
Similarly,
[1] = {x : x 1 = 5k, k Z}
= {. . . , 9, 4, 1, 6, 11, . . .}
[2] = {. . . , 8, 3, 2, 7, 12, . . .}
[3] = {. . . , 7, 2, 3, 8, 13, . . .}
[4] = {. . . , 5, 1, 4, 9, 14, . . .} .
It is observed that [0] [1] [2] [3] [4] = Z and any two of them are disjoint. Thus
a partition of Z is {[0], [1], [2], [3], [4]}.
Ex 1.7.1 Find all partitions of S = {a, b, c}.
Solution: Since S = {a, b, c}, the partition (S) is given by
n
o n
o n
o n
o n
o
(S) = {a}, {b}, {c} , {a, b}, {c} , {a}, {b, c} , {a, c}, {b} , {a, b, c} .
Ex 1.7.2 Determine whether the sets , {1, 3, 5, 8}, {2, 4, 6, 9}, {5, 9, 11, 12} is a partition of
the set S = {1, 2, 3, , 12}.
Solution: Let S1 = , S2 = {1, 3, 5, 8}, S3 = {2, 4, 6, 9}, S4 = {5, 9, 11, 12}. We see that,
S1 S2 S3 S4 = S, but S2 S4 = {5} =
6 . Hence the given subsets
, {1, 3, 5, 8}, {2, 4, 6, 9}, {5, 9, 11, 12}
do not form a partition on S.
Theorem 1.7.1 Fundamental theorem of equivalence relation: An equivalence relation on a set A gives a partition of A into mutually disjoint equivalence classes, such that
a, b belongs to the same class if and only iff ab.
Proof: First, we shall define class [a] (for a given a A) as [a] = {x/ax, x A}. Let P
be the set of all distinct equivalence classes in A. If a A, then since a [a] and [a] P .
Hence a belongs to the union of all members of P . Hence the union of all members of P
is A. Also the members of P are all pairwise disjoint. Hence P is a partition of A. Now
for two elements a, b A, it can be shown that ab, if and only if they belongs to the same
equivalent classes.
Converse theorem: A partition P of a set A gives an equivalence relation for which the
numbers of P are the equivalence classes.
We define a relation in A by ab if and only if a, b belongs to the same class of P . a, a
belongs to the same class of the partition P . Hence aa, a A, and so is reflexive. Let

Equivalence Class

33

a, b S and ab. Now,


ab a, b belongs to the same class.
b, a belongs to the same class.
ba.
Hence is symmetric. Let a, b S and ab, bc hold. Then,
ab, bc a, b and b, c belongs to the same class.
Since b belongs to both classes, it follows that these two classes are the same subset of the
partition P and consequently, a, c belong to one and the same subset of P . Thus
a, c belongs to the same class ac.
Hence is transitive. Hence is an equivalence relation on A. And the equivalence classes
are just the classes of P . This completes the proof.
Ex 1.7.3 In Z, define ab iff ab is divisible by an integer 7. Show that is an equivalence
relation. Hence find the corresponding partition of Z.
Solution: (i) a Z, and a a = 0 is divisible by 7. Hence aa; a Z and hence is a
reflexive. Using the definition,
ab a b is divisible by 7
b a is divisible by 7
ba; a, b Z.
Hence is a symmetric. Now,
ab, bc a b, b c is divisible by 7
(a b) + (b c) is divisible by 7
(a c)is divisible by 7
ac; a, b, c Z.
Hence is transitive and consequently, it is an equivalence relation. For this relation, the
equivalent classes are
Ep = {7k + p; k Z and p = 0, 1, 2, 3, 4, 5, 6.}
Therefore, the distinct equivalent classes can be written as
E0
E2
E4
E6

= { , 14, 7, 0, 7, 14, }; E1 = { , 13, 6, 1, 8, 15, }


= { , 12, 5, 2, 9, 16, }; E3 = { , 11, 4, 3, 10, 17, }
= { , 10, 3, 4, 11, 18, }; E5 = { , 9, 2, 5, 12, 19, }
= { , 8, 1, 6, 13, 20, }.

Now, we see that, Z = E0 E1 E6 and Ei Ej = , for i 6= j, i.e., the classes are


mutually disjoint. Therefore, {E0 , E1 , E2 , E3 , E4 , E5 , E6 } is a partition of Z.
Ex 1.7.4 Consider the equivalence relation on Z by xy if and only if x2 y 2 is a multiple
of 5. Hence find the corresponding partition of Z.

34

Theory of Sets

Solution: If x Z, then we have x = 5k+r, where 0 r. Therefore x2 = 25k 2 +10kr+r2


x2 r2 (mod5). For,
r = 0, r2 0(mod5); r = 1, r2 1(mod5)
r = 2, r2 4(mod5); r = 3, r2 9 4(mod5)
r = 4, r2 16 1(mod5).
Hence x2 02 or 12 or 22 (mod5). Consequently, there are only three congruence classes
[0], [1] and [2]. Here,
[0] = {a Z : a2 0(mod5)} = {5k : k Z} = A0 (say),
[0] = {a Z : a2 1(mod5)} = {5k + 1 : k Z} {5k + 4 : k Z} = A1 (say),
[2] = {a Z : a2 4(mod5)} = {5k + 2 : k Z} {5k + 3 : k Z} = A2 (say).
Hence {A0 , A1 , A2 } is the corresponding partition of Z.
Theorem 1.7.2 The set Z/(n) of residue classes is a finite set of order n.
We shall first show that ab is divisible by n iff a, b when divided by n have same remainder.
Let a = np + r1 where p, r1 are integers and 0 r1 < n and let b = nq + r2 where q, r2 are
integers and 0 r2 < n. Hence,
a b = n(p q) + (r1 r2 )

(1.13)

If r1 = r2 , this relation shows that a b is divisible by n. Conversely, if a b is divisible by


n, then (1.13) shows that r1 r2 is divisible by n. But 0 |r1 r2 | < n. Hence,
r1 r2 = 0 i.e. r1 = r2 .
Hence the result is proved. Now, all the possible remainders are 0, 1, 2, . . . , (n 1). Accordingly we get n distinct classes in Z/(n). These are denoted by (0), (1), (2), . . . , (n 1)
(called class 0,class 1,. . . etc.) Hence Z/(n) = {(0), (1), (2), . . . , (n 1)} and it is a finite set
of order n.

1.8

Poset

Let be a relation on a set A satisfying the following three properties:


(i) (Reflexive): For any x A, we have aa.
(ii) (Antisymmetric): If ab and ba, then a = b.
(iii) (Transitive): If ab and bc, then ac.
Then is called a partial order or, simply an order relation, and is said to define a partial
ordering of A. The set A with the partial order is called a partially order set or, simply, an
order set or poset. Thus a non-empty set A together with a partial ordering relation on
A is called a partial ordered set (poset) and is usually denoted by (A, ). For example,
(i) Consider the set N of positive integers. Here m n means m divides n, written m|n,
if there exists an integer p such that mp = n, m, n N . For example, 2|4, 3|12, 7|21,
and so on. Thus relation of divisibility is a partial ordering and (N , ) is a poset.
(ii) Let A be the set of all positive divisors of 72, then (A, ) is a poset, where m n
means m is a divisor of n, for m, n A.

Poset

35

(iii) Let P be the set of all real valued continuous functions defined on [0, 1]. Let f, g P
and f g mean that f (x) g(x), x [0, 1]. Then, (P, ) is a poset.
(iv) Let U be a non empty universal set, i.e., collection of sets and A be the set of all
proper subsets of U . The relation P Q means P Q of set inclusion, i.e., P Q,
for P, Q A is a partial ordering of U . Specially, P P for any set P ; if P Q and
Q P then P = Q; and if P Q and Q R then P R. Therefore (A, ) is a
poset.
(v) (<, ) is a poset, where m n means m is less than or equal to n, for m, n <.
Ex 1.8.1 Let A = {0, 1} and = (a1 , a2 , a3 ), = (b1 , b2 , b3 ) A3 . Define a relation on
A3 by if and only if ai bi , for i = 1, 2, 3. Prove that (A3 , ) is a poset.
Solution: Here, A = {0, 1} and the elements of A3 are of the form (a1 , a2 , a3 ), so, the elements of A3 can be written as (0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (1, 1, 1).
Here the relation is defined as if and only if ai bi , for i = 1, 2, 3. Now,
, = (a1 , a2 , a3 ) A3 .
Hence is reflexive. Let us now assume that , A3 and , both hold. Then
ai bi and bi ai ai = bi ; i = 1, 2, 3.
, = .
Hence is antisymmetric. Let = (a1 , a2 , a3 ), = (b1 , b2 , b3 ), = (c1 , c2 , c3 ) A3 and
and both hold. Then
ai bi and bi ci ai ci ; for, i = 1, 2, 3.
or, and ; , , A3
and so the relation is transitive. As the is reflexive, antisymmetric and transitive, so
(A3 , ) is a poset.

1.8.1

Dual Order

Let be any partial ordering of a set S and (S, ) be a poset. Let % be a binary relation on
S such that for a, b S, a%b if and only if b a. Then the relation % is called the converse
of the partial ordering relation and is denoted by . It may be easily seen that (S, ) is
also a poset. It follows that we can replace the relation in any theorem about posets by
the relation throughout without affecting the truth. This is known as principle of duality.
This duality principle applies to algebra, to projective geometry and to logic.
Ex 1.8.2 Let (A, ) be a poset. Define a relation on A by a b if and only if b a, for
a, b A, then show that (A, ) is a poset.
Solution: The relation on A is defined as a b if and only if b a, for a, b A.
(i) Since, a a, a A, so, a a, a A and hence is reflexive.
(ii) Let a, b A be such that a b, b a, then b a and a b. Therefore, b = a, as is
antisymmetric. Therefore, a b, b a a = b, a, b A. Hence, is antisymmetric.
(iii) Let a, b, c A be such that a b, b c, then b a and c b, i.e., c b and b a.
This implies that c a since is transitive, i.e., a c. Therefore, a b, b c a
c, a, b, c A. Hence, is transitive.
As the is reflexive, antisymmetric and transitive, so (A, ) is a poset.

36

1.8.2

Theory of Sets

Chain

Let (S, ) be a poset. Given x, y S, let x y or y x. The poset which satisfies this
relation is said to be simply or totally or linearly ordered and is called a chain. In other
words, of any two distinct elements in a chain, one is less and the other greater. A subset
of S is called an antichain if no two distinct elements in the subset are related. A poset
(S, ) is called a totally ordered set or simply an ordered set if S is a chain and in this case
the binary relation is called a total ordering relation.
(i) Any subset S of a poset P is itself a poset under the same inclusion relation (restricted
to S).
(ii) Every subset of a linearly ordered set S must be linearly ordered i.e., any subset of a
chain is a chain.
(iii) Although an ordered set S may not be linearly ordered, it is still possible for a subset
A of S to be linearly ordered.
We frequently refer to the number of elements in the chain as the length of the chain.
Consider the following examples:
(i) Consider the set N of positive integers ordered by divisibility. Then 21 and 7 are comparable since 7|21. On the other hand, 3 and 5 are non-comparable, since neither 3|5
nor 5|3. Thus N is not linearly ordered by divisibility. Observe that A = {2, 6, 12, 36}
is a linearly ordered subset of N since 2|6, 6|12 and 12|36.
(ii) The set N of positive integers with the usual order is linearly ordered and hence
every ordered subset of N is also linearly ordered.
(iii) The power set P (A) of a set A with two or more elements is not linearly ordered by
set inclusion. For instance, suppose a, b A. Then {a} and {b} are non-comparable.
Observe that the empty set , {a} and A do not form a linearly ordered subset of P (A)
since {a} A. Similarly, , {b} and A do not form a linearly ordered subset of
P (A).

1.8.3

Universal Bounds

In any poset P = (S, ), the elements O and I of S, when they exist, will be universal
bounds of P , if for any element x S, we have,
O x and x I, i.e., O x I, x P.

(1.14)

We call these elements O and I as the least element and the greatest element of S.
Lemma : A given poset (S, ) can have at most one least element and at most one greatest
element.
Proof: Let O and O be universal lower bounds of (S, ). Then, since O is the universal
lower bound, we have O O , and since O is the universal lower bound we have O O.
Hence by the hypothesis P 2, we have O = O and similarly, I = I .
Posets need not have any universal bounds. Thus under the usual relation of inequality, the
real numbers form a poset (<, ) which has no universal bounds (unless and + are
adjoint to form extended reals).

Poset

1.8.4

37

Covering Relation

Let S be a partially ordered set, and suppose a, b S. We say that a is an immediate


precedence of b, or that b is an immediate successor of a, or that b is a cover of a, written
a << b, if b < a (i.e., b a and a 6= b), i.e., if there is no element c S such that b < c < a
holds.
For example, (N , ) is a poset, where m n means m|n, for m, n N . Therefore, in
(N , ), 6 covers 2, 10 covers 2, but 8 does not cover 2, as 2 < 4 < 8, although 2 < 8.
Suppose S is a finite partially ordered set. Then the order on S is completely known
once we know all pairs a, b S such that a << b, that is, once we know the relation <<
on S. This follows from the fact that x < y if and only if x << y or there exist elements
a1 , a2 , , am in s such that
x << a1 << a2 << << am << y.
Hasse diagram
The Hasse diagram of a finite partially ordered set S is a directed graph whose vertices are
thee elements of S and there is a directed edge from a to b whenever a << b in S.
A convenient way of displaying the ordering relation among the elements of an ordered
set is done by means of a graph whose vertices represent the elements of a set. Thus we
define a graph, whose vertices are different elements a, b, c, of S in which a and b are
joined by a segment if and only if a covers b or b covers a. If the graph is so drawn that
whenever a covers b, the vertex a is at a higher level than the vertex b, then the graph is
called the Hasse diagram of S. The Hasse diagram of a poset S is a picture of S; hence it
is very useful in describing types of elements of S. Sometimes we define a partially ordered
set by simply presenting its Hasse diagram, and note that they need not be connected.
Ex 1.8.3 Let A = {a, b, c, d, e}. The diagram in Fig. 1.12 defines a partial order on A in
a
@
@
@ c
b
A
@
A
@
A
@
d
e
Figure 1.12: Hasse diagram
the natural way. That is, d b, d a, e c, and so on.
Ex 1.8.4 Let S = {a, b, c}, then the power set (S), i.e., the set of all subsets of S has
{a, b, c} Q
{b,
Q c}
Q
Q
Q {c}
{a, c}
{a, b}
@
@
{a}

{b}
@
@
@

Figure 1.13: Hasse diagram


elements , {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c}. The Hasse diagram of this poset is
shown in the Fig. 1.13.
Ex 1.8.5 Let S be the set of all positive divisors of 12, i.e., S = {1, 2, 3, 4, 6, 12}. Here 1
covers 2, 2covers 4, 4 coves 12, 1covers 3, 3 covers 6, 6 covers 12, 2 covers 6, but 2 does not
cover 12, as 2 < 6 < 12. The covering diagram of this poset (S, ) is given in the Fig. 1.14.

38

Theory of Sets
3

6
%

12

%
1
2
4
Figure 1.14: Hasse diagram
Ex 1.8.6 Let S be the set of all positive divisors of 30, i.e., S = {1, 2, 3, 5, 6, 10, 15, 30}.

Q
3

30
Q
Q
Q 6

@
@
1

10
@
@
@ 2

15 Q
5

Figure 1.15: Hasse diagram


The covering diagram of this poset (S, ) is given in the figure 1.15.
Ex 1.8.7 A partition of a positive integer m is a set of positive integers whose sum is m.
For instance, there are seven partitions of m = 5 as follows:
5, 3 2, 2 2 1, 1 1 1 1 1, 4 1, 3 1 1, 2 1 1 1.
We order the partitions of an integer m as follows: A partition P1 precedes a partition P2 if
the integers in P1 can be added to obtain the integers in P2 or equivalently, if the integers in

41

5Q
QQ
32
@
@

@
@
31
1
221
b

b
2111
11111
Figure 1.16: Hasse diagram
P2 can be further subdivided to obtain the integers in P1 . For example, 2 2 1 precedes
3 2 as 2 + 1 = 3. On the other hand, 3 1 1 and 2 2 1 are non-comparable. Fig.
1.16 gives the Hasse diagram of the partitions of m = 5.

1.8.5

Maximal and Minimal Elements

Let S be a partial ordered set. An element a of a poset (S, ) is minimal, if no other element
of S strictly precedes (is less than) a. Similarly, an element b is called a maximal element if
no element of S strictly succeeds (is larger than) b. For example,
(i) (N , ) is a poset, where m n means m|n, for m, n N . This poset (N , ) contains
no greatest element and no maximal element. The least element is 1 and 1 is the only
minimal element.
(ii) Let U be a non empty universal set and A be the set of all proper subsets of U , then
(A, ) is a poset, where P Q means P is a subset of Q, i.e., P Q, for P, Q A.
This poset (A, ) contains no greatest element and no least element. The the minimal
elements are {1}, {2}, {3} and the three maximal elements are {1, 2}, {2, 3}, {1, 3}.

Poset

39

Geometrically speaking, a is a minimal element if no edge enters a (from below), and b is a


maximal element if no edge leaves b (in the upward direction). An element a S is called
a first element if a x, for every element x S, i.e., if a precedes every other element in
S. Similarly, an element b in S is called a least element if y b for every y S, i.e., if b
succeeds every other element in S. Note the following:
(i) If S is infinite, then S may have no minimal and no maximal element. For instance,
the set Z of integers with the usual order has no minimal and no maximal element.
(ii) If S is finite, then S must have a least one minimal element and at least one maximal
element.
(iii) S can have more than one minimal and more than one maximal element. For example,
Let X = {1, 2, 3}, S = (X) {, X}; A, B S, AB A B. Here {1}, {2}, {3}
are minimal in poset (S, ). {1, 2}, {2, 3}, {1, 3} are maximal in poset (S, ).
(iv) S can have at most one first element, which must be a minimal element, and S can
gave at most one last element, which must be a maximal element. Generally speaking,
S may have neither a first nor a last element, even when S is finite.
(v) The least element in a poset is the minimal element and the greatest element in a
j
k
i

g
H
H c dH
He
b f T
Ta
Figure 1.17: Hasse diagram
poset is a maximal element but the converse is not true. Let us consider the poset
whose Hasse diagram is given in Fig. 1.17, a, b, e are minimal elements and j, k are
maximal elements. Here f covers c but f does not cover a.
(vi) Let A = {a, b, c, d, e}. The diagram in Fig. 1.12 defines a partial order on A in the
natural way. That is, d b, d a, e c, and so on. A has two minimal elements, d
and e, and neither is a first element. A has only one maximal element a, which is also
a least element.
(vii) Let A = {1, 2, 3, 4, 6, 8, 9, 12, 18, 24} be ordered by the relation x divides y. The Hasse

24
@
@
12



18


4
6

9
@
T
@2 
T
3
@
@ 1 !!
Figure 1.18: Hasse diagram
diagram is given in Fig. 1.18. Unlike rooted trees, the direction of a line in the diagram
of a poset is always upward. A has two maximal elements 18 and 24 and neither is a
last element. A has only one minimal element, 1, which is also a first element.

40

Theory of Sets
v

Figure 1.19: Hasse diagram


(viii) The diagram of a finite linearly ordered set, i.e., a finite chain, consists simply of one
path. For example Fig. 1.19 shows the diagram of a chain with five elements. The
chain has one minimal element, x, which is a first element, and one maximal element,
v, which is a last element.
(ix) Let A be any non empty set and let P (A) be the power set of A ordered by set
inclusion. Then the empty set is a first element of P (A) since, for any set X, we
have X. Moreover, A is a least element of P (A) since every element Y of P (A) is,
by definition, a subset of A, that is, Y A.

1.8.6

Supremum and Infimum

Let A a subset of a partially ordered set (S, ). An element M S is called to be a upper


bound of a subset A if M succeeds every element of A, i.e., if, for every x A, we have
x M. If an upper bound of A precedes every other upper bound of A, then it is called the
supremum or least upper bound of A and is denoted by sup(A) and we write sup(A) = M.
In particular, let (S, ) be any poset and let a, b S be given. Then an element d S is
called the glb or meet of a and b, when
d a, d b and x a, x b a d,

(1.15)

and we write d = a b.
An element m S is called to be a lower bound of a subset A of a poset S if m precedes
every element of A, i.e., if, for every y A, we have m y. If a lower bound of A succeeds
every other lower bound of A, then it is called the infimum or greatest lower bound of A
and is denoted by inf (A) and we write inf (A) = m.
Dually, an element s S is called the lub or join of a and b, when
a s, b s and a x, b x s x.

(1.16)

In this case, we write s = a b.


Below are some examples:
(i) Let S = {a, b, c, d, e, f } be ordered as pictured in Fig. 1.20, and let A = {b, c, d}.
f
e
,
l
l,
,l
l d
c ,
@
@b

a
Figure 1.20: Hasse diagram
The upper bounds of A are e and f succeed every element in A. The lower bounds
of A are a and b since only a and b precede every element of A. Note that e and f
are non-comparable; hence sup(A) does not exist. However, b also succeeds a, hence
inf (A) = b.

Poset

41

(ii) Let (N , ) be a poset, and ordered by divisibility, where m n means m|n, for
m, n N . The greatest common divisor of m and n in N , denoted by gcd(m, n) is
the largest integer which divides mand n. The least common multiple of m and n,
denoted by lcm[m, n] is the smallest integer divisible by both m and n. From number
theory, every common divisor of a and b divides gcd(m, n) and lcm[m, n] divides every
multiple of m and n. Thus
gcd(m, n) = inf (m, n) and lcm[m, n] = sup(m, n).
In other words, inf (m, n) and sup(m, n) do exist for every pair of elements of N
ordered by divisibility.
(iii) For any positive integer m, we will let Dm denote the set of divisors of m ordered
36
@
@
18

12
@
@
6

4
J
JJ
2a
aa 1

@
@
@
9
"
b
"
b
b "
3



Figure 1.21: Hasse diagram


by divisibility. The Hasse diagram of D36 = {1, 2, 3, 4, 6, 9, 12, 18, 36} appears in Fig.
1.21. Again, inf (m, n) = gcd(m, n) and sup(m, n) = lcm[m, n] exist for any pair m, n
in Dm .
(iv) Let X be a non empty set and P (X) be the power set of X. Then, (P (X), ) is a
poset, where A B means A is a subset of B, i.e., A B, for A, B P (X). In this
poset (P (X), ), the lub of A and B is the union of A, B, i.e. A B and the glb of A
and B is the intersection of A, B, i.e., A B.
It is important to note that, if the elements a, b in a poset (S, ) have an upper bound (
a lower bound) then they may not has a least upper bound ( a greatest lower bound). For a
poset whose Hasse diagram is shown if figure (1.22), the set A = {e, d, g} has four elements
c
aaab

d
"aa g
f aa

e""

Figure 1.22: Hasse diagram


a, b, c, d as upper bounds and d is the lub of A. But the subset A0 = {a, f, c} has no upper
bound.
Theorem 1.8.1 Let (A, ) be a poset and a, b A. Then one of the relations (i) a b,
(ii) a b = a and (iii) a b = b implies the other two.
Proof: Given that, (A, ) be a poset and a, b A, then a a and a b. This means that
a is the lower bound od a and b. Let m be any lower bound of a and b, then by definition,

42

Theory of Sets

m a, m b. Since a is a lower bound of a, b and any lower bound m a, a is the greatest


lower bound of a, b, i.e., a = a b and consequently, (i) (ii).
Let the relation (ii), i.e., a b = a holds. Then, by definition, a is the glb of a and b.
Hence, a a, a b. Also, we have, b b. Now, a b and b b gives b is an upper bound
of a and b. Let n be any upper bound of a and b, then a n and b n. Since, b is an upper
bound of a and b and b n, for any upper bound, so, b is the lub of a, b, i.e., a b = b and
consequently, (ii) (iii).
Let the relation (iii), i.e., a b = b holds. Then, by definition, b is the lub of a and b.
As, b is an upper bound of a, b, we have a b and consequently, (iii) (i).
By the transitivity of the implication , it follows that, (i) (ii) and (iii); (ii) (iii);
(iii) (i) and (ii). Hence the theorem.

1.9

Lattices

Let L be a nonempty set closed under two binary operations called meet and join, denoted
respectively by and . Then L is called a lattice if the following axioms hold where a, b, c
are elements in L:
(i) (Commutative law):

a b = b a;

a b = b a.

(ii) (Associative law):

(a b) c = a (b c);

(a b) c = a (b c).

(iii) (Absorption law):

a (a b) = a;

a (a b) = a.

We will sometimes denote the lattice by (L, , ) when we want to show which operations
are involved. A chain (L, ) is a lattice since lub(a, b) = b and glb(a, b) = a when a b and
lub(a, b) = a and glb(a, b) = b when b a. For example Fig. 1.19 shows the diagram of a
chain with five elements, which is a lattice. Below are some examples:
(i) (N , ) is a poset, where m n means m|n, for m, n N . This poset (N , ) is a lattice,
where for any two elements m, n N , m n = lcm(m, n) and m n = gcd(m, n).
(ii) Let X be a non empty set and P (X) be the power set of X. Then, (P (X), ) is a
poset, where A B means A is a subset of B, i.e., A B, for A, B P (X). This
poset (P (X), ) is a lattice, where for any two elements A, B P (X), A B = A B
and A B = A B.
(iii) (Z, ) is a poset, where m n means m is less than or equal to n, for m, n Z. This
poset (Z, ) is a lattice, where for any two elements m, n Z, m n = min{m, n}
and m n = max{m, n}. This is a chain.
(iv) Then, (P, ) is a poset, where, P be the set of all real valued continuous functions
defined on [0, 1] and f g, f, g P mean that f (x) g(x), x [0, 1]. This poset
(P, ) is a lattice, where for any two elements f, g P,
f g = max{f (x), g(x)}; f g = min{f (x), g(x)}; x [0, 1].
(v) Let U be a non empty universal set and A be the set of all proper subsets of U , then
(A, ) is a poset, where P Q means P is a subset of Q, i.e., P Q, for P, Q A.
This poset (A, ) is a not a lattice, as the pair of elements {1, 2} and {2, 3} has no
lub and the pair of elements {1} and {2} has no glb.
(vi) For any positive integer m, we will let Dm denote the set of divisors of m ordered by
divisibility (|). Let D30 = {1, 2, 3, 5, 6, 10, 15, 30} denotes the set of all divisors of 30.
The Hasse diagram of the lattice (D30 , |) appears in Fig. 1.23. m n = lcm(m, n) and

Lattices

43
30
@
@
@
6
15
10
A

A \
2
A3
\ 5
S
S
S1
Figure 1.23: Hasse diagram

m n = gcd(m, n). traverse upwards from the vertices representing a and b and reach
a meeting point of the two paths. The corresponding element is a b. By traversing
downwards onwards, we can get a b similarly.
Ex 1.9.1 Show that the poset (L, ) represented by its Hasse diagram (Fig.1.24) is a lattice.
g
 @

@
@
e
f
d
c
cc
b
"
@
"
@ a"
Figure 1.24: Hasse diagram of Ex.1.9.1
Solution: We have to prove that each pair of elements of L = {a, b, c, d, e, f, g} has a lub
and glb.

a
b
c
d
e
f
g

a
(a, a)
(a, b)
(a, c)
(a, d)
(a, e)
(a, f )
(a, g)

b
(a, b)
(b, b)
(a, d)
(b, d)
(b, e)
(a, g)
(b, g)

c
(a, c)
(a, d)
(c, c)
(c, d)
(a, g)
(c, f )
(c, g)

d
(a, d)
(b, d)
(c, d)
(d, d)
(b, g)
(c, g)
(d, g)

e
(a, e)
(b, e)
(a, g)
(b, g)
(e, e)
(a, g)
(e, g)

f
(a, f )
(a, g)
(c, f )
(c, g)
(a, g)
(f, f )
(f, g)

g
(a, g)
(b, g)
(c, g)
(d, g)
(e, g)
(f, g)
(g, g)

If the element in a-row and b-column is (x, y), then x = a b and y = a b.


Ex 1.9.2 Show that the following posets are not lattices:
(i) L1 = ({1, 2, , 12}, |).
(ii) L2 = ({1, 2, 3, 4, 6, 9}, |).
Solution: For (L1 , |) 2 7 = lcm{2, 7} = 14 6 L1 . So L1 is not a lattice.
For (L2 , |) 4 6 = lcm{4, 6} = 12 6 L2 . So L2 is not a lattice.
Ex 1.9.3 Which of the following posets given in the Fig.1.25 is a lattice? Not a lattice?
Solution: P1 : d, e, f are upper bounds of b and c. f cannot be the lub of b and c since
d f and d =
6 f . d or e cannot be the lub of b and c since d 6 e and e 6 d. So lub(b, c)

44

Theory of Sets
f
l
l
S
le A


S 


A 
b
S
b
AA
 b
e

bc
Q
b@
#

e
Q
Q
a
#
b
b
a
@#
a
P1
P2
P3
Figure 1.25: Hasse diagram of Ex.1.9.3
db

does not exist. Hence (L, ) is not a lattice.


P2 : It is not a lattice since there is no lower bound for (a, b) and hence glb(a, b) does not
exist.
P3 : It is not a lattice since glb(a, b) does not exist.

1.9.1

Lattice Algebra

The binary operations and in lattices have important algebraic properties, some of them
analogous to those of ordinary multiplication and addition.
Theorem 1.9.1 In any lattice the following identities hold:
(i) L1: x x = x and x x = x: Idempotency
(ii) L2: x y = y x and x y = y x, Commutativity
(iii) L3: (x y) z = x (y z) and (x y) z = x (y z), Associativity
(iv) L4: x (x y) = x and x (x y) = x, Absorption.
Moreover x y is equivalent to each of the conditions: xy = x and xy = y : Consistency.
Proof: By the principle of duality, which interchange and it suffices to prove one of
the two identities in each of L1 L4.
L1 : Since x y x, we have x x x. Also, d x, d y d x y. It follows that
x x, x x x x x. Hence it follows that x x = x.
L2 : Since the meaning of glb{x, y} is not attend by interchanging x and y, it follows that
x y = y x.
L3 : Since both of x (y z) and (x y) z represent the glb{x, y, z}, the result follows.
L4 : Since x (x y) is the lower bound of x and x y, we have x (x y) x. Since
x x, x x y, it follows that x is the lower bound of x and x y. And since x (x y) is
the glb of x and x y, we must have x x (x y). Hence it follows that x (x y) = x.
Theorem 1.9.2 In a lattice L, y z implies x y x z and x y x z, x L.
Proof: Since y z, we have y = y z. Therefore,
x y = x (y z) = (x x) (y z), as x x = x
= x (x (y z)); Associative
= x ((x y) z); Associative
= x ((y x) z); commutative
= (x (y x)) z = ((x y) x) z
= (x y) (x z).
Since x y is the glb of x y and x z, so x y x z. By the principle of duality,
x y x z.

Lattices

45

Theorem 1.9.3 Any lattice satisfies the distributive inequalities (or semi distributive laws):
(i) x (y z) (x y) (x z).
(ii) x (y z) (x y) (x z).
Proof: We have, x y x and x y y y z. Therefore, x y is a lower bound of x
and y z. Since x (y z) is the glb of x and y z, we have,
x y x (y z) and similarly, x z x(y z).
These shows that x (y z) is an upper bound of x y and x z. But (x y) (x z) is
the lub of x y and x z. Therefore
x (y z) (x y) (x z).

1.9.2

Sublattices

Let T is a non empty subset of a lattice L (T L).We say T is sublattice of L if T is itself


a lattice (with respect to the operations of L). Therefore, T is a sublattice of L if and only
if T is closed under the operations of and , i.e.,
a, b T a b T and a b T.
For example, the set Dm of divisors of m is a sublattice of the positive integers N under
divisibility.
Two lattices L and L0 are said to be isomorphic if there is a one-to-one correspondence
f : L L0 such that
f (a b) = f (a) f (b) and f (a b) = f (a) f (b).

1.9.3

Bounded Lattices

A lattice L is said to have a lower bound 0 if for any element x L we have 0 x.


Analogously, L is said to have an upper bound I if for any x L we have x I. We say L
is bounded if L has both a lower bound 0 and upper bound I. In such a lattice we have the
identities
a I = I, a I = a, a 0 = a, a 0 = 0,
for any element a L. For example:
(i) The nonnegative integers with the usual ordering 0 < 1 < 2 < have 0 as the lower
bound but have no upper bound.
(ii) The lattice P (U ) of all subsets of any universal set U is a bounded lattice with U as
an upper bound and the empty set as a lower bound.
Therefore every finite lattice is bounded.

1.9.4

Distributive Lattices

A lattice L in which the distributive laws


(i) x (y z) = (x y) (x z)
(ii) x (y z) = (x y) (x z)

46

Theory of Sets
I
c

J
J
J
b

I
 @

@

c
a b
a

@
S
S0
@0 
(a) 1.26: Hasse diagram
(b)
Figure
hold for all elements x, y, z in L is called a distributive lattice. We note that by the principle
of duality the condition (i) holds if and only if (ii) holds. If the lattice L is not distributive,
it is said to be non distributive. Fig. 1.26 is a non distributive lattice, since
a (b c) = a 0 = a but (a b) (a c) = I c = c.
Fig. 1.26(b) is also a non distributive lattice. In fact, we have the following characterization of such lattices. A lattice L is non distributive if and only if it contains a sublattice
isomorphic to Fig. 1.26(a) or (b).
Theorem 1.9.4 In a distributive lattice, a x = a y and a x = a y together imply
x = y.
Proof: We have,

1.9.5

x = x (x a), L4
= x (y a) = (x y) (x a)
= (y x) (y a) = y (x a)
= y (y a) = y.

Trivially Complement

The case in which a x = a y = O and a x = a y = I is the particular interest. In


general, by a complement of an element a in a lattice L with universal bounds O and I, we
mean an element x L such that
a x = O and a x = I.
The elements O and I are trivially complementary in any lattice. It is obvious that in any
chain elements other than O and I have no complements.
Theorem 1.9.5 In any distributive lattice L, a given element can have at most one complement.
Proof: Let the element a have two complements a0 and a00 , then
a a0 = O = a a00 and a a0 = I = a a00 .
Since the lattice is distributive, we have a0 = a00 .
Theorem 1.9.6 In any distributive lattice, the set of all complemented elements is a sublattice.
Proof: Let a, a0 and b, b0 be complementary pairs, then by definition,a a0 = O = b
b0 and a a0 = I = b b0 . Now,

Mapping

47
(a b) (a0 b0 ) = (a b a) (a b b0 )
= (a a0 b) (a O)
= (O b) O = O O = O.
0
0
Also, (a b) (a b ) = (a a0 b0 ) (b a0 b0 )
= (I b0 ) (a0 I) = I I = I.

Hence a b and a0 b0 are complementary. Similarly, a b and a0 b0 are complementary.


Thus if L is a distributive lattice and if S is a subset of L consisting of the complementary
pairs of L, then we see that for any two elements a and b (with complements a0 and b0
respectively ) in S a b and a b also belong to S. Hence by definition, S is a sublattice.

1.10

Mapping

Let A and B be two non-empty sets. A function f from A to B, which is denoted by


f : A B, is a relation from A to B with the property that for every a A, there exactly
one b B such that (a, b) f . Functions are called mappings or transformations. The
element a A is called an argument of the function f and f (a) is called the value or image
or f -image of a under f .
f
range

r

r
r
r
r
r
r
r

A: domain

B: Co-domain

Figure 1.27: Pictorial representation of domain, co-domain and range


The set A is called the domain of the function f and B is called the co-domain of f .
The range of f consists of those elements in B which appears as the image of at least one
element of A. It is also known as image set. Thus range of f is denoted by f (A), i.e.,
f (A) = {f (x) : x A}. Obviously, f (A) B. For example, (i) let A = {1, 2, 3, 4} and
B = {x, y, z} and let
f = {{1, x}, {2, x}, {3, y}, {4, z}}.
i.e., f (1) = x, f (2) = x, f (3) = y, f (4) = z.
Thus for each element of A has a unique value in B, so it is a function.
(ii) Let A = {1, 2, 3} and Let B = {x, y, z}. Consider the relations
f1 = {(1, x), (2, x)} and f2 = {(1, x), (1, y), (2, z), (3, y)}.
The relation f1 is a function, as f1 (1) = x and f1 (2) = x, i.e., each element has unique
image. But, the relation f2 is not a function as f2 (1) = x and f2 (1) = y, i.e., 1 A has two
distinct images x and y.
If x( A) correspondence to y( B), it is said that y is the image of x under the mapping
f and it is expressed by writing either xf = y or f (x) = y. In this case x is said to be the or
inverse image of y. Sometimes, it is possible
to write down a precise formula showing how
f (x) is determine by x. For example, f (x) = x, f (x) = 2x + 5, f (x) = ex + 2x, etc. That
is, for the function f (x) = x3 we mean the function f : R R which associates with any
x R, its cube x3 . In the notation of binary relation f = {(x, x3 ) : x R}.
Thus, a sub set f of A B is called a function or mapping from A to B if to each a A,
there exists a unique b B such that the ordered pair (a, b) f .

48

1.10.1

Theory of Sets

Types of Functions

Constant function
A function f : A B is said to be a constant function (or a constant mapping) if f maps
each element of A to one and the same element of B, i.e., f (A) is a singleton set. For
example, f (x) = 5 for all x R is a constant function.
Identity function
A function f : A A is said to be the identity function on A if f (x) = x for all x A. It
is denoted by IA .
Into function
A function f : A B is said to be an into function if f (A) is a proper subset of B, i.e.,
f (A) B. In this case, we say that f maps A into B.
Onto function
Let A and B are two non empty subsets. A mapping f : A B is defined as a surjective
or surjection if
y B, x A such that f (x) = y.
(1.17)
This is also called an onto mapping and is denoted by f (A) = B. In this case, we say that
f maps A onto B. For example,
(i) Let f : Z Z be given by f (x) = 3x, x Z. Then f is into function because
f (z) = {0, 3, 6, 9, . . .} is a proper subset of Z (co-domain).
(ii) Let f : Z Z be given by f (x) = x + 2, x Z. The f is onto function because
f (z) = z (co-domain).
Pre-image
If f : A B be a function and x A then f (x) is a unique element in B. The element x
is said to be a pre-image (or inverse image) of f (x).
One-to-one function
A function f : A B is said to be a one-to-one function, if different elements in A have
different images in B, i.e., if x1 6= x2 then f (x1 ) 6= f (x2 ) for all x1 , x2 A. The one-to-one
function is also known as one-one or injective or injection. For example
(i) A mapping f : Z + Q is defined by n

n
2n+1

is an injective mapping.

(ii) Let A = {1, 2, 3, 4, . . .} and B = {1, 12 , 13 , 14 , . . .} thus the mapping f : x x1 . Here


one element x A maps exactly one element y B. So this is an one-to-one mapping.
Ex 1.10.1 Two sets are given by X = {1, 2, 3, 4}, Y = {, , } and f = {(1, ), (2, ),
(3, ), (4, )}, g = {(1, ), (2, ), (3, )}. Test whether f, g are functions and if they are
functions, test whether they are (i) injective, (ii) surjective.

Mapping

49
g

f
1
2
3
4

r
r
r
r

r
r
r

1
2
3
4

r
r
r
r

r
r

(a)

Y
(b)

Figure 1.28: Functions f and g


Solution: The pictorial representation of f is shown in Fig. 1.28(a). From the Fig. 1.28(a),
it is seen that f is a function since every element of X has a unique image. But, all elements
of X are not mapped with the elements of Y so f is not surjective. Also, f is not injective
since the image of the elements 2, 3, 4 have the same image . The pictorial representation
of g is shown in Fig. 1.28(b). g is not a function because all elements of the domain X are
not mapped to the element of co-domain Y .
Difference between relation and function : Let A and B be two sets. By definition
of relation every subset of A B will be a function if for each x A there is one and
only one ordered pair. Thus, every function is relation, but, converse is not true. For
example, let A = {a, b, c}, B = {0, 1, 2} and f = {(a, 0), (a, 1), (b, 1), (c, 2)} is a relation but
not a function, since two different ordered pairs, viz., (a, 0) and (a, 1) have the same first
coordinates. If we consider f as {(a, 0), (b, 1), (c, 2)} then it becomes a function as well as a
relation.
Ex 1.10.2 If Z be the set of nonnegative integers and f : Z Z is defined by f (x) =
1
[CH04]
2 [x + |x|], test whether it is injective or not.
Solution: The mapping f : Z Z defined by f (x) = 12 [x + |x|], is given by,
1
(x + x) = x;
2
1
= (x x) = 0;
2

f (x) =

for x 0
for x < 0.

We know, 1, 2 < 0 but f (1) = f (2) = 0. Thus f (x1 ) = f (x2 ) although x1 6= x2 , so is


not injective.
Ex 1.10.3 If Z be the set of nonnegative integers and f : Z Z is defined by
x
; when x is an even integer
2
x + 1
=
; when x is an odd integer
2

f (x) =

Is f injective or surjective ?

[ CH03]

Solution: Let x, y Z and x, y be even integers, then f (x) = x2 , f (y) =


f (x) = f (y) x = y. Next, let x, y Z and x, y be odd integers, then
f (x) =

y
2

so that

x + 1
y + 1
and f (y) =
,
2
2

then f (x) = x2 , f (y) = y2 so that f (x) = f (y) x = y. So f is injective. Now, 1 Z, but 1


has no preimage under f . Therefore, f is not surjective.

50

Theory of Sets

Bijective mapping
A mapping f : A B is defined as a bijective mapping or bijection if it is both injective
and surjective. For example, let f : Z Z be given by f (x) = x + 1, x Z. This is an
injective and surjective mapping.
Ex 1.10.4 Decide whether the following mapping are surjective or injective?
(i)f : C < defined by f (a + ib) = a2 + b2 .
(ii)f : Z Z + defined by x x2 + 1.
x
(iii)f : Z + Q defined by x 2x+1
.
Solution: (i) By definition, f (2 + 3i) = 4 + 9 and f (3 + 2i) = 9 + 4. So,
f (2 + 3i) = f (3 + 2i) 6 2 + 3i = 3 + 2i.
Hence f (a + ib) is not an injective or one-one mapping. It is not a surjection. Since 3 <
but it has no pre-image in C, Since a2 + b2 0, real value of a and b.
(ii) If x = 2 and x = 2 then f (2) = 22 + 1 = 5 and f (2) = (2)2 + 1 = 5. The image
is 5 and 2 Z. Therefore,
f (x1 ) = f (x2 ) 6 x1 = x2 .
+
Hence it has not an injective
mapping. It has not a surjection, since 3 Z but 3 = f (n)
2
given, 3 = n + 1 i.e. n = 2 6 Z. Hence 3 has no pre-image in Z.
(iii) Let n1 and n2 be such that f (n1 ) = f (n2 ). Therefore,

n1
n2
=
2n1 + 1
2n2 + 1
2n1 n2 + n1 = 2n1 n2 + n2
n1 = n2

f (n1 ) = f (n2 )

Hence it is an injection. For positive n, we have


have no pre-image. Hence it is not onto.

n
2n+1

> 0. Hence negative numbers in Q

x
Ex 1.10.5 Is the mapping f : < (1, 1), defined by, f (x) = 1+|x|
a bijective mapping?
Justify your answer. R = set of real numbers and (1, 1) = {x : R : 1 < x < 1}. [ KH
06]

Solution: Here, the mapping f : < (1, 1) is defined by, f (x) =


so the mapping is well defined. Therefore,
x
;
1x
x
=
;
1+x
= 0;

f (x) =

x
1+|x| .

when, x < 0, i.e., 1 < f (x) < 0,


when, x > 0, i.e., 0 < f (x) < 1,
when, x = 0.

In the first case, if f (x1 ) = f (x2 ), then,


x1
x2
=
x1 = x2 .
1 x1
1 x2
For the second case, if f (x1 ) = f (x2 ), then,
x1
x2
=
x1 = x2 .
1 + x1
1 + x2

Since |x| 6= 1,

Mapping

51

This shows that f is one-one. Let y (1, 1) and y = f (x). When, x < 0, we have,


y
y
x
y=
x=
f
= y, as x <.
1x
1+y
1+y
When, x > 0, we have,
y
x
y=
x=
f
1+x
1y

y
1y


= y,

as x <.

Thus for x R their pre-image exist and since f is one-one therefore f (x) is onto. Hence
f (x) is bijective mapping.


ab
Ex 1.10.6 Let S be the set of all 22 real matrices
; adbc 6= 0 and < denote the set
cd


ab

of all non zero real numbers. Show that the mapping f : S < defined by f
=
cd
ad bc, is surjective but not injective.




20
1 0
Solution: Let us consider two real matrices A =
and B =
. Since
01
0 2
2.1 0.0 = (1).(2) 0.0 6= 0 so, A, B S. Now, we see that, A 6= B, although, using




definition,
20
1 0
f
=2=f
.
01
0 2
Thus, when, A 6= B but still f (A) = f (B). Therefore, the given mapping f is not injective.


Also,
a b


c d = ad bc 6= 0,
consequently, the inverse of the mapping exists
 and
 belongs to S. Thus for every real number
ab
a, b, c, d with ad bc 6= 0, a real matrix
for which the mapping exists. Thus, f is
cd
surjective.
Ex 1.10.7 Discuss the mapping f : R (1, 1) defined by f (x) =
is the set of real numbers and (1, 1) = {x R : 1 < x < 1}.

x
1+x2 ,

x R, where R

Solution: Since x < so, the given mapping is well defined. Take two elements x1 , x2 <.
If f (x1 ) = f (x2 ), then
x1
x2
= 2
(x1 x2 ) x1 x2 (x1 x2 ) = 0
2
x1 + 1
x2 + 1
either, x1 = x2 or, x1 x2 = 1.
5
Taking x1 = 5 and x2 = 15 , we see that f (x1 ) = f (x2 ) = 26
. Thus when, x1 6= x2 but still
p
f (x1 ) = f (x2 ). Therefore, f is not injective. Let y an arbitrary element of (1, 1), then
x
1 1 4y 2
y= 2
x=
; 1 < y < 1.
x +1
2y

p
1 14y 2
When, y( <) > 12 , we have, 14y 2 < 0 and so 1 4y 2 6 < consequently, x =
6
2y2
p
1
14y
6 <.
<. Similarly, when, y( <) > 12 , then also 1 4y 2 6 < and so x =
2y
p
Therefore,
1 1 4y 2
x=
6 <; y (1, 1).
2y

This means that 12 f (x)


not onto.

1
2,

i.e., f does not map entire co-domain (1, 1). So, it is

52

Theory of Sets

Restriction Mapping
Let f : A B and A0 A. The mapping g : A0 B, such that g(x) = f (x); x A0 ,
is said to be the restriction mapping of f to A0 . It is denoted by f |A0 or fA , read as f
restricted to A0 . f is said to be an extension of g to A. As for examples: f : R R be
given by f (x) = |x| x, x R and g : R+ R be given by g(x) = 0x R+ then g = f |R+ .
Inverse mapping
Let f : A B be a mapping and b B be arbitrary. Let the mapping f : A B be
one-one onto mapping, then corresponding to each element y B, an unique element
x A such that f (x) = y. Thus a mapping, denoted by f 1 , is defined as
f 1 : B A : f 1 (y) = x f (x) = y.

(1.18)

The mapping f 1 defined above is called the inverse of f .The pictorial representation of f 1
is shown in Fig. 1.29. If f 1 is the inverse of f and if f (x) = y then x = f 1 (y). For example,
let A = {1, 2, 3}, B = {a, b, c} and f = {(1, a), (2, b), (3, c)}. Then inverse relation of f is
f 1 = {(a, 1), (b, 2), (c, 3)}, which is a function from B to A. Again g = {(1, a), (2, a), (3, b)}
is a function from A to B and its inverse relation g 1 = {(a, 1), (a, 2), (b, 3)}, which is not a
f
x

f 1

y = f (x)

A
B
Figure 1.29: Inverse function
function since f 1 = {1, 2}.
Theorem 1.10.1 The necessary and sufficient condition that a mapping is invertible is that
it is one-one and onto.
Proof: Suppose f is invertible. Then f 1 : B A is a function. Let b1 , b2 B. There
exists a1 , a2 A (a1 6= a2 ) such that f 1 (b1 ) = a1 and f 1 (b2 ) = a2 . That is, b1 = f (a1 )
and b2 = f (a2 ). Since f 1 is a function, for different b1 and b2 their images must be different,
i.e., f (a1 ) 6= f (a2 ).
Again, f 1 is a function, all elements of B must be mapped with some elements of A
under f 1 . Thus, f is onto. Thus the condition is necessary.
Conversely, let f is bijective, i.e., one-one and onto. Then for different a1 , a2 A there
exists b1 , b2 B (where b1 6= b2 ) such that a1 = f 1 (b1 ) and a2 = f 1 (b2 ). Thus for
different b1 and b2 there images a1 , a2 under f 1 are different.
Again, f is onto, all elements of B are mapped with the elements of A, i.e., each element
of the domain B of f 1 is mapped with each element of A. Hence f 1 is a function.
Theorem 1.10.2 Let f : A B is bijective function then f 1 : B A is also bijective.
Proof: Let b1 , b2 be any two elements of B. Since f is one-one, therefore there exist
unique elements a1 , a2 A such that b1 = f (a1 ) and b2 = f (a2 ). Let a1 = f 1 (b1 ) and
a2 = f 1 (b2 ). Suppose and1
f (b1 ) = f 1 (b2 ) a1 = a2
f (a1 ) = f (a2 ) as f is one-one b1 = b2 .
Therefore f 1 (b1 ) = f 1 (b2 ) iff b1 = b2 . Hence f 1 is one-one.

Mapping

53

To prove f 1 onto : Let a be any element of A. Since f is a function from A to B,


there exists a unique element b B such that b = f (a) or, a = f 1 (b). That is, image under
f 1 of the element b B is a A. Hence f 1 is onto.
Ex 1.10.8 Let A = < { 12 }, B = < { 12 } and f : A B is defined by f (x) =
A. Does f 1 exists?
Solution: Since A = < { 21 }, so, the given mapping f (x) =
x1 , x2 A. If f (x1 ) = f (x2 ), then

x3
2x+1

x3
2x+1 , x

is well defined. Let,

x1 3
x2 3
=
2x1 x2 + x1 x2 3
2x1 + 1
2x2 + 1
= 2x1 x2 6x1 + x2 3 x1 = x2 .
Therefore, f is injective. Let y an arbitrary element of B, then
y=
which is defined as y 6=

f

1
2

x3
y+3
x=
,
2x + 1
1 2y

B. Also,

y+3
1 2y


=

y+3
12y 3
y+3
2 12y
+1

y + 3 3(1 2y)
= y,
2(y + 3) + 1 2y

y+3
A such that f (x) = y. Hence f is surjective and
so for each y B, an element x = 12y
consequently it is bijective. Therefore, f 1 exists.

Ex 1.10.9 If the function f : < < be defined by f (x) = x2 + 1, then find


f 1 (8) and f 1 (17).
Solution: We have, from the definition of the inverse mapping,
f 1 (8) = {x <; f (x) = 8}
= {x <; x2 + 1 = 8}

= {x <; x = 3 1} = ,

as 3 1 are not real numbers. Again,


f 1 (17) = {x <; f (x) = 17}
= {x <; x2 + 1 = 17}
= {x <; x = 4} = {4, 4}.
Theorem 1.10.3 The inverse of a bijective mapping is also bijective.
Proof: Let f : A B be a bijective mapping, then, we are to show that f 1 : B A is
also bijective. Let y1 , y2 B. As f is bijective, i.e., one-one and onto so, f 1 (y1 ) = x1 and
f 1 (y2 ) = x2 . Therefore,
f 1 (y1 ) = f 1 (y2 ) x1 = x2
f (x1 ) = f (x2 );
y 1 = y2 .

as f is one-one

Thus f 1 is one-one. Given any element x A, we can find an element y B, where


f 1 (y) = x. Thus x is the f 1 image of y B. This shows that f 1 is an onto mapping.
Hence the theorem.

54

Theory of Sets

Theorem 1.10.4 For a bijective mapping, the inverse mapping is unique.


Proof: Let, if possible, the mapping f : A B has two inverses, say, g : B A and
h : B A. Let b be any element of B and g(b) = a1 , h(b) = a2 , a1 , a2 A. Since f is
one-one function, therefore
f (a1 ) = f (a2 ) a1 = a2 g(b) = h(b),
i.e., h = g. Thus the inverse mapping of f : A B is unique.
Theorem 1.10.5 If f is a one-one correspondence between A and B then,
f 1 of = IA and f 1 of = IB .
Proof: By definition, IA (a) = b = f (f 1 (b)) = (f 1 of )(a). Thus, IA = f 1 of . Again,
IB (b) = b = f (f 1 (b)) = (f of 1 )(b). Hence IB = f of 1 .
Ex 1.10.10 f : Q Q is define by f (x) = 5x + a where a, x Q, the set of rational
numbers, then show that f is one-one and onto. Find f 1 .
Solution: Let x1 , x2 Q. If f (x1 ) = f (x2 ) then 5x1 + a = 5x2 + a x1 = x2 . Thus f is
one-one. Let y = f (x) = 5x + a, then x = 15 (y a). Since x Q, 15 (y a) Q. Again,
ya
ya
f ( ya
5 ) = 5 + a = y. Thus, y Q is the image of the element 5 Q.
Hence f is onto. Since f is bijective, f is invertible and hence x = f 1 (y) = 15 (y a).
Ex 1.10.11 Show that the functions f : R (1, ) and g : (1, ) R defined by
f (x) = 32x + 1, g(x) = 12 log3 (x 1) are inverses to each other.
Solution: Let x R, then (gof )(x) is given by,
(gof )(x) = g{f (x)} = g(32x + 1)
= 12 log3 {[(32x + 1) 1]}
= log3 32x = 12 .2x = x
Therefore gof = IR . For x (1, ), we have
(f og)(x) = f {g(x)} = f ( 12 log3 (x 1))
1
.
= 3( 2 log3 (x1)) + 1
= (x 1) + 1 = x
Therefore f og = IR and so the functions f : R (1, ) and g : (1, ) R are inverses to
each other.
Ex 1.10.12 Show that the functions f : [ 2 , 2 ] [1, 1] such that f (x) = sin x is one-one
and onto. Also find f 1 .
Solution: Let x1 , x2 [ 2 , 2 ] be any two numbers. Let
f (x1 ) = f (x2 ) sin x1 = sin x2 x1 = x2 .
Hence f is one-one. Let y = f (x) = sin x, or, x = sin1 y [ 2 , 2 ] where y [1, 1]. Also,
f (x) = f (sin1 y) = sin(sin1 y) = y. Thus f is onto.
Hence f is one-one and onto, so f is invertible. Now,
y = sin x x = sin1 y
or, f 1 (y) = sin1 y; y [1, 1].

Mapping

55

Ex 1.10.13 Show that the mapping f : Z Z, defined by f (x) = x + 2 is a bijective


mapping.
Solution: Let x1 , x2 be two different elements of Z. If possible let f (x1 ) = f (x2 ). Then
x1 + 2 = x2 + 2, i.e., x1 = x2 , which contradicts x1 6= x2 .
Thus f (x1 ) = f (x2 ) iff x1 = x2 . Hence f is one-one.
Let y = f (x) = x + 2. Then x = y 2 Z and
f (x) = f (y 2) = (y 2) + 2 = y Z (co-domain).
Thus, f is onto.
Hence f is a bijective mapping.
Ex 1.10.14 Let f : R R be a function defined by

1, if x is rational ,
f (x) =
1, if x is irrational ,
then show that f is neither injective nor surjective (This function is known as Dirichlets
function).
Solution: The function is not injective since all rational numbers are mapped to 1 and all
irrational numbers are mapped to 1.
It is not surjective as only two elements of R, viz., 1 and 1 are the images of the elements
of R.
But, if we redefine the function f as f : R {1, 1} then it become an surjective
function.
Ex 1.10.15 Show that the mapping f : R R given by f (x) = |x|, x R is neither
one-one nor onto.
Solution: The function f is not one-one, since f (1) = | 1| = 1 and f (1) = |1| = 1. That
is, for different elements of R have same image.
Also, it is not onto, as f (R) = R+ {0} =
6 R, i.e., only the positive numbers of R
(co-domain) are mapped with the elements of R (domain).
Theorem 1.10.6 If |A| = n = |B| then number of bijective mapping from A to B is n!.
Proof: Let Xn be the set of all bijective functions on A, when |A| = n. Let A =
{a1 , a2 , . . . , an } and B = {b1 , b2 , . . . , bn }. When n = 1 there is only one bijection {(a1 , b1 )}.
When n = 2, then there are two bijections, viz., {(a1 , b1 ), (a2 , b2 )} and {(a1 , b2 ), (a2 , b1 )},
i.e., X2 = {{(a1 , b1 ), (a2 , b2 )}, {(a1 , b2 ), (a2 , b1 )}}. There the number of bijections from A to
B, when n = 2 is 2 = 2!.
When n = 3, we construct all bijections starting from a bijective function of X2 as follows.
Let {(a1 , b1 ), (a2 , b2 )} X2 . One can add new elements a3 A and b3 B to this
bijective in three different ways shown below.
{(a1 , b1 ), (a2 , b2 ), (a3 , b3 )}, {(a1 , b3 ), (a2 , b2 ), (a3 , b1 )}, {(a1 , b1 ), (a2 , b3 ), (a3 , b2 )}.
Similarly, from second element of X2 one can generate other three bijective functions.
Hence X3 contains 6 = 3! bijective functions when |A| = |B| = 3.
We assume that Xn has n! bijections when |A| = |B| = n. Let A = {a1 , a2 , . . . , an+1 }
and B = {b1 , b2 , . . . , bn+1 } and Xn+1 .be the set of all bijections from A onto B.
Let {(a1 , b1 ), (a2 , b2 ), ldots, (an , bn )} be a bijective function of Xn . By introducing an+1
A and bn+1 B, one can generates the following bijections starting from the above bijection
{(a1 , b1 ), (a2 , b2 ), . . . , (an , bn ), (an+1 , bn+1 )}

56

Theory of Sets
f

f
r

r
r
r

r
r

r
r

r
r

r
r

A
B
(a) One-one into function
f
r
r
r
r
r
r

A
B
(b) Many-one into function
f
r
r
r
r
r
r
r

A
B
(c) One-one onto function

A
B
(d) Many-one onto function

Figure 1.30: Pictorial representation of different type of function


{(a1 , bn+1 ), (a2 , b2 ), . . . , (an , bn ), (an+1 , b1 )}
{(a1 , b1 ), (a2 , bn+1 ), . . . , (an , bn ), (an+1 , b2 )}
..
..
..
.
.
.
{(a1 , b1 ), (a2 , b2 ), . . . , (an , bn+1 ), (an+1 , bn )}.
Thus starting from a single bijective function of Xn we generate (n + 1) bijective functions
by increasing only one element each of A and B. Hence the number of bijective functions
Xn+1 is n! (n + 1) = (n + 1)!.
Hence by mathematical induction the number of different bijective functions from A to
B is n! when |A| = |B| = n.
If f : A B is a bijective function from A onto B then |A| = |B|, i.e., both the sets have
same number of elements. If |A| =
6 |B| then f can not be bijective.
Result 1.10.1 Let A and B be two non-empty sets with cardinality m and n respectively.
The number of possible relations from A to B is 2mn 1.
Many-one function
A function f : A B is said to be many-one function if two or more elements of A
correspond to the same element of B. The constant function is an example of many-one
function. Foe example, let A = {1, 2, 3, 4}, B = {0, 1, 2}. Then
(i) f = {(1, 0), (2, 0), (3, 1), (4, 2)} is a many-one into function.
(ii) f = {(1, 0), (2, 1), (3, 1), (4, 2)} is a many-one onto function.
Let A = {1, 2, 3}, B = {0, 1, 2, 3} and f = {(1, 0), (2, 0), (3, 3)} is one-one and into function.
Equality of two function
Two functions f : A B and g : C D are said to be equal if A = C, B = D and
f (x) = g(x) for all x A. It is written as f = g.
Ex 1.10.16 Suppose A = {1, 2, 3} and B = {8, 9}. Examine whether the following subsets
of A B are functions from A to B.

Mapping

57

(i) f1 = {(1, 8), (1, 9), (2, 8), (3, 9)}


(ii) f2 = {(1, 9), (2, 9), (3, 9)}
(iii) f3 = {(1, 8), (2, 9), (3, 9)}.
How many mapping are there from A into B ? Identify the one-one and onto mapping.
Solution: Here A and B are domain and co-domain respectively. Then,
(i) f1 is not a function since 1 A has two different images 8 and 9.
(ii) f2 is a function, particularly it is a constant function.
(iii) f3 is also a function.
The possible mapping from A to B are
g1 = {(1, 8), (2, 8), (3, 8)},
g3 = {(1, 8), (2, 8), (3, 9)},
g5 = {(1, 9), (2, 8), (3, 9)},
g7 = {(1, 9), (2, 8), (3, 9)},

g2
g4
g6
g8

= {(1, 9), (2, 9), (3, 9)},


= {(1, 8), (2, 9), (3, 8)},
= {(1, 9), (2, 9), (3, 8)},
= {(1, 9), (2, 8), (3, 8)}.

Therefore, there are 8 mappings from A to B. The mappings are many-one and none of
them are one-one. The onto mappings are g3 , g4 , g5 , g6 , g7 , g8 .

1.10.2

Composite mapping

Let A, B, C be any non empty sets and let f : A B and g : B C be two functions. If
a function h is defined in such a way that h : A C by h(x) = g{f (x)}, x A, then h is
g
f
r
x

r
f (x)

r
g{f (x)}

h = gof
Figure 1.31: Composite function gof
called the product or composite function of f and g. It is denoted by gof or gf . Thus, the
product or composite mapping of the mappings f and g, denoted by gof : A C is defined
by
(g0f )(x) = g[f (x)], for all x A.

(1.19)

Under the mapping f , an element x A is mapped to an element y = f (x) B. Again, y


is mapped by g to an element z C such that z = g(y) C and hence z = g(y) = g[f (x)].
Obviously, the domain of (gof ) is A and co-domain is C. For example, let f : R R and
g : R R be two functions where f (x) = 3x + 2 and g(x) = x2 + 1. Now,
(f og)(x) = f (g(x)) = f (x2 + 1) = 3(x2 + 1) + 2 = 3x2 + 5 and
(gof )(x) = g(f (x)) = g(3x + 2) = (3x + 2)2 + 1 = 9x2 + 12x + 5.
For this example, it is seen that (f og)(x) 6= (gof )(x). Thus in general f og 6= gof , i.e.,
product of the function is non-commutative.

58

Theory of Sets

Ex 1.10.17 Let f, g : < < be two mappings, defined by f (x) = |x| + x, x < and
g(x) = |x| x, x <. Find f g and gf .
Solution: Here the two mappings f, g : < < is defined by f (x) = |x| + x, ; g(x) =
|x| x, x <, i.e.,
f (x) = 2x; if x 0
= 0; if x < 0

g(x) = 2x; if x < 0


.
= 0; if x 0.

Now, let x 0, then,


(f g)(x) = f (g(x)) = f (0) = 0 and (gf )(x) = g(f (x)) = g(2x) = 0.
Now, let x < 0, then,
(f g)(x) = f (g(x)) = f (2x) = 4x and (gf )(x) = g(f (x)) = g(0) = 0.
Therefore, (gf )(x) = 0, for all x < and
(f g)(x) = 0; if x 0
= 4x; if x < 0.
Theorem 1.10.7 The product of any function with the identity function is the function
itself.
Proof: Let f : X Y and let us denote by Ix and Iy the identity functions on X and
Y respectively. Then we must show that IY of = f and f oIX = f . Since f : X Y and
IY : Y Y, so IY of : X Y. Now, let x be an arbitrary element of X and let f (x) = y.
Then,
(IY of )(x) = IY [f (x)] = IY (y) = y = f (x)
IY of = f.
Again, since IX : X X and f : X Y , so f oIX : X Y. Now, for arbitrary x X, we
have,
(f oIX )(x) = f [IX (x)] = f (x)
f oIX = f.
Therefore, the product of any function with the identity function is the function itself.
Theorem 1.10.8 The product of any invertible mapping f with its inverse mapping f 1 ,
is an identity mapping.
Proof: Let f be an one-one mapping of X onto Y and let IX and IY be the identity
mappings on X and Y respectively. Clearly,
f 1 is a one-one mapping of Y onto X. Now,
f : X Y and f 1 : Y X f 1 of : X X.
Now, let x be an arbitrary element of X and let f (x) = y so that, x = f 1 (y). Therefore,
(f 1 of )(x) = f 1 [f (x)] = f 1 (y) = x = IX (x)
f 1 of = IX .
Again, f 1 : Y X and f : X Y , so f of 1 : Y Y. Now, for an arbitrary y Y , there
is associated an unique x X such that f (x) = y or x = f 1 (y). Therefore,
(f of 1 )(y) = f [f 1 (y)] = f (x) = y = IY (y)
f of 1 = IY .
Hence, f 1 of = IX and f of 1 = IY . Therefore, the product of any invertible mapping f
with its inverse mapping f 1 , is an identity mapping.

Mapping

59

Theorem 1.10.9 Composites of functions is associative.


Proof: Let X, Y, Z, T be the four non-empty sets. Let f : X Y, g : Y Z and
h : Z T be the three mappings. Then, we are to show that, (hog)of = ho(gof ). Now,
hog : Y T, gof : X Z and so (hog)of : X T, ho(g)f ) : X T. Let x be an arbitrary
element of X. Then,
[(hog)of ](x) = (hog)[f (x)] = h[g{f (x)}]
= h[(gof )(x)] = [ho(gof )](x); x X.
Hence (hog)of = ho(gof ) and so composites of functions is associative.
Theorem 1.10.10 Let f : X Y and g : Y X. If gof is an identity function on X
and f og is an identity function on Y , then g = f 1 .
Proof: Being given that, gof = IX and f og = IY . We are to show that g = f 1 . We first
prove that f is invertible, i.e., it is one-one and onto. Now,
f (x1 ) = f (x2 ) g[f (x1 )] = g[f (x2 )]
(gof )(x1 ) = (gof )(x2 )
IX (x1 ) = IX (x2 ) x1 = x2 .
So, f is one-one. In order to show that f is onto, let y be an arbitrary element of Y and let
g(y) = x. Then,
g(y) = x f [g(y)] = f (x)
(f og)(y) = f (x)
IY (y) = f (x) y = f (x).
Thus, for each y Y , there is an x such that f (x) = y. So, f is onto. Now,
f og = IY f 1 (f og) = f 1 oIY
(f 1 of )og = f 1 ; associativity
IY og = f 1 g = f 1 .
Theorem 1.10.11 Let f : A B and g : B C be invertible. Then gof is invertible and
(gof )1 = f 1 og 1 .
Proof: Since f and g are invertible, they are bijective. Let a1 , a2 A. Now,
(gof )(a1 ) = (gof )(a2 ) g{f (a1 )} = g{f (a2 )}
f (a1 ) = f (a2 ); as g is one-one
a1 = a2 ; as f is one-one.
Hence (gof ) is one-one. Let c C. Since g is onto, every c has one pre-image, say b B,
such that g(b) = c. Again, since f is onto, every b B has a pre-image a A, such that
f (a) = b. Now,
(gof )(a) = g{f (a)} = g(b) = c.
Thus, for every element c C, there is an element a A, such that (gof )(a) = c, and hence
gof is onto. Thus (gof ) is bijective and so it is invertible. Now,
(gof )(a) = c (gof )1 (c) = a.
Also, f 1 og 1 (c) = f 1 {g 1 (c)} = f 1 (b) = a
as g(b) = c and f (a) = b.
(gof )1 (c) = f 1 og 1 (c), i.e., (gof )1 = f 1 og 1 .

60

Theory of Sets

Therefore, if f and g be one-one mappings of A onto B and B onto C respectively, so that


f and g are both invertible, then (goh) is also invertible and (goh)1 = f 1 og 1 .
Theorem 1.10.12 Let f : A B and g : B C be both injective then gof : A C is
also injective.
Proof: Let x1 , x2 A.Now,
(gof )(x1 ) = (gof )(x2 ) g{f (x1 )} = g{f (x2 )}
f (x1 ) = f (x2 )( since g is injective )
x1 = x2 ( since f is injective ).
Hence gof is injective.
Theorem 1.10.13 Let f : A B and g : B C be two surjective functions then gof :
A C is surjective.
Proof: Let z C. Since g is surjective, there exist y B such that g(y) = z. Again, since
f is surjective, there exist x A such that f (x) = y. Now,
(gof )(x) = g{f (x)} = g(y) = z,
and it is true for arbitrary z. Hence gof is surjective.
Theorem 1.10.14 If f : A B and g : B C be two mappings and gof is bijective, then
f is one-one and g is onto.
Proof: Let any two elements x1 , x2 A be such that f (x1 ) = f (x2 ). Now,
(gof )(x1 ) = g[f (x1 )] = g[f (x2 )] = (gof )(x2 ).
Now, since gof is one-one, therefore,
(gof )(x1 ) = (gof )(x2 ) x1 = x2 .
Therefore, f (x1 ) = f (x2 ) x1 = x2 .
Hence f is one-one. Again, since gof : A C is onto, for all z C, an element a A
such that (gof )(x) = z. Therefore, z = g[f (x)], z C. Now, for each x A, we have
f (x) = y B. Thus, for each z C, there is an element y B such that z = g(y). Hence
g : B C is onto.
Theorem 1.10.15 The inverse of the inverse of a function is the function itself, i.e.,(f 1 )1 =
f.
Proof: Let a mapping f : A B be invertible. Then a function g = f 1 : B A such
that
f (x) = y x = g(y), x A and y B.
Again, f , being invertible, is one-one and onto and so g is one-one and onto, i.e., g is
invertible and g 1 exists. Now,
(f og)(y) = f [g(y)] = f (x),
i.e., (f og)(y) = y, as f (x) = y.
This shows that (f og) = IB , identity mapping in B. Thus, f is the inverse of g, i.e., f = g 1 .
This gives (f 1 )1 = f.

Mapping

61

Definition 1.10.1 Images and inverse images of sets under a mapping : Let X and
Y be any two non-empty sets and f be a mapping of X into Y . Let A X and B Y.
Then, we define
f (A) = {y Y : y = f (x) for some x A}
(1.20)
1
and f (B) = {x X : f (x) B}.
(1.21)
Thus, y f (A) y = f (x), for some x A and x f 1 (B) f (x) B.
Note : Here note that, f (x) f (A) not necessarily implies that x A, for example, if we
consider the mapping
f : < < : f (x) = x2 , x <
and if A = [0, 1] is a subset of <, then obviously, f (A) = [0, 1]. Also by definition, of f , we
have, f (1) = 1 [0, 1] = f (A) but (1) 6 A. However x A f (x) f (A).
Theorem 1.10.16 If X and Y are two non-empty sets and f be a mapping of X into Y ,
then for any subsets A and B of X,
(i)f (A B) = f (A) f (B),

(ii)f (A B) f (A) f (B).

Proof: (i) Let y be an arbitrary element of f (A B), then


y f (A B) y
y
y
y

= f (x) for some x (A B)


= f (x) for some x A or x B
= f (x), for f (x) f (A) or f (x) f (B)
f (A) or y f (B) y f (A) f (B).

Consequently, f (A B) f (A) f (B) and f (A) f (B) f (A B) and hence, f (A B) =


f (A) f (B).
(ii) Let y be an arbitrary element of f (A B), then,
y f (A B) y
y
y
y

= f (x) for some x (A B)


= f (x) for some x A and x B
= f (x), such that f (x) f (A) and f (x) f (B)
f (A) and y f (B) y f (A) f (B).

Consequently, f (AB) f (A)f (B). Note that, the relation can not in general be replaced
by equality. For example, if a = [1, 0] and B = [0, 1] are any two subsets of the set < of
all real numbers and
f : < < : f (x) = x2 , x <,
then clearly, f (A) = [0, 1] and f (B) = [0, 1] so that f (A) f (B) = [0, 1] and since A B =
{0}, so f (A B) = {0}. Thus, f (A B) 6== f (A) f (B). Also it may be noted here that
f (A B) f (A) f (B). Thus in general, f (A B) f (A) f (B).
Theorem 1.10.17 If X and Y be two non-empty sets and f be a mapping of X into Y ,
then for any subsets A and B of Y ,
(i)f 1 (A B) = f 1 (A) f 1 (B) and (ii)f 1 (A B) = f 1 (A) f 1 (B).

62

Theory of Sets

Proof: (i) Let x be an arbitrary element of f 1 (A B), then


x f 1 (A B) f (x) (A B)
f (x) A or f (x) B
x f 1 (A) or x f 1 (B)
x f 1 (A) f 1 (B).
Consequently, f 1 (A B) f 1 (A) f 1 (B) and f 1 (A) f 1 (B) f 1 (A B) and
hence, f 1 (A B) = f 1 (A) f 1 (B).
(ii) Let x be an arbitrary element of f 1 (A B), then
x f 1 (A B) f (x) (A B)
f (x) A and f (x) B
x f 1 (A) and x f 1 (B)
x f 1 (A) f 1 (B).
Consequently, f 1 (A B) f 1 (A) f 1 (B) and f 1 (A) f 1 (B) f 1 (A B) and
hence, f 1 (A B) = f 1 (A) f 1 (B).
Theorem 1.10.18 If X and Y be two non-empty sets and f be a mapping of X into Y ,
then, (i) for any subset A of X, A f 1 [f (A)] and in general, A 6= f 1 [f (A)]. (ii) For
any subset B of Y , f [f 1 (B)] B and further if B is a subset of the range of f , then
f [f 1 (B)] = B.
Proof: (i) Let A be any subset of X. If A = , then the result is obvious. So, let A 6=
and let x be an arbitrary element of A, then
x A f (x) f (A) x f 1 [f (A)].
Hence, A f 1 [f (A)]. Now, in order to show that, in general, A 6= f 1 [f (A)], consider
the mapping, f : < < : f (x) = x2 , x <. Now, let A = [1, 0] be any subset of
<, then obviously, f (A) = [0, 1]. Therefore, f 1 [f (A)] = [1, 1] 6= A. Thus in general,
A 6= f 1 [f (A)].
(ii) Let B be any subset of X and let y be an arbitrary element of f [f 1 (B)], then
y f [f 1 (B)] y = f (x) for some x f 1 (B)
y = f (x) such that f (x) B
y B f [f 1 (B)] B.
Further, if B is a subset of the range of f , then for each y B, an x f 1 (B) such that
y = f (x) and so
y B y = f (x) for some x f 1 (B)
y = f (x) for f (x) f [f 1 B]
y f [f 1 B] B f [f 1 (B)].
Hence it follows that, f [f 1 B] = B. Note here that f [f 1 B] = B holds only when B is a
subset of the range of f and so in general, f [f 1 B] 6= B. For example, consider the mapping,
f : < < : f (x) = x2 , x <. Now, let B = [1, 0] then f 1 (B) = {0} and therefore,
f [f 1 (B)] = {0} =
6 [1, 0] = B. Thus in general, f [f 1 B] 6= B.

Permutation

1.11

63

Permutation

Permutation of a non-empty finite set is defined to be a bijective mapping of a finite set


onto itself. Let a, b, c, , k be any arrangement of the set of positive integers 1, 2, , n.
The one-one mapping p : S S of the finite set S = {1, 2, , n} onto itself
1
2
3

a
b
c

k
where p(1) = a, p(2) = b, , p(n) = k, denoted by the symbol

 

1
2
3 n
1 2 3 n
p=
=
p(1) p(2) p(3) p(n)
a b c k

(1.22)

is known as the permutation of degree n or n symbols. Obviously, the order of the column
in the symbol is immaterial so long as the corresponding elements above and below in that
column remain unchanged. The order, in which the first row is written,
does

 not matter,

123
213
what actually matters is which element is replaced by which. Thus,
,
and
abc
bac


231
, are the same. In the standard form, the elements in the top row are in natural
bca
order. If p be a permutation of n symbols, then the set of all permutations, denoted by Pn ,
will contain n! distinct elements, as n distinct elements can be arranged in n! ways and is
known as symmetric set of permutations.
Ex 1.11.1 Construct the symmetric set of permutations P3 .
Solution: The symmetric set of permutations P3 contains 3! = 6 elements, where each
permutation has 3 symbols. Therefore,

 
 
 
 
 

123
123
123
123
123
123
P3 =
,
,
,
,
,
.
123
132
213
231
312
321

1.11.1

Equal permutations

Two permutations p and q of degree


 said to beequal, ifp(a) = q(a), for all a S. For
 n are
2431
1234
example, the permutations p =
and q =
are equal permutations.
2341
3142

1.11.2

Identity permutation

If S = {a1 , a2 , , an }, then the bijective mapping I : S S, defined by I(ai ) = ai , is


called an identity permutation of degree n. For example,




abc
1 2 3 n
I=
or
abc
1 2 3 n
are identity permutations. Thus, if there be no change of the elements, i.e., if each element
be replaced by itself, then it is called identity permutation and is denoted by I.

1.11.3

Product of permutations

Since permutation is just a bijective mapping, the product of two permutations is just the
product of two mappings. Let S = {a1 , a2 , , an } and let p : S S and q : S S be two

64

Theory of Sets

permutations of S. Since range of p = domain of q, the composite mapping p0 q : S S is


defined. Since the permutations p and q are bijective, p0 q is also bijective. Hence p0 q is a
permutation of S. The product of two permutations p : S S and q : S S, denoted by
p0 q or simply pq is defined by,


a1
a2
a3 an
pq =
.
p[q(a1 )] p[q(a2 )] p[q(a3 )] p[q(an )]
Since, composition of mappings is non-commutative, so pq 6= qp, in general. Also, since
composition of mappings is associative, so p(qr) = (pq)r. Let p be a permutation on a finite
set of degree n, then we define,
pn = p.p p(n factors), n N
with p0 = I. Also for all integral values of m, n, we have the index laws (i) pm pn = pm+n
and (ii) (pm )n = pmn . As, pq 6= qp, in general, so (pq)n = pn q n does not hold. If a least
positive value integer k, such that pk = I, then k is called the order of the permutation p.
Ex 1.11.2 If p = (1 2 3 4 5), q = (2 3)(4 5), find pq.
Solution: Using the definition of product of permutations we have,



12345
12345
pq =
23451
13254
The product is given by the following mapping procedure, from the right, follows from the
definition of mapping f (g)(x) = f (g(x)).
1


Therefore the product of p and q is given by

qp =

1.11.4

12345
13254



12345
23451

12345
24315


=


= (1 2 4). Similarly,

12345
32541


= (1 3 5).

Inverse of permutations

Let S = {a1 , a2 , , an } and let f : S S be a permutation of S. As p : S S is a


bijective mapping, it has unique inverse which is also bijective. Let p1 be the inverse, then
p1 : S S is the permutation of S and is defined by


p(a1 ) p(a2 ) p(a3 ) p(an )
p1 =
.
a1
a2
a3 an
The important property is that pp1 = p1 p = I.
Ex 1.11.3 If p = (1 2 4 3), q = (1 4 3 2), show that, (pq)1 = q 1 p1 .

Permutation

65

Solution: Here the product of p and q is given by,




 

1234
1234
1234
pq =
=
= (1 3 4).
2413
4123
3241

 

3241
1234
(pq)1 =
=
= (1 4 3).
1234
4213
As, p = (1 2 4 3), q = (1 4 3 2), so the inverses are given by

 



2413
1234
1234
p1 =
=
; q 1 =
1234
3142
2341


 

1234
1234
1234
q 1 p1 =
=
= (1 4 3).
2341
3142
4213
Hence, (pq)1 = q 1 p1 is verified.

1.11.5

Cyclic permutation

Let S = {a1 , a2 , , an }. A permutation p : S S is said to be a cycle of length r, or


an r cycle, if there are r elements ai1 , ai2 , , air in S such that, p(ai1 ) = ai2 , p(ai2 ) =
ai3 , , p(air1 = air , p(air ) = ai1 and p(aj ) = aj ; j 6= i1 , i2 , , ir . The cycle is denoted
by (ai1 , ai2 , , air ) and the elements appear in a fixed cyclic order ai1 , ai2 , , air are said
to be the elements of the cycle.
(i) Two cycles p and q on the same set S = {a1 , a2 , , an } are said to be disjoint if they
have no common elements.
(ii) A cycle of length 2 is called transposition. The cycle of length 1 may be ignored.
Theorem 1.11.1 Every permutation on a finite set is either a cycle or it can be expressed
as a product of disjoint cycles.
Proof: Let S = {a1 , a2 , , an } and p : S S be a permutation on S. Let us consider the elements a1 , p(a1 ), p2 (a1 ), , all these can not de distinct as all of them belong to a finite set S. Let k be the least positive integer such that pk (a1 ) = a1 . Then,
a1 , p(a1 ), p2 (a1 ), , pk1 (a1 ) are k distinct elements of S, because, if pr (a1 ) = ps (a1 ), for
some r, s such that 0 < p < q < r, then prs (a1 ) = p0 (a1 ) = a1 holds and this contradicts
that k is the least positive integer satisfying pk (a1 ) = a1 .

Let us consider a k cycle p1 = a1 , p(a1 ), p2 (a1 ), , pk1 (a1 ) . If k = n, then p = p1 and
the theorem is proved. If k < n, let al be the first element among a2 , a3 , an such that al
does not belong to the cycle p1 . Let us consider the elements am , p(am ), p2 (am ), . Neither
of these belong to p1 , because, if pi (a1 ) = pj (am ) for some integers i, j then pij (a1 ) = am ,
a contradiction. Arguing as before we arrive at a cycle p2 of length m, say. If k + m = n
then p is the product of disjoint cycles p1 and p2 . If k + m < n, let at be the first element
among a3 , a4 , an , which does not belong to p1 or p2 and proceed as before.
Since S is finite set, this process terminates after a finite number of steps and we arrive at
the decomposition of p as the product p1 p2 pr of disjoint cycles.
(i) Let p be a permutation on a finite set S = {a1 , a2 , , an }. The order of the permutation p : S S is the least positive integer n such that pn = I, I being the identity
permutation.
(ii) The order of an k cycle is k.

66

Theory of Sets

(iii) The order of a permutation on a finite set is the l.c.m. of the lengths of its disjoint
cycles.
(iv) Every permutation on a finite set S = {a1 , a2 , , an }, n 2 can be expressed as a
product of transpositions.
(v) A permutation p is said to be an even permutation, if it can be expressed as a product
of even number of transpositions.
(vi) A permutation p is said to be an odd permutation, if it can be expressed as a product
of odd number of transpositions.
(vii) The number of even permutations on a finite set S = {a1 , a2 , , an }, n 2 is equal
to the number of odd permutations on it.

Ex 1.11.4 Express p =

12345678
35412687


as the product of disjoint cycles.

Solution: Here p is not a cycle. Note that,


p(1) = 3, p2 (1) = p(3) = 4, p3 (1) = p(4) = 1.
Thus the first cycle is (1 3 4). Since 2 6 (1 3 4), we compute
p(2) = 5, p2 (2) = p(5) = 2.
Thus the second cycle is (2 5). Also, as 6 6 (1 3 4) and (2 5), and p(6) = 6, so the third
cycle is (6). Again,
p(7) = 8, p2 (7) = p(8) = 7.
Thus the fourth cycle is (7 8). All elements have been exhausted. Therefore,
p = (1 3 4)(2 5)(6)(7 8) = (1 3 4)(2 5)(7 8) = (1 4)(1 3)(2 5)(7 8).
Also, we see that, p is the product of 4(even) number of transpositions, so p is an even
permutation.
Ex 1.11.5 Describe all the permutations on the set {1, 2, 3} and their respective orders.
Solution: There are
= 6 permutations
onthe setS = {1, 2, 3}, given by,
 possible
 3! 

123
123
123
= I,
= (1 2 3),
= (1 3 2),
123
231
312






123
123
123
= (2 3),
= (1 3),
= (1 2).
132
321
213
Thus the even permutations are {I, (1 2 3), (1 3 2)} and the odd permutations are
{(2 3), (1 3), (1 2)}. As I 1 = I, so the order of I is 1. As, (1 2 3), (1 3 2) are the cycles
of length 3, the orders of (1 2 3), (1 3 2) are 3. As, (2 3), (1 3), (1 2) are the cycles of
length 2, the orders of (2 3), (1 3), (1 2) are 2.

Enumerable Set

1.12

67

Enumerable Set

Let S and N be the set of real numbers and natural numbers respectively. The set S is
defined as enumerable or de-enumerable or countable if there is a bijection f : S N .
So corresponding to every positive integer n, there exist one and only one element of an
enumerable set. This element may be denoted by an or bn or un etc. Thus a countable
set can be written as {a1 , a2 , . . . , an , . . .}. For example, the set S = {2n|n N } is an
enumerable set.
(i) A countable set is an infinite set.
(ii) Obviously an enumerable set is an infinite set. Obviously every infinite set is not enumerable. If an infinite set be enumerable then it is sometimes said to be an enumerably
infinite set. It is needless to say that a non enumerable infinite set can not be written
as : {a1 , a2 , . . . , an , . . .}
(iii) A set S is defined to be almost enumerable if it is either finite or enumerably infinite.
(iv) Any sub set of an enumerable set is almost an enumerable.
(v) Any super-set of non-enumerable set is non enumerable.
Theorem 1.12.1 Union of a finite set and an enumerable set is an enumerable.
Proof: Let A be a finite set which can be written as A = {a1 , a2 , . . . , ar } in which the
elements are increasing order of magnitude. Let B = {b1 , b2 , . . . , bn , . . .} be an enumerable
set. If A B = , we can define a bijective mapping f : A B N such that f (1) =
a1 , f (2) = a2 , . . . , f (r) = ar and then f (r + k) = bk for k = 1, 2, . . . . That is A B may be
written as
A B = {a1 , a2 , . . . , ar+1 , ar+2 , . . . , ar+k , . . .},
where ar+k = bk ; k. Hence A B is an enumerable set. If A B 6= , let, B1 = B A,
then B1 A = A B and B1 A = . Now, B1 is an infinite subset of B and therefore, B1
is enumerable. Hence B1 A is enumerable and so, A B is enumerable.
Theorem 1.12.2 Union of finite number of enumerable sets is enumerable.
Proof: Let A1 , A2 , . . . , Ar be (each of) a finite number of enumerable sets. Let
A1 = {a11 , a12 , a13 , . . . , a1n , . . .}
A2 = {a21 , a22 , a23 , . . . , a2n , . . .}
.. ..
..
. .
.
Ar = {ar1 , ar2 , ar3 , . . . , arn , . . .}
We can write the elements of

r
F
i=1

Ai as

r
F

Ai = {a11 , a21 , a31 , . . . , ar1 , a12 , a22 , a32 , . . . ,

i=1

ar2 , . . . , a1n , a2n , a3n , . . . , arn , . . .}. Hence

r
F

Ai is an enumerable set.

i=1

Theorem 1.12.3 Union of enumerable set of enumerable sets is enumerable.

68

Theory of Sets

Proof: Let {A1 , A2 , . . . , An , . . .} be an enumerable set where each Ai is an enumerable set.

F
We are to show that
Ai is an enumerable. Let,
i=1

A1 = {a11 , a12 , a13 , . . . , a1n , . . .}


A2 = {a21 , a22 , a23 , . . . , a2n , . . .}
.. ..
..
. .
.
Ar = {ar1 , ar2 , ar3 , . . . , arn , . . .}
.. ..
..
. .
.
The elements

Ai are arranged as:

i=1

{a11 ; a12 , a21 ; a13 , a22 , a31 ; . . .}


in which there are several blocks. k th block contains all aij such that i + j = k + 1 in each
block the elements are written in the increasing order of the first suffix. In this arrangement
the element aij occupies in (i + j 1)th block and occupies ith position in the block. Also
k th block contains exactly k elements. Hence in the above arrangement aij occurs [{1 + 2 +

F
+ (i + j 2)} + i]th position. Hence
Ai is an enumerable set. The set of all positive
i=1
+

rational numbers can be written as Q

F
i=1

Ai , where Ai = { ni ; n N } and it is the

union of enumerable set of enumerable sets and hence enumerable. Now, Q+ is similar to
Q , hence Q = Q+ Q {0} is enumerable.
Theorem 1.12.4 The set of real numbers is non enumerable.
Proof: We shall first show that the interval 0 < x 1 is non-enumerable. If possible , let
us assume , the set is an enumerable set. If the real numbers lying in the above interval,
then they can be written as {a1 , a2 , . . . , an , . . .}. Since a real number can be expressed as an
infinite decimal(if we agree not to use recurring in this can be done in only one way). Let,
a1 = 0.a11 a12 a13 . . .
a2 = 0.a21 a22 a23 . . .
.. ..
..
. .
.
an = 0.an1 an2 an3 . . .
Now we construct a number b = 0.b1 b2 b3 . . ., where br is different from arr , 0 and 9 for all
r. Obviously b is a real number lying in 0 < x 1, and so must itself appear somewhere
in the succession {a1 , a2 , . . . , an , . . .} if this section is to contain all real numbers between
0 and 1. But b is different from every ai , since it differs from ai at least in the ith place
of decimal. This contradict the assumption that the given interval is an enumerable set.
Hence 0 < x 1 is non-enumerable. The whole of real numbers is a super set of this
non-enumerable set and hence is non-enumerable.
(i) The open interval (0, 1) is an non-enumerable. For if this is enumerable then (0, 1){1}
i.e. 0 < x 1 is also an enumerable which contradict the above result.
(ii) The close interval [0, 1] i.e. 0 x 1 being a super set of the non-enumerable set
0 < x 1, is also non-enumerable.

Enumerable Set

69

(iii) Any interval a x b is non-enumerable. First we shall prove 0 < x 1 is a nonxa


enumerable. Let us define a bijection f : x xa
ba ; (b > a) i.e. f (x) = ba then if
x goes from a to b, f goes from 0 to 1. Hence the interval a < x b is similar to
0 < x 1.
As 0 < x 1 is non-enumerable so a < x b is also non-enumerable. Hence its super
set [a, b] is also non-enumerable.
(iv) The open interval (a, b) is non-enumerable.

Exercise 1
Section-A
[Multiple Choice Questions]
1. A and A respectively
(a) A, A0 (b) (c) A, (d) A.
2. (A B) (B C) is equals to
(a) B (b) A B C

(c) A (B C) (d) (A0 C 0 )0 B.

3. ((((A B) A) B) A) is equals to
(a) A (b) B (c) A B (d) A B.
4. (A B) (B A) (A B) is equals to
(a) A B (b) Ac B c (c) A B (d) Ac B c .
5. The number of elements in the power set P (S) of the set S = {{}, 1, {2, 3}} is
(a) 2
(b) 4
(c) 8
(d) None of these.
6. Let A be a finite set of size n, the number of elements in the power set of A A is
n
2
(a) 22
(b) 2n
(c) (2n )2
(d) None of these.
7. The number of binary relations on asset with n elements is
2
(a) 2n
(b) 2n
(c) 2n
(d) None of these.
8. Suppose A is a finite set with n elements. The number of elements in the largest
equivalence relation of A is
(a) 1
(b) n
(c) n + 1
(d) n2 .
9. The number of equivalence relations of the set {1, 2, 3, 4} is
(a) 4
(b) 15
(c) 16
(d) 24
10. The power set 2S of the set S = {3, {1, 4}, 5} is
(a) {S, 3, 1, 4, {1, 3, 5}, {1, 4, 5}, {3, 4}, }
(b) {S, 3, {1, 4}, 5}
(c) {S, {3}, {3, {1, 4}}, {3, 5}, }
(d) None of these.
11. If there is no onto function from {1, 2, , m} onto {1, 2, , n}, then
(a) m = n (b) m < n (c) m > n (d) m 6= n.
12. If |A B| = 12, A B and |A| = 3, then |B| is
(a) 12 (b) 9 (c) 9 (d) None of these.

70

Theory of Sets

13. Let P (S) denote the power set of the set S. Which of the following is always TRUE?
(a) P (P (S)) = P (S) (b) P (S)S = P (S). (c) P (S)P (P (S)) = {} (d) S 6 P (S).
14. The number of relations from A to B with |A| = m and |B|n is
(a) mn (b) 2n (c) 2m (d) 2mn .
15. If mn if m2 = n, then
(a) (3, 9) (b) (3, 9) (c) (3, 9) (d) (9, 3) .
16. The relation on Z defined by mn if m + n is even is
(a) Reflexive (b) Not reflexive (c) Not symmetric (d) Not antisymmetric
17. and A A are
(a) Both reflexive (b) Both symmetric (c) Both antisymmetric (d) Both equivalence
relation.
18. The relation ab if |a b| = 2 where a and b are real numbers, is
(a) Neither reflexive nor symmetric (b) Neither symmetric nor transitive (c) An
equivalence relation (d) Symmetric but not transitive.
19. The relation defined in Z by ab if |a b| < 2 is
(a) Not reflexive (b) Not symmetric (c) Not transitive (d) An equivalence relation.
20. The relation defined in N by ab if m2 = n is
(a) Reflexive (b) Symmetric (c) Transitive (d) Antisymmetric.
21. The relation defined in N by ab if m|n or n|m is
(a) Not reflexive (b) Not symmetric (c) Not transitive (d) None of these.
22. A relation defined in N by ab if m and n are relatively prime is
(a) A partial ordering (b) Transitive (c) Not transitive (d) An equivalence relation.
23. The subset relation on a set of sets is
(a) A partial ordering (b) Transitive and symmetric only (c) Transitive and antisymmetric only. (d) An equivalence relation.
24. The binary relation S = on set A = {1, 2, 3} is
(a) Neither reflexive nor symmetric (b) Symmetric and reflexive
reflexive (d) Transitive and symmetric.

(c) Transitive and

25. The less than relation, <, on real is


(a) A partial ordering since it is asymmetric and reflexive
(b) A partial ordering since it is anti-symmetric and reflexive
(c) Not a partial ordering because it is not asymmetric and not reflexive
(d) Not a partial ordering because it is not anti-symmetric and not reflexive.
26. A partial order is defined on the set S = {x, a1 , a2 , , an , y} as x ai , for all i and
ai y for all i, where n 1. The number of total orders on the set S which contain
the partial order is
(a) 1 (b) n (c) n + 2 (d) n!.
27. The number of possible partial ordering on {a, b, c} in which a b is
(a) 3
(b) 4
(c) 5
(d) 6

Enumerable Set

71
a
Q
Q

Qc
b
g
eb
d

b
b
f
Figure 1.32: Poset

28. In a lattice defined by the Hasse diagram given below, how many (Fig. 1.32) components does the element e have?
(a) 2 (b) 3 (c) 0 (d) 1.
29. The maximal and minimal elements of poset given by the Hasse diagram (Fig. 1.33)
are
3
5
@ 4%
%
@
2
@ % @
6
@%
1
Figure 1.33: Poset
(a) Max=5,6; Min.=2 (b) Max.=5,6; Min.=1 (c) Max.=3,5: Min.=1,6 (d) None of
the above.
30. The greatest and least element of the poset given by the Hasse diagram (Fig. 1.34) are
4
5
@ 

@
3 @
1
2
Figure 1.34: Poset
(a) Greatest=4,5; least=1,2 (b) Greatest=5; least=1 (c) Greatest=None; least=1 (d)
None of the above.
31. If a b c in a lattice L
(a) a b = b c (b) a b = b c (c) a b = b c (d) a b = b c
32. In a lattice L, a b = b. Then
(a) a b (b) b a (c) a b = a (d) None of these.
33. Let X = {2, 3, 6, 12, 24}, let be the partial order defined by x y if x divides y.
Then number of edges in the Hasse diagram of (X, ) is
(a) 3
(b) 4
(c) 5
(d) None of these.
34. In a lattice L, ((a b) a) b is
(a) a b (b) a b (c) (a b) a (d) ((a b) a) b).
35. In a lattice, if a b and c d, then
(a) b c (b) a d (c) a c b d (d) None of these.
36. ({1, 2, 5, 10, 15, a}, |) is a lattice if the smallest value for a is
(a) 150 (b) 100 (c) 75 (d) 30.
37. S = {1, 2, 3, 12} and T = {1, 2, 3, 24}, then
(a) S and T are sublattice of (D24 , |)

72

Theory of Sets
(b) Neither S nor T are sublattices of (D24 , |)
(c) S and T are sublattices of ({1, 2, 3, 12}, |)
(d) S and T are sublattices of ({1, 2, 3, 24}, |)

38. S = {1, 2, 4, 8} and T = {1, 3, 9}, then


(a) Only S is a sublattice of (D72 , |)
(b) Only T is a sublattices of (D72 , |)
(c) Both S and T are sublattices of (D72 , |)
(d) Neither S nor T is a sublattice of (D72 , |).
39. If the posets P1 and P2 are given in Fig. 1.35, then
@
 @,
,@
, P1

%l
l
P2%

Figure 1.35: Poset for Self-Test


(a) P1 and P2 are lattices (b) P1 is lattice (c) P2 is lattice
lattice.

(d) None of them is a

40. ({1, 2, 4, 6, 12, 24}, |) is


(a) Not a poset (b) A lattice (c) A complemented lattice (d) A lattice which is not
complemented.
41. The lattice given in Fig. 1.36 is
l
J
Jll
J l


Z AA

Z



A

Z
Figure 1.36: Hasse diagram
(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
42. The lattice given in Fig. 1.37 is
L
LL


c
c
Figure 1.37: Hasse diagram
(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
43. If a, b, c L, L being a distributive lattice, then
(a) (a b) c a (b c) (b) (a b) c (a b) c
(a b) c = c.

(c) (a b) c = a b (d)

44. (D45 , |) is not distributive since


(a) {1, 3, 5, 45} is a sublattice of D45 (b) {1, 3, 9, 45} is a sublattice of D45
{1, 5, 9, 15, 45} is a sublattice of D45 (d) {1, 5, 15, 45} is a sublattice of D45 .

(c)

Enumerable Set

73

45. A chain with 3 elements is


(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
46. The lattice given in Fig. 1.38 is
@
@
" bb
"
H
H
H
Figure 1.38: Hasse diagram
(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
47. In a distributive lattice, if a b0 = 0, then
(a) b a (b) a b (c) a0 b = 0 (d) a b0 = 1.
48. ({1, 2, 3, 6, 12, 30, 60}, |) is
(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
49. ({{1}, {1, 2}, {1, 3}, {1, 2, 3}}, ) is
(a) Complemented but not distributive (b) Distributive but not complemented (c)
Both complemented and distributive (d) Neither complemented nor distributive.
50. The number of functions from an m element set to an n element set is
(a) m + n
(b) mn
(c) nm
(d) m n.
51. Let A and B be sets with cardinalities m and n respectively. The number of one-one
mappings from A and B, when m < n, is
(a) mn
(b) n Pm
(c) n Cm
(d) None of these.
52. It is given that there is exactly 97 functions from the set A to B. From this one can
conclude that
(a) |A| = 1, |B| = 97 (b) |A| = 97, |B| = 1 (c) |A| = 97, |B| = 97 (d) None of
these.
53. Let f : < < < < be a bijective function defined by f (x, y) = (x + y, x y). The
inverse function of f is given by
 1
1 
(a) f 1 (x, y) =
,
(b) f 1 (x, y) = (x y, x + y) (c) f 1 (x, y) =
x
+
y
x

y




x+y xy
(d) f 1 (x, y) = 2(x y), 2(x + y)
2 , 2
54. The range of g f when f : Z Z and g : Z Z are defined by f (n) = n + 1 and
g(n) = 2n is
(a) Z (b) Z + (c) The set of all odd numbers (d) The set of all even numbers.
55. If f, g, h are functions from < < defined by f (x) = x+1, g(x) = x2 +2, h(x) = 2x+1,
then (h g f )(2) is
(a) 20 (b) 23 (c) 21 (d) 22.
56. If f, g, h are functions from Z Z defined by f (x) = x3, g(x) = 2x+3, h(x) = x+3,
then g f h is
(a) f (b) g (c) h (d) h g f.

74

Theory of Sets

57. If f is a function from Z Z defined by f (x) = x + 2, then f 3 (10) is


(a) 7
(b) 6
(c) 5
(d) 4.
58. If f and g are functions from <+ to <+ defined by f (x) = ex and g(x) = x 3, then
(g f )1 (x) is
(a) log(3 + x)
(b) log(3 x)
(c) e3x
(d) log(x 3).
59. f (A1 A2 ) = f (A1 ) f (A2 ) holds
(a) if f is injective (b) If f is surjective (c) If f is any function (d) For no function.
60. The relation {(x, y) <2 : ax + by = c} is an invertible function from < < if
(a) a 6= 0 (b) b 6= 0, a 6= 0 (c) c 6= 0 (d) c 6= 0, a 6= 0.
61. The number of invertible functions from {1, 2, 3, 4, 5} to {a, b, c, d, e} is
(a) 55 (b) 25 (c) 5! (d) None of these.
62. The number of odd permutations of the set {1, 3, 5, 7, 9} is
(a) 15
(b) 30
(c) 60
(d) 120
63. Which one of the following is an even permutation?
(a) f = (1, 2, 3)(1, 2) (b) f = (1, 2)(1, 3)(1, 4)(2, 5) (c) f = (1, 2, 3, 4, 5)(1, 2, 3)(4, 5) (d)
None of these




1234
1234
64. Which power multiplying itself of the permutation f =
gives
1342
1234
2
3
4
(a) f
(b) f
(c) f
(d) f
Section-B
[Objective Questions]
1. Let S be a non-empty set and P (S) be its power set. Show that there exists no
bijection from S to P (S).
2. Let f : X Y be a map. Show that f is surjective if there exists a map h : Y Y
such that f h = IY (identify map).
3. Let f : X Y be a map. Show that f is injective if and only if there is a map
g : Y Y such that g f = IX (identify map).
4. Consider the map defined by f (x, y) = (x, 0). Let A = {(x, y) <2 : x y = 0} and
B = {(x, y) <2 : x y = 1}. Show that f (A B) 6= f (A) f (B).
5. Show that every infinite set contains a countable subset.
6. Let N be the set of positive integers and a b be the divisibility relation defined by
a b if and only if b is divisible by a. Show that (N , ) is a poset.
7. Let denote the natural ordering in <. Show that the poset (<, ) has neither minimal
nor maximal elements.
8. Show that the following posets are not lattices:
(a) ({2, 3, 5, 30, 60, 120, 360}, |)
(b) ({1, 2, 3, 4, 6, 8, 12}, |)
(c) ({2, 3, 6, 12, 24, 36}, |)
(d) ({1, 2, 3, 6, 12, 30}, |).

Enumerable Set

75
Section-C
[Long Answer Questions]

1. A and B be any two sets, prove that the sets A B, A B and B A are pairwise
disjoint.
[ VH94, 95, 99]
2. (a) If A, B, C be three nonempty sets such that A B = A C and A B = A C,
prove that B = C.
[VH00, CH05, 01, BH03 ]
(b) If A, B, C be three nonempty sets such that A C = B C and A C 0 = B C 0 ,
prove that B = C.
3. For each n N , let An = [n, , 2n] = {x Z : n x 2n}. Find the value of

8
S

An .

n=4

4. Prove the following set theoretic statement if it true or give counter example to disprove
it.
(a) A (B C) = (A B) (A C).
0

(b) (AB) = (B A) .

[CH09]
[CH08,10]

5. For the subsets A, B, C of an universal set U , prove the following:


(a) A (B C) = (A B) (A C).
(b) A (B C) = (A B) (A C).

[KH07]

(c) A (B C) = (A B) (A C).

[VH96]

(d) A (B C) = (A B) (A C).
(e) A (B C) = (A B) (A C).
(f) (A B) C = (A C) (B C).
(g) (A B)c = Ac B c and (b) (A B)c = Ac B c .
6. Prove that
(a) (A B) B =
(b) A B, A B and B A are mutually disjoint
(c) (A B) A = A.
(d) If A B then show that A (B A) = B.
7. (a) (a) Show that A B A B = .
(b) If A B and C is any set then show that nA C B C.
(c) If A X = A Y and A X = A Y then prove that X = Y .
(d) If A C = B C and A C 0 = B C 0 , then prove that A = B.
8. Simplify the following expression by using the laws of algebra of sets.
(a) [{(A B) C}c B c ]c
(b) (Ac B c C) (B C) (A C)
(c) A (B C) (Ac (B c C c ))
(d) (A B)c (Ac B c ).
(e) (A B 0 ) (B C).

[KH08]

76

Theory of Sets
9. Let A = {1, 2, 3, 4}. List all subsets B of A such that {1, 2} B.

10. Let A = {1, 2, 3} and B = {a, b}. Find AB and B A and verify that AB 6= B A.
11. Prove that
(a) (A B) (C D) = (A C) (B D).
(b) A (B C) = (A B) (A C).
(c) (A B) C = (A C) (B D).
(d) (A C) (B D) = {(A B) (C D)} {(A B)
(C D)} {(A B) (C D)}.
12. Prove that
(a) AB = AC B = C
(b) A (BC) = (A B)(A C).
13. If A and B are subsets of a set X, then prove that A B X B X A.
14. S T = T S S = T or one is phi.
15. Find the power set of the set A = {a, b, c, 1}.
16. (a) If the number of elements of the set A is n then show that the number of elements
of the power set P (A) is 2n .
(b) If A and B are two non-empty sets having n elements in common, then prove that
A B and B A have n2 elements in common.
17. If the set X has 5 elements, then find n(P (X)) and P (P (P ())).
18. There are 1000 students in a college studying Physics, Chemistry and Mathematics. 658 study Physics, 418 study Chemistry and 328 study Mathematics. Use Venn
diagram to find the number of students studying Physics or Mathematics but not
Chemistry.
[JECA03]
19. Among 100 students, 32 study Mathematics, 20 study Physics, 45 study Biology, 15
study Mathematics and Biology, 7 study Mathematics and Physics, 10 study Physics
and Biology and 30 do not study any of three subjects.
(a) Find the number of students studying all three subjects.
(b) Find the number of students studying exactly one of the three subjects.
20. In a city, three daily newspaper A, B and C are established. 42 percent of the people
in that city read A, 6 percent read B, 60 percent read C, 24 percent read A and B,
34 percent read B and C, 32 percent read C and A, 8 percent do not read any of the
three newspapers. Find the percentage of the persons who read all the three papers.
21. Let A = {1, 2, 3, 4, 5, 6}. Determine whether or not each of the following is a partition of A. (a) P1 = {{1, 2, 3}, {1, 4, 5, 6}} (b) P2 = {{1, 2}, {3, 5, 6}} (c) P3 =
{{1, 3, 5}, {2, 4}, {6}} (d) P4 = {{1, 3, 5}, {2, 4, 6}}.
22. Let A, B, C be three finite sets of U . Show that
(a) |A B| = |A| |A B|
(b) |A B| |A| + |B|
(c) |A B C| |A| + |B| + |C|

Enumerable Set

77

(d) |A B| |A| + |B| |A B|


(e) |A B C| |A| + |B| + |C| |A B| |A C| |B C| + |A B C|.
23. Let A = {1, 2, 3, 4}. For each of the following relations on A, decide whether it is
reflexive, symmetric, antisymmetric or transitive
(a) {(1, 3), (3, 1)}
(b) {(2, 2), (1, 1)}
(c) {(1, 2), (1, 4), (2, 3)}
(d) {(1, 1), (2, 2), (3, 3), (4, 4), (1, 3), (3, 1)}.
24. The following relations are defined on the set of real numbers. Find whether these
relations are reflexive, symmetric or transitive
(a) aRb iff |a b| > 0
(b) aRb iff 1 + ab > 0
(c) aRb iff |a| |b|
(d) aRb iff |a| |b|.
25. In each of the following cases, examine whether the relation is an equivalence relation
on the set given below
(a) = {(a, b) Z Z : |a b| 3}
(b) = {(a, b) Z Z : a b is a multiple of 6}.
(c) xy if and only if |x y| y; x, y <.
26. Let Z be the set of nonzero integers and S = Z Z . Let = {x : x = ((r, s), (t, u))
S S with ru = st}. Prove that is an equivalence relation.
[CH05]
27. A relation on the set of integers Z is define by = {(a, b): a, b Z and |a b| 5}.
Is the relation reflexive, symmetric and transitive?
[WBUT 07]
28. Determine whether the relation on the set A of all triangles in the plane defined by
= {(a, b) : triangle a is similar to the triangle b} is an equivalence relation.
29. In the set of all points in a plane show that the relation of equidistance from the origin
is an equivalence relation.
30. A relation on the set of integers Z is defined as ab iff (a b) is divisible by m (a
positive integer). Show that is an equivalence relation.
31. Determine whether the relation is an equivalence relation on the set of positive
integers Z + .
(a) ab iff a = 4b.
(b) ab iff a = b2 .
32. If A and B be equivalence relation in a set X, show that A B is an equivalence
relation.
33. Let H be a subgroup of a group G. Show that the relation = {(a, b) G G :
a1 b H} is an equivalence relation on the set G.
34. A relation is defined on a set Z by ab if and only if 2a + 3b is divisible by 5 for
a, b Z. Prove that is an equivalence relation.
[ VH97, 05]

78

Theory of Sets

35. A relation is defined on a set Z by ab if and only if either a = b or both a, b are


positive, for a, b Z. Prove that is an equivalence relation on Z. Write down the
distinct equivalence classes of .
36. A relation is defined on a set Z by ab if and only if ab is divisible by 5 for a, b Z.
Prove that is an equivalence relation. Write down the distinct equivalence classes of
.
[VH99, 03]
37. A relation is defined on a set Z by ab if and only if a + b is even, for a, b Z. Prove
that is an equivalence relation.
38. For natural numbers a and b, define ab iff a2 +b is even. Prove that is an equivalence
relation on N .
39. For a, b R, define ab iff a b Z. Show that is an equivalence relation.
40. For any integers a, b define
(a) a1 b iff 2a + 3b = 5n for some integer n.
(b) a2 b iff 3a + 4b is divisible by 7.
41. For a, b Z define
(a) a1 b iff a2 b2 is divisible by 3.
(b) a2 b iff 3a + b is multiple of 4.
42. A relation is defined on a set Z by ab if and only if a2 b2 is an even integer. Prove
that is an equivalence relation on Z and write down the equivalence classes. [BH02]
43. A relation is defined on a set < by ab if and only if a b is rational. Prove that is
an equivalence relation on < and the set of equivalence classes is uncountable.[BH03]
44. A relation is defined on a set Z by ab if and only if ma + nb is divisible by (m + n)
for a, b Z. Prove that is an equivalence relation. Find out an infinite sequence of
positive integers lying in the equivalence class containing 0.
[ BH05]
45. Let A be the set of all straight lines in the plane.
(a) a1 b iff a2 b2 a is parallel to b
(b) a2 b iff 3a + b a is perpendicular to b.
Show that 1 is is an equivalence relation but 2 is not.
46. Determine which of the following define equivalence relations in <2 .
(a) (a, b)(c, d) iff a + 2b = c + 2d.
(b) (a, b)(c, d) iff a2 + b = c + d2 .
(c) (a, b)(c, d) iff ab = cd.
(d) (a, b)(c, d) iff ab = c2 .
47. Let S be a finite set and let f : S S. If f is one=to-one then show that f is onto.
Examine whether this remains true if the set S is infinite.
[ CH: 08]
48. 1 is a relation on Z such that
1 = {(a, b) : a, b Z; a b = 5n, n Z}.
Show that 1 is an equivalence relation. If 2 be another relation defined by
2 = {(a, b) : a, b Z; a b = 3n, n Z}.
Show that the relation 1 2 is symmetric but not transitive.

Enumerable Set

79

49. Given A = {1, 2, 3, 4} and B = {x, y, z}. Let be the relation from A to B defined as
= {(1, x), (2, y), (2, z), (3, z)}.
(a) Find the inverse of the relation 1 of .
(b) Determine the domain and range of .
50. Given A = {1, 2, 3, 4}. Let be the relation on A and is defined as
= {(1, 1), (2, 2), (2, 3), (3, 2), (4, 1), (4, 4)}.
(a) Draw its digraph, (b) Is is equivalence relation?
51. If is an equivalence relation, then prove that 1 is also an equivalence relation in
the set A.
52. If R and S are equivalence relations in the set A then show that R S is also an
equivalence relation in A.
53. Let A = {1, 2, 3, 4}. Consider two equivalence relations
R = {(1, 2), (1, 1), (2, 1), (2, 2), (3, 3), (4, 4), (4, 5), (5, 4), (5, 5)}
and S = {(1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (1, 3), (3, 1), (4, 5), (5, 4)}.
Determine the partitions corresponding to following relations
(a) R1 , (b) R S, (c) R S.
54. Let be an equivalence relation on the set A = {a, b, c, d} define by the partition
P = {{a}, {b}, {c}, {d}}. Determine the elements of equivalence relation and also find
the equivalence classes of .
55. For the partition P = {{a}, {b, c}, {d, e}}, write the corresponding equivalence relation
on the set A = {a, b, c, d, e}.
56. Let S = {n N : 1 x 20}. Define a relation on A by ab iff 5 divides a b
for all a, b S. Show that is an equivalence relation on S. Find all the equivalence
classes.
57. Let A be a finite set with n elements. Prove that the number of reflexive relations
2
that can be defined on S is 2(n n) , the number of symmetric relations is 2n(n+1)/2
and the number of relations that are both reflexive and symmetric is 2n(n1)/2 .
58. Let A and B be two non-empty sets with cardinality m and n respectively. Show that
the number of possible relations from A to B is 2mn 1.
59. Let A = {1, 2, 3, 4} and B = {a, b, c}. Determine whether the relation R from A to B
is a function. If it a function, find its domain and range.
(a) R = {(1, a), (2, a), (3, b), (2, b)}, (b) R = {(1, c), (2, a), (3, b)},
(c) R = {(1, a), (2, b), (3, c), (4, b), (1, b)}, (d) R = {(1, c), (2, a), (3, a), (4, c)}.
60. If A = {2, 3, 4}, B = {2, 0, 1, 4} and relation f is defined as f (2) = 0, f (3) = 4,
f (4) = 2. Find out whether it depends a mapping.
61. Let f : A B and g : B C be two mappings. Show that, if g f is injective, f is
injective but g is not so.
[ CH: 09,10]
62. Let f : A B and g : B C be both surjective, then prove that the composite
mapping g f : A C is surjective. Give an example to show that f is not surjective
if g f : A C is surjective.

80

Theory of Sets

63. A mapping f : N N N is defined by f (m.n) = 2m .3n . Show that f is injective


but not surjective.
[ CH: 07]
64. If Z + is the set of positive integers and f (n) = 2n + 1 then show that f : Z + Z +
is one-one into mapping.
65. If R is the set of real numbers and f (x) = x2 + 7 then prove that f : R R is
many-one into mapping.
66. If A and B be two sets having n distinct elements, show that the number of bijective
mappings from A to B is n!.
[ CH: 07]
67. Show that the function f defined by f : Q Q such that f (x) = 3x+4 for all x Q is
one-one onto, where Q is the set of rational numbers. Also find a formula that defines
the inverse function f 1 .
68. A function f : Z Z is defined by:
x
; if x is even
2
= 7; if x is odd .

f (x) =

Find the left inverse of f ; if it exists.

[ CH: 10]

69. Consider the sets A = {k, l, m, n} and B = {1, 2, 3, 4}. Let f : A B such that (a)
f = {(k, 4), (l, 1), (m, 2), (n, 3)}, (b) f = {(k, 1), (l, 2), (m, 1), (n, 2)}.
Determine whether f 1 is a function.

1, if x is a rational
70. Let f (x) =
be a function from R to R. Find f (0.5) and
0, if x is a irrational

f ( 2).
71. Is the mapping f : X Z defined by f (x) =
integers and X = {x : 0 < x < 1}.

2x1
1|2x1|

is a bijective? Here z = set of

72. If A = {1, 2} and B = {a, b}, find all relations from A into B. Delete which of these
relations are functions from A to B.
73. Show that the following functions are neither injective nor surjective.
(a) f : R R given by f (x) = |x| + 1 xR
(b) f : R R given by f (x) = sin x xR.
74. Show that the following functions are injective but not surjective.
(a) f : Z Z given by f (x) = 2x + 3 xZ
(b) f : N N given by f (x) = sin x xZ.
75. Show that the following functions are surjective but not injective
(a) f : Z {1, 1} given by f (n) = (1)n , n Z
(b) f : N Z10 given by f (n) = [r], where r is the remainder when n is divided by
10.

Enumerable Set

81

76. Determine which of the following functions are bijective


(a) f : R R where f (x) = |x| + 1 xR
(b) f : Z Q where f (x) = 2x xR
(c) f : R R where f (x) = x2 3x + 4 xR
(d) f : R S where f (x) =

x
1+|x|

where S = {x R : 1 < x < 1}.

77. Let A be a finite set and let f : A Bbe a surjective function. show that the number
of elements of B cannot be greater than that of A.
78. Let A = {1, 2, 3}. Find all possible bijective functions from A into itself.
79. Let |A| = n. Prove that there can be n! different bijective functions on A.
80. Consider the function f : R R and g : R R where f (x) = x + 2 and g(x) = x2 .
Find f og and gof .
81. Suppose f and g are two functions from R into R such that f og = gof . Does it
necessarily imply that f = g? Justify your answer.
82. Let f, g and h : R R defined by f (x) = x + 2, g(x) =

1
1+x2 ,

h(x) = 3.

Compute gof , f og, gohof , gof 1 of and f 1 ogof .


83. Let A = {1, 2, 3, 4} and define functions f, g : A A by
f = {(1, 3), (3, 2), (3, 1), (4, 2)} and g = {(1, 4), (2, 3), (3, 1), (4, 2)}.
Find f og, gof , g 1 of og, f og 1 og and gog 1 of .
84. Let A = {a, b, c}. Define f : A A such that f = {(a, b), (b, a), (c, c)}. Find (a) f 2 ,
(b) f 3 , (c) f 4 . [Hints: f 3 = f of of ]

2|x|, if x < 0
85. Define f : Z N by f (x) =
Show that f has an inverse and find
2x + 1, if x 0.
1
1
f (25), f (20).
86. If f : x x + 1 and g : x 3x be mappings of the set of integers into itself, examine
whether each of f and g is surjective, injective. Also, show that f g 6= gf. [VH95, 05]
87. Prove that the mapping f : < < is defined by f (x) = 2x + 3, x < is a bijective
mapping.
[ VH96]
88. Prove that the mapping f : Q Q is defined by f (x) = 5x + 2, x Q is a bijective
mapping.
[ VH03]
89. Test whether the mapping f : C < defined by f (x) = |x|, x C is a bijective
mapping.
[ VH98]
90. Show that the mapping f : < <, defined by f (x) = x3 x2 is surjective but not
injective.
[ BH03]
91. A mapping f : < < is defined by f (x) =
bijective mapping.

x
x2 +1 , x

<. Examine whether it is a


[ VH97, CH05]

92. A mapping f : Z Z is defined by f (x) = x2 + x 2, x <, find f 1 ({4}) and


f {f (2)}.
[ VH01]

82

Theory of Sets

93. A mapping f : < < is defined by f (x) = x2 + x 2, x <, find f 1 ({8}) and
f 1 {17, 37}.
94. For the mappings f (x) = x2 and g(x) = 1 + x, x <, find the set {x < : f g(x) =
gf (x)}.
95. For the mappings f (x) = |x| + x and g(x) = |x| x, x <, find f g, gf and the set
{x < : f g(x) = gf (x)}.
BH04
96. For the mappings f : N Q; f (x) = 32 x + 1 and g : Q Q; g(x) = 6x,, examine with
justification if f g and gf are defined.
97. Let the mappings f, g : Z Z be defined by f (x) = (1)x and g(x) = 2x, x Z, find
gf and f g.
98. Prove that the set of rational numbers in [0, 1] is countable.
JECA06




1234
1234
99. If f =
and g =
, find f g, f 1 , g 1 and prove that (f g)1 =
2413
4123
g 1 f 1 .

 

123456
123456789
100. Examine whether the permutations
,
are odd or
315642
479182635
even.
101. Let X = {a, b, c} and f, g : X X be defined by f (a) = b, f (b) = c, f (c) = a and
g(a) = a, g(b) = c, g(c) = b. Show that f g 6= gf.
102. A relation is defined on a set Z by ab if and only if b is the divisor of a, for a, b Z.
Prove that is an partial order relation.
103. Give an example of a partially ordered set which is a lattice and another which is not
lattice. Justify your answer.
104. Let X = {0, 1, 2, , 100}, define a binary relation on X by x y if and only if
x divides y. (i) Prove that, it is a partially ordered set. Find the least and greatest
element of (X, ) if they exist. (ii) Is (X, ) a lattice. Justify the answer.

Chapter 2

Theory of Numbers
The integers are the main elements of mathematics. The theory of numbers is concerned,
at least in its elementary aspects, with basic properties of the integers and more particularly with the positive integers 1, 2, 3, . . ., known as natural numbers. Here we shall discuss
some basic properties of integers including well-ordering principle, mathematical induction,
Euclidean algorithm representation of integers etc.

2.1

Number System

Number systems are basically of two types (i) Non-positional number system, (ii) Positional
number system.

2.1.1

Non-positional Number System

In this number system, people counted on figures in the early days, when ten figures were
not adequate, small stones, balls, sticks, pebbles were used to indicate values. This method
of counting uses an additive approach or the non-positional number system. Each symbol
represent the same value regardless of its position in the number and the symbols are simply
added to find out the value of the particular number. Since it is very difficult to perform
arithmetic with such a number system, positional number system were developed as the
centuries passed.
(i) In this system, we have symbols (Roman number system) I for 1, II for 2, III for 3
etc. and so on.
(ii) An example of earlier types of notation can be found in Roman numerals, which are
essentially additive: III = I + I + I, XXV = X + X + V. New symbols X, C, M, . . .
etc. were used as the numbers increased in value: thus rather than IIIII is equals to
5.
(iii) The only importance of position in Roman numbers lies in whether a symbol precedes
or follows another symbol, i.e., IV = 4, while V I = 6.
(iv) The clumsiness of this system can be seen easily if we try to multiply XII by XIV .
Calculating with roman numbers was to difficult that early mathematicians were forced
to perform arithmetic operations almost entirely on abaci, or counting boards, translating their results back to Roman numeral form.
Some of such roman number system are given below in the tabular form:
1 2 3
4 5 6
7
8
9 10
I II III IV V V I V II V III IX X
83

84

Theory of Numbers
11
39
40 41
49
50 51
89
XI XXXIX XL XLI XLIX L LI LXXXIX
90
91
99
100 200 300 400 500 600
XC XCI XCIX C CC CCC CD D DC
700
800
900 1000 1100 1200
1300
1400 1500 1600 1700
DCC DCCC CM M M C M CC M CCC M CD M D M DC M DCC
1800
1900 2000 5000 10000 50000 100000 500000 1000000
X
L
C
D
M
M DCCC M CM M M V

Pencil and paper computations are unbelievably intricate and difficult in such systems. In
fact the ability to perform such operations as addition and multiplication was considered a
great accomplishment in earlier civilizations.

2.1.2

Positional Number System

In a positional number system, there are only a few symbols called digits and these symbols
represent different values depending on the position they occupy in the number. In this
number system the position of the digit is very important, the digit will view be it values.
The value of each digit in such a number is determined by three considerations
(i) the digit itself
(ii) the position of the digit in the number
(iii) the base of the number system.
The positional number system are groups as
(i) Decimal number system
(ii) Binary number system
(iii) Octal number system
(iv) Hexadecimal number system.
There are two characteristic of all number systems that are suggested by the value of the
base
(i) The total number of digits (symbols) available to represent numbers in a positional
number system. Commonly base as a subscript notation. In all number system the
value of the base determines the total number of different symbol or digit available in
the number system.
(ii) The second characteristic is that the maximum value of a single digit is always equal
to one less than the value of the base. For example, 0011 base 2 first digit is 0 less
than base 2.
Decimal Number system : In this number system, the base or radix is equal to 10
because there are altogether ten symbols or digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 are used. In day to
day life this number system are more useful. The general rule for representing numbers in
the decimal system by using positional notation as
an1 10n1 + an2 10n2 + . . . + a1 10 + a0

(2.1)

Natural Number

85

is expressed as an1 , an2 , . . . , a1 , a0 where n is the number of digits to the left of the decimal
point. In this number system, we can start counting from 0 for the converting purpose, we
can (1) from the total number of digit of the number. For example,
(2586)10 = 2 1041 + 5 1042 + 8 1043 + 6 1044 .
The other positional number systems, may consult the Authors Numerical Book.

2.2

Natural Number

A set N of natural numbers is defined by a set in which the following axioms (known as
Peanos axiom) are satisfied:
(i) every element a N has a unique successor denoted by a , a N .
(ii) If two natural numbers have equal successor, then they are themselves equal, i.e.,
a = b a = b; a, b N .
(iii) an unique element (denoted by 1) in N , which has no predecessor.
(iv) If M N such that 1 M and k M k M , then M = N . This is called
principle of mathematical induction or first principle of finite induction.
The set of numbers 1, 2, 3, . . . is called natural numbers and is denoted by N = {1, 2, 3, . . .}.

2.2.1

Basic Properties

We are acquainted with the following familiar properties of integers.


(i) Closure law : a + b N ; a, b N
(ii) Associative law: (a + b) + c = a + (b + c); a, b, c N
(iii) Identity law : a + 0 = 0 + a, a.1 = 1.a = a; a N
(iv) Additive inverse law : a + b = b + a; a, b N
(v) Commutative law : a + b = b + a; a, b N
(vi) Distributive law : a.(b + c) = a.b + a.c; a, b, c N
(vii) Cancellation law : a + b = a + c b = c; a, b N
The set of all natural numbers or is closed with respect to addition and multiplication but
not closed with respect to subtraction and division.

2.2.2

Well Ordering Principle

The well ordering principle plays an important role in the proof of the next sections. The
principle states that,
every non empty subset of N , of natural numbers have unique least element.
Let S be a non empty subset of the set N of natural numbers. Thus m S such that
m a ,a S; m is called the least element of N .
From well ordering principle it follows that, every descending chain of natural numbers
must terminate.

86

Theory of Numbers

Theorem 2.2.1 There is no integer m satisfying 0 < m < 1.


n
o
If possible, there lie integers in (0,1), then consider the set S = n Z : 0 < n < 1 .By
assumption S is non empty subset of N . So by well ordering principle it has least element
say c so that 0 < c < 1 and c N . Therefore, 1 c > 0 and also c > 0. Thus,
c(1 c) > 0 c c2 > 0 0 < c2 < c < 1.
Thus c2 Z and 0 < c2 < 1. Hence c2 S,but c2 < c which contradict the fact that c is
the least element in S. This contradiction shows that our assumption is wrong. Hence there
is no integer satisfying 0 < m < 1.

2.2.3

Mathematical Induction

Form 1: If M, a set of positive integers, be N with two specific properties


(i) the integer 1 M and
(ii) k M k + 1 M
then M = N .
Proof: Let F = N M , it is sufficient to show that F = , null set. Let us suppose F 6= ,
then F is a set of all positive integers not is M . So by well ordering principle, it has an least
element q(say), then q F . Since 1 M , so q 6= 1 so q 1 N . Since 0 < q 1 < q and q
is the least element in F . So q 1 M . The hypothesis (ii) gives
q 1 M (q 1) + 1 M q M,
which is a contradiction. This contradiction shows that F = and consequence that M = N .
Form 2: Let P (n) be a mathematical statement involving positive value integer n. If
(i) P (1) is valid,
(ii) validity of P (k) validity of P (k + 1),
then P (n) is valid for all +ve integer n.
Proof: Let M be the subset of all N , of natural numbers n for which P (n) is true . Since
P (1) is true so 1 M and condition (ii) gives k M k + 1 M . Consider F = N M ,
we are to show that F = , by the previous, F = so M = N , hence P (n) is valid for all
+ve integer n. We see that all the forms are equivalent.
Result 2.2.1 From the above equivalent forms, we see that, the mathematical induction
consists of three steps
(i) Basis: Show that P (1) is true
(ii) Inductive hypothesis: Write the inductive hypothesis like, let P (k) be true.
(iii) Inductive step: Show that P (k + 1) is true.
Result 2.2.2 Although mathematical induction provides a standard technique for attempting to prove a statement about the positive integers, one disadvantage is that it gives no aid
in formulating such statements.
Ex 2.2.1 Prove that 23n 1 is divisible by 7 for all n N .

Natural Number

87

Solution: Let us write P (n) for 23n 1. We have P (1) = 23 1 = 7 , which is divisible by
7. Thus the proposition is true for n = 1. Let us consider P (m + 1) P (m), which is given
by,
P (m + 1) P (m) = (23m+1 1) (23m 1)
= 23m+3 23m = 23m (8 1)
= 23m .7 = 7p,
where p = 23m = an integer. Hence P (m + 1) is divisible by 7, if P (m) is so. This proves
that the proposition is true for n = m + 1, if it is true for n = m. Hence by principle of
mathematical induction, the proposition is true for all n N .
Ex 2.2.2 Show that n5 n is divisible by 30 for all n N .
Solution: Let us write P (n) for n5 n. We have,
P (1) = 1 1 = 0, which is divisible by 30.
P (2) = 25 2 = 30, which is divisible by 30.
Thus the proposition is true for n = 1, 2. Let P (m) = m5 m is divisible by 30 i.e
P (m) = 30k, where k N
P (m + 1) = (m + 1)5 (m + 1)
= m5 + 5m4 + 10m3 + 10m2 + 5m + 1 m 1
= (m5 m) + 5m(m + 1)(m + 2)2 15m(m + 1)2
= 30k + 30q 30r; q, r N .
P (m+1) is divisible by 30 if P (m) is divisible by 30. Hence, by the principle of mathematical
induction, the proposition is true for all n N .
Ex 2.2.3 Show that n3 n is divisible by 6 for all n N .
Solution: Let us write P (n) for n3 n. We have
P (1) = 1 1 = 0, which is divisible by 6.
P (2) = 23 2 = 6, which is divisible by 6 .
Thus the proposition is true for n = 1, 2. Let P (m) = m3 m is divisible by 6 i.e P (m) =
6k, k N
P (m + 1) = (m + 1)3 (m + 1) = m3 + 3m2 + 2m
= (m3 m) + 3m(m + 1) = 6k + 6q; q N
as product of two consecutive number is divisible by 2. P (m + 1) is divisible by 6 if P (m)
is divisible by 6. Hence, by the principle of mathematical induction, the proposition is true
for all n N .
Ex 2.2.4 Show that 2.7n + 3.5n 5is divisible by 24 for all n N .
Solution: Let us write P (n) for 2.7n + 3.5n 5. We have,
P (1) = 2.7 + 3.5 5 = 24, which is divisible by 24.

88

Theory of Numbers

Thus the proposition is true for n = 1. Let P (m) be divisible by 24 i.e P (m) = 2.7m +
3.5m 5 = 24q, q N . Now,
P (m + 1) = 2.7m+1 + 3.5m+1 5
= 7[2.7m + 3.5m 5 3.5m + 5] + 3.5m+1 5
= 7(2.7m + 3.5m 5) 6.5m + 30
= 7.24.q 6.5(5m1 1)
= 7.24.q 6.5.4(5m2 + 5m3 + . . . + 1)
= 24[7.q 5(5m2 + 5m3 + . . . + 1)].
Therefore, P (m + 1) is divisible by 24 if P (m) is divisible by 24. Hence, by the principle of
mathematical induction, the proposition is true for all n N .
Ex 2.2.5 Show that, 34n+2 + 52n+1 is divisible by 14 for all n N .
Solution: Let us write P (n) for 34n+2 + 52n+1 . We have,
P (1) = 36 + 53 = 14.61, which is divisible by 14.
Thus the proposition is true for n = 1. Let P (m) be divisible by 14 i.e P (m) = 34m+2 +
52m+1 = 14q, q N . Thus, 52m+1 = 14.q 34m+2 . Now,
P (m + 1) = 34(m+1)+2 + 52(m+1)+1
= 34m+2 .34 + 52m+1 .52
= 34m+2 .81 + 25(14q 34m+2 )
= 34m+2 (81 25) + 25.14q
= 14[4.34m+2 + 25q]; where q N
= 14k; where k = 4.34m+2 + 25q N .
Therefore, P (m + 1) is divisible by 14, if P (m) is divisible by 14. Hence, by the principle of
mathematical induction, the proposition is true for all n N .
Ex 2.2.6 Show that nn > 1.3.5 . . . (2n 1) for n > 1.
Solution: For n = 2, the LHS = 22 = 4 and RHS = 1.3 = 3. As 4 > 3, the inequality holds
for n = 2. Let the result holds for n = m i.e.,mm > 1.3.5 . . . (2m 1). Hence,
(2m + 1)mm > 1.3.5 . . . (2m 1)(2m + 1).
1
1
Now, (m + 1)m+1 (2m + 1)mm = mm+1 [(1 + )m+1 (2 + )]
m
m

 

1
1
1
1
= mm+1 1 + m + 1C1 + m + 2C2 2 + + m+1 2 +
m
m
m
m


m
+
1
1
= mm+1
+ + m+1 > 0.
2m
m
Hence, (m + 1)m+1 > (2m + 1)mm > 1.3.5 . . . (2m + 1).
Thus the inequalities holds for n = m + 1 when it holds for n = m. Hence it is true for all
positive value integer n.
Ex 2.2.7 For what natural number n is the inequality 2n > n2 valid.

Integers

89

Solution: We shall prove this by using the principle of mathematical induction. For
n = 1 as 2 > 1 so the inequality is valid.
n = 2 as 22 = 22 so the inequality is not valid.
n = 3 as 23 6> 32 so the inequality is not valid.
n = 4 as 24 = 42 so the inequality is not valid.
n = 5 as 25 > 52 so the inequality is valid.
Let 2k > k 2 , when k > 4 and k N . Therefore,
2k > 2k + 1; for k > 4
2k + 2k > k 2 + 2k + 1 2k+1 > (k + 1)2 .
Thus the inequality is valid for n = k + 1, when it is valid for n = k, and k > 4. Hence by
the principle of mathematical induction the inequality is valid for n = 1 and n > 4.
Ex 2.2.8 Prove that the product of r consecutive numbers is divisible by r!.
Solution: Let pn = n(n + 1)(n + 2) (n + r 1); n N , then,
pn+1 = (n + 1)(n + 2) (n + r)
n.pn+1 = (n + r)pn = n.pn + r.pn
pn
pn+1 pn =
r
n
= r product of (r 1) consecutive natural numbers.
If the product of (r 1) consecutive natural nos. is divisible by (r 1)! then,
pn+1 pn = k.r! ; k N .
Now, p1 = r! so p2 , p3 , p4 , . . . are also multiple of r!. We shall show that product of (r 1)
consecutive natural numbers is divisible by (r 1)! then the product of r consecutive natural
numbers is divisible by r!. The product of two consecutive natural numbers is divisible by
2!, so the product of three consecutive natural numbers is divisible by 3! and so on.

Ex 2.2.9 Prove that (2 + 3)n + (2 3)n is an even integer for all n N .


n
n
Solution:
1 Let pnbe 1the statement that (2 + 3) + (2 3) is an even integer.Since,
(2 + 3) + (2 3) = 4 = even integer, so p1 is true. Let pk be true, i.e., (2 + 3)k +
(2 3)k is an even integer,
then,

(2 + 3)k+1 + (2 3)k+1

= ak+1 + bk+1 ; where, a = 2 + 3, b = 2 3


= (ak + bk )(a + b) (ak1 + bk1 )ab
= 4(ak + bk ) (ak1 + bk1 ).
This is an even integer, as ak + bk and ak1 + bk1 are even integers, by assumption.
This shows that Pk+1 is true whenever p1 , p2 , , pk are true. Therefore, by principle of
mathematical induction, the statement is true for all n N .

2.3

Integers

The set of all integers, denoted by Z,consists


of whole numbers
as
n
o
Z = 0, 1, 2, 3, .

(2.2)

The set of all positive integers is identified with the set of natural number N . We shall
use the properties and principles of N in connection with the proof of any theorem about
positive integers.

90

2.3.1

Theory of Numbers

Divisibility

In this section, we define the divisibility and division algorithm, for two give integers, which
are most important and fundamental concept in number theory.
Definition 2.3.1 Let a Z and x is any member of Z. Then ax is called multiple of a.
For example,
(i) 3 7 = 21, then 21 is called a multiple of 3. Also it is called a multiple of 7.
(ii) The number 0 is multiple of every member of Z, as a.0 = 0, a Z.
These exist infinitely many elements which are multiple of a Z.
Definition 2.3.2 An integer a(6= 0) is said to divide as integer b, if
unique c Z such that ac = b.

(2.3)

This is expressed by saying a divides b or a is the divisor of b or b is the divisible of a


and is denoted by a|b. We also say that b is a multiple of a, that a is a divisor of b or that
a is a factor of b.
For example
(i) 9|63 as 63 = 9.7, where 7 Z, i.e. 63 is a multiple of 9.
(ii) Also 81 is divisible by 3 as 81 = 3.(27), and 27 Z.
(iii) Again 3 6 |16 because for 16 there is no integer x such that 16 = 3 x.
(1), (a) are called the improper divisors of a nonzero integer a. We write, a 6 |b to indicate
that b is not divisible by a. Divisibility establishes a relation between any two integers with
the following elementary properties.
Property 2.3.1 If a|b, then every divisor of a divides b.
Proof: Since a|b, c Z such that b = ac. Let m be any divisor of a, then
a = md, for some d Z.
Thus, b = mdc m|b, as cd Z.
Property 2.3.2 If a|b and a 6= 0 then (b/a)|b
Proof: From definition, we have,
a|b b = ac b/a = c, an integer
 
b
Now,
b=
a, a Z (b/a)|b.
a
b/a is called the devisor conjugate of a.
Property 2.3.3 For integers a, b, c Z,
(i) a|a, a(6= 0) Z,(reflexive property)
(ii) 1|a,

a|0,

1|a;

(iii) a|b, b|c a|c (transitive property). The converse of this property need not hold. For
example, a = 5, b = 10, c = 15, then 5|15 but 10 6 |15 although 5|10.

Integers

91

(iv) a|b and b|a if and only if a = b.


These properties are immediately follows from definition.
Property 2.3.4 If a, b Z then a|b implies a|bm.
Proof: If a, b Z such that, a|b, by definition, b = ac; c Z. Therefore,
bm = (ac)m = a(cm); where cm Z,
a|(bm).
Thus, if a|b, then a|(bm), m Z. The converse is not always true. For example, let
b = 5, m = 8 and a = 10, then 10|5.8 i.e., 10|40 but 10 6 |8.
Also, if a|b, then ma|mb, m 6= 0. This is known as multiplication property. Also, the
cancellation law states that ma|mb and m 6= 0 implies a|b.
Property 2.3.5 If a|b and a|c, then a|(bm + cn); m, n being arbitrary integers.
Proof: The relations a|b, a|c ensure that suitable integers x, y such that b = ax, c = ay.
Hence mb = max and nc = nay. Thus whatever the choice of integers m and n,
mb + nc = max + nay = a(mx + ny)
mb + nc = n(mx + ny), where m, n, x, y Z
a|(bm + nc).
The converse of this result need not hold. For example, let a = 5, b = 6, c = 7, m = 3 and
n = 1, so 5|25, but 5 6 |6 and 5 6 |7.
This property of this theorem extends by induction to sums of more than two terms.
That is, if a|bk for k = 1, 2, . . . , n, then
n
X
a|(b1 x1 + b2 x2 + . . . bn xn ) = a|
bi xi
i=1

for any integers x1 , x2 , . . . , xn . This is known as linearity property of divisibility.


Property 2.3.6 If a|b, c Z such that b = ac. Also, b 6= 0 implies c 6= 0. Upon taking
absolute values, we get
|b| = |ac| = |a||c|.
Because, c 6= 0, it follows that |c| 1, whence,
|b| = |ac| = |a||c| |a| |b| |a|.
Thus, if a|b and b 6= 0, then |a| |b|. Let b 6= 0, then,
a|b |a| |b|.
Again, if a 6= 0, then b|a |b| |a|. Therefore, if a 6= 0, b 6= 0, then
a|b and b|a |a| = |b|.
This is known as comparison property.
Property 2.3.7 If 0 a b and b|a, then a = 0. For, let a 6= 0 and
b|a |b| |a| b a,
which is a contradiction as a, b are both non negative. This contradiction shows the hypothesis that a = 0. If b|a and |a| < |b| then a = 0. For if a 6= 0, and b|a |b| |a| which is
contradictory to the hypothesis and hence a = 0.

92

Theory of Numbers

2.3.2

Division Algorithm

Given integers a and b(b > 0), unique two integers q and r, such that
a = bq + r;

where, 0 r < b.

(2.4)

Proof: Existence : We begin by considering the set of non negative integers, given by,
S = {a bx : x Z, a bx 0}.
First, we shall show that S is non empty. To do this, it suffices to exhibit a value of x
making a bx nonnegative. Now,
|a| a and b 1 |a| b|a|.
Therefore, a |a| b|a| a + b|a| 0
a + b|a| S.
For this choice of x = |a|, S is non empty set of non negative integers. Thus,
(i) either S contains 0 as its least element, or,
(ii) S does not contain 0, so, S is a nonempty subset of N , by well ordering principle, it
has a least element which is positive.
Hence in each case, S has a least element r 0, (say) and r is of the form a bq. Thus,
r = a bq; q Z
a = bq + r; q Z and r 0.
We shall now show that r < b. If possible let r b then r b 0 and
r b = a b(q + 1); where (1 + q) Z,
so that (r b) S, smaller than its smallest member r, which is a contradiction. Hence,
r < b (b > 0, r b < r). Thus q, r Z and 0 r < b such that a = bq + r.
Uniqueness : To prove the uniqueness of integers q, r; we assume that we can find
another pair q1 , r1 Z such that,
a = bq1 + r1 ; 0 r1 < b.
0 = b(q q1 ) + (r r1 )
or, b(q q1 ) = r r1 b|(r r1 ),
where, |r r1 | < b (r r1 ) = 0 r = r1
so, b(q q1 ) = 0 q = q1 ; as b > 0.
Thus q and r are unique, ending the proof. Also, it is clear that r = 0, if and only if, b|a.
This important theorem is is known as division algorithm. The advantage of this algorithm
is that it allows us to prove assertions about all the integers by considering only a finite
number of cases.
Result 2.3.1 The two integers q and r, termed as quotient and remainder in the division
of a by b respectively.
Result 2.3.2 Though it is an existence theorem, its proof actually gives us a method for
computing the quotient q and remainder r.

Integers

93

Theorem 2.3.1 If a and b(> 0) be two integers, then integers Q and R such that
a = bQ R; 0 R <

b
.
2

(2.5)

Proof: For any two integers a and b with b > 0, the division algorithm shows that q, r Z
such that
a = bq + r; 0 r < b
(2.6)
Case1: Let r < 2b . Taking q = Q and r = R in (2.6), we have,
a = bQ + R; 0 R <

b
2

Case2: Let r > 2b . Now, a = bq + r can be written in the form


a = b(q + 1) + r b = b(q + 1) (b r)
Taking q + 1 = Q and b r = R, we have,
a = bQ R, 0 R <

b
b
b
, R=br <b = .
2
2
2

Thus combining the Case(1) and (2) we have,


a = bQ R; 0 R <

b
.
2

Case3: Let r = 2b , then a = bq + r can be written in the form a = bQ + R, where we take


q = Q and r = R = 2b . Again,
a = bq + r = b(q + 1) (b r) = bQ + R
where q + 1 = Q and R = (b r) = 2b . Thus, it follows that for r = 2b , Q and R are not
unique. In this case, R is called the minimal remainder, i.e. the absolutely least remainder
of a with respect to b.
Theorem 2.3.2 (Generalized division algorithm) : Given integers a and b(b 6= 0), unique
two integers q and r, such that a = bq + r; 0 r < |b|.
Proof: When b is positive, then it is the previous theorem. So it is enough to consider the
case in which b is negative. When b is negative then |b| > 0 as b 6= 0. By the above theorem,
unique integers q1 and r such that
a = |b|q1 + r; 0 r < |b|
= bq1 + r = bq + r, where q = q1 .
Hence the theorem.
Ex 2.3.1 If n be any positive integer, show that the product of the consecutive natural
numbers n, n + 1, n + 2 is divisible by 6.
Solution: In case of division by 3, one of the numbers 0, 1, 2 will be the remainder and the
corresponding integers of the form 3k, 3k + 1, 3k + 2; k Z. If
n = 3k, then 3|n; n = (3k + 1), then 3|n + 2; n = 3k + 2, then 3|n + 1.

94

Theory of Numbers

Hence for any value of n in Z, 3|n(n + 1)(n + 2). In case of division by 2, one of the numbers
0, 1 will be the remainder and the corresponding integers of the form 2k, 2k + 1; k Z. If,
n = 2k then 2|n and n = 2k + 1 then 2|n + 1.
Hence for any n Z, 2|n(n + 1) i.e. the two consecutive integers n, n + 1 one is even i.e.
divisible by 2. Therefore,
2|n(n + 1)(n + 2) and 3|n(n + 1)(n + 2).
Since (2, 3) = 1, so 6|n(n + 1)(n + 2). In the above procedure, we can show that the product
of m consecutive integers is divisible by m.
Ex 2.3.2 Show that the square of an odd integer is of the form 8k + 1; k Z.
Solution: By division algorithm, we see that when an integer is divided by 4, the remainder
will be one of 0, 1, 2, 3 and the corresponding integers is of the form 4k, 4k + 1, 4k + 2, 4k + 3.
Of those form (4m + 1) and (4m + 3) will be odd integers. Now,
(4m + 1)2 = 8(2m2 + m) + 1
where 2m2 + m Z, which is of the form 8k + 1 and
(4m + 3)2 = 8(2m2 + 3m + 1) + 1,
where 2m2 + 3m + 1 Z, which is of the form 8k + 1. Therefore, the square of an odd
integer is of the form 8k + 1; k Z.
Ex 2.3.3 Show that square of any integer is of the form 4n or (4n + 1), for some n Z.
By division algorithm, we see that when an integer is divided by 2, the remainder of 0, 1
and the corresponding integers of the form 2k, 2k + 1; k Z. Now,
(2k)2 = 4k 2 = 4n; k 2 = n Z
so, (2k + 1)2 = 4(k 2 + k) + 1
= 4n + 1; k 2 + k = n Z.
Hence square of any integer is of the form 4n, 4n + 1; n Z.

2.4

Common Divisor

Let a and b be given arbitrary integers. If d divides two integers a and b, i.e., if both
d|a and d|b,

(2.7)

then d is called a common divisor of a and b. The number of divisors of any non-zero integer
is finite. Now
(i) 1 is a common divisor of every pair of integers a and b, so the set of positive common
divisors of integers a and b is non empty.
(ii) Every integer divides zero, so that if a = b = 0, then, every integer serves as a common
divisor of a and b. In this instance, the set of positive common divisors of a and b is
infinite.
(iii) However, when at least one of a and b is different from 0, there are only a finite number
of positive common divisors.
Every pair of integers a and b has a common divisor which can be expressed as a linear
combination of a and b. Every finite set has the largest value. It is defined as the gcd as in
the following definition.

Common Divisor

2.4.1

95

Greatest Common Divisor

For two given integers a and b, with at least one of them different from zero, a positive
integer d is defined to be the greatest common divisor (gcd) of a, b if,
(i) d be a common divisor of a as well as b i.e., d|a, d|b.
(ii) every common divisor of a, b is a divisor of d ,i.e for an integer c ;i.e.,
c|a, c|b c|d.
The gcd of a, b is denoted by gcd(a, b) or simply (a, b). For more than two integers it is
denoted by (a1 , a2 , , an ). From definition, it follows that,
(a, b) = (a, b) = (a, b) = (a, b),
where, a, b are integers, not both zero. For example,
(i) (12, 30) = 6, (9, 4) = 1, (0, 5) = 5 etc.
(ii) (12, 30) = 6 and (16, 40) = 8.
Result 2.4.1 Let d and d1 be two greatest common divisors of integers a and b. Then
by the definition, we find that d|d1 and d1 |d. Hence, there exist integer r and t such that
d1 = dr and d = d1 t. Now,
d = d1 t = drt, d 6= 0 rt = 1.
Thus, r = s = 1, and hence d = d1 . So it follows that, two different gcds of a and b
differ in their sign only. But we take the positive value as the gcd.
Theorem 2.4.1 Any two non zero integers a, b, not both of which are zero, have an unique
gcd, which can be written as in the form ma + nb; m, n Z.
Proof: Let us consider a set S of all positive linear combinations of a and b as,

Also,

S = {xa + yb : x, y Z, xa + yb > 0}.


a.a + 0.b = a2 (> 0) S.

so, S is non empty subset of N . Therefore, by the well ordering principle, it has an least
element r (say), which is of the form
r = ma + nb; m, n Z.
We shall first show that, r|a and r|b. If r is not a devisor of a, by the division algorithm,
p, q Z such that a = pr + q; 0 < q < r, i.e.,
q = a pr = a p(ma + nb)
= (1 mp)a + (np)b S;
where, 1 mp, np Z and q > 0. Since q < r, this representation would imply that, q is
a member of S contradicting the fact that r is the least element in S. Hence q = 0 and r|a
and similarly r|b. Next let, c|a, c|b, then
c|a, c|b a = ck1 , b = ck2 ; k1 , k2 Z
so, r = ma + nb = mck1 + nck2
= c(mk1 + nk2 ); where mk1 + nk2 Z.

96

Theory of Numbers

Thus, c|r and so r = (a, b) and r = ma + nb; m, n Z. To the uniqueness of r, let there be
another gcd of a, b say r1 i.e r1 = (a, b) also. r|r1 and r1 |r i.e r, r1 are associates r1 = r.
But as r and r1 are both positive so r = r1 . Hence gcd is unique which can be expressed
as a linear combination of a and b with integral multiplier m and n. This is the Euclidean
algorithm for existence of gcd. Note the following:
(i) This method involves repeated application of the division algorithm.
(ii) If m, n are integers then (a, b) is the least positive integer of the form ma + nb, where
m and n range over integers.
(iii) The representation of d as ma + nb is not unique.
(iv) If a and b are integers, not both of which are zero, we have,
(a, b) = (b, a) = (a, b) = (a, b) = (a, b) = (a, b + ax),
for any integer x.
(v) The theorem does not give any algorithm how to express (a, b) in the desired form
ma + nb.
(vi) If d = (a1 , a2 , . . . , ar ), ai 6= 0; i then integers m1 , m2 , . . . , mr such that
d = a1 m1 + a2 m2 + . . . + ar mr .
Theorem 2.4.2 (Method of finding gcd ) : For two given positive value integers a, b if
a = bq + r; q, r Z, 0 r < b then (a, b) = (b, r).
Proof: Let d = (a, b) and d1 = (b, r). Since d is the gcd of a and b so d|a and d|b i.e.,
k1 , k2 Z such that a = dk1 , b = dk2 . Now,
a = bq + r r = a bq
= dk1 dk2 q = d(k1 k2 q); k1 k2 q Z.
Thus, d|r also d|b so d|d1 . Similarly we can get d1 |d. As d1 = (b, r) so d1 |b, d1 |r. b = d1 b
and r = d1 r1 so
a = bq + r = d1 b1 q + d1 r1 = d1 (b1 q + r1 ).
Therefore, d1 |a also d1 |b d1 |d. d = d1 , as d and d1 are both positive so d = d1 .
Ex 2.4.1 Find the gcd of 120 and 275 and express the gcd in the form 120m+275n; m, n Z.
Solution: To find (120, 275), we have the following table:
3
3

120
105
15
15

275
240
35
30
5

2
2

Hence (120, 275) = 5. Now,


15 = 5.3 + 0
35 = 15.2 + 5 5 = 1.35 15.2
120 = 35.3 + 15 15 = 1.120 35.3
275 = 120.2 + 35 35 = 275 120.2

Common Multiple

97

Therefore, the gcd 5 = (120, 275) can be written as


5 = 35 15.2 = 35 (120 35.3).2
= 7.35 2.120 = 7(275 120.2) 2.120
= 7.275 + (16).120
which is of the form 120m + 275n where m = 16 and n = 7.
Ex 2.4.2 Show that (a, a + 2) = 1 or 2 for every integer a.
Solution: Let d = (a, a + 2), then d|a and d|a + 2. Therefore, by linearity property of
divisibility, we have
d|ma + n(a + 2); m, n Z.
Taking m = 1 and n = 1, it follows that d|2, i.e., d is either 1 or 2.
Ex 2.4.3 Let a, b be two integers not both zero. Prove that (ka, kb) = k(a, b) for any positive
integer k.
Solution: Let (a, b) = d1 , then there exist m, n Z such that
d1 = ma + nb, i.e., kd1 = kma + knb.
Let (ka, kb) = d2 , so, d2 divides ka and kb, so that d2 divides the least positive value of
kma + knb, i.e., d2 divides kd1 . On the other hand, d1 divides a and b, hence kd1 divides
ka and kb. But d2 = (ka, kb), consequently, kd1 |d2 . Hence,
(ka, kb) = d2 = kd1 = k(a, b).
This is actually distributive law.

2.5

Common Multiple

Let a1 , a2 , , an be integers all different from zero. An integer b is said to be a common


multiple b of a1 , a2 , , an if ai |b for i = 1, 2, , n. In fact common multiple do exist. For
example, 2.3.5 is a common multiple of the integers 2, 3, 5, none of which is zero.

2.5.1

Lowest Common Multiple

Let a, b Z. Let us consider a set S as


S = {x : x N and x is a common multiple of a and b}


= {x : x N such that a x and b|x}.








Now, a a gives a| |a| or a |a||b| or a |ab| and similarly b |ab| and |ab| N , so |ab| S.
Therefore, S is nonempty subset of positive integers. Hence by well ordering principle, S
has a least element say m. This m is called the lowest common multiple (lcm) of a and
b, written as [a, b]. The least of the positive common multiples is called the least common
multiple of a1 , a2 , , an and is denoted by [a1 , a2 , , an ]. If m = [a1 , a2 , , an ], then the
common multiples of the integers is the set
{0, m, 2m, 3m, }.
The lcm of any set of nonzero integers is unique. For example, the lcm of 2,3,6 is 6; that of
-2,-3,-6 is 6; the lcm of -2,-6,10 is 30.

98

Theory of Numbers

Property 2.5.1 The relation between gcd and lcm is [a, b](a, b) = |ab|.
Proof: It is sufficient if we prove the result for positive integers only. First we consider
(a, b) = 1. suppose [a, b] = m, then m = ka for some k. Then b|ka and (a, b) = 1. Therefore,
b|k and therefore, b k, ba ka. But ba, being a positive common multiple of b and a, can
not be less than the least common multiple, and so b = m.
ba = ka = [a, b].
Now, let us consider the general case, (a, b) = d > 1. Then,




a b
a b
a b
=1
,
= . by the above
,
d d
d d
d d
[a, b](a, b) = ab.
Hence the theorem. Then if
(a, b) = d, [a, b] =

ab
.
d

From this, we have, if (a, b) = 1, then [a, b] = |ab|.


Property 2.5.2 If m = [a1 , a2 , , an ], then there exists l such that ai |l; i = 1, 2, , n if
and only if m|l.
Proof: Since, m = [a1 , a2 , , an ], so ai |m; i = 1, 2, , n. First let m|l, then,
ai |m; i = 1, 2, , n and m|l ai |l; i = 1, 2, , n.
Conversely, let ai |l; i = 1, 2, , n. Suppose m 6 |i, then by division algorithm, l = mq + r,
where 0 < r < m, i.e., r = l mq. Now,
ai |l, ai |m ai |r; i = 1, 2, , n.
But m is least such that ai |m; i = 1, 2, , n and hence m|l. Therefore, if [a1 , a2 ] =
m2 , [m2 , a3 ] = m3 , [m3 , a4 ] = m4 , . . . , [mn1 , an ] = mn . Then [a1 , a2 , . . . , an ] = mn .
Property 2.5.3 For k > 0, [ka, kb] = k[a, b].
Proof: Let m = [ka, kb], then by definition ka|m and kb|m, so we have m = kx.
If [a, b] = x1 , we note that a|x1 , b|x1 , ak|kx1 , bk|kx1 and so kx|kx1 . Thus, x|x1 . Also,
ak|kx, bk|kx, a|x, b|x and so x1 |x. Hence x = x1 and therefore,
m = kx = kx1 = k[a, b] [ka, kb] = k[a, b].
Hence the result.
Ex 2.5.1 Show that, (a + b, [a, b]) = (a, b) for any two integers a and b.
Solution: Let d = (a, b) and l = [a, b], then, (a + b, [a, b]) = (a + b, l). Now,
d = (a, b) d|a, d|b d|(a + b).
Also, a|l, b|l, so, d|a, a|l d|l.
If d is prime, then (a + b, l) = d = (a, b). Now, let d is not prime, i.e., d is a composite
number. Then, say,
d = d1 .d2 ; where, (d1 , d2 ) = 1; d1 , d2 < d.
Thus a positive number d1 ( or d2 ) such that,
d1 |a + b, d1 |l and d1 |d.
So, (a + b, l) = d, i.e.,(a + b, [a, b]) = (a, b).

Diophantine Equations

2.6

99

Diophantine Equations

In this section, we are to consider the Diophantine equations, named after the Greek mathematician Diophantos of Alexandria. We apply the term Diophantine equations in one or
more unknowns with integer coefficients which is to be solved in integers only. Such a equation is called an indeterminate equation (i.e., the number of equations is less than that of
the unknowns). One of the basic interest in the theory of numbers is to obtain all solutions
in Z of a given algebraic polynomial equation
a0 xn + a1 xn1 + + an1 x + an = 0; ai Z.
Such a problem is called Diophantine problem and we say we are solving Diophantine
Equations. As an example, we have to consider one of the oldest Diophantine Equations:
x2 + y 2 = z 2 , where x, y, z are pairwise relatively prime integers and obtained its complete
solution of the form
x = a2 b2 , y = 2ab, z = a2 + b2 with (a, b) = 1.
In this type of equations we usually look for the solutions in a restricted class of numbers
such as positive integers, negative integers.

2.6.1

Linear Diophantine Equations

Let us consider a linear diophantine equations in two unknown variables x and y as


ax + by = c;

where, a, b, c Z

(2.8)

with a, b are integers (not both zero). A integer solution of (2.8) is a pair of integers (x0 , y0 ),
that, when substituted into the equation, satisfies it, i.e., we ask that ax0 + by0 = c. In
finding the solution of a Diophantine equation ax+by = c, (a, b) = 1, we follows the following
methods
(i) Substitution method,
(ii) Simple continued function,
(iii) Euclidean algorithm method.
If (a, b) = 1 and if x0 , y0 is a particular solution of the linear Diophantine equation ax + by =
c, then all solutions are x = x0 + bk; y = y0 ak, for integral values of k.
(i) A given linear Diophantine equation can have a number of integral solutions, as is the
case with 2x + 4y = 12, where,
2.4 + 4.1 = 12; 2.2 + 4.2 = 12
or may not have even a single solution.
(ii) Conversely, there are some linear Diophantine equations like 2x + 6y = 13, which has
no solution, due to the fact that, the LHS is an even integer, whatever, the choice of
whereas the RHS is not.
So, our first task is to find out the condition for solvability of the linear Diophantine equations. The following theorem tells us when a Diophantine equation has a solution.
Theorem 2.6.1 The necessary and sufficient condition that, ax + by = c has integral solution if (a, b) divides c, where a, b, c are integers such that a, b are not both zero.

100

Theory of Numbers

Proof: Let d = (a, b) = the greatest common divisor of a and b. If d 6 |c, then there exist no
integers x and y with ax + by = c. Suppose d|c, in this case first determine integers x0 and
y0 so that ax0 + by0 = d. Since (a, b) = d, so d can be expressed in the form d = ma + nb,
where, m, n Z. This can be put in the general form as
d = a(m kb) + b(n + ka),
where k Z. We have d|c i.e c = ld; l Z. Now,
ld = al(m kb) + bl(n + ka)
or, c = a{l(m kb)} + b{c(n + ka)}.
Let x0 = l(m kb) Z and y0 = l(n + ka) Z, so that (x0 , y0 ) is an integral solution of
ax + by = c. Conversely, let (x0 , y0 ) be an integral solution of the equation ax + by = c.
Then ax0 + by0 = c, where, x0 , y0 are integers. Let (a, b) = d, then
d|a and d|b d|(ax0 + by0 ); i.e., d|c.
Now, if (x0 , y0 ) be any particular solution of ax + by = c, we are to all integral solutions.
Let (x1 , y1 ) be an integral solution of ax + by = c, where, set,
x1 = (c/d)x0 , and y1 = (c/d)y0
so that ax1 + by1 = c. Suppose r and s are integers satisfying ar + bs = c, we get
ar + bs = ax1 + by1 = c
b
a
(r x1 ) = (s y1 ).
d
d

(2.9)

Now d = (a, b), so, ( ad , db ) = 1, then from (2.9) we conclude that a/d|(s y1 ) and b/d|(r x1 )
and hence an integer t, such that
b
a
r = x1 + t and s = y1 t; t Z.
d
d
So the linear diophantine equations ax + by = c (a, b, c Z) has a solution iff d = (a, b)
divides c. Moreover for integral solution (x , y ), an integer t such that,
b
a
x = x0 + t and y = y0 t.
d
d
In fact (x0 + db t, y0 ad t) is an integral solution of the given equation, for any integer t, as
b
a
ab
ab
a(x0 + t) + b(y0 t) = (ax0 + by0 ) + t t
d
d
d
d
= ax0 + by0 = c.
Hence, if (x0 , y0 ) is an integral solution of the given equation, then all the integral solutions
are given by
b
a
x = x0 + t; y = y0 t,
d
d
where t is any integer. Therefore, there are an infinite number of solutions of the given
equation, one for each value of t.

Diophantine Equations

101

Ex 2.6.1 Find all solutions of the Diophantine equation 108x + 45y = 81.
Solution: By the Euclideans algorithm, which is given by (45, 108) = 9. Because 9|81, a
integral solution to this equation exists. To obtain the integer 9 as a linear combination of
108, 45, we work as follows:
9 = 45 2 18 = 45 2(108 2 45)
= 45 5 2 108.
Upon multiplying this relation by 9, we arrive at
81 = 9.9 = 9.[5.45 + (2).108] = 108.(18) + 45.45
so that x = 18 and y = 45 provide one integral solution to the given linear Diophantine
equation. Also, the equation can also be written in the form
108x + 45y = 81 = 108.(18) + 45.45
45
108
(x + 18) =
(45 y).
or,
9
9
Since

108
9

and

45
9

are prime to each other, we have,


x + 18
45 y
=
= t say.
45/9
108/9

Thus other integral solutions can be expressed as


45
108
t, y = 45
t, where, t Z
9
9
or, x = 18 + 5t, y = 45 12t; where, t = 0, 1, 2, .
x = 18 +

Deduction 2.6.1 All integral solutions of ax + by = c, such that a, b, c N and


(a, b) = 1: Since (a, b) = 1, m, n Z such that am + bn = 1. Thus,
ax + by = c(am + bn) a(x cm) = b(y cn)
y cn
x cm
=
= t(say) Z

b
a
x = cm bt, y = cn + at; t Z
where as (a, b) = 1 so b|x cm and a|y cn. This is the general solution in integers. For
positive integral solution, we must have
cn
cm
<t<
.
a
b
 cm 
 
cn
If we take cm
, q = cn
b = p+f1 and a = q+f2 where p =
b
a are integers and 0 < f1 1,
0 f2 < 1, then t p and t > q. In this case the total number of solutions in positive
integers is p q.
cm bt > 0 and cn + at > 0

Ex 2.6.2 Find all positive integral solution of 5x + 3y = 52.


Solution: Here, 5 and 3 are prime to each other, i.e., d = (5, 3) = 1. Thus there exists
m, n Z such that 5m + 3n = 1. Here, m = 2, n = 3. Thus,
5x + 3y = 52[5.2 + 3.(3)]
or, 5(x 104) = 3(y + 156).

102

Theory of Numbers

Since 5 and 3 are prime to each other, x 104 is divisible by 3 and y + 156 is divisible by 5
and therefore,
y + 156
x 104
=
= t; t Z
3
5
or, x = 104 3t; y = 5t 156, where, t = 0, 1, 2, .
This is the general solution of integers. For a positive integral solution, we must have
104
156
104 3t > 0 and 5t 156 > 0
<t<
.
5
3
The solutions in positive integers corresponds to t = 32, 33 and the solution is x = 8, y = 4
and x = 5, y = 9.
Ex 2.6.3 Find all positive integral solution of 5x + 12y = 80.
Solution: Here, 5 and 12 are prime to each other, i.e., d = (5, 12) = 1. Thus there exists
m, n Z such that 5m + 12n = 1. Here, m = 5, n = 2. Thus,
5x + 12y = 80(5.5 12.2)
or, 5(x 400) = 12(y + 160).
Since 5 and 12 are prime to each other, x 400 is divisible by 12 and y + 160 is divisible by
5 and therefore,
x 400
y + 160
=
= t; t Z
12
5
or, x = 400 12t; y = 5t 160, where, t = 0, 1, 2, .
This is the general solution of integers. For a positive integral solution, we must have,
400 12t > 0 and 5t 160 > 0 32 < t < 100
3 .
The only solution in positive integers corresponds to t = 33 and the solution is x = 4, y = 5.
Ex 2.6.4 Find all positive integral solution of 12x 7y = 8.
Solution: Here, 12 and 7 are prime to each other, i.e., d = (12, 7) = 1. Thus there exists
m, n Z such that 12m + 7n = 1. Here, m = 3, n = 5. Therefore,
12x 7y = 8[12.3 + 7.(5)]
or, 12(x 24) = 7(y 40).
Since 12 and 7 are prime to each other, x 244 is divisible by 7 and y 40 is divisible by 5
and so,
x 24
y 40
=
= t; t Z
7
12
or, x = 24 + 7t; y = 12t + 40, where, t = 0, 1, 2, .
This is the general solution of integers. For a positive integral solution, we must have,
10
24 + 7t > 0 and 12t + 40 > 0 t > 24
7 ; t> 3 .
The solution in positive integer corresponds to t = 3 and so x = 3, y = 4.

2.7

Prime Numbers

An integer p > 1 is called a prime number, or simply a prime, if there is no positive divisor
d of p satisfying 1 < d < p, i.e., its only positive divisors are 1 and p. If p > 1 is not prime
it is called composite number.
(i) The integer 1 is regarded as neither prime nor composite.
(ii) 2 is the only even prime number. All other prime numbers are necessarily odd.
For example, the prime numbers less than 10 are 2, 3, 5, 7, while 4, 6, 8, 9 are composite.

Prime Numbers

2.7.1

103

Relatively Prime Numbers

Two integers a and b, not both of which are zero, are said to be relatively prime or co-prime
if (a, b) = 1. In this case, it is guaranteed the existence of integers m and n such that
1 = ma + nb.
For example, 4 and 9 are not prime numbers, but they are relatively prime as (4, 9) = 1. A
set of integers a1 , a2 , . . . , an , not all zero, are said to be relatively prime, if
(ai , aj ) = 1; i 6= j = 1, 2, . . . , n.

(2.10)

Ex 2.7.1 Prove that, for n > 3, the integers n, n + 2, n + 4 cannot be all primes.
Solution: Any integer n is one of the forms 3k, 3k + 1, 3k + 2, where k Z. If
(i) n = 3k, then n is not a prime.
(ii) n = 3k + 1, then n + 2 = 3(k + 3) and it is not prime.
(iii) n = 3k + 2, then n + 4 = 3(k + 2) and it is not prime.
Thus in any case, the integers n, n + 2, n + 4 cannot be all primes.
Theorem 2.7.1 If m(6= 0) Z; then, (ma, mb) = m(a, b), where a, b Z are not both
zero.
Proof: Let (a, b) = k then a = kA, b = kB; k Z and (A, B) = 1. Therefore,
ma = mkA; mb = mkB and (A, B) = 1
(ma, mb) = mk = m(a, b).
Theorem 2.7.2 If d = (a, b) > 0 then

a
d

and

a
d

b
d

are integers prime to each other.

b
d

Proof: We observe that, although


and
have the appearance of fractions, in fact,
they are integers as d is a divisor of both a and b. Since d = (a, b) by existence theorem,
 
m, n Z such that d = ma + nb. Therefore,
a
b
m+
n.
1=
d
d
Since d|a, d|b, by definition of gcd, integers u, v where ad = u and db = v, such that,
1 = um + vn. Since, u, v are integers, the conclusion is ad and db are relatively prime.
Theorem 2.7.3 If (a, b) = 1, then for any integer c, (ac, b) = (c, b).
Proof: Let (ac, b) = d and (c, b) = d1 . Since (a, b) = 1, m, n Z such that
am + bn = 1 acm + bcn = c.
Now,
(ac, b) = d d|ac,
d|b d|ac, d|bc; as b|bc
d|(acx + bcy) d|c.
Then as (c, b) = d, so d|c, d|b d|d1 . Also,
(c, b) = d1 d1 |c, d1 |b
d1 |ac, d1 |b; as c|ac
d1 |d; as (ac, b) = d.
Thus it follows that d = d1 . For example (2, 5) = 1, c = 10, (20, 5) = 5. Also, (10, 5) = 5,
therefore (2.10, 5) = (10, 5). Therefore, if (ai , b) > 1, i = 1, 2, . . . , n then (a1 a2 an , b) =
1.

104

Theory of Numbers

Theorem 2.7.4 If p(> 1) is prime and p|ab, then, p|a, or p|b; where a, b are any two
integers.
Proof: Let p be a prime and a, b are integers such that p|ab. If a = 0 or b = 0, the result is
true. If p|a then the result is also true. Let us assume that p 6 |a. Because the only positive
divisors of p are 1 and p itself, we have, (p, a) = 1, so, m, n Z such that 1 = ma + np.
Multiplying both sides by b, we get,
b = mab + npb = (mp)c + npb; let ab = pc; c Z
= p(mc + nb) = p an integer
according as p|b. Similarly if p 6 |b then p|a. Conversely, let us suppose that, the integer
p(> 1) satisfies the given condition. Let q be a positive divisor od p such that q < p. We
are to show that p = qr. Since p|p, we have p|qr. Hence either p divides q or p divides r.
Since 0 < q < p, p 6 |q, so p|r. Thus, some k Z such that r = pk, so that
p = qr = qpk qk = 1 q = 1
so we conclude that 1 and p are the only positive divisors of p. Hence p is a prime. Thus a
positive integer p has the property that, if for any a, b Z,
p|ab p|a or p|b,
then p is prime. For example, 12|8.3, but neither 8 and 3 is divisible by 12. Hence 12 is
not prime. This theorem distinguish prime numbers from composite numbers, which is the
fundamental problem in number theory. Now,
(i) If p = ab, then at least one of a and b must be less p.
(ii) If a(6= 1) Z, a must have a prime factor.
Ex 2.7.2 Show that the fraction

9n + 8
is irreducible for all n N .
6n + 5

Solution: It is sufficient if we are to show that (9n+8, 6n+5) = 1. Let a = 9n+8, b = 6n+5,
9n+8
then 2a3b = 1. Therefore a and b are relatively prime. Hence the fraction 6n+5
is irreducible
for all n N .
Theorem 2.7.5 If p is prime, and p|a1 a2 a3 then p|ai for some i with 1 i n.
Proof: We shall prove this by use of the principle of mathematical principle on n, the
number of factors. When n = 1,, i.e., if p|a1 , then the result is true. Let n = 2 and if
p 6 |a1 ,then by the previous theorem, we get,
p|(a2 a3 an ) = p|a2 (a3 a4 . . . an ).
If p 6 |a2 , then p|a3 a4 an . Let, as the induction hypothesis that n > 2 and that whenever
p divides a product of less than n factors, it divides at least one of the factors. Now,
p|a1 a2 . . . an , then either p|an or p|a1 a2 . . . an1 , the inductive hypothesis ensures that p|ai
for some choice of i, with 1 i n 1. In any event, p divides one of the integers
a1 , a2 , . . . , an .
Therefore, if p, q1 , q2 , . . . , qn are all primes and p|q1 q2 qn , then p = qk , for some k,
where 1 k n.
Ex 2.7.3 If a, b are both primes with a b 5, show that 24|a2 b2 .

Prime Numbers

105

Solution: Since a and b are primes > 3, both of them are of the form 3k + 1 or 3k + 2,
where k Z. If both a and b are either of the forms then 3|a b. If one of them is of the
form 3k + 1 and the other is of the form 3k + 2 then 3|a + b. Thus, in any case 3|a2 b2 .
Given that a, b are odd primes, so they are of the form 4k + 1 or 4k + 3, where k Z. If
both a and b are either of the forms 4k + 1 then 2|a + b and 4|a b. If both of them are of
the form 4k + 3, then 4|a + b and 2|a b. Thus, in any case 8|a2 b2 .
Since (3, 8) = 1, we have 24|a2 b2 .
Theorem 2.7.6 If (a, b) = 1 and b|ac then b|c.
Proof: Since b|ac so r Z such that ac = br. Also, (a, b) = 1, m, n Z such that
1 = ma + nb. Multiplication of this equation by c produces,
c = c.1 = c(ma + nb)
= mac + nbc = mbr + nbc
= b(mr + nc) = b some integer.
Because b|bc and b|ac, it follows that b|(mac + nbc), shows that b|c. This is known as Euclids
lemma. If ap = bq and (a, b) = 1, then a|q and b|p.
If a and b are not relatively prime, then the result may or may not be true. For example,
12|9.8, but 12 6 |9 and 12 6 |8.
Theorem 2.7.7 If a|c, b|c and (a, b) = 1 then ab|c.
Proof: In as much a|c, b|c, k1 , k2 Z such that c = ak1 = bk2 . Again, the relation
(a, b) = 1 allows us to write m, n Z, such that ma + nb = 1. Multiplying the equation
by c, it appears that,
c = c.1 = c(ma + nb) = mac + nbc
= mabk2 + nbak1 ; as c = ak1 = bk2
= ab(mk2 + nk1 ) = ab some integer.
Hence as the divisibility statement ab|c.
Theorem 2.7.8 a|b if and only if ac|bc, where c 6= 0.
Proof: If ac|bc then bc = (ac)q; q Z. Therefore,
c(b aq) = 0 b aq = 0 as c 6= 0
b = aq; i.e., a|b.
The converse part is obvious. Without the condition (a, b) = 1, a|c and b|c together may
not imply ab|c. For example, 4|12 and 6|12 do not imply 4.6|12.
Theorem 2.7.9 If (a, b) = 1, then for any integer q, (a + bq, b) = 1.
Proof: Let (a + bq, b) = k where k 1. If k = 1, then the result holds. Let k > 1, then
(a + bq, b) = k(> 1) k|a + bq, k|b
k|(a + bq).1 + b(q) k|a.
Therefore, k|a, k|b (a, b) 6= 1, which is a contradiction. Hence the theorem is true for
k = 1 only. Therefore (a + bq, b) = 1.
Ex 2.7.4 If (a, b) = 1, prove that (a + b, a2 ab + b2 ) = 1 or 3.

106

Theory of Numbers

Solution: Let d = (a + b, a2 ab + b2 ), then,


(a + b, a2 ab + b2 ) = d (2ab + b2 , a + b) = d
(3ab, a + b) = d (3a2 , a + b) = d d|3a2 .
Also, (3b2 , a + b) = d d|3b2 . Therefore,
d|3a2 d = 3 or, d|a2 and d|3b2 d = 3 or, d|b2 .
d = 3 or d|a2 and d|b2 ,
d = 1 = (a2 , b2 ), as (a, b) = 1.
Thus, we have, d = 1 or d = 3.
Theorem 2.7.10 If (a, b) = 1, (a, c) = 1 then (a, bc) = 1.
Proof: Since (a, b) = 1 = (a, c), m1 , m2 , n1 , n2 Z such that,
1 = m1 a + n1 b = m2 a + n2 c
(n1 b)(n2 c) = (1 m1 a)(1 m2 a) = 1 a[k]; [k] = integer

ak + n1 n2 bc = 1.
So if r = (a, b, c) then r|1 so r = 1 (a, bc) = 1. Thus if a is prime to b and a is prime to
c, then c is prime to bc. From this theorem, we have the following results
(i) If (a, x) = 1 (a, x2 ) = 1 and in general (a, xn ) = 1.
(ii) If a|xn then (a, x) = 1.
(iii) If (a, c) = (b, c) = 1 then (ab, c) = 1.
Theorem 2.7.11 If (a, b) = 1 and c|a then (c, b) = 1.
Proof: Since (a, b) = 1, m, n Z such that 1 = ma + nb. Also, c|a, k Z such that
a = ck. Now,
1 = ma + nb = mkc + nb = (mk)c + nb
(b, c) = 1; mk, n Z.
Theorem 2.7.12 If (a, b) = 1 then, (a + b, ab) = 1.
Proof: Since a is prime to b, m, n Z such that am+bn = 1. This expression am+bn = 1
can be written in the form
a(m n) + (a + b)n = 1.
Since m n and n are integers, it follows that a is prime to a + b. Again, the expression
am + bn = 1 can be written in the form
(a + b)m + b(n m) = 1.
Since m and n m are integers, it follows that a + b is prime to b. Hence a + b is prime
to ab.
Ex 2.7.5 If (a, b) = 1, prove that (a2 , b) = 1 and (a2 , b2 ) = 1.

Prime Numbers

107

Solution: Since (a, b) = 1, m, n Z such that am + bn = 1. Thus,


a2 m2 = (1 bn)2 = 1 2bn + b2 n2
a2 m2 + b(2n bn2 ) = 1.
Since m2 and (2n bn2 ) Z, it follows that (a2 , b) = 1. As (a2 , b) = 1, m1 , n1 Z such
that
b2 n21 = (1 a2 m1 )2 = 1 2a2 m1 + a4 m21
a2 (2m1 a2 m21 ) + b2 n21 = 1.
Since 2m1 a2 m21 and n21 Z, it follows that (a2 , b2 ) = 1. In general, if d = (a, b), then,
d2 = (a2 , b2 ).
Theorem 2.7.13 If c|ab and (a, c) = 1 then c|b.
Proof: Since (a, c) = 1, m, n Z such that 1 = ma + nc. Also, as c|ab, k Z such that
ab = kc. Now,
1 = ma + nc b = mab + nbc
= mkc + nbc = c(mk + nb)
= c an integer c|b.
Theorem 2.7.14 If a|c, b|c, (a, b) = d then ab|cd.

Ex 2.7.6 Prove that m is irrational for any positive prime m.

Solution: If possible, let m is a rational number. Then,

p
m = ; where, p, q Z, q > 0, (p, q) = 1
q
2
p
or, m = 2 p2 = q 2 m = q(qm) q|p2 .
q
If q > 1, then by fundamental theorem of arithmetic, a prime m such that m|q. Thus,
m|q and q|p2 m|p2 m|a.
2
Then (p, q) m > 1, a contradiction arises unless q = 1. When, q = 1, we
have, p = m
which is not possible, as the square of any integer cannot be prime. Hence, m is irrational
for any prime m.

Ex 2.7.7 Prove that Fermats numbers are relatively prime to each other.
n

Solution: The number of the form Fn = 22 + 1 is known as a Fermats number. It has


been also found that the Fermats number Fn is prime n 0.
Let r be the common divisor of the two Fermats numbers Fn and Fn+k . Since Fn and
Fn+k being odd integers cannot have any even integer as a common divisor, so r is odd.
 n 2 k
Now,
n
n+k
Fn = 22 + 1; Fn+k = 22
+ 1 = 22
+ 1.
 n 2 k
n
Thus, Fn+k 2 = 22
1 has a factor 22 + 1 = Fn .
or, Fn |Fn+k 2, also, r|Fn r|Fn+k 2.
So, r|Fn+k and r|Fn+k 2 r|Fn+k Fn+k + 2 r|2,
which is not possible, since r is odd. Hence, (Fn , Fn+k ) = 1, i.e., Fermats numbers are
relatively prime to each other.

108

Theory of Numbers

Theorem 2.7.15 Every positive value integer greater than 1 has a least divisor (other than
1) which is prime.
Proof: Let n(> 1) be a positive integer. Let S be the set of positive value divisor of n
other than 1. So S is non-empty as n S (since,n|n). Thus S is a non-empty set of natural
number. Hence by well ordering principle it has an least element. Let k be the least element
of S. Then k is the least divisor of n other than 1. We assert that k is prime for if k be not
prime then
k = k1 k2 where, 1 < k1 < k2 ,
and k1 |n, which is contradiction shows that k is a least divisor. Hence k is prime. Therefore,
a composite number has at least one prime divisor.
Ex 2.7.8 If 2n 1 be prime, prove that n is a prime.
Solution: Let n be composite, then n = pq, where, p and q are integers greater than 1.
Now,
2n 1 = 2pq 1 = (2p 1)(2p(q1) + 2p(q2) + + 2p + 1).
Each of the factor on the right hand side is evidently greater than 1 and therefore, 2n 1 is
composite. Therefore, 2n 1 is a prime, i.e., n is prime.
Ex 2.7.9 Let p be prime and a be a positive integer. Prove that an is divisible by p if and
only if a is divisible by p.
Solution: Let a is divisible by p, then a = pk for some k Z. Thus,
an = pn k n = p(pn1 k n ) = p.m
where, m = (pn1 k n ) Z. This shows that an is divisible by p. Let a is not divisible by p,
i.e., (a, p) = 1. Therefore, u, v Z such that au + bv = 1. Then,
an un = (1 pv)n = 1 ps; s Z
or,
an r + ps = 1; r, s Z.
or, (an , p) = 1.
Therefore an is not divisible by p. Hence a is not divisible by p, i.e., an is not divisible by
p. Thus p|an p|a. Therefore, an is divisible by p if and only if a is divisible by p.

2.7.2

Fundamental Theorem of Arithmetic

Every integer greater than 1 is either a prime number or can be expressed as a product
of finite positive primes upto the order of factors and the expression is unique except the
rearrangement of factors.
Proof: Existence: Let n(> 1) be a given positive integer. Since 2 is prime, so if n is 2 or
a prime number there is nothing to prove. If n is a composite number then it has a prime
factor n1 (> 1) and so, an integer r1 such that
n = n1 r1 , where, 1 < r1 < n.
Among all such integers n1 , choose r1 to be smallest (it is possible by use of well-ordering
principle). If r1 is prime, then n is product of two primes and the result is obtained. If r1
is not prime, then it has a least prime factor n2 (> 1) and
n = n1 .n2 .r2 ; 1 < r2 < r1 < n.

Prime Numbers

109

This process of factorizing any composite factor is continued n = n1 .n2 . . . nk1 .rk1 where
1 < rk1 < rk2 < < r2 < r1 < n given a strictly descending chain of positive integers
and the chain 1 < rk1 < rk2 < < r1 must terminate after a finite steps. If terminates at
finite number of steps rk1 , then, rk1 is prime say nk . This leads to the prime factorization
n = n1 .n2 .n3 . . . nk , where ni s are all prime.
Uniqueness : To prove the uniqueness of representation, let us assume that the integer n
can be represented as a product of primes in two ways, say,
n = n1 .n2 .n3 . . . nk = p1 .p2 .p3 . . . pl ; k l
where ns and ps are all primes. Let l k then p1 |n1 .n2 .n3 . . . nk and p1 is prime, so p1 |ni
for some i and 1 i k. Since ni and pi are both prime we conclude pi = ni for some i;
1 i k. Without loss of any generality we can say
p1 = n1 .n2 .n3 . . . nk = p2 .p3 pl .
Similar argument shows that n2 = p2 , , nk = pk , leaving 1 = pk+1 . . . pl which is absurd
(as each pi s are prime and > 1). Hence l = k and pi = ni ; i = 1, 2, 3, , k, making the
two factorizations of n identical. Thus n > 1 can be expressed as a product of finite positive
primes, the representation being unique apart from the order of the factors. This is known
as fundamental theorem of arithmetic or unique factorization theorem.
Result 2.7.1 Standard form : In the application of this theorem, we may write, any
positive integer n(> 1) can be expressed uniquely in a canonical factorization as
n = p1 1 .p2 2 . . . pr r , i 0, for i = 1, 2, , r,
where pn is the nth prime with p1 < p2 < < pr and the integers If no i in the
canonical form of n is greater than 1, then integer n is said to be square free. For example,
n = 70 = 2.5.7 is square free number, whereas 140 = 22 .5.7 is not square free.
Result 2.7.2 Two integers a and b greater than one, then by fundamental theorem of
arithmetic
1 2
r
r
2
a = p11 p
2 pr and b = p1 p2 pr

then

min{1 ,1 } min{2 ,2 }
p2

(a, b) = p1

max{1 ,1 } max{2 ,2 }
p2

[a, b] = p1

prmin{r ,r }
prmax{r ,r } .

For example, let a = 491891400 = 23 .33 .52 .72 .111 .132 and b = 1138845708 = 22 .32 .72 .112 .133 .171 ,
then
(a, b) = 22 .32 .50 .71 .111 .132 .170 = 468468 and [a, b] = 23 .33 .52 .72 .112 .133 .171 = 1195787993400.
Theorem 2.7.16 (Euclids Theorem): The number of prime numbers are infinite; alternatively, there is no greater prime.
Proof: If possible, let the number of primes be finite. Then there is a greater prime say pn
and arrange all the primes are ascending order in magnitude p1 < p2 < < pn . Suppose,
there is only a finite number, say p1 , p2 , , pn . Let,
q = (p1 .p2 . . . pn ) + 1.

110

Theory of Numbers

Here we see that, q > 1, so q is divisible by some prime p. But p1 , p2 , , pn are the only
prime numbers, so that p must be equal to one of p1 , p2 , , pn . Now,
p|p1 .p2 . . . pn and p|q
p|q.p1 p2 pn p|1.
The only positive divisor of the integer 1 is 1 itself and because p > 1, a contradiction arises.
If q is prime, we get a contradiction as q > pn . If q is composite it has a prime factor, but
none of the primes p1 , p2 , . . . , pn divides q (since 1 is the remainder in each case). So the
prime of q must be greater than pn , when is again a contradiction. This shows that, there
is no greatest prime i.e. the number of primes are infinite. Now,
(i) Every positive integer greater than one has a prime divisor(factor).
(ii) If
n (integer, greater than one) is not a prime, then n has a prime factor not exceeding
n.
(iii) No rational algebraic formula can represent prime numbers only.
(iv) Consider the following consecutive integers
(k + 1)! + 2, (k + 1)! + 3, . . . , (k + 1)! + k + 1.
Each of these numbers in a composite number as
n|(k + 1)! + n; if 2 n k + 1.
Thus there are arbitrarily large gaps in the series of primes.
Ex 2.7.10 If pn is the nth prime number, then pn 22

n1

Solution: Clearly, the equality sign holds for n = 1. As an hypothesis of the induction, we
assume that the result holds for all integers up to k > 1. Euclids theorem shows that the
expression p1 p2 . . . pk + 1 is divisible by at least one prime. If there are several such prime
divisors, then pk+1 does not exceed the smallest of these so that
pk+1 p1 p2 . . . pk + 1 2.22 . . . 22
2
However, 1 22

k1

1+2+22 +2k1

+1=2

k1

2k1

+1

+ 1.

for all k, whence,


pk+1 22

+ 22

= 22 .

Thus the result is true for n = k + 1, if it is true for n = k. Thus for n 1, there are at
k
least (n + 1) primes less than 22 .
Ex 2.7.11 Determine which of the following integers are primes 287 and 271.
Solution: First we find all primes p such that p2 287. These primes are 2,3,5,7,11,13,17.
Now, 7|287, hence 287 is not a prime. The primes satisfying p2 271 are 2,3,5,7,11,13,17.
None of these divide 271, hence 271 is a prime.
Theorem 2.7.17 If be a positive prime and n be a positive integer then prove that an is
divisible by p iff a be divisible by p where a is any positive integer.
Proof: Since a is a positive integer, by fundamental theorem of arithmetic we get
a = p1 .p2 . . . pk ,
where p1 .p2 . . . pk are primes and p1 < p2 < . . . < pk . Now, a divisible by p iff one of
p1 , p2 , . . . , pk is divisible by p. Also, as an = p1 n .p2 n . . . pk n so an is divisible by p iff a is
divisible by p.

Modular/Congruence System

2.8

111

Modular/Congruence System

C.F.Gauss introduces the remarkable concept of congruence and the notion that makes it
such a powerful techniques for simplification of many problems concerning divisibility of
integers.
Let m > 0 be a fixed integer. Then an integer a is said to be congruent to another integer
b modulo m, if m|(a b) i.e if m is a divisor of (a b). Symbolically, this is expressed as
(2.11)
a b( mod m).
The number m is called the modulus of the congruence, b is called the residue of a modulo
m. In particular, a 0(modm) if and only if m|a. Hence
a b(modm) if and only if a b 0(modm).
For example,
(i) 15 7(mod8), 2 1(mod3), 52 1(mod2).
(ii) n is even if and only if n 0(mod2).
(iii) n is odd if and only if n 1(mod2).
(iv) a b(mod1) for all a, b Z, this case (m = 1) is not so useful and interesting.
Therefore, m is taken to be positive integer greater than 1.
(v) Let a, b be integers and m a positive integer, then a b(modm), if and only if a =
km + b, for some integer k.
When m 6 |(a b), we say that a is incongruent to b modulo m and in this case a 6 b(modm).
For example, 2 6 6(mod5), 3 6 3(mod5).
Ex 2.8.1 Use the theory of congruences to prove that 7|25n+3 + 52n+3 ; n( 1) N .
Solution: 25n+3 + 52n+3 can be written as 8.32n + 125.25n . Now,
32n 25n 0(mod7), for all n 1,
8.32n 8.25n 0(mod7), for all n 1.
Also, we have 133.(25)n 0(mod7), for all n 1 and so 8.32n + 125.25n 0(mod7) for all
n 1. Consequently, 7|25n+3 + 52n+3 ; n( 1) N.

2.8.1

Elementary Properties

The congruence is a statement about divisibility slightly different point of view more than
the convenient notation. Congruence symbol may be viewed as a generalized form of
equality sign, in the sense that its behavior with respect to addition and multiplication is
reminiscent of ordinary equality. Some of the elementary properties of equality that carry
over to congruences appear below.
Property 2.8.1 If a b (mod m), then a b (mod n), when n|m, m, n > 0.
Proof: From definition, n|m m = nk for some k Z. Given a b(mod m), so


m (a b) a b = ml for some l Z
a b = nkl = nr; r = kl Z


n (a b) a b(mod n)

112

Theory of Numbers

Property 2.8.2 a a (mod m), for any m > 0 and a 0 (mod m), if m|a.
Property 2.8.3 The relation congruence modulo m, defined by a b (mod m) if m|(a
b), is an equivalence relation in the set of integers.
Proof: If m(> 0) be a fixed positive integer, then we define a relation for any two
elements a, b Z such that
ab a b(mod m).
We are to show that this relation is an equivalence relation.
Reflexivity: Let a be any integer, then, we have, a a = 0 and m|0 for any m(> 0) Z.
Thus it follows that,
m|(a a); a Z a a(modm), for all a Z.
aa; a Z.
Thus the relation is reflexive.
Symmetry: Let a, b Z be such that ab (mod m). Then,
ab a b(mod m) m|(a b) m|(1)(a b)
m|(b a) b a (mod m)
Hence, ab ba; a, b Z.
Therefore, the relation is symmetric.
Transitivity: Let a, b, c Z such that ab and bc. Now,
ab, bc a b(mod m), b c(mod m)
m|(a b) and m|(b c)
m|[(a b) + (b c)]
m|(a c) a c (mod m) ac.
So the relation is transitive. The relation being reflexive, symmetric and transitive is an
equivalence relation. Thus, congruence is an equivalence relation in Z.
Result 2.8.1 Hence the equivalence relation will partition I into equivalence classes or
residue classes modulo m. The number of these classes is m. They are denoted as,
[a] = the class in which all integers a(modm).
Hence, [0] = [m] = [2m] = and[a] = [a + m] = [a + 2m] = .
So the residue classes modulo 5 are
[0] = { , 10, 5, 0, 5, 10, }; [1] = { , , 9, 4, 1, 6, }
[2] = { , 8, 3, 2, 7, }; [3] = { , 7, 2, 3, 8, 13, }
[4] = { , 6, 1, 4, 9, 14, }
Property 2.8.4 Two congruences with same modulus can be added, subtracted, or multiplied, member by member, as they were equations. Therefore, if a b (mod m), c d
(mod m) then, a + c b + d (mod m), a c b d (mod m) and ac bd (mod m).
Proof: Since a b(modm); c d(modm), we have assumed that, a = mq + b and
c = ms + d, for some choice of q, s Z. Hence, adding these equations, we obtain,
a + c = m(q + s) + (b + d)
(a + c) (b + d) = m(q + s).

Modular/Congruence System

113

Since q, s Z so q + s Z and as a congruent statement a + c b + d (mod m). The


converse is not always true. For example, let a = 10, c = 5, b = 1, d = 2 and m = 4. Then
(10 + 5) (1 + 2)(mod4), but 10 6 1(mod4) and 5 6 2(mod4). Again,
ac = (mq + b)(ms + d) = m(bs + qd + qsm) + bd.
Since b, s, q, m, d Z, bs + qd + qsm Z says that ac bd is divisible by m, whence
ac bd (mod m). The converse is not always true. For example, let 50 2(mod4), but
10 6 1(mod4) and 5 6 2(mod4). In general, if a1 b1 (mod m), a2 b2 (mod m), . . . ,
an bn (mod m), then
a1 .a2 an b1 .b2 bn (modm).
Property 2.8.5 If a b (mod m), then x Z, a + x b + x (mod m), a x b x
(mod m) and ax bx (mod m).
Proof: As a b (mod m), Z such that,
a b = m (a + x) (b + x) = m
or, (a + x) (b + x)(modm).
Also as a b = m, where Z, m|(a b) and so
m|(ax bx), where x Z ax bx(modm).
The converse of the result a b(modm) ax bx(modm) is not always true. For example,
2.4 2.1(mod6), whence 4 6 1(mod6).
Thus we conclude that, one cannot unrestrictedly cancel a common factor in the arithmetic
of congruences. The same holds true for any finite number of congruences with the same
modulus. For example,
3.(2) 2(mod8) and 3.14 2(mod8) 2 14(mod8).
Cancellation is allowed however, in some restricted sense, which is provided in the following
theorem.
Property 2.8.6 If a b(mod m) and d|m, m > 0, then a b(mod d).
Proof: Given that a b(modm) and d|m, m > 0. This implies that there are two integers
x and y such that
(a b) = xm and m = yd.
Now (a b) = xyd. So
(a b) = zd where xy = z.
Hence, a b(mod d). The converse of the result is not always true. For example, let
a = 5, b = 2 and m = 3, then 5 2(mod3). Again 3|6, but 5 6 2(mod6).
Property 2.8.7 A common factor which is relatively prime to the modulus can always be
cancelled. Thus, if, m be a positive integer and a, x, y be integers, then

m
ax ay(mod m) iff x y mod
; where, d = (a, m).
d

114

Theory of Numbers

Proof: Let d = (a, m) 6= 0, as m > 0, then by definition, d|a, d|m, so that, k, l Z such
that a = kd, m = ld where k and l are prime to each other. Since ax ay (mod m), i.e.,
m|(ax ay), so, q Z such that ax = mq + ay. Now,
kdx = ldq + kdy kx = lq + ky; d 6= 0
k(x y) = lq l|k(x y).
Since k and l are prime to each other, Euclids lemma yields l|(x y), which may be recast


as x y (mod l), i.e.,
m
m
x y (mod
)xy
mod
.
d
(a, m)
Thus, a common factor a can be cancelled provided the modulus is divided by d = (a, m).
y = tm
Conversely, let x y (mod m
d ), then xm
da, for some integer t. Hence,
ax ay = t a = tm = tmk = (mk)t
d
d
ax ay(modm).
This theorem gets its maximum force when the requirement that (a, m) = 1 is added, for
then the cancellation may be accomplished without a change in modulus. From this theorem
we have,
(i) If ax ay (mod m) and (a, m) = 1, then x y (mod m).
(ii) If x y(mod mi ); i = 1, 2, , r if and only if x y(mod[m1 , m2 , , mr ]).
(iii) If ax ay(modm) and a|m, then x y(mod m
a ). For example, 5.7 5.10(mod15), as
5|15, we get 7 10(mod3).
(iv) When ax 0(modm), with m a prime, then either a 0(modm) or x 0(modm).
(v) If ax ay (mod m) and m 6 |a, where m is a prime number, then x y(modm).
(vi) It is unnecessary to stipulate that a 6 0(modm). Indeed, if a 0(modm), then
(a, m) = m and in this case x y(mod1), for all integers x and y.
Property 2.8.8 Let a, b, c, d are integers and m a positive integer. a b (mod m) and
c d (mod m) then ax + cy (bx + dy) (mod m), for all integers x and y.
Proof: Since, a b (mod m), c d (mod m); so m|(a b) and m|(c d), i.e., , Z
such that a b = m and c d = m. For integers x, y we have,
(ax + cy) (bx + dy) = x(a b) + y(c d)
= mx + my
= m(x + y)
Since , , x, y Z so x + y Z, we get ax + cy bx + dy (mod m).
Property 2.8.9 For arbitrary integers a and b, a b (mod m) iff a and b leave the same
nonnegative principal remainder on division by m.
Proof: Let a = 1 m + r and b = 2 m + r where r is the common principal remainder when
a, b are divided by m when 1 , 2 Z and 0 r < m. Therefore,
a b = (1 2 )m = m, where, = 1 2 Z.
m|(a b), i.e., a b(modm).

Modular/Congruence System

115

Conversely, let, a b (mod m) and a, b leave the remainders r1 and r2 respectively when
divided by m. Hence,
a = 1 m + r1 ; 0 r1 < m and 1 Z
b = 2 m + r2 ; 0 r2 < m and 2 Z
a b = m(1 2 ) + r1 r2
r1 r2 = (a b) + m(2 1 ).
As a b (mod m) so that m|(a b). Also, as m|(a b) and also m|(2 1 )m, therefore,


m [(a b) + m(2 1 )] m|(r1 r2 )
r1 r2 = 0;
r1 = r2 .

since 0 |r1 r2 | < m

Thus, the congruent numbers have the same gcd with m. This theorem provides a useful
characterization of congruence modulo m in terms of remainders upon division by m. For
example, let m = 7. Since 23 2(mod7) and 12 2(mod7), i.e., 23 and -12 leave the same
remainder upon division by 7, so 23 12(mod7).
Property 2.8.10 If a b (mod m) then an bn (mod m) where n is a positive integer.
Proof: For n = 1, the theorem is certainly true. We assume that the theorem is true for
some positive integer k, so that ak bk (mod m). Also a b (mod m). These two relations
together imply that,
a.ak b.bk (modm) ak+1 bk+1 (modm),
so that the theorem is seen to be true for the positive integer k + 1, if it is true for n = k.
Hence the theorem is true for any positive integer n. The converse of the theorem is not
true, for an example 52 42 (mod 3) but 5 does not congruence to 4 (mod 3). The power
applications are given below.
Ex 2.8.2 Prove that 1920 1(mod181).
Solution: We have, 192 1(mod181). Therefore,
1920 (1)10 (mod77) 1(mod181).
Ex 2.8.3 What is the remainder, when 730 is divided by 4?
Solution: Let r be the remainder, when 730 is divided by 4. Hence by definition, 730 r is
divisible by 4, where 0 r < 4 and so 730 r(mod4). Now,
7 3(mod4) 72 32 (mod4).
But, 32 1(mod4), which implies that (72 )15 115 (mod4), i.e., 730 1(mod4). Hence the
remainder is 1.
Ex 2.8.4 Let f (x) = a0 + a1 x + + an1 xn1 + an xn is a polynomial whose coefficients
ai are integral. If a b (mod m), then f (a) f (b) (mod m).
Solution: We have a b (mod m) so ak bk (mod m), where k Z. Hence,
ai ak ai bk (modm), where, ai Z.

116

Theory of Numbers

Putting i = 0, 1, 2, . . . , n respectively and adding the congruences, we get,


(a0 + a1 a + a2 a2 + + an an ) (a0 + a1 b + a2 b2 + + an bn )(modm)
f (a) f (b)(modm).
If f (x, y, z) be a polynomial in x, y, z with integral coefficient and x = x0 (mod m), y =
y 0 (mod m), z = z 0 (mod m) then,
f (x, y, z) = f (x0 , y 0 , z 0 )(mod m).
Deduction 2.8.1 Let n = ak 10k +ak1 10k1 + +a2 102 +a1 10+a0 , where ai are integers
and 0 ai 9; i = 0, 1, , k be the decimal representation of a positive integer n. Let
S = a0 + a1 + + ak and T = a0 a1 + + (1)k ak . Then
(i) n is divisible by 2 if and only if a0 is divisible by 2;
(ii) n is divisible by 9 if and only if S is divisible by 9;
(iii) n is divisible by 11 if and only if T is divisible by 11.
Ex 2.8.5 Show that an integer N is divisible by 3, if and only if the sum of the digits of N
is divisible by 3.
Solution: Let the number N can be written as
N = am 10m + am1 10m1 + + a1 10 + a0 .
Let f (x) = a0 + a1 x + + am1 xm1 + am xm ,
so, f (1) = am + am1 + + a1 + a0 .
therefore, f (10) = N and f (1) = sum of the digits of N = S (say). Now,
10 1(mod3) f (10) f (1)(mod3)
N S(mod3) 3|(N S).
Thus 3|N, if and only if 3|S. Thus integer N is divisible by 3, if and only if the sum of the
digits of N is divisible by 3.
Ex 2.8.6 N is divisible by 5 if the last digit is either 0 or 5.
Solution: Taking the last digit as a0 , the number N can be written as
N = am 10m + am1 10m1 + + a1 10 + a0 .
Let f (x) = a0 + a1 x + + am1 xm1 + am xm ,
then f (10) = N and f (0) = a0 . We have,
10 0(mod5) f (10) f (0)(mod5)
N = a0 (mod5) 5|N a0 .
Now 5|N, then, 5|N a0 if and only if 5|a0 . Therefore,
f |a0 either a0 = 0 or, a0 = 5.
Hence, 5|N if the last digit of N is either 0 or 5.
Ex 2.8.7 Show that the integer 23456785 is divisible by 11.

Modular/Congruence System

117

Solution: Let N = 23456785, then N can be written as


N = 23456785 = 23 (1000)2 + 456 1000 + 785.
Let f (x) = 23x2 + 456x + 785, then f (1000) = N and f (1) = 352. Now,
1000 1(mod11) f (1000) f (1)(mod11)
N = 352(mod11) 11|N 352.
Now 11|N, if 11|352 and this is true. Therefore, 11|N.
Ex 2.8.8 Show that the integer 205769 is not divisible by 3.
Solution: Let N = 205769, then N can be written as
N = 205769 = 20 (100)2 + 57 100 + 69.
Let f (x) = 20x2 + 57x + 69, then f (100) = N and f (1) = 146. Now,
100 1(mod3) f (100) f (1)(mod3)
N = 146(mod3) 3|N 146.
Now 3|N, if and only if 3|146 and this is not true. Therefore, the integer 205769 is not
divisible by 3.
Property 2.8.11 If a b (mod m) and d|a,
(d, m) = 1, then ad db (mod m), d > 0.

d|b and the integers a, b, m are such that

Proof: Since d|a, d|b, a1 , b1 Z such that a = da1 , b = db1 . Now,


a b (mod m) m|(a b)
m|d(a1 b1 ); d(> 0) Z
m|(a1 b1 ); as, (m, d) = 1
a
b
a1 b1 (modm)
(modm).
d
d
If a b(modm) and a b(modn), where, (m, n) = 1, then a b(modmn). For example, 8.7 2.7(mod6), where (7, 6) = 1, then 8 2(mod6). This is known as restricted
cancellation law.
Property 2.8.12 If a b (mod m) and if 0 |a b| < m then a = b.
Proof: Since m|(a b), we have, m |a b|, unless a b = 0.

2.8.2

Complete Set of Residues

Consider a fixed modulus m > 0. Given an integer a, let q and r be its quotient and
remainder upon division by m, so that,
a = mq + r; 0 r < m.
Then, a r = mq a r(modm).
r is called the least residue of a modulo m. For example, 5 1(mod4), so 1 is the residue of
5 modulo 4. Because there are n choices for r, we see that every integer is congruent modulo
m to exactly one of the values 0, 1, 2, , m 1. In particular, a 0(modm) if and only

118

Theory of Numbers

if m|a. The set of integers 0, 1, 2, , m 1 is called the set of least non negative residues
modulo m.
For example, let a = 8 and m = 3, then
8 = 3.2 + 2 8 2 = 3.2 8 2(mod3).
Then 2 is the least residue of 8 modulo 3. We consider S = {0, 1, 2} and let a = 7 be any
integer. Then 7 congruent to modulo 3, to exactly one of the number of S, and that is 1. If
a = 32, then 32 is congruent to only 2 S modulo 3. Thus if a be any integer, then it must
be congruent to mod 3, to exactly one of the members of S. S = {0, 1, 2} is called the set
of least non-negative residues of an integer, modulo 3.
The following are the important properties of residue class
(i) If a
and b be respectively the residue classes a, b modulo m, then a
= b, if and only if
a b(modm).
(ii) Two integers a and b are in the same residue class, if and only if a b(modm).
(iii) The m residue classes 1, 2, , mare disjoint and their union is the set of all integers.
(iv) Any two integers in a residue class are congruent modulo m and any two integers
belonging to two different residue classes are incongruent modulo m.
n
o
A set S = r1 , r2 , . . . , rm of m integers is called a complete residues system modulo m if
ri 6 rj (mod m) for 1 i < j m
For example,
(i) Let m = 11, S = {0, 1, 2, . . . , 10}, then ri 6 rj (mod m); i, j = 0, 1, . . . , 10, i 6= j.
So S forms a complete system of residues.
(ii) Let m = 5, then S = {12, 15, 82, 1, 31} forms a complete system of residues
modulo 5.
Complete residue system of a modulo system is not unique. For example,S1 = {0, 1, , 8}
and S2 = {5, 6, , 13} are two different complete residue system modulo 9.
If r1 , r2 , . . . , rm be a complete set of residues modulo m and (a, m) = 1 then ar1 , ar2 , . . . , arm
is a complete set of residues modulo m, as, ri rj (mod m); 1 i < j m, we have
ari arj (modm) m|a(ri rj ).
Now, (a, m) = 1 m|(ri rj ) ri rj (modm)
ari arj (modm).
A reduced residue system modulo m is a set of integers ri such that (ri , m) = 1, ri rj
(mod m) if i 6= j and such that every x prime to n is congruent modulo m to some integer
ri of the set.
Property 2.8.13 If (a1 , a2 , , an ) be a complete system of residues modulo m and b1 , b2 , , bn
any set of integers such that
bi ai (modm); i = 1, 2, , n
then (b1 , b2 , , bn ) is also a complete system( mod m).

Modular/Congruence System

119

Proof: Let ri be the least residues of ai , modulo m, for i = 1, 2, , n, then ai ri (modm).


Again given bi ai (modm), therefore, ri are also the same residues of bi (modm), i.e.,
bi ai (modm).
The relation ai ri (modm) shows that ri 6 rj (mod m), then ri rj is divisible by m,
i.e.,
r r = mk; for some k Z.
i

Now, the relation ai ri (modm) shows that,


ai ri = mt1 and aj rj = mt2 ; for some t1 , t2 Z,
(ai aj ) (ri rj ) = m(t1 t2 )
(ai aj ) = m(t1 t2 ) + mk = mk1 ; k1 = t1 t2 + k Z
ai aj (modm),
which is a contradiction. Hence, ri rj (mod m), for every i and j with i 6= j. Therefore,
the relation bi ai (modm) shows that
bi 6 bj (mod m), i and j with i 6= j
and hence (b1 , b2 , , bn ) forms a complete residue system( mod m). Therefore, if a1 , a2 , , an
be a complete system of residues modulo m, then,
(i) k + a1 , k + a2 , , k + an
(ii) ka1 , ka2 , , kan , (k, m) = 1
also form a complete system of residues modulo m.
Property 2.8.14 A set of m integers which are in arithmetic progression with common
difference d, (d, m) = 1, forms a complete residue system modulo m.
Proof: Let us consider the A.P. of m terms with common difference d as
a, a + d, a + 2d, , a + (m 1)d; (d, m) = 1.
Now, a + id a + jd(modm); i, j = 0, 2, , m 1; i 6= j
ikd jd(modm) i j(modm); (d, m) = 1,
which contradicts that i 6= j(modm). Therefore,
a + id 6 a + jd(modm).
Ex 2.8.9 Find the remainder when 273 + 143 is divisible by 11.
Solution: Here,
2 2(mod11), 24 5(mod11), 28 3(mod11)
210 3 22 (mod11) 1(mod11)
270 1(mod11) 273 8(mod11).
Again, 143 = (11 + 3)3 33 (mod11) 5(mod11). Therefore,
273 + 143 8 + 5(mod11) 2(mod11).
Hence the remainder is 2.
Ex 2.8.10 Find the least positive residues of 244 (mod89).

120

Theory of Numbers

Solution: We know, 26 25(mod89), i.e.,


26 52 (mod89) (26 )2 (52 )2 (mod89)
or, 212 625 7 89(mod89) 212 2(mod89)
211 1(mod89); as (2, 89) = 1
or, (211 )4 1(mod89) 244 1(mod89).
Least positive residue is 1, therefore, 89|244 1.
Ex 2.8.11 What if the remainder when 15 + 25 + + 1005 is divisible by 4.
Solution: We have the following results
1 1(mod4) 15 1(mod4)
3 1(mod4) 35 1(mod4)
5 1(mod4) 55 1(mod4)
..
.
99 1(mod4) 995 1(mod4)
Adding the above 50 congruence relations, we get,
15 + 35 + + 995 [1 1 + 1](mod4) 0(mod4).
Again, we have the following results
2 2(mod4) 25 25 (mod4) 0(mod4)
4 0(mod4) 45 1(mod4)
6 2(mod4) 65 0(mod4)
..
.
100 0(mod4) 1005 0(mod4)
Adding the above 50 congruence relations, we get,
25 + 45 + + 1005 0(mod4).
Adding the results, we get,
15 + 25 + 35 + 45 + 1005 0(mod4).
Thus the given expression is completely divisible by 4 and hence the remainder is zero.,
when divisible by 4.
Ex 2.8.12 Show that 3.52n+1 + 23n+1 0(mod17) for all n 1.
Solution: The expression 3.52n+1 + 23n+1 can be written as
3.52n+1 + 23n+1 = 15.52n + 2.23n .
We have the following results,
25 8(mod17) 52 8(mod17)
or, (52 )n 8n (mod17) 15.52n 8n .15(mod17).
Also, 8 8(mod17) 23n 8n (mod17)
2.32n 8n .2(mod17).

Modular/Congruence System

121

Adding the two results, we get,


15.52n + 2.23n 8n (15 + 2)(mod17) 0(mod17)
or, 3.52n+1 + 23n+1 0(mod17).
Ex 2.8.13 For any natural number n, show that (2n + 1)2 1 (mod 8)
Solution: Here, (2n + 1)2 can be written as
(2n + 1)2 = 4n2 + 4n + 1 = 4n(n + 1) + 1.
Now, as n Z, n and (n + 1) are two consecutive numbers so 2|n(n + 1) so that k Z
s.t. n(n + 1) = 2k. Therefore,
(2n + 1)2 = 4n(n + 1) + 1 = 8k + 1
(2n + 1)2 1 = 8k, where k Z
(2n + 1)2 1(mod8).
Ex 2.8.14 Prove that 1! + 2! + + 1000! 3 (mod 15).
Solution: For n = 0 and for any integer n Z we have (5 + n)! 0(mod 15). Now,
1! + 2! + 3! + 4! = 33 so that 1! + 2! + 3! + 4! 3 (mod 15). Hence
1! + 2! + + 1000! 3(mod15).
Hence, we the remainder is 3, when 1! + 2! + + 1000! is divisible by 15.
Ex 2.8.15 Find all natural number n 100 such that n 0(mod 7).
Solution: Here 100 = 14.7 + 2. Hence the required natural numbers n 100 such that
n 0 (mod 7) are 7, 14, 21, . . . , 98.

2.8.3

Reduced Residue System

By a reduced residue system modulo m we mean any set of (m) integers, incongruent
modulo m, each of which is relatively prime to m, where (m) is Eulers phi function.
Reduced set of modulo m can be obtained by deleting from a complete set of residues
modulo m those members that are not relatively prime to m. A reduced set of residues
therefore, consists of the numbers of a complete system, which are relatively prime to the
modulus. For example, in the modulo 8 system, complete set of residue is {0, 1, 2, , 7},
its reduced system is {1, 3, 5, 7}.
Ex 2.8.16 Find the reduced residue system of m = 40.
Solution: We note that 40 = 5.23 , and the suitable reduced residue system of 5 and 23
respectively are 1, 9, 17, 33; 1, 11, 21, 31. Therefore, residue system of 40 is seen from the
following table: can be arranged in n lines, each containing m numbers. Thus,

1
11
21
31

1
1
11
21
31

9
9
11 9 19
21 9 29
31 9 39

17
17
11 17 27
21 17 37
31 17 7

33
33
11 33 3
21 33 13
31 33 23

Therefore, the reduced residue system is {1, 3, 7, 9, 11, 13, 17, 19, 21, 23, 27, 29, 31, 33, 37, 39}
modulo 42.

122

Theory of Numbers

Ex 2.8.17 Find the least positive residues 336 (mod77).


Solution: We know, 34 4(mod77), therefore,
312 43 (mod77) = 13(mod77)
324 169(mod77) = 15(mod77)
336 15.(13)(mod77) = 36(mod77).
Hence the least positive residue is 36.

2.8.4

Linear Congruences

Let a, b be two integers and m be a positive integer. An equation in unknown x of the form
ax b(modm) is called a linear congruence and by a solution of such an equation we mean
an integer x0 for which ax0 b(modm). By definition,
ax0 b(modm) ax0 b = mk; for some k Z
m|(ax0 b).
In finding the solution we observe that the set of non-negative integers {0, 1, 2, . . . , m 1}
forms a complete residue system modulo m. There exists at least one member of S which
satisfies the linear congruence ax b(mod m). Thus, the system has either a single solution
or more than one solution which are incongruent to each other with mod m. Thus, the
problem of finding all integers that will satisfy the linear congruence ax b(modm) is
identical with that of obtaining all solutions of the linear Diophantine equation axmy = k.
For example,
(i) Consider 4x 3(mod 5) and S = {0, 1, 2, 3, 4}. Hence we see that only 2 S satisfies
the linear congruence, so 2 is the only solution of linear congruence. Also we observe
that (a, m) = (4, 5) = 1.
(ii) Consider 6x 3(mod 9) and S = {0, 1, 2, . . . , 8}. Note that 2, 5, 8 S satisfies the
linear congruence.
Thus, the linear congruence system has more than one solution. Also,
2 6 5(mod 9), 5 6 8(mod 9), 2 6 8(mod 9).
Hence the solutions are incongruent to each other. Hens we observe that (a, m) = (6, 9) 6= 1,
i.e. not prime to each other. Hence, x 2, 5, 8(mod 9).
Note : Now, x = 2 and x = 4 both satisfy the linear congruence 2x 4(mod6) as
2 4(mod6), so they are not counted as different solution. Therefore, when we speak to
the number of solutions of a congruence ax b(modm), we shall mean number of incongruent
integers satisfying this congruence.
Theorem 2.8.1 If x1 be a solution of the linear congruence ax b(modm) and if x1
x2 (modm), then x2 is also a solution of the congruence.
Proof: Given that x1 be a solution of the linear congruence ax b(modm), therefore,
ax1 b(modm). Now,
x2 x1 (modm) ax2 ax1 (modm)
ax2 b(modm).

Modular/Congruence System

123

Thus, x2 is a solution of the linear congruence ax b(modm). From this theorem, we have,
if x1 be a solution of the linear congruence ax b(modm), then
x1 + m; = 0, 1, 2,
is also a solution. All these solutions belong to one residue class modulo m and these are
not counted as different solutions.
Theorem 2.8.2 Let a, b and m be integers with m > 0 and (a, m) = 1. Then the congruence
ax b(modm) has a unique solution.
Proof: Since (a, m) = 1, u, v Z such that au + mv = 1 and so
a(bu) + m(bv) = b a(bu) b(modm).
This shows that x = bu is a solution of the linear congruence ax b(modm). Let x1 , x2 be
solutions of the linear congruence ax b(modm), then ax1 b(modm) and ax2 b(modm).
This gives
ax1 ax2 (modm) x1 x2 (modm);

as (a, m) = 1.

This proves that the congruence has an unique solution. The solutions are written in the
form x = bu + m; = 0, 1, 2, and they all belong to one and only one residue class
modulo m.
a
Result 2.8.2 In finding the solution when (a, m) = 1, let m
be expressed as a simple
y0
continued fraction with an even number of quotients and x0 be the least convergent but
one. Then ax0 my0 = 1 so that

ax0 1(mod m) abx0 b(mod m)


bx0 is the required solution of ax b(mod m)
x bx0 (mod m)
Ex 2.8.18 Solve the linear congruence 5x 3(mod 11).
Solution: Since d = (a, m) = (5, 11) = 1, the given linear congruence has unique solution.
Since (5, 11) = 1 integers u, v such that 5u + 11v = 1. Here u = 2, v = 1, so
5.(2) + 11.1 = 1,
i.e., 5.(2) 1(mod 11) 5.(6) 3(mod 11).
Therefore, the value of x is 6. All the solutions are x 6(mod11), i.e., x 5(mod11).
All the solutions are congruent to 5(mod11) and therefore, the given congruence has unique
solution.
Ex 2.8.19 Solve the linear congruence 47x 11(mod249).
Solution: Since (47, 249) = 1, so the given linear congruence 47x 11(mod249) has an
unique solution. We express it as a simple continued fraction with even no. of quotients.
47
1 1 1 1 1
=0+
249
5+ 3+ 2+ 1+ 4


y0
Last but one congruent is pq55 = 10
35 = x0 . Therefore,
x 11 53(mod 249)
or, x 583 2 249(mod 249)
or, x 85(mod 249).

124

Theory of Numbers

Theorem 2.8.3 Let a, b and m be integers with m > 0. The linear congruence ax
b(modm) has incongruent solutions if and only if d|b, where d = (a, m) 6= 1. Also, if
d|b, then there are exactly d mutually incongruent solutions modulo m.
Proof: The given linear congruence ax b(modm), i.e., ax b = my, for some m Z is
equivalent to the linear Diophantine equation ax my = b. Since (a, m) = d, so,
d|a and d|m a = da1 and m = dm1
for some integers a1 and m1 with (a1 , m1 ) = 1. Then,
ax b(mod m) da1 b(mod dm1 )

(2.12)

We require these values of x for which da1 x b is divisible by dm1 . No such value of x is
obtained unless d|b. Thus if dm1 |da1 x b and d|b, i.e. if b = db1 for some integer b1 , then
dm1 |da1 x db1 m1 |a1 x b1
a1 x b1 is divisible by m1
a1 x b1 (mod m1 ); where (a1 , m1 ) = 1
(2.13)
which is equivalent form of (2.12). Therefore, the congruence (2.13) has one solution x0 < m0
and the d distinct solutions can be written in the form
x0 , x0 + m0 , x0 + 2m0 , . . . , x0 + (d 1)m0
m
m
m
m
i.e. x0 , x0 + , x0 + 2 , , x0 + (d 1) ; m0 =
d
d
d
d

(2.14)

which is also d incongruence solution of (2.12). We assert that, these integers are incongruent
modulo m, and all other such integers x are congruent to some one of them. If it happens
that,
m
m
x0 + k1 x0 + k2 (modm); where,0 k1 < k2 d 1,
d
d
m
m
k1 k2 (modm).
d
d
Now, ( m
d , m) =

m
d

and so, the factor

m
d

can not cancelled to arrive at the congruence

k1 k2 (mod d), 0 |k1 k2 | < d d|(k2 k1 ),


which is impossible due to the fact that 0 |k2 k1 | < d. Therefore,
k1 k2 = 0 k1 = k2 .
So the d distinct solutions are incongruent to each other modulo m.
Now, we are to show that, any other solution x0 + m
d k is congruent modulo m to one of
the d integers 0, 1, 2, , d 1, i.e., the congruent equation has no solutions except those
listed in (2.14). Using division algorithm, we get, k = qd + r; 0 r d 1, and therefore,
m
m
m
x0 + k = x0 + (qd + r) = x0 + mq + r
d
d
d
m
x0 + r(mod m); 0 r < d
d
m
with x0 + d r being one of our d selected solutions. Thus, if x0 is any solution of ax
b(modm), then the d incongruent distinct solutions are given by,
m
x = x0 + k(mod m); k = 0, 1, 2, , d 1
d
m
m
m
i.e., x0 , x0 + , x0 + 2 , , x0 + (d 1) .
d
d
d

Modular/Congruence System

125

Ex 2.8.20 Solve the linear congruence 20x 10(mod 35).


Solution: Since d = (a, m) = (20, 35) = 5 and 5|10, the given linear congruence has unique
solution. The congruence 20x 10(mod 35) is equivalent to the linear congruence
4x 2(mod 7), where, (4, 7) = 1,
so, 4x 2(mod 7) has an unique solution. Since (4, 7) = 1 integers u, v such that
4u + 7v = 1. Here u = 2, v = 1, so
4.2 + 7.(1) = 1, 4.2 1(mod 7) 4.4 2(mod 7).
Therefore, the value of x is 4. The incongruent solutions are
x=4+

35
t = 4 + 7t; t = 0, 1, 2, 3, 4.
5

Ex 2.8.21 Solve : 25x 15(mod 120).


Solution: Here d = (25, 120) = 5 and 5(= d)|10(= b), the given linear congruence has
unique solution. The congruence 25x 15(mod 120) is equivalent to the linear congruence
5x 3(mod 24), where, (5, 24) = 1,
or, 5x 45(mod24) x 9(mod24)
or, x 15(mod24); i.e., x 15 + 24t; t = 0, 1, 2, 3, 4,
having 5 solutions. Therefore, x = 15, 39, 63, 87, 111 are 5 incongruent solutions.

2.8.5

Simultaneous Linear Congruences

Here we consider to the problem of solving a system of simultaneous linear congruences


a1 x b1 (mod m1 ), a2 x b2 (mod m2 ), , ak x bk (mod mk ).

(2.15)

We assume that, the moduli mr are relatively prime in pairs. Evidently, the system of two
or more linear congruences will not have a solution unless dr |br ; r, where dr = (ar , mr ).
When these conditions are satisfied, the factor dr can be cancelled in the k th congruence to
produce a new system having the same set of solutions as the original one
a1 x b1 (mod n1 ), a2 x b2 (mod n2 ), , ak x bk (mod nk ),

r
where nr = m
dr and (ni , nj ) = 1, for i 6= j. Also, (ar , nr ) = 1. The solutions of the individual
congruences assume the form

x c1 (mod n1 ), x c2 (mod n2 ), , x ck (mod nk ).

(2.16)

Thus, the problem is reduced to one of finding a simultaneous solution of a system of congruences of this type. The kind of problem that can be solved by simultaneous congruences
is given by the following theorem.
Deduction 2.8.2 Let x a(modp) and x b(modp) be two simultaneous congruences and
let (p, q) = d. Since x a(modp), so x = a + py, where y is given by
a + py b(modq) py b a(modq).
If b a is not divisible by d, then the solution does not exist. But if d|b a, there is only
one solution y1 of y < dq , which satisfies the last congruence and the general value of y is
given by

126

Theory of Numbers
q
y = y1 + t; t Z
d
pq
so that x = x1 + t; where, x1 = a + py1 .
d
pq
Hence, x = a + py1 + t; t Z.
d

Thus a solution x1 of the given congruences exists if and only if ba is divisible by d = (p, q)
and the congruence are equivalent to a single congruence x x1 (modl), where l = [p, q].
Ex 2.8.22 Find the general values of x for x 1(mod 6) and x 4(mod 9).
Solution: Here, a = 4, b = 1, so that a b = 3 and (9, 6) = 3 = d so that d|a b. So the
solution exists. Now,
x 1(mod 6) x = 1 + 6y,
where, y is given by

1 + 6y 4(mod9) 6y 3(mod9,

which has a solution y = 2(= y1 ) <

q
d

= 3 and the general values of y are given by

q
y = y1 + t = 2 + 3t; t Z
d
or, x = 1 + 6(2 + 3t) = 13 + 18t;

as x = 1 + 6y

which gives the general values of the given congruences equivalent to a single congruence
x 13(mod18, where [p, q] = [9, 6] = 18.
Theorem 2.8.4 Let m1 , m2 , , mr denote r positive integers with (mi , mj ) = 1, 1 i <
j r. Let a1 , a2 , , ar be arbitrary integers. Then the system of linear congruences
x a1 (mod m1 ), x a2 (mod m2 ), , x ar (mod mr )

(2.17)

has unique simultaneous solution x0 modulo the product m, m = m1 .m2 . . . mr , i.e., x


x0 (mod m).
Proof: Here we take, m = m1 .m2 . . . mr . For each k = 1, 2, , r let us define,
Mk =

m
= m1 .m2 . . . mk1 .mk+1 mr ,
mk

i.e., Mk is the product of all integers mr with the factor mk omitted. By hypothesis, mk s
are respectively prime in pairs, i.e., (mi , mj ) = 1 for i 6= j, so that (Mk , mk ) = 1. Hence by
the theory of linear congruence, it is therefore possible to find the linear congruence
Mk x 1(modmk ).
Let the unique solution be x0 . We are to show that the integer
x0 = a1 M1 x1 + a2 M2 x2 + + ar Mr xr
is the simultaneous common solution of the given system of congruences. As, mk |Mi ; i 6= k,
in this case, Mi 0(mod mk ). Hence
x0 = a1 M1 x1 + a2 M2 x2 + + ar Mr xr ak Mk xk (modmk ).

Modular/Congruence System

127

But the integer xk was chosen to satisfy the congruence Mk x 1(modmk ), which gives,
Mk xk 1(mod mk ) ak Mk xk ak (mod mk )
x0 ak .1 = ak (mod mk ).
This shows that a solution x0 to the system (2.17) of congruences exists. Let, x0 be any
other integer that satisfies these congruences, then,
x0 ak x0 (mod mk ); k = 1, 2, , r
mk |(x0 x0 ); for each value of k.
As (mi , mj ) = 1, we have,
m1 m2 . . . mr |(x0 x0 ) x0 x0 (modm).
This shows the uniqueness of the solution. This is the Chinese remainder theorem.
Ex 2.8.23 Solve the simultaneous linear congruence x 36(mod 41), x 5(mod 17).
Solution: From the given simultaneous linear congruences, we see that m1 = 41, m2 = 17,
so that (m1 , m2 ) = (41, 17) = 1. Let m = m1 .m2 = 41.17 = 697 and let
M1 =

m
697
m
697
=
= 17, M2 =
=
= 41,
m1
41
m2
17

then (M1 , 41) = 1 and (M2 , 1) = 1. Since, (M1 , 41) = 1, the linear congruence 17x
1(mod 41) has an unique solution. Since,
17.(12) + 41.5 = 1,

i.e., 17.(12) 1(mod41),

so the solution is x1 (12)(mod 41) 29(mod 41). Since, (M2 , 17) = 1, the linear
congruence 41x 1(mod 17) has an unique solution. Since,
41.5 + 17.(12) = 1,

i.e., 41.5 1(mod 17),

so the solution is x2 5(mod 41). Therefore, the common integer solution of the given
system of linear congruences is given by
x0 a1 M1 x1 + a2 M2 x2 {36.(17.29) + 5.(41.5)}(mod 697)
18773(mod 697) 36(mod 697).
Ex 2.8.24 Solve the following system of linear congruences x 2(mod 7), x 5(mod 19)
and x 4(mod 5).
Solution: Let m = 7.19.5. Now, we consider the following simultaneous linear congruences
m
m
m
x 1(mod 7), x 1(mod 19) and x 1(mod 5)
7
19
5
i.e., 95x 1(mod 7), 35x 1(mod 19) and 133x 1(mod 5)
i.e., (91 + 4)x 1(mod 7), (38 3)x 1(mod 19) and (130 + 3)x 1(mod 5).
Now, we consider the system of congruences
4x 1(mod 7), 3x 1(mod 19) and 3x 1(mod 5).

128

Theory of Numbers

Now, x = 2 is a solution of the first linear congruence, x = 6 is a solution of the second


3x 1(mod 19), and x = 2 is a solution of the third 3x 1(mod 5). Therefore, the
linear congruences
95x 1(mod 7), 35x 1(mod 19) and 133x 1(mod 5)
are satisfied by x = 2, 6, 2 respectively. Hence a solution of the given system is given by,
x0 = 2.2.95 + 5.6.35 + 4.2.133 = 2494
and the unique solution is given by,
x 2494(mod 7.19.5) x 499(mod 665).
Ex 2.8.25 If x a(mod16), x b(mod5) and x c(mod11), then show that
x 385a + 176b 560c(mod880).
Solution: Here, m1 = 16, m2 = 5, m3 = 11, and they are relatively prime. Therefore,
m = m1 m2 m3 = 880. Chinese remainder theorem is applicable. Now,
m
y1 1(mod16) 55y1 33(mod16)
m1
5y1 3(mod16) y1 7(mod16)
m
y2 1(mod5) 176y2 1(mod5)
m2
y2 1(mod5)
m
y3 1(mod11) 80y3 1(mod11)
m3
y3 4(mod11) y3 7(mod11)
The integer solution of the solution is given by
m
m
m
y1 a +
y2 b +
y3 c(mod880)
m1
m2
m3
55a 7 + 176b 1 + 80c (7)(mod880)
385a + 176b 560c(mod880).

x0

Ex 2.8.26 Find four consecutive integers divisible by 3, 4, 5, 7 respectively.


Solution: Let n, n + 1, n + 2 and n + 3 be four consecutive integers divisible by 3, 4, 5 and
7 respectively, then
n 0(mod3), n + 1 0(mod4), n + 2 0(mod5) and n + 3 0(mod7). (i)
We are to solve simultaneous linear congruence (i) by using the Chinese remainder theorem.
For this, let m = 3.4.5.7 as they are prime to each other. Now, let
M1 =

m
m
m
m
= 140, M2 =
= 105, M3 =
= 84 and M4 =
= 60,
3
4
5
7

where (M1 , 3) = 1, (M2 , 4) = 1, (M3 , 5) = 1 and (M4 , 7) = 1.


(i) Since, (140, 3) = 1, the linear congruence 140x 1(mod 3) has an unique solution
(mod 3) and the solution is x = x1 = 2.
(ii) Since, (105, 4) = 1, the linear congruence 105x 1(mod 4) has an unique solution
(mod 4) and the solution is x = x2 = 1.

Modular/Congruence System

129

(iii) Since, (84, 5) = 1, the linear congruence 84x 1(mod 5) has an unique solution (mod
5) and the solution is x = x3 = 4.
(iv) Since, (60, 3) = 1, the linear congruence 60x 1(mod 7) has an unique solution (mod
7) and the solution is x = x4 = 2.
Thus, the common integer solution of the given system of congruences is given by
x0 a1 M1 x1 + a2 M2 x2 + a3 M3 x3 + a4 M4 x4 (mod 420)
1803(mod 420).
Therefore, the consecutive integers are n, n + 1, n + 2 and n + 3, where,
n = 123 + 420t;

t = 0, 1, 2, .

Ex 2.8.27 Find the integer between 1 and 1000 which leaves the remainder 1, 2, 6 when
divided by 9, 11, 13 respectively.
Solution: The required integer between 1 and 1000 is a solution of the system of linear
congruences
x 1(mod 9), x 2(mod 11) and x 6(mod 13).
Now we are to solve these system of linear congruences x 1(mod 9), x 2(mod 11) and
x 6(mod 13) by using the Chinese remainder theorem. For this, let M = 9.11.13. Now,
we consider the congruences
13.11x 1(mod 9), 13.9x 1(mod 11) and 9.11x 1(mod 13)
i.e., 143x 1(mod 9), 117x 1(mod 11) and 99x 1(mod 13)
i.e., (144 1)x 1(mod 9), (110 + 7)x 1(mod 11) and (91 + 8)x 1(mod 13).
Now, we consider the system of congruences
x 1(mod 9), 7x 1(mod 11) and 8x 1(mod 13).
Notice that, x = 8 is a solution of the first linear congruence x 1(mod 9), x = 8 is
a solution of the second linear congruence 7x 1(mod 11), and x = 5 is a solution of
the third linear congruence 8x 1(mod 13). Therefore, the linear congruences 143x
1(mod 9), 117x 1(mod 11) and 99x 1(mod 13) are satisfied by x = 8, 8, 5 respectively.
Hence a solution of the given system is given by,
x0 = 1.8.11.13 + 2.8.9.13 + 6.5.9.11 = 5986
and the unique solution is given by,
x 5986(mod 9.11.13) x 838(mod 1287).
Ex 2.8.28 Solve the linear congruence 32x 79(mod125) by applying Chinese remainder
theorem.
Solution: The canonical form of 1225 is 1225 = 52 .72 , and (52 , 72 ) = 1. Thus the solution
of the given linear congruence 32x 79(mod125) is equivalent to finding a simultaneous
solution of the congruences
32x 79(mod 25) and 32x 79(mod 49).
equivalently, 7x 4(mod 25) and 16x 15(mod 49). (i)

130

Theory of Numbers

We are to solve simultaneous linear congruence (i) by using the Chinese remainder theorem.
For this, let m = 25.49 as they are prime to each other. Now, let
M1 =

m
m
= 49, M2 =
= 25,
25
49

where (M1 , 25) = 1, (M2 , 49) = 1.


(i) Since, (49, 25) = 1, the linear congruence 49x 1(mod 25) has an unique solution
(mod 25) and the solution is x = x1 = 24.
(ii) Since, (25, 49) = 1, the linear congruence 25x 1(mod 49) has an unique solution
(mod 49) and the solution is x = x2 = 2.
Thus, the common integer solution of the given system of congruences is given by
x0 a1 M1 x1 + a2 M2 x2 (mod m)
26072(mod 1225) 347(mod1225).
Ex 2.8.29 Give an example of an congruence which has more roots than its degree.
Solution: We know x2 1(mod8) has four distinct solutions x = 1, 3, 5, 7, but x2
1(mod8) is of degree 2.
Result 2.8.3 Symbolic fraction method Let the linear congruence be ax b(mod m),
where (a, m) = 1, then we have
ax b + mh(mod m)
where h is an arbitrary integer. For examples if we consider 5x 2 (mod 16) then,
x

2.8.6

2 + 3.16
2
(mod 16) x
(mod 16) x 10(mod 16).
5
5

Inverse of a Modulo m

If (a, m) = 1, then the linear congruence ax b(mod m) has a unique solution modulo m.
This unique solution of ax 1(mod m) is sometimes called the multiplicative inverse or
reciprocal of a modulo m. From the definition, it follows that, if a
is the reciprocal of a,
then b
a is the solution of ax b(mod m). An element a is said to have an unit element, if
it has an inverse modulo m.
Since (1, 12) = 1 = (5, 12) = (7, 12) = (11, 12), so 1, 5, 7, 11 are units of modulo 12.
Ex 2.8.30 Find the inverse of 12 modulo 17, if it exists.
Solution: Consider the linear congruence 12x 1(mod17). Since (12, 17) = 1, it follows
that the linear congruence 12x 1(mod 17) has a solution. Hence there exists an inverse
of 12 modulo 17. By division algorithm
17 = 12.1 + 5; 12 = 5.2 + 2 and 5 = 2.2 + 1.
Since (17, 12) = 1 so inverse of 12 exists. Now, from above we write
1 = 5 2.2 = 5 2.(12 5.2) = 5 2.12 + 5.4
= 5.5 2.12 = 5.(17 12.1) 2.12
= 5.17 5.12 2.12 = 12(7) + 17.5.
This shows that 12(7) 1(mod 17). Therefore, 7 is a solution of 12x 1(mod 17).
Hence, 7 is an inverse of 12 modulo 17.

Fermats Theorem

131

Ex 2.8.31 If possible, find the inverse of 35 modulo 48.


Solution: Since (35, 48) = 1, so inverse of 35 modulo 48 exists. Now
48 = 1 35 + 13, 35 = 2 13 + 4, 13 = 1 9 + 4,
1 = 9 + (2) 4 = 9 + (2)[13 + (1) 9] = 11 35 + (8) 48.
Hence 11 is the inverse of 35 modulo 35.

2.9

Fermats Theorem

Let a be an integer. If p be a prime, or p does not divide a, then respectively


ap1 1(mod p) or ap a(mod p).

(2.18)

Proof: Let us consider the set R of nonzero residue classes of integers modulo p as
R = {a, 2a, 3a, , (p 1)a}.
It forms a multiplicative group of order p 1. We shall first show that, no two distinct
members of the above (p 1) integers are congruent to each other modulo p. Let if possible,
ra sa(modp); 1 s < r p 1
(r s)a 0(modp) p|(r s)a;
p|(r s) or p|a; as p is prime .
Since 1 s < r p 1, we have p 6 |(r s) and by hypothesis, p 6 |a. Hence, ra 6 sa(modp).
Also, we find that ra 6 0(modp) for r = 1, 2, . . . , p 1. Hence,
ra k(modp), where, k Z and 0 < k p 1.
Since no two distinct members of R are congruent to each other and there are (p1) distinct
integers a, 2a, , (p1)a, it follows that the (p1) integers in R must be congruent modulo
p to 1, 2, , (p 1) taken in some order. Let p be not a divisor of a. Therefore,
a.2a.3a . . . (p 1)a 1.2.3 . . . (p 1)(modp)
ap1 [1.2.3 . . . (p 1)] 1.2.3 . . . (p 1)(modp)
ap1 (p 1)! (p 1)!(modp)
ap1 1(modp); as (p, (p 1)!) = 1.
Hence the theorem. The converse of this theorem is not always true. For example, 2340
1(mod341), as 341 = 11.31, so 341 is not a prime number.
Result 2.9.1 Let p be a divisor of a, i.e., p|a, then a = pk, for some k Z. Therefore,


ap a = a ap1 1 = pk ap1 1 = pt,

where, t = k ap1 1 Z
ap a is divisible by p
ap a 0(mod p) ap a(mod p).
Ex 2.9.1 Prove that

1 5 1 3
7
n + n + n is an integer for every n.
5
3
15

132

Theory of Numbers

Solution: n5 n(mod5) and n3 n(mod3), by Fermats theorem. Then






5 (n5 n) and 3 (n3 n) n5 = 5t + n, n3 = 3s + n,
for some integer t and s.Now
7
7n + 5n + 3n
1 5 1 3
= (t + s + n) = an integer.
n + n + n = (t + s) +
5
3
15
15
Ex 2.9.2 Use Fermats theorem to prove that a12 b12 , is divisible by 13 7, where, a, b
are both prime to 91.
Solution: Since a is prime to 91, a is prime to both 13 and 7. Using Fermats theorem,
a12 1 0(mod13) and a6 1 0(mod7).
Since a6 1 0(mod7), it follows that a12 1 0(mod7). Also, a12 1 0(mod13) and
a12 1 0(mod7), so,
a12 1 0(mod91); as (13, 7) = 1.
Similarly, b12 1 0(mod91); as (13, 7) = 1.
a12 b12 0(mod91)
which is required.
Theorem 2.9.1 If p be prime > 2 then 1p + 2p + + (p 1)p 0 (mod p).
Proof: We have,
1p 1 (mod p); 2p 2 (mod p), , (p 1)p (p 1) (mod p).
Adding all the results, we get,
1p + 2p + + (p 1)p {1 + 2 + + (p 1)}(mod p)
1
p(p 1) (mod p)
2
0 (mod p); since p 1 is even.
2

Theorem 2.9.2 If p be prime and a be prime to p, then ap

1 (mod p2 ).

Proof: The Fermats theorem is, if p be prime and a is integer then ap1 1 (mod p).
Hence q Z such that ap1 = 1 + qp. Therefore,
2

ap

p(p 1)
(qp)2 + + (qp)p
2!
= 1 + kp2 ; where k Z, .
= (1 + qp)p = 1 + p.qp +

Hence, by definition, ap

1 (mod p2 ).

Ex 2.9.3 Show that the prime factor of n2 + 1 is of the form 4m + 1, where m is an integer.
Solution: Let p be a prime factor of n2 + 1, then p is not the divisor of n and n2 + 1
0(mod p). By Fermats theorem, we have,
np1 1(modp) (n2 )
2

p1
2

p1
2

p1
2

1(mod p); assume p is odd

(n )
(1)
(mod p); as n2 1 (mod p)
p1
p1

= even integer, i.e.,


= 2m (say).
2
2
Therefore, p = 4m + 1, where m is an integer. From this it follows that, no prime factor of
n2 + 1 can be put of the form 4m 1, where m is an integer.

Fermats Theorem

133
n1

Ex 2.9.4 If p is prime to a, then ap

(p1)

1(modpn ).

Solution: By Fermats theorem, we have ap1 1(modp). Using the theorem, if a


1(modp) then ap 1(modpn+1 ), we get,
2

ap(p1) 1(modp2 ), ap

(p1)

n1

1(modp3 ), , ap

Result 2.9.2 If p is prime and p 6= 2, then a

p1
2

(p1)

1(modpn ).

1(modp), when p 6 |a.

Proof: By Fermets theorem, we have,


ap1 1 0(modp)
 p1
  p1

or, a 2 1 a 2 + 1 0(mod p); p 6= 2.
 p1

 p1

p| a 2 1 or, p| a 2 + 1 ; as p = prime.
 p1

 p1

a 2 1 0(mod p) or a 2 + 1 0(mod p)
a

p1
2

1(mod p).

Ex 2.9.5 Show that square of any integer is of the form 5k 1.


Solution: We know, a

p1
2

p1
2

1(modp), then,
1 + pk; k Z a

p1
2

= pk + 1.

When, p = 5, then a2 = 5k 1. Thus, square of any integer is of the form 5k 1.


Ex 2.9.6 Prove that the eighth power of any integer is of the form 17k or 17k 1.
Solution: Let a be an integer, divisible by 17, then a = 17k. If a is not divisible by 17,
then (a, 17) = 1. By Fermats theorem,
a16 1 0(mod17) (a8 1)(a8 + 1) 0(mod17).
Either, a8 1 0(mod17), or, a8 + 1 0(mod17).
a8 1 0(mod17) a8 = 17k + 1
a8 + 1 0(mod17) a8 = 17k 1.
Hence a8 = 17k or 17k 1, where a is an integer.

2.9.1

Wilsons Theorem

Statement : If p be a prime, then (p 1)! + 1 0(modp).


Proof: For the prime number p, the set S of integers which are less than and prime to p
is S = {1, 2, 3, . . . ; p 1}. Let a be one of the integers 1, 2, , p 1. Then no two of the
integers 1.a, 2.a, , (p 1). a are congruent modulo p, because, if ra sa(modp), for some
integers r, s such that
r s(modp); 1 r < s p 1, as (a, p) = 1
which is a contradiction. Then as (a, p) = 1, a S, the linear congruence ax 1(mod p)
has unique solution. Also, none of these is divisible by p. This means that the integers
a, 2a, , (p 1)a are congruent to 1, 2, , p 1 modulo p, taken in some order. So for
a S unique a0 S such that aa0 = 1(modp). If a = a0 , then

134

Theory of Numbers
a2 1(mod p) a2 1 0(mod p)

p| a2 1 p|(a + 1)(a 1)
(a + 1)(a 1) 0(mod p)
either (a 1) 0(mod p) or (a + 1) 0(mod p)

Now, a2 1(modp) holds if p|(a2 1) and this happens only when p|(a 1) or p|(a + 1).
Since p is a prime and a < p, it follows that when a 1 0(mod p), a = 1 and when
a + 1 0(mod p), a = p 1.
If we omit integers 1 and p 1 from S, the remaining p 3 integers 2, 3, , p 2 are
0
0
such that they are grouped into p3
2 pairs (a, a ) satisfying ax 1(mod p), a 6= a and
1
0
1 < a < p 1. Multiplying 2 (p 3) of such pair congruences, we have,
2.3 p 2 (modp) (p 2)! 1(modp).
or, (p 1)! (p 1)(modp) (1)(modp)
(p 1)! + 1 0(modp).
The converse of this theorem is also true, i.e., if (p 1)! + 1 0(modp), then p(> 1) is a
prime. For, if p be not a prime, p is composite and has a divisor d with 1 < d < p such that
d|(p + 1)! + 1. Since 1 < d < p, d divides one of the factors of (p + 1)!. Thus
d|(p + 1)! + 1 and d|(p + 1) d|1
which is absurd as d 6= 1. Therefore p cannot be composite and so p is prime.
This theorem provides necessary and sufficient condition for determining primality of a
positive integer p. When p assumes large values, then (p 1)! becomes very large and in
this case is impracticable.
Ex 2.9.7 Show that 70! + 1 0(mod 71).
Solution: Since 71 is a prime number, by Wilsons theorem
(71 1)! 6 (1)(mod71) 70! + 1 0(mod 71).
Ex 2.9.8 For a prime p,
(p 1)! (1)

p1
2

2

p1
(mod p).
1.2.
2


Show that the integer p1
! satisfies the congruence x2 1 0(mod
2
p = 4k + 1 and p = 4k + 3.

p) according as

p+1
Solution: We consider the set of integers, 1, 2, 3, , p1
2 , 2 , , (p 2), (p 1) where, p
p1
is prime. Now, p 1 1(mod p), p 2 2(mod p), p+1
2 2 (mod p). Now,
p1 p+1
(p 1)! = 1.2.3. .
.
. .(p 2).(p 1)
2

2 

p1
p1
1.2.3. .
.
. .(2).(1) (mod p)
2
2

 

p1
p1
1.(1).2.(2). .
.
(mod p)
2
2

2
p1
p1
2
(1)
1.2.
(mod p)
2

 2
p1
p1
p1
(1) 2
! (mod p);
= integer.
2
2

Fermats Theorem

135

Again by Wilsons theorem (p 1)! 1(modp). Therefore,


1 (1)

p1
2



 2
p1
p1
! (mod p);
= integer.
2
2

Now, if p is a prime of the form 4k + 1 for some k N , then,


1 (1)

4k+11
2



 2
p1
! (mod p)
2

 2
p1
or, 1 1
! (mod p)
2

 2
p1
or,
! 1(mod p)
2



Thus x = p1
! satisfies the congruence x2 + 1 0(mod p), when p is of the form 4k + 1.
2
Again, p is of the form 4k + 3 for some k N , then,
1 (1)

4k+31
2



 2
p1
! (mod p)
2

 2
p1
! (mod p)
or, 1 1
2
 2

p1
! 1(mod p)
or,
2


Thus x =

p1
2

! satisfies the congruence x2 1 0(mod p).

Ex 2.9.9 If p is odd prime, then show that


12 .32 .52 (p 2)2 (1)

p+1
2

(mod p).

Solution: We use the result, p k k(mod p) and k (p k)(mod p). Now,


p1 p+1
.
. .(p 2).(p 1)
2
2
= {1.(p 1)}{3.(p 3)} {2.(p 2)}
1(1).3(3) {(p 2).(p 2)}(mod p)

(p 1)! = 1.2.3. .

(1)

p1
2

12 .32 .52 (p 2)2 (mod p).

Again using Wilsons theorem (p 1)! 1(modp), we have,


1 (1)

p1
2

12 .32 .52 (p 2)2 (mod p)

or, 12 .32 .52 (p 2)2 (1)(1)


2

or, 1 .3 .5 (p 2) (1)

p+1
2

p1
2

(mod p)

(mod p).

Ex 2.9.10 If p be a prime number, then show that


(p 1)! p 1(mod(1 + 2 + + (p 1))).

136

Theory of Numbers

Solution: Using Wilsons theorem (p 1)! 1(modp), we have,


(p 1)! (p 1)(modp), p|(p 1)! (p 1).
p1
|(p 1)! (p 1); as p 1 is even.
Also,
2


p1
p1
or, p
|(p 1)! (p 1) (p 1)! (p 1) 0 mod p
2
2
or, (p 1)! p 1(mod(1 + 2 + + (p 1))).
p1
as 1 + 2 + (p 1) = p
.
2
Ex 2.9.11 Prove that 4(29)! + 5! is divisible by 31.
Solution: In Wilsons theorem, let p = 31(prime), then (30)! + 1 0(mod31), i.e.,
(31 1)(29)! + 1 0(mod31)
(29)! + 1 0(mod31) 4(29)! 4 0(mod31)
4(29)! 4 + 124 0(mod31) 4(29)! + 120 0(mod31).
Thus 4(29)! + 5! is divisible by 31.

2.10

Arithmetic Functions

An arithmetic or a number-theoretic function is a real or complex valued function whose


domain is the set of positive integers. If f is an arithmetic function we write f (n) for its
values. For example, f : N N , defined by f (n) = n or n2 , are arithmetic functions, but
f (n) = log n is not an arithmetic function. The following are arithmetic functions:
(i) f (n) = 2n for all n N .
(ii) f (n) =

1
n

for all n N .

(iii) f (n) = n +

1
n

for all n N .

An arithmetic function not identically zero is said to be normalized if f (1) = 1. Several


arithmetical functions plays an important role in the study of divisibility properties of integers and the distribution of primes. Here we shall discus two important number theoretic
functions
(i) Euler-Totiant function or phi function,
(ii) Mobius function.

2.10.1

Eulers Phi Function

Let n N . Then the number of positive integers less than n and prime to n(i.e., the number
of divisors of n) is denoted by (n) with (1) = 1. Thus the function
0

: N N defined by (n) =

n
X

(2.19)

k=1

is known as Eulers phi function where 0 indicates the sum is extended over those k(< n)
satisfying (k, n) = 1. For example, let n = 12, then k = 1, 5, 7, 11, where k < n and
(k, n) = 1. Thus

Arithmetic Functions

137
0

(12) =

12
X

1 = 1 + 1 + 1 + 1 = 4.

k=1

A short table for (n) is given as follows:


n : 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
(n) : 1 1 2 2 4 2 6 4 6 4 10 4 12 6 8
The function is a number-theoretic function.
Result 2.10.1 Let n be a prime integer. If p is a positive integer such that p < n, then
(p, n) = 1. Hence the number of positive integers not exceeding n and relatively prime to n
is n 1, so that, (n) = n 1.
Result 2.10.2 The followings are important:
k
2
1
(i) If n = p
1 p2 pk , where pi (1 i k) are distinct primes and i N 1 i k,
then the number of prime divisors of n is (1 + 1 )(1 + 2 ) (1 + k ).

(ii) The highest power of a prime p contained in n! is denoted by k(n!), where


hni h n i h n i
+ 2 + 3 +
k(n!) =
p
p
p
Ex 2.10.1 Find the highest power of 3 contained in 100!
Solution: The highest power of a prime p contained in n! is denoted by k(n!), where
hni h n i h n i
k(n!) =
+ 2 + 3 +
p
p
p
h 100 i h 100 i h 100 i h 100 i h 100 i
=
+
+
+
+
+
3
32
33
34
35
= 33 + 11 + 3 + 1 + 0 = 48.


Theorem 2.10.1 If p is prime then (pk ) = pk 1 p1 where k is a positive integer.
Proof: When k = 1, then (p) = p 1 as p is prime. If k > 1, let us arrange the integers
from 1 to pk in the following way
1
p+1
2p + 1
..
.

2
p+2
2p + 2
..
.

p1
p + (p 1)
2p + (p 1)
..
.

p
2p
3p
..
.

(pk1 1)p + 1 (pk1 1)p + 2 (pk1 1)p + (p 1) pk1 p


Thus if, k > 1, we find that q( pk ), a positive integer, then (q, pk ) 6= 1, if and only if q is one
of p, 2p, 3p, . . . , pk p and their numbers are pk1 . If q is not equal to one of p, 2p, 3p, . . . , pk p,
then (q, pk ) = 1. Now there are pk integers 1 to pk . Among these integers, there are pk1
integers which are not relatively prime to pk . Hence the remaining integers are relatively
prime to k. The total number is pk pk1 and so


1
(pk ) = pk pk1 = pk 1
.
(2.20)
p

138

Theory of Numbers

r
1 2
If the integer n(> 1) is of the form n = p
1 p2 pr , where p1 , p2 , , pr are prime to one
i j
another, i.e., (p1 p2 ) = 1 for all i, j. Then
2
r
1
(n) = (p
1 ) (p2 ) (pr )






1
1
1
2
r
1
= p
1

p
1

p
1

r
1
2
p1
p2
pr


 

1
1
1
r
1 2
= p
p

p
1

r
1
2
p1
p2
pr






r 
Y
1
1
1
1
=n 1
1
1
=n
1
.
(2.21)
p1
p2
pr
p
=1

Ex 2.10.2 Find (191) and (260).


Solution: First we are to test whether 191 is a prime number or not. For this, we find
all primes p satisfying p2 191. These primes are 2, 3, 5, 7, 11, 13 and 17. But none of the
primes divide 191, so 191 is a prime.
Therefore, by definition, (191) = 191 1 = 190.
By using the unique factorization theorem, we get 260 = 22 .5.13. Therefore,
1
1
1
(260) = (22 .5.13) = 260(1 )(1 )(1 )
2
5
13
1 4 12
= 96.
= 260. . .
2 5 13
Theorem 2.10.2 Let m and n be two positive integers. If m, n are relatively prime, then
(mn) = (m)(n), i.e., (n) is multiplicative.
Proof: Given that (m, n) = 1. We consider the product mn. Then the first mn numbers
can be arranged in n lines, each containing m numbers. Thus,
1
m+1
2m + 1
..
.

2
m+2
2m + 2
..
.

k
m+k
2m + k
..
.

..
.

m
m+m
2m + m

(n 1)m + 1 (n 1)m + 2 (n 1)m + k (n 1)m + m


Now, we consider the vertical column beginning with k. If (k, a) = 1, all the terms of this
column will be prime to m, but if k and m have a common divisor, no number in the column
will be prime to m. Now the first row contains (m) numbers prime to n, therefore, (m)
vertical columns in each of which every term is prime to n. Let us suppose that, the vertical
column which begins with k is one of these. This column is in arithmetic progression, the
terms of which when divided by n leaves remainders
0, 1, 2, 3, , n 2, n 1.
Hence, the column contains (m) integers prime to n. Thus in the table, there are (m)(n)
integers, which are prime to m and also n and therefore to mn; i.e., (mn) = (m)(n).
This theorem can be extended to a finite number of positive integers as
(m1 m2 mr ) = (m1 )(m2 ) (mr ),
where m1 , m2 , , mr are prime to one another. If (m, n) = d, then
(m, n) = (m) (n)

d
.
(d)

Arithmetic Functions

139

Theorem 2.10.3 If n is a positive integer, then,


(2n) = (n); n is odd
= 2(n); n is even.
Proof: When n is odd and 2, n are prime to each other so
(2n) = (2)(n) = 1.(n) = (n)
When n is even, let n = 2k .p where p is an odd integer.Therefore,
1
(n) = (2k .p) = 2k .(1 )(p) = 2k1 (p)
2
(2n) = (2k+1 )(p) = 2k (p)
(2n) = 2(n).
If p = 1, then also (2n) = 2(n), since (1) = 1.
Ex 2.10.3 Find all integers n such that
n
(i) (n) = , (ii) (n) = (2n) and (iii) (n) = 12.
2
Solution: (i) Let n be a prime integer. Hence the number of positive integers not exceeding
n and relatively prime to n is n 1, so that, (n) = n 1. Therefore,
n
n
n 1 = = 1 n = 2.
2
2
Also, n2 can be written as n(1 12 ), so that the values of n are of the form n = 2 ; N .
(ii) The given equation (n) = (2n) can be written in the form (n) (2n) = 0. So,
using the definition of Eulers phi function, we have,
(n) = 1 or 2 or 4 or 6 i.e., n = 1, 2; 3, 4, 6; 5, 8, 10, 12; 7, 9, 14, 18;
(2n) = 1 or 2 or 4 or 6 i.e., 2n = 1, 2; 3, 4, 6; 5, 8, 10, 12; 7, 9, 14, 18;
Since n N , so for the second case, 2n 6= 1, 3, 5, 7, and therefore, the common values of
n are n = 1, 3, 5, 7, 9, .
(iii) Case 1: Let n be a prime integer. Hence the number of positive integers not
exceeding n and relatively prime to n is n 1, so that, (n) = n 1, then
(n) = n 1 = 12 n = 13.
Case 2: Now, 12 can be written as 12 = 22 .3 and so,
1
1
1 2
n(1 )(1 ) = 12 n = 12 n = 36.
2
3
2 3
Thus the values of n satisfying (n) = 12 are n = 13, 36.
Ex 2.10.4 Solve for x, y, z N where (x 5) + (3y 5) + (5z 18) = 3.
Solution: We see that, (n) N , and therefore, the given equation will be satisfied if and
only if (x 5) = 1, (3y 5) = 1, (5z 18) = 1. Now,
(x 5) = 1 x 5 = 1 or 2;

i.e., x = 6 or7

(3y 5) = 1 3y 5 = 1 or 2;
(5z 18) = 1 5z 18 = 1 or 2;
Thus the solutions are (6, 2, 4); (7, 2, 4).

7
, as y N
3
19
i.e., z = 2 but z 6=
, as z N .
5

i.e., y = 2 but y 6=

140

Theory of Numbers

Theorem 2.10.4 Let a, m(> 0) be integers. If (a, m) = 1 then a(m) 1 (mod m).
Proof: For each positive integer m 1, (m) is the Eulers phi function, defined as
X
(1) = 1 and (m) =
1.
1km,(k,m)=1

Thus for m = 1, the result holds trivially. Fix a positive integer m and take an integer a, coprime to m. Let {r1 , r2 , . . . , r(m) } be a reduced residue system mod m. then,
{ar1 , ar2 , . . . , ar(m) } is also a reduced residue system mod m in some order. Since each
(ari , m) = 1 and they are incongruent to each other. Hence, the product of all the integers
in the first set is congruent to the product of those in the second set. Therefore,
ar1 .ar2 . . . . , ar(m) r1 .r2 . . . . r(m) (modm)
a(m) r1 .r2 . . . r(m) r1 .r2 . . . . r(m) (modm)
a(m) 1(modm);

as, (r1 .r2 . . . . r(m) , m) = 1.

Each ri is relatively prime to m, so we can cancel each ri and obtain the theorem. This
is known as Euler Fermat theorem. This theorem can be used to calculate the solution of
linear congruence.
Result 2.10.3 If (a, m) = 1, the solution (unique mod m) of the linear congruence ax
b(modm) is given by x ba(m)1 (modm).
Ex 2.10.5 Solve the linear congruence 5x 3(mod24).
Solution: Since (5, 24) = 1, there is an unique solution. The solution is given by
x 3.5(24)1 3.57 (mod24); as (24) = (3)(8) = 2.4
3.5(mod24); as 52 1(mod24) 56 1(mod24)
x 15(mod24).
Ex 2.10.6 Solve the linear congruence 25x 15(mod120).
Solution: Here d = (25, 120) = 5. As d|15, the congruence has exactly five solutions
modulo 120. To find them, we are to solve the linear congruence 5x 3(mod24). Thus the
five solutions are given by
x 15 + 24k; k = 0, 1, 2, 3, 4
or, x 15, 39, 63, 87, 111(mod120).
Ex 2.10.7 If n > 7 is prime, prove that n6 1 is divisible by 504.
Solution: Since 7 is a prime and n is prime to 7, by Fermats theorem, n6 1 is divisible
by 7. By Eulers theorem as n is prime to 9, n(9)
 1is divisible by 9. Now,
1
(9) = (32 ) = 9 1
= 6.
3
Therefore, n6 1 is divisible by 9. Since n > 7 is an odd prime, n is of the forms 4k + 1 or
4k + 3, where k(> 1) N .
n6 1 = (n 1)(n + 1)(n4 + n2 + 1).
If, n = 4k + 1, then, (n 1)(n + 1) = 4k(4k + 2) and,
if, n = 4k + 3, then, (n 1)(n + 1) = (4k + 2)(4k + 4).
Therefore, in any case n is divisible by 8. Since, the three consecutive integers 7,8,9 are
pairwise prime to each other,7.8.9|n6 1, i.e., 504|n6 1.

Arithmetic Functions

141

Ex 2.10.8 Use Euler-Fermats theorem to find the unit digit in 3100 .

[NET11]

Solution: Since 3 is prime to 10, by Euler-Fermats theorem,


3(10) 1(mod10) where, (10) = 4,
or, 34 1(mod10) 3100 = 34.25 1(mod10).
Thus the unit digit in 3100 is 1.
Result P
2.10.4 The sum of the divisors of a positive integer n is denoted by (n), i.e.,
(n) =
d and it is an arithmetical function. For example, consider the positive integer 4.
d|n

The divisors of 4 are 1,2,4. Therefore, (4) = 1 + 2 + 4 = 7. Similarly, (6) = 12, (10) =
k
2
1
18, (15) = 24. In general, if n = p
1 p2 pk , then
(n) =

d=

d|n

1 +1
2 +1
pk +1 1
p
1 p
1
1
2
k
.
p1 1
p2 1
pk 1

Ex 2.10.9 Find the number of positive divisors of 50000.

(2.22)
[NET12]

Solution: We have, 50000 = 24 55 . Thus the number of positive divisor of 50000 is


= (4 + 1)(5 + 1) = 5 6 = 30.
Result 2.10.5 Consider positive integer n and write S = {1, 2, , n}. Define on S by
a b (an) = (b, n). Then is an equivalence relation. For a divisor of n,
n
o
A(d) = k : (k, n) = d
is an equivalence class. So S =

A(d). For example, let n = 6, S = {1, 2, , 6}. Then,

d|n

divisors of 6 are 1,2,3,6. Now


A(1) = {1, 5}, A(2) = {2, 4}, A(3) = {3}, A(6) = {6}.
Note that these sets A(1), A(2), A(3), A(6) are disjoint and of is S.

2.10.2

The M
obius Function:

The Mobius function (n) is defined as,


(n) = 1; n = 1
= (1)r ; if n = p1 .p2 . . . . pr (pi s are distinct prime.)
= 0; if a2 |n for some a > 1, i.e.n has a square factor > 1.
The following is a table, showing some values of (n) :
n : 1 2 3 4 5 6 7 8 9 10
(n) : 1 1 1 0 1 1 1 0 0 1
The Mobius Function arises in many different places in number theory.
P One of its fundamental properties is a remarkably simple formula for the divisor sum
(d), extended over
d|n

the positive divisor of n.

142

Theory of Numbers

Theorem 2.10.5 If n 1, then


X
d|n

1
(d) = [ ] = 1; n = 1
n
= 0; n > 1

Proof: The formula is clearly true if n = 1. Assume, then, n > 1.


Case 1: Let n = p , then
X
(d) = (1) + (p) + (p2 ) + + (p )
d|p

= 1 + (1) + 0 + + 0 = 1 1 = 0.
Case 2: Assume that the result be true for integers with at most k distinct prime factors,
i.e., let n = ap , where a is an integer with k prime factors and p 6 |a. Now,
X
X
X
X
X
(d) =
(d) +
(pd) +
(p2 d) + +
(p d)
d|n

d|a

d|a

(d) +

d|a

d|a

(d)

d|a

d|a

(p)(d) +

d|a

(p )(d) + +

d|a

(p )(d)

d|a

(d) = 0.

d|a

Case 3: Let n = pa1 1 .pa2 2 . . . par r > 1, be a standard factorization of n. In the sum

(d)

d|n

the only nonzero terms come from d = 1 and from those divisor of n which are products of
distinct primes. Hence,
X
X
X
(d) = (1) +
(pi ) +
(pi pj ) + + (p1 . . . . pr )
1ir

d|n

1i<jr

= (1) + (p1 ) + + (pr ) + (p1 p2 ) + + (pr1 pr ) + + (p1 . . . pr )


 
 
r
r
=1+
(1) +
(1)2 + + (1)r
1
2
= (1 1)r = 0
Thus the proof will be induction on different prime factors for n > 1.
P
Theorem 2.10.6 If F (n) =
f (d) for every positive integer n, then
d|n
P
f (n) =
(d)F (n/d).
d|n

Proof: By using the definition,


X
X
X
(d)F (n/d) =
(d)f () =
(d)f ()
d|n

|n/d

X
|n

f ()

|n

(d) = f (n)

d|n/

This is called M
obius inversion P
formula. If f (n) and g(n) are two arithmetic functions
satisfying the condition f (n) =
g(n), then {f (n), g(n)} is a Mobius pair. For example,
d|n

{n, (n)} is Mobius pair.

Arithmetic Functions

2.10.3

143

Divisor Function

Let n(> 1) be a positive integer. The divisor function, i.e., the number of positive divisors
of a positive integer n, : N N , denoted by (n), n N is given by
(n) = 1; if n = 1
= 2; if n = p( a prime )
> 2; if n is composite.
Note that, { (n), 1} is M
obius pair. Let a positive integer n(> 1) be expressed in a canonical
form as
n = p1 1 .p2 2 . . . pr r , i 0, for i = 1, 2, , r,
where pn is the nth prime with p1 < p2 < < pr . If m be a positive divisor of n, then m
is of the form p1 1 .p2 2 . . . pr r , where,
0 u1 1 , 0 u2 2 , , 0 ur r .
Thus the positive divisors of n in one-one correspondence with the totality of r tuples
(u1 , u2 , , ur ), satisfying the above inequality. The number of such r tuples is (1 +
1)(2 + 1) (r + 1). Hence the total number of positive divisors of n is
(n) = (1 + 1)(2 + 1) (r + 1).
The total number of positive divisors (n) include both the divisors 1 and n. For example,
(4) = (22 ) = 2 + 1 = 3; (12) = (22 .3) = (2 + 1)(1 + 1) = 6.
The sum of all positive divisors of a positive integer n is denoted by (n). Every positive
divisor of n in the canonical form is a term in the product
2
r
1
(1 + p1 + + p
1 )(1 + p2 + + p2 )(1 + p1 + + pr )

and conversely, each term in the product is a divisor of n. Thus the sum of all positive
divisors of n = p1 1 .p2 2 . . . pr r is
2
r
1
(n) = (1 + p1 + + p
1 )(1 + p2 + + p2 )(1 + p1 + + pr )

1 +1
2 +1
p
1 p
1
pr +1 1
1
. 2
..... r
p1 1
p2 1
pr 1

with (1) = 1. The functions and are examples of number-theoretic functions. Both of
and are multiplicative functions, i.e.,
(mn) = (m) (n) and (mn) = (m)(n).
A positive integer n is said to be a prefect number , if (n) = 2n, i.e., if n be the sum of all
its positive divisors excluding itself. For example, 6, 28 etc. are perfect number.
Ex 2.10.10 Find (360) and (360).
Solution: The number 360 can be written in canonical form as 360 = 23 .32 .5. Therefore,
(360) = (1 + 3)(1 + 2)(1 + 1) = 24.
24 1 33 1 52 1
(360) =
.
.
= 1170.
21 31 51

144

Theory of Numbers

Ex 2.10.11 The total number of positive divisors of a positive integer n is odd if and only
if n is a perfect square.
Solution: Let a positive integer n(> 1) be expressed in a canonical form as
n = p1 1 .p2 2 . . . pr r , i 0, for i = 1, 2, , r,
where pn is the nth prime with p1 < p2 < < pr . Then each of 1 , 2 , , r is an even
integer and (n) is odd. If however, n = 1, a perfect square, then (n) = 1 and it is odd.
Conversely, let (n) be odd. Then each of the factors 1 + 1, 2 + 1, , r + 1 must be
odd. Consequently, each of 1 , 2 , , r must be even and n is therefore a perfect square.
This completes the proof.

2.10.4

Floor and Ceiling Functions

Let x be any real number. The floor function of x denoted by bxc and it is the greatest
integer less than or equal to x. That is, bxc : R Z where bxc = greatest integer less than
orequal to x. For example, b8.25c = 8, b8.75c = 8, b10.6c = 11, b8c = 8, b3c = 3,
b 26c = 5, etc.
The ceiling function or x R is denoted by dxe and it is the smallest integer greater than
or equal to x. Thus, dxe : R Z, where bxc = least integer greater than or equal to x. For
example, b8.25c = 9, b8.75c = 9, b4.6c = 4, b5c = 5, b5c = 5 etc.
Properties
1. bxc = n n x < n + 1, where n is an integer.
2. dxe = n n < x n + 1, where n is an integer and x is not an integer.
3. x 1 < bxc x dxe < x + 1.
4. bm + nc = bmc + n, where n is an integer.
5. dm + ne = dme + n, where n is an integer.
6. bxc + byc 6= bx + yc when x, y 6 Z.
7. dxe + dye 6= dx + ye when x, y 6 Z.
8. dmxe = mdxe, where m is an integer.
9. bmxc = mbxc, where m is an integer.

2.10.5

Mod Function

Let m be a positive integer. The (mod m) function is defined as fm (a) = b, where b is the
remainder when a is divided by m. The function fm (a) = b is also denoted by a b (mod
m), 0 b < m. Also, fm (a) = b when (b a) is divisible by m. The integer m is called
the modulus and a b (mod m) is read as a is congruent to b modulus m. It can also be
defined as, fm (a) is unique integer r such that a = kq + r, 0 r < m for some integer q.
This function is also written as a (mod m). For example,
f7 (35) = 0 as 7 divides 35 0 or 35 = 5 7 + 0,
f5 (36) = 0 as 5 divides 36 1 or 36 = 5 7 + 1.

Exercise 2
Section-A
[Multiple Choice Questions]
1. Fundamental theorem of arithmetic: every positive integer n > 1 can be uniquely as
a product of
(a) Prime (b) Positive integers (c) Perfect squares (d) None of the above.

Arithmetic Functions

145

2. Division algorithm states as: let a and b integers with b 6= 0, then there exists integers
q and r such that
(a) a bq = r (b) a = bq r (c) a = q r + b (d) All of the above.
3. Suppose a, b and c are integers, which of the following are true?
(a) If a/b and b/c, then a/c (b) If a/b and b/c, then a/(b + c) and a/(b c) (c) If
x > 0, then gcd(ax, bx) = x+ gcd(a, b) (d) For any integer x gcd(a, b) = gcd(a, b + ax).
4. gcd(540, 168) =
(a) 168
(b) 34

(c) 12

(d) none of the above

5. Two integers a and b are said to relatively prime or coprime if


(a) gcd(a, b) = a (b) gcd(a, b) = 1 (c) gcd(a, b) = a b (d) All of the above.
6. For linear congruence equation ax b(mod m) where, d = gcd(a, m), if d does not
divide b then the equation has
(a) Unique solution (b) Has no solution (c) Two solutions (d) None of the above.
7. Consider congruence equation, 8x 12(mod 28), then
(a) Equation has no solution (b) has unique solution as 5
the four solutions (d) All of the above.
8. Solution of 235x 54(mod7) is
(a) x 12(mod7)
(b) x 3(mod7)

(c) 5,12,19 and 26 are

(c) x 5(mod7)

(d) x 4(mod7)

9. Remainder of 8103 from Fermat theorem when divided by 103 is


(a) 8
(b) 7
(c) 6
(d) 10
10. The unit digit of 2100 is
(a) 2 (b) 4 (c) 6 (d) 8

NET(June)11

n
o
11. The number of elements in the set m : 1 m 1000, mand 1000 are relatively prime
is
NET(June)11
(a) 100 (b) 250 (c) 300 (d) 400
12. The number of positive divisors of 50000 is
(a) 20
(b) 30
(c) 40
(d) 50

NET(June)12

13. The last digit of (38)1031 is


(a) 6 (b) 2 (c) 4

NET(June)12
(d) 8
Section-B
[Objective Questions]

1. Show that every integer > 1 has a prime factor.


2. If n > 1 is an integer, show that there exists a prime integer p such that p divides n.
3. Explain the fundamental theorem of arithmetic by an example.
4. Let a, b, n be positive integers such that a b(mod n), show that (a, n) (b, n).
5. Show that 32n 1(mod 8), for all integers n 1.
6. Find all the integral solutions of the equation 5x + 4y = 9.
7. Explain the Wilsons theorem for integers by an example.

146

Theory of Numbers

8. Find (14), where is the Euler function.


9. Define the Mobius -function. Find (15).
10. Find the highest power of 7 contained in 1000!. Ans: 164
Section-C
[Long Answer Questions]
1. Prove the following by mathematical induction:
(a) 1.2 + 2.22 + 3.23 + 4.24 + + n.2n = (n 1)2n+1 + 2, n N .
(b) 1 + 2 + + n =

n(n+1)
,
2

n N .

KU (H) :09

(c) 1.1! + 2.2! + + n.n! = (n + 1)! 1, n 1.


(d)
(e)

1
1
1
1
1
1 2 + 2 2 + 3 2 + + n2 2 n , n 1
2
3
n
n+2
1
2 + 22 + 32 + + n2 = 2 2n , n 1.

2. Prove by mathematical induction that, for every n N


(a) 33n+3 8n 7 is divisible by 49.
2n+1

(b) 3

2n

(c) 3

n+2

+2

is divisible by 7.

1 is not exactly divisible by 2n+3 .

(d) 22n+1 9n2 + 3n 2 is divisible by 54


2n

+ 16n 1 is divisible by 64.

2n

8n 1 is divisible by 64.

(e) 7
(f) 3

JECA06

KU (H) :07

3. Prove the following inequalities by induction on n N


(a) 2n < n! for all n 4.
(b) n! > 3n for all integers n 7.
(c) n2 < n! for all integers n 4.
4. Use division algorithm, show that
(a) The product of any k consecutive integers is divisible my k.
5. (a) Show that n2 + 2 is not divisible by 4 for any integer n.
(b) If p is a prime greater than 3, show that 24|p2 1.
(c) If n is an integer not divisible by 2 or 3, then 32|(n2 + 3)(n2 + 7).
(d) If n is an odd integer, then 24|(n2 + 3).
(e) Prove that 2n! is divisible by n!(n + 1)!.
6. (a) Let a1 , a2 , . . . , an be any non-zero integers and d = (a1 , a2 , . . . , an ). Then m1 ,
m2 , . . . , mn Z such that d = a1 m1 + a2 m2 + + an mn .
(b) Prove that for any two integers u and v where v > 0 there exists two unique
integers m and n such that u = mv + n, where 0 n < v.
(c) If (a, b) = [a, b] for two positive integers a, b, prove that a = b.
(d) If a > 1, prove that (am 1, an 1) = a(m,n) 1.

7. Prove that 2, 3, 5 are irrational numbers.

BH 0 97

Arithmetic Functions

147
n

8. Prove that the product of the first n Fermats number is 22 1.


9. Prove that every square integer is either of the forms 5m, 5m + 1, 5m 1, m is an
integer.
10. Prove that the prime factor of n2 + 1 is of the form 4m + 1.
11. Find integers m, n such that
(a) (95, 102) = 95m + 102n.
(b) (723, 24) = 723m + 24n.
(c) (426, 246) = 426m + 246n.
12. If (a, b) = 1, show that
(a) (a + b, a b) = 1 or 2.
(b) (a + b, ab) = 1 and (a2 + b2 , a2 b2 ) = 1.
13. If k be a positive integer, then (ka, kb) = k(a, b).
14. (a) Prove that n12 1 is divisible by 7, if (n, 7) = 1.
(b) If n and n2 + 8 are both prime numbers, prove that p = 3.
(c) If 2n 1 be prime, prove that n is a prime.
(d) Prove that n4 + 4n is a composite number for all natural number n > 1.
15. Show that a natural number is divisible by 9 if and only if the sum of its digits is
divisible by 9.
16. Find the integer between 1 and 1000 which leaves the remainder 1, 2, 6 when divided
by 9, 11, 13 respectively.
17. Solve the Diophantine equations:
(a) 56x + 72y = 40 :

x = 20 + 9t, y = 15 7t.

(b) 8x 27y = 125 :

x = 1169 27t, y = 351 8t.

(c) 7x + 11y = 1 :
(d) 68x 157y = 1 :
(e) 13x 17y = 5 :

x = 8 11t, y = 5 + 7t.
x = 30 157t, y = 13 68t.
x = 20 17t, y = 15 13t.

18. The sum of two positive integers is 100. If one is divided by 7 the remainder is 1, and
if the other is divided by 9 the remainder is 7. Find the numbers. Ans: 57, 43.
19. For any natural number n show that,
(a) (2n + 1)2 1(mod8).
(b) 4.6n + 5n+1 9(mod20).
20. Show that
(a) 241 3(mod23).
(b) 315 1(mod13).
21. Find all the natural numbers n 100 that satisfy
(i) n 10(mod7)
(ii) n 3(mod17)
(iii) n 10(mod17).

148

Theory of Numbers

22. If a b(mod m) and x y(mod m), then prove that


(a) ap + xq = (bp + yq)(mod m)
(b) ax by(mod m).
23. If a b(mod m) then prove that (a, m) = (b, m), i.e. congruent numbers have the
same GCD with m.
24. Solve the linear congruence :
(a) 7x 3(mod15) :

Ans: x 9(mod15)

(b) 37x 7(mod127) :

Ans: x 86(mod127)

(c) 29x 1(mod 13).


(d) 15x 9(mod18) :

x = 3 + 6t; t = 0, 1, 2.

25. A certain number of sixes and nines are added to give a sum of 126. If the number of
sixes and nines are interchanged, the new sum is 114. How many sixes and nines were
there originally?
26. Show that the solution of the system
x a(mod 21), x b(mod 16) is x 64a 63b(mod 336).
27. Find the solution of the system with the help of Chinese Method :
(a) x 5(mod 4), x 3(mod 7), x 2(mod 9).
(b) x 3(mod 6), x 5(mod 8), x 2(mod 11).
(c) x 1(mod 3), x 2(mod 5), x 3(mod7).

Ans: x 52(mod 105)

28. Solve the system of congruence


(a) x 1(mod 3), x 2(mod 4), x 3(mod 5).
(b) x 11(mod 15), x 6(mod35),

Ans: x 41(mod 105).

29. Use Fermats theorem to prove that for two positive integers a, b; a40 b40 is divisible
by 541 if both a and b are prime to 541.
30. Use Fermats theorem to prove that
(a) 1! + 2! + 3! + + 79! + 80! 1(mod 80).
(b) 1p1 + 2p1 + 3p1 + + (p 1)p1 (1)(mod p)
(c) 1p + 2p + 3p + + (p 1)p 0(mod p)
when p is an odd prime.
31. If p is odd prime, then show that
22 .42 .62 (p 1)2 (1)
32. Show that 28! + 233 0(mod899).

p1
2

(mod p).

Chapter 3

Theory of Matrices
In this chapter, we are to investigate the concepts and properties matrices and discuss some
of the simple operations by which two or more matrices can be combined. Matrices are very
important topics in every field of science and engineering.

3.1

Matrix

A matrix is a collection of numbers ordered by rows and columns. It is customary to enclose


the numbers of a matrix in brackets [ ] or parenthesis ( ). For example, the following is a
matrix:


3 5 3
A=
.
0 6 1
This matrix has two rows and three columns and it is referred to as a 2 by 3 or 2 3
matrix. Let F be a field of scalars and let the elements aij (i = 1, 2, , m; j = 1, 2, , n),
not necessarily distinct, belong to the field F . If we construct a rectangular array A of mn
quantities aij into m rows and n columns, then A is said to be a matrix of order or size
m n (read as m by n) over the field F and usually written in the form

a11 a12 a1n


a21 a22 a2n

(3.1)
A= .
..
.. .
..
.
.
an1 an2 ann
The mn quantities aij are called elements or constituents or coordinates or entries of the
matrix. Frequently, the matrix may be written simply as A = [aij ] or [aij ]mn or (aij ) or
(aij )mn ; where aij is the ith element or ij entry appears in the ith row and j th column.
The numbers a11 , a22 , , ann form the main or leading or principle diagonal.
Also, the elements of the the matrix A belong to the field F = <, of real numbers,
therefore A is a real matrix.

3.1.1

Special Matrices

Row and column matrices


We draw attention to the fact that each row of an m n matrix has n components, where
n is the number of columns and each column has m components, where m is the number of
rows. A matrix having a single row (column) is called a row (column) matrix. The ith row
and j th column of the matrix A are
149

150

Theory of Matrices

[ai1 ai2 ain ], 1 i m;

aij
a2j

..
.

, 1 j n.

(3.2)

amj


167
respectively. For example, let us consider the real matrix A =
of order (size) 2 3.
4 3
 2 
 
1
6
7
It has two rows [1 6 7] and [2 4 3] and three columns
,
and
. A 1n
2
4
3
or n 1 is also known as n vector. The row and column matrices are sometimes called row
vectors and column vectors. A matrix having only one row is called row matrix, while a
matrix having only one column is called column matrix.

3.1.2

Square Matrix

For an m n matrix [aij ]mn if m = n, i.e., the number of rows equal to the number of
columns, then the matrix is said to be a square matrix. A n n square matrix is said to
be of order n and is sometimes known as n square
matrix.The elements a11 , a22 , , ann
167
are known as diagonal elements of A. For example, 2 4 3 is a square matrix of order 3.
436
with 1, 4, 6 is the leading diagonal.
Null matrix
A matrix whose entries are all zero, i.e., aij = 0, for all pairs of i and j, then the matrix
A = [aij ]mn 
is said to
 be a null or zero matrix of order m n and is denoted by 0mn . For
000
is an example of a 2 3 null matrix. If any one of aij s is non zero,
example, 0 =
000
then A is said to a non-zero matrix.
Diagonal matrix
A square 
matrix
elements as zero, is called a diagonal matrix. For
 A withall non-diagonal

10
80
00
example,
,
,
are the examples of diagonal matrices. So, for a diagonal
04
00
00
matrix A = [aij ], aij = 0, for i 6= j and it is denoted by A = diag(d11 , d22 , , dnn ). If in
a diagonal matrix, all the elements are equal, then the diagonal matrix is called scalar or
constant matrix. Thus for a scalar matrix A = [aij ], we have,
aij = k; for i = j;
= 0; for i 6= j


2 0
and is denoted by [k]. For example, the diagonal matrix
is scalar matrix.
0 2
Ex 3.1.1 If a matrix B commutes with a diagonal matrix, no two diagonal elements of
which are equal to each other, show that B must be a diagonal matrix.
Solution: Let A be a diagonal matrix of order n whose elements are
aij = ai ij ; 1 i, j n,

Matrix

151

where ai are scalars such that ai 6= aj if i 6= j. Let the ij th element of B be bij . Given that
AB = BA, so taking ij th elements of both sides, we have,
n
X

or,

p=1
n
X
p=1

aip bpj =

n
X

bip apj

p=1

ai ip bpj =

n
X

bip aj pj

p=1

or, ai bij = bij aj (ai aj ) bij = 0.


This shows that, if i 6= j, then bij = 0. The only elements of B which are likely to be different
from zero are the diagonal elements bii for 1 i n, proving that B is a diagonal matrix.
Identity matrix
If in a scalar matrix all the diagonal elements are unity, then it is called identity matrix or
unit matrix. The nth order identity matrix
is denotedby In and is written as
1 0 0
0 1 0

(3.3)
In = . .
.. .
.. ..
.
0 0 1
The identity matrix can be written as I = [ij ], where ij is the kronecker delta, defined by,
ij = 0, if i 6= j and ij = 1, if i = j. We shall denote the ith column of I by ei . Thus ei
has 1 in the ith position and 0s elsewhere. A permutation matrix is a square matrix with
entries 0s and 1s such that each row and each column contains exactly one 1.
Triangular matrix
If in a square matrix A = [aij ], all the elements below the diagonal are zero, i.e., aij = 0,
for i > j, then the square matrix is said to be an upper triangularmatrix and
unit upper
8 4 9
triangular if aii = 1; aij = 0, i > j for all i, j. For example, 0 4 7 is an upper
0 06
triangular matrix.
If in a square matrix A = [aij ], all the elements above the diagonal are zero, i.e., aij = 0,
for
i < j,then the square matrix is said to be an lower triangular matrix. For example,
8 0 0
2 4 0 is a lower triangular matrix and unit lower triangular if aii = 1; aij = 0, i < j
1 36
for all i, j. A square matrix A = [aij ] is said to be a triangular matrix, if it is either upper
triangular or lower triangular. In a diagonal matrix the non-diagonal elements are all zero,
so diagonal matrix is both upper and lower triangular.
A matrix is said to be upper Hessenberg if aij = 0 when i > j + 1 and lower Hessenberg
if aij = 0 for i < j 1.
Ex 3.1.2 Find an upper triangular matrix A such that A3 =


Solution: Let the required upper triangular matrix be A =

8 57
0 27

ab
0c


.


, then,

152

Theory of Matrices

a2 ab + bc
A =
=
0
c2

 2
  3 2

ab
a ab + bc
a a b + abc + bc2
3
2
A = AA =
=
,
0c
0
c2
0
c3
2

ab
0c



ab
0c

a3 = 8; c3 = 27; a2 b + abc + bc2 = 57




2 3
a = 2, c = 3, b = 3, A =
.
0 3
Trace of a matrix
The spur or trace of a square matrix A = [aij ]nn is the sum of the diagonal elements as
n
X
(3.4)
trA = a11 + a22 + + ann =
aii .
i=1

For example, the trace of the above matrices are 5 + 4 = 9 and 1 + 4 + 6 = 11 respectively.
If A be an m n real matrix, then tr(AAT ) 0, the equality occurs if A is a null matrix.
If A and B are square matrices of the same order, then
(i) trA + trB = tr(A + B).
(ii) trAT = trA.
(iii) tr(BA) = tr(AB).
For an m n matrix if m 6= n, i.e., the number of rows not equal to the 
number of
167
columns, then the matrix is said to be a rectangular matrix. For example, A =
is
243
a rectangular matrix or order 2 3.
Ex 3.1.3 If A and B are any two 2 2 matrics, show that AB BA = I2 cannot hold
under any circumstances.
Solution: If possible, let, AB BA = I2 , then
tr(AB BA) = tr(I2 )
or, tr(AB) tr(BA) = tr(I2 ) = 1 + 1 = 2.
But tr(AB) = tr(BA); hence this cannot hold. Therefore, AB BA = I2 cannot hold under
any circumstances.
Band matrix
A real matrix A = [aij ]mn is said to be band matrix with bandwidth k if
aij = 0 for |i j| > k.

(3.5)

If k = 1, then the matrix is called tridiagonal and if k = 0, then it is called diagonal. It is


called diagonally dominant if
n
X
|aii |
|aij |; i = 1, 2, ..., n.
(3.6)
j=1;i6=j

Equations containing a diagonal matrix can be easily solved and hence some algorithms for
solution of linear equations actually try to transform the original matrix to an equivalent
diagonal form.

Matrix Operations

153

Filled and sparse matrix


If most elements of a matrix are nonzero, then it is said to be filled, while if most of the
elements are zero, then it is said to be sparse.

3.2

Matrix Operations

In this section, we are to define the algebraic operations on matrices that will produce new
matrices out of given matrices. These operations are useful in application of matrices.

3.2.1

Equality of matrices

Two matrices A = [aij ]mn and B = [bij ]mn are said to be equal iff they are of the same
order and each element of Ais equalto the corresponding

 element of B, i.e., aij = bij for all
8 25
23 52
and B =
i and j. For example, A =
are equal matrices. Two matrices
32 43
9 64
are said to be comparable, if they are of the same type.
Ex 3.2.1 Find the values
of x, y, z and
satisfy the matrix equation

 uwhich 
x + 3 2y + x
0 7
=
.
z 1 4u 6
3 2u
Solution: Since the matrices are equal, x + 3 = 0, 2y + x = 7, z 1 = 3, 4u 6 = 2u.
Solution of these equations is x = 3, z = 4, y = 2 and u = 3. Hence the required values
of x, y, z, u are 3, 2, 4, 3 respectively.

3.2.2

Matrix Addition

For addition of two matrices, the matrices must be of same order. Let A = [aij ]mn and
B = [bij ]mn be two given matrices of the same order. The sum of A and B, denoted by
A + B, is obtained by adding the corresponding elements of A and B as
A + B = C = [cij ]mn ,
then the elements of C can be written as
cij = aij + bij ; 1 i m, 1 j n.






56
31
21 3
Let, A =
,B=
and C =
be three matrices of order 22, 22, 23
78
20
9 6 1
respectively. As A and B are in same order, A + B is defined and

 
 
 

56
31
5+3 6+1
87
A+B =
+
=
=
.
78
20
7+2 8+0
98
Since the two matrices A and C are not of same order, then they are not conformable for
addition, i.e., A + C and hence B + C are not defined. Matrix subtraction works in the same
way, except that the
instead
of added. For example, if
 elements
 are subtracted


a1 b1 c1
x1 y1 z1
A=
and B =
then,
a2 b2 c2 
x2 y2 z2

a x1 b1 y1 c1 z1
AB = 1
.
a2 x2 b2 y2 c2 z2

154

3.2.3

Theory of Matrices

Matrix Multiplication

Multiplication of matrices by a scalar


If A is a matrix [aij ]mn and k is a scalar quantity, then the product kA or Ak is the matrix
[bij ]mn where bij = kaij . Thus, if k F and A = [aij ]mn be any matrix, then
P = kA = [pij ]mn ; where pij = kaij ; 1 i m, 1 j n.
Therefore, we see that for scalar multiplication, each element of P is obtained by multiplying
the corresponding element of A by k. The negative of A is obtained by multiplying by (1)
scalarly. The difference between two matrices A and B of same order m n is defined as
A B = A + (1)B.


56
31
be two matrices of order 2 2 respectively.
For example, Let, A =
and B =
78
20
So, the scalar multiplication by 3 of A and A B is given by

 
 

56
3.5 3.6
15 18
3A = 3
=
=
78
3.7 3.8
21 24

 
 
 

56
31
53 61
25
AB =

=
=
.
78
20
72 80
58


Two m n matrices A and B are equal, if (A B) equals to the null matrix. Let A, B be
two matrices such that A + B and AB is defined. Then the following properties are satisfied:
(i) kA = Ak.
(ii) k(A + B) = kA + kB; k F
(iii) (k + l)A = kA + lA; k, l F
(iv) A(kB) = k(AB) = (kA)B.
Thus, the scalar multiplication of matrices is commutative, associative and distributive. If
A1 , A2 , , Ak are m n matrices and c1 , c2 , , ck are scalars, then an expression of the
form
c1 A1 + c2 A2 + + ck Ak
is called a linear combination of A1 , A2 , , Ak and c1 , c2 , , ck are called coefficients.
Theorem 3.2.1 Matrix addition is commutative as well as associative.
Proof: Let A = [aij ]mn , B = [bij ]mn and C = [cij ]mn be three matrices of same order,
so that A + B, B + C, A + C, B + A, C + A, C + B are all defined. Let
X = A + B = [xij ]mn ;
Y = B + A = [yij ]mn ;

where, xij = pij + qij


where, yij = qij + pij .

Here, X any Y are of same orders and


xij = pij + qij = qij + pij ; as pij , qij F
= yij ; for all 1 i m and 1 j n.
or,
A + B = B + A.

Matrix Operations

155

Hence the matrix addition is commutative. Now,


(A + B) + C = [xij ]mn + [cij ]mn
= [rij ]mn ; where, rij = xij + cij = aij + bij + cij ,
A + (B + C) = [aij ]mn + [sij ]mn
= [tij ]mn ; where, tij = aij + sij = aij + bij + cij .
Since, rij = tij for every pair of i and j, we have (A + B) + C = A + (B + C). Therefore,
matrix addition is associative.
Since matrix addition is associative, we can define A + B + C as the matrix A + (B + C)
which is the same as (A + B) + C. We can extended it as
n
X

Ai = A1 + A2 + + An .

i=1

Multiplication of a matrix by another matrix


If the number of columns of a matrix A be equal to the number of rows of another matrix B,
then the matrices A and B are said to be conformable for the product AB and the product
AB is said to be defined.
The number of rows and the number of columns of C are equal to the number of rows of
A and the number of columns of B, respectively.

Let


x1 y1
a b c
A = 1 1 1 and B = x2 y2 , then
a2 b2 c2
x3 y3


 x1 y1



a b c
= a1 .x1 + b1 .x2 + c1 .x3 a1 .y1 + b1 .y2 + c1 .y3 .
x
y
AB = 1 1 1
2
2
a2 b2 c2 y
a2 .x1 + b2 .x2 + c2 .x3 a2 .y1 + b2 .y2 + c2 .y3
x3 y3
For the product of two matrices A and B, the number of columns of the matrix A must
be equal to the number of rows of matrix B, otherwise it is impossible to find the product of
A and B. Let A = [aij ]mp and B = [bij ]pn be two matrices. Here A, B are conformable
for the product AB. The ij th element is obtained by multiplying the ith row of A by the
j th column of B. Hence,

a11 a12 a1p


b11 b12 b1n
c11 c12 c1n
a21 a22 a2p b21 b22 b2n c21 c22 c2n

..
..
.. .. ..
.. = .. ..
.. .
.

.
.
. .
.
. .
.
am1 am2 amp
bp1 bp2 bpn
cn1 cn2 cnn
In the product, the matrix A is called the pre-factor and B is called the post-factor. Clearly,
AB is the m n matrix, whose ij th element
cij = ai1 b1j + ai2 b2j + + aip bpj =

p
X

aik bkj .

(3.7)

k=1

In the product, we say that B is pre-multiplied by A and B is post-multiplied by B. In


order that both AB and BA should exist, if A be of order m n, B be of order n m.
In general matrix multiplication is not commutative. The difference between the two
matrices AB and BA is known as the commutator of A and B and is denoted by
[A, B] = AB BA.

(3.8)

156

Theory of Matrices

If should be clear that [B, A] = [A, B]. If, in particular, AB is equal to BA, the two
matrices A and B are said to be commute with each other. AB and BA are equal only
when both the matrix A and B are square matrix of same order. The anticommutator of
the matrices A and B, denoted by {A, B} is defined by,
{A, B} = AB + BA.
(3.9)




1 5
2 3 1
Ex 3.2.2 Consider the matrices A =
,B =
. Here A is of order 2 2
3 2
40 5
and B is of order 2 3. So the product AB is defined and

 

1.2 + 5.4
1.3 + 5.0
1.(1) + 5.5
22 3 24
AB =
=
3.2 + (2).4 3.3 + (2).0 3.(1) + (2).5
2 9 13
which is of order 2 3. Notice that BA is not defined here.




1 5
2 1
Ex 3.2.3 Consider the matrices A =
and B =
, then,
3 2
4 6




18 31
1 12
AB =
, BA =
.
14 9
22 8
Hence AB, BA both are defined but AB 6= BA. The commutator of A and B is

 
 

18 31
1 12
17 43
[A, B] = AB BA =

=
.
14 9
22 8
36 17
The anticommutator of A and B is

 
 

18 31
1 12
19 19
{A, B} = AB + BA =
+
=
.
14 9
22 8
8 1




23
10
Ex 3.2.4 Consider the matrices P =
and Q =
. Here
35
01


 

23
10
23
PQ =
=
= QP
35
01
35
So we can conclude that, if A is an m p matrix and B is a p n matrix, then AB is an
m n matrix. BA is not defined if m 6= n. If m = n, then order of AB and BA are different
sizes. Even if both AB and BA are defined they may not be of the same order and hence
may not be equal. Even if AB and BA are defined and are same order they may not be
equal.
Result 3.2.1 In ordinary algebra, we know,
ab = 0 either a = 0 or b = 0.
But in matrix theory, if AB = 0, then it is not necessarily imply that either A = 0 or B = 0.




For example, let,
12
6 4
A=
, B=
, then
24
3 2


 

12
6 4
00
AB =
=
.
24
3 2
00
In this case A is called the left divisor of zero and B is called right divisor of zero.

Matrix Operations

157

Result 3.2.2 In ordinary algebra, we know,


ab = ac either a = 0 or b = c.
But in matrix theory, if AB = AC, then it is not necessarily imply that either A = 0 or
B = C. For example, let,






12
426
068
A=
,B =
and C =
,
24
369
548


10 14 24
then
AB =
= AC.
20 28 48
but neither A = 0 nor B = C.
Result 3.2.3 Let us consider
 thematrix multiplication with special structures.
1
(i) The multiplication of
and [2 4] gives
3


1
3


[2 4] =

2 4
6 12


.


2
(ii) Let us consider AT = [1 0 2] and B = 3 , then AT B = [8].
5
(iii) Let C T = [1 0 3 4], then

2
2 0 6 8
BC T = 3 [1 0 3 4] = 3 0 9 12 .
5
5 0 15 20

Ex 3.2.5 If A =




3 4
1 + 2n 4n
, prove that An =
, where n is a positive integer.
1 1
n 1 2n

Solution: We shall prove this by using the principle of mathematical induction. Now,


 

3 4
3 4
3.3 + (4).1 3.(4) + (4).(1)
A2 = A.A =
=
1 1
1 1
1.3 + (1).1 1.(4) + (1).(1)

 

5 8
1 + 2.2 4.2
=
=
2 3
2
1 2.2


 
 

5 8
3 4
7 12
1 + 2.3 4.3
3
A =
=
=
.
2 3
1 1
3 5
3
1 2.3
Thus the result is true for n = 1, 2, 3. Let the result be true for n = k, then,



1 + 2k 4k
3 4
k+1
k
A
= A .A =
k 1 2k
1 1

 

3 + 2k 4 4k
1 + 2(k + 1) 4(k + 1)
=
=
.
k + 1 1 2k
k+1
1 2(k + 1)
Thus the result is true for n = k + 1 if it is true forn = k, but it istrue for n = 1, 2, 3. Thus
1 + 2n 4n
by the principle of mathematical induction, An =
.
n 1 2n
Theorem 3.2.2 Matrix multiplication is associative.

158

Theory of Matrices

Proof: Let A = [aij ]mn , B = [bjk ]np and C = [ckj ]pq be three matrices such that the
products A(BC) and (AB)C are defined. We are to show that A(BC) = (AB)C. Now,
AB = [dik ]mp ; BC = [ejl ]nq ,
where, dik =

n
P

aij bjk and ejl =

j=1

p
P

bjk ckl . Now, (AB)C = [uil ]mq , where

k=1

uil =

p
X

p X
n
X

dik ckl =

Now, A(BC) = [vil ]mq , where,


vil =

n
X

aij bjk ckl .

k=1 j=1

k=1

aij cjl =

j=1

p
n X
X

aij bjk ckl .

j=1 k=1

Since the sums are equal, i.e., corresponding elements in (AB)C and A(BC) are equal, so
A(BC) = (AB)C.
Theorem 3.2.3 Matrix multiplication is distributive over addition, i.e., if A, B, C be three
matrices such that A(B + C), AB and AC, BC are defined, then
(i) A(B + C) = AB + AC, left distributive,
(ii) (A + B)C = AC + BC, right distributive.
Proof: Let A = [aij ]mn , B = [bjk ]np and C = [ckj ]pq be three matrices such that
A(B + C), AB and AC, BC are defined. Now,
B + C = [djk ]np , where, djk = bjk + cjk .
Let, A(B + C) = [eik ]mp , then,
eik =
=

n
X
j=1
n
X

aij djk =
aij bjk +

j=1

n
X

aij (bjk + cjk )

j=1
n
X

aij cjk .

j=1

Let AB = [fik ]mp and AC = [gik ]mp , then fik =

n
P

aij bjk , gik =

j=1

[hik ]mp , then,


hik = fik + gik =

n
X
j=1

aij bjk +

n
P

aij cjk . If AB+AC =

j=1
n
X

aij cjk .

j=1

As the corresponding elements of A(B + C) and AB + AC are equal, so we conclude that


A(B + C) = AB + AC. Similarly, the right distributive (A + B)C = AC + BC. Also, if k be
a scalar, then k(AB) = (kA)B = A(kB). Using distributive laws, we can prove that

! l
k
k X
l
X
X
X
Ai
Bj =
Ai Bj ,
i=1

j=1

i=1 j=1

where the summation on the RHS can be taken in any order.

Matrix Operations

159

Definition 3.2.1 Let A be a square matrix. For any positive integer m, Am is defined, as
Am = A.A. A(m times ).
when A is an n n non-null matrix, we define, A0 = In , in analogy with real numbers.
Using the laws of matrix multiplication, it is easy to see that for a square matrix A,
Am An = Am+n and (Am )n = Amn
for non-negative integers m and n. It is important that, (AB)n 6= An B n , in general, the
equality holds only when AB = BA. Also, it follows that Am An = An Am , i.e., the powers
of A commute.

4
4 8 4
Ex 3.2.6 If 1 A = 1 2 1 , then find A.
3
3 6 3
Solution: Let the given equation be of the form XA = B. Since the size of the matrix X
is 3 1 and that of the matrix B therefore the size of the matrix A should be 1 3. Hence
we can take A=
[a b c]. Now,
from the
given relation we have
4
4 8 4
1 [a b c] = 1 2 1
3
36 3

4a 4b 4c
4 8 4
or, a b c = 1 2 1 .
3a 3b 3c
3 6 3
Equating both sides we get, 4a = 4, 4b = 8, 4c = 4; a = 1, b = 2, c = 1; 3a = 3, 3b =
6, 3c = 3. Therefore, a = 1, b = 2, c = 1. Hence the required matrix A is [1 2 1].




cos sin
cos n sin n
n
Ex 3.2.7 If n N and A =
then show that A =
.
sin cos
sin n cos n
Solution: Here we use the principle of mathematical induction. Now,



cos sin
cos sin
2
A =
sin cos
sin cos
 2


2
cos sin 2 sin cos
cos 2 sin 2
=
.
sin 2 cos 2
2 sin cos cos2 sin2
Thus the result be true for n = 2. Let the result be true for n = k. Now,



cos k sin k
cos sin
Ak+1 = Ak A =
sin k cos k
sin cos


cos cos k sin k sin sin k cos + sin cos k
=
sin k cos sin cos k cos cos k sin k sin


cos(k + 1) sin(k + 1)
=
.
sin(k + 1) cos(k + 1)
Therefore, the result is true for n = k + 1 if the result is true for n = k. But, the result
is true for n = 2. Hence the result is true for n = 2 + 1 = 3, 3 + 1 = 4, . . .. Thus, by
Mathematical induction the result is true for n = 2, 3, 4, . . . etc, i.e., for any positive integer.

160

3.2.4

Theory of Matrices

Transpose of a Matrix

Let A = [aij ]mn be a given matrix. An n m matrix, obtained by interchanging rows and
columns of A as AT = [aji ]nm is said to be the transpose of the matrix A. For example,
let,


23 6
A=
and B = [2 1 5]
3 5 7

2 3
2
then, AT = 3 5 and B T = 1 .
6 7
5
Thus, transpose of the transpose of a matrix is the given matrix itself, i.e., (AT )T = A.
Theorem 3.2.4 If A and B be two matrices such that A + B is defined, then (A + B)T =
AT + B T .
Proof: Let A = [aij ]mn and B = [bij ]mn be two given matrices such that A + B is
defined. Also, let, A + B = [cij ]mn , where, cij = aij + bij . Now,
ij th element of (A + B)T = jith element of (A + B)
= jith element of [cij ]mn = jith element of [aij + bij ]mn
= jith element of [aij ]mn + jith element of [bij ]mn
= ij th element of [aji + ij th element of bji ]mn
= ij th element of AT + ij th element of B T
= ij th element of (A + B)T .
Also, order of (A+B)T is nm and order of (AT +B T ) is nm. Hence, (A+B)T = AT +B T .
If k be a scalar, then (kA)T = kAT . If A, B be two matrices of same order, then,
(rA + sB)T = rAT + sB T ,
provided s, t are scalars.
Theorem 3.2.5 If A, B be two matrices of appropriate sizes, then (AB)T = B T AT .
Proof: Let A = [aij ]mn and B = [bjk ]np be two given matrices such that AB is defined
and the order is m p. Also, order of AT is n m and order of B T is p n, so that order
of B T AT is p m. Therefore,
order of (AB)T = order of B T AT .
Now, ij th element of AB is obtained by multiplying ith row of A with k th column of B,
which is
ai1 b1k + ai2 b2k + + ain bnk .
Also, ik th element of AB = kith element of (AB)T , which is
ai1 b1k + ai2 b2k + + ain bnk .
Also, column k of B becomes k th row of B T and ith row of A becomes ith column of AT .
Now,
kith element of B T AT = [b1k b2k bnk ][ai1 ai2 ain ]T
= b1k ai1 + b2k ai2 + + bnk ain .
= ai1 b1k + ai2 b2k + + ain bnk
= kith element of (AB)T .

Few Matrices

161

Therefore, (AB)T = B T AT , i.e., transpose of the product of two matrices is equal to the
product of their transposes taken in reverse order. This statement can be extended to several
matrices as
(A.B KL)T = LT K T B T AT .
This can be proved by induction. From this result, it follows, if A is a square matrix, then
(An )T = (AT )n ,

n N.

Ex 3.2.8 Find the matrices A and B such that 2A + 3B = I2 and A + B = 2AT .


Solution: The given system is 2A + 3B = I2 and A + B = 2AT . Now from the first
equation, we have B = 2AT A. Therefore, from the first equation we have,
2A + 3B = I2 2A + 6AT 3A = I2
A + 6AT = I2 AT + 6A = I2 ; taking transpose.
Solving the equations A + 6AT = I2 , AT + 6A = I2 , we get, A = 15 I2 . Using the relation
B = 2AT A, we get, B = 51 I2 .
Ex 3.2.9 Find the matrices A and B such that




2 5
18
T
T
2A + B =
and A + 2B =
.
10 2
41
Solution: Taking the transpose of first equation we get,




2 10
18
T
T
2A + B =
. Also, A + 2B =
5 2
41




0 6
02
3B =
B=
.
3 0
10
From the first given equation we get,




 

1
1
2 5
2 5
01
A=
BT =

10 2
10 2
20
2
2

 

1 20
51
12
=
.
=
20
41
2 10 2

3.3
3.3.1

Few Matrices
Nilpotent Matrix

For a least positive integer r, if

Ar = 0, the null matrix,

(3.10)

then the non-null matrix A is said to be nilpotent matrix of order r. The least value of r is
called the index of it. For example, let




2 4
00
2
A=
, then, A =
.
1 2
00
Therefore, A is a nilpotent matrix of index 2.


ab b2
Ex 3.3.1 Show that A =
is a nilpotent matrix of index 2.
a2 ab

162

Theory of Matrices

Solution: We are to show that, A2 = 0. Now,




 

ab b2
ab b2
00
A2 =
=
.
a2 ab
a2 ab
00
Therefore, the given matrix A is a nilpotent matrix of index 2.


ab
Ex 3.3.2 Find non-null real matrices
such that it is a nilpotent matrix of index 2.
cd


ab
Solution: Let A =
be non-null real matrices such that A2 = 0. Therefore,
cd

 

  2

ab
ab
a + bc ab + bd
00
=
=
cd
cd
ac + cd bc + d2
00
a2 + bc = 0; ab + bd = 0; ac + cd = 0; bc + d2 = 0
a2 + bc = 0; a = d.
Therefore, the non null matrices are given by



a b
2
; a + bc = 0; a, b, c, d < .
c a

3.3.2

Idempotent Matrix

A matrix A is said to be idempotent matrix if A2 = A. For example, let

2 3 5
2 3 5
A = 1 4 5 , then, A2 = 1 4 5 = A.
1 3 4
1 3 4
Therefore, A is an idempotent matrix. Identity matrix is idempotent as I 2 = I.
Ex 3.3.3 If A be an idempotent matrix of order n, show that In A is also idempotent.
Solution: Since A is an idempotent matrix, so by definition, A2 = A. Now,
(In A)2 = (In A)(In A) = In2 In A AIn + A2
= In2 2AIn + A = In 2A + A = In A.
Hence, if A is an idempotent matrix, the matrix In A is so.
Ex 3.3.4 If A and B are two matrices such that AB = A and BA = B, then show that
AT , B T and A, B are idempotent.
Solution: From the given first relation, AB = A, we have,
(AB)T = AT B T AT = AT .
Also, as B = BA, we have,

(BA)T AT = AT AT B T AT = AT
AT AT = AT (AT )2 = AT .

From the relation, BA = B, we have,


(BA)T = B T AT B T = B T .

Few Matrices

163

Also, as A = AB, we have,

(AB)T B T = B T B T AT B T = B T
B T B T = B T (B T )2 = B T .

Therefore, AT and B T are idempotent. Also,


A = AB = A(BA) = (AB)A = AA = A2
and B = BA = B(AB) = (BA)B = BB = B 2 .
This shows that A and B are also indempotent.

3.3.3

Involuntary Matrix

A matrix A is said to be involuntary matrix if A2 = I. For example, let

5 8 0
100
A = 3 5 0 , then, A2 = 0 1 0 = I.
1 2 1
001
Identity matrix is also involuntary as I 2 = I, and hence, identity matrix is involuntary as
well as idempotent matrix.


ab
Ex 3.3.5 Find all non-null real matrices A =
such that it is an involuntary matrix.
cd
2
Solution: Here, we
real matrices
 are tofind all non-null
 2
 A
 such
 that A = I2 . Therefore,
ab
ab
a + bc ab + bd
10
=
=
cd
cd
ac + cd bc + d2
01

a2 + bc = 1; ab + bd = 0; ac + cd = 0; bc + d2 = 1
a = 1, d = 1, a + d = 0, a2 + bc = 0.
Therefore, the non null real matrices are given by



a b
2
; a + bc = 0; a, b, c, d < , I2 , I2 .
c a

3.3.4

Periodic Matrix

A matrix A is said to be periodic matrix if Ak+1 = A, wherek is a positive


integer, where
2 3 5
the least value of k is the period of A. For example, let A = 1 4 5 , then, A2 = A.
1 3 4
Therefore, A is a periodic matrix of period 2.

3.3.5

Symmetric Matrices

A square matrix A = [aij ]nn is said to be symmetric, if


AT = A, i.e., aij = aji , pairs of (i, j).
(3.11)

ahg
For example, the matrix A = h b f is a symmetric. In the symmetric matrix A, the
gf c
elements of A are symmetric with respect to main diagonal of A. A diagonal matrix is

164

Theory of Matrices

always symmetric. Advantage of working with symmetric matrices A is that only half of A
needs to be stored and the amount of calculation required is also halved.
A matrix A = [aij ]nn is said to be pseudo-symmetric if
aij = an+1j,n+1i , i and j.

(3.12)

Now, note the following:


(i) If A, B be two symmetric matrices of the same order and c is a scalar, then A + B and
cA are symmetric matrices.
(ii) However, if A, B are symmetric matrices

 of the same
 order, then AB
 may
 not be
21
34
10 9
symmetric. For example, let A =
,B =
, then AB =
, which
14
41
19 8
is not symmetric. If A, B be two symmetric matrices of the same order then AB is
symmetric if and only if AB = BA.
(iii) If A be a symmetric matrix, then An is symmetric for all n N .
(iv) The product of any matrix with its transpose is symmetric. If A be an m n matrix,
then AT and AT A are symmetric matrices of order m and n respectively.
(v) A matrix A is diagonal if and only if it is symmetric and upper triangular.

3.3.6

Skew-symmetric Matrices

A square matrix A = [aij ]nn is said to be skew-symmetric, if


AT = A, i.e., aij = aji , pairs of (i, j).

(3.13)

For a skew symmetric matrix A = [aij ]nn , we have by definition,


aii = aii ;

for i = j 2aii = 0, i.e., aii = 0.





0 2
2 0
is a skew symmetric matrix of order 2. If A be a skew-symmetric matrix, then An is
symmetric or skew-symmetric according as n is even or odd positive integer.
So all the diagonal elements in a skew symmetric matrix is zero. For example, A =

Theorem 3.3.1 Every square matrix can be uniquely expressed as a sum of a symmetric
matrix and a skew symmetric matrix.
Proof: Let A be any given square matrix. Let us write,
A=

1
1
(A + AT ) + (A AT ) = B + C, say,
2
2

where, B = 21 (A + AT ) and C = 12 (A AT ). Now,


BT =

 1
1
1 T
(A + AT )T =
A + (AT )T = [AT + A] = B.
2
2
2

Therefore, B is a symmetric matrix. Again,



1
1 T
(A AT )T =
A (AT )T
2
2
1 T
1
= [A A] = [A AT ] = C.
2
2

CT =

Few Matrices

165

Therefore, C is a skew-symmetric matrix. So, every square matrix can be expressed as a


sum of a symmetric matrix and a skew symmetric matrix. Now, we are to show that the
representation is unique. For this, let A = M + N, where M is symmetric and N is skew
symmetric. Now,
AT = (M + N )T = M T + N T = M N
A + AT = 2M ; A AT = 2N,
1
1
B = (A + AT ) and C = (A AT ).
2
2
Thus, the representation is unique. Therefore, every square matrix can be uniquely expressed
as a sum of a symmetric matrix and a skew symmetric matrix.

2 5 3
Ex 3.3.6 Express A = 7 1 1 as a sum of a symmetric and skew symmetric matrix.
1 3 4

2 7 1
Solution: For the given matrix A, we have, AT = 5 1 3 . Now,
3 1 4

2 5 3
2 7 1
4 12 4
A + AT = 7 1 1 + 5 1 3 = 12 2 4
1 3 4
3 1 4
4 4 8

2 5 3
2 7 1
0 2 2
A AT = 7 1 1 5 1 3 = 2 0 2 .
1 3 4
3 1 4
2 2 0
Now the symmetric matrix is 12 (A + AT ) and the skew symmetric matrix is
Therefore,

2 5 3
4 12 4
0 2 2
1
1
7 1 1 = 12 2 4 + 2 0 2 .
2
2
1 3 4
4 4 8
2 2 0

1
2 (A

AT ).

This representation is unique. Therefore, the given square matrix A can be uniquely expressed as a sum of a symmetric matrix and a skew symmetric matrix.
Ex 3.3.7 Show that (I3 A)(I3 + A) is a symmetric matrix, where A is a 3 3 symmetric
or a skew symmetric matrix.
Solution: If A is symmetric, then AT = A and skew-symmetric if AT = A. Let B =
(I3 A)(I3 + A), then,
B = (I3 A)(I3 + A) = I3 + A A A2 = I3 A2
B T = [(I3 A)(I3 + A)]T = [I3 A2 ]T = I3 (AT )2 = I3 A2 ,
whatever, A is a symmetric or a skew symmetric matrix. Hence, B T = B and consequently
B = (I3 A)(I3 + A) is a symmetric matrix.

3.3.7

Normal Matrix

A real matrix A is normal if it commutes with its transpose AT , i.e., if AAT = AT A. If A is


symmetric, orthogonal or skew symmetric, then A is normal. There are also other normal

166

Theory of Matrices



6 3
matrices. For example, let, A =
, then,
3 6


 

6 3
6 3
45 0
AAT =
=
.
3 6
3 6
0 45
Since AAT = AT A, the matrix A is normal.

3.4

Determinants

A very important issue in the study of matrix algebra is the concept of determinant. In this
section, various properties of determinant are studied. The methods for its computation
and one of its application are discussed.
The x eliminant of two linear equations a11 x + a12 = 0 and a21 x + a22 = 0 is

a22
a12
=
, ; i.e., a11 a22 a12 a21 = 0.
a11
a21



a a
Now, the expression (a11 a22 a12 a21 ) can be written in the form 11 12 . Let A = [aij ]
a21 a22
be a square matrix of order n. We define the determinant
of A of order n as

a11 a12 a1n


a21 a22 a2n


(3.14)
..
..
..
.
.
.

an1 an2 ann
and it is denoted by detA or |A| or 4. If we consider the matrix A of order 2, then the
a a
determinant of order 2 is |A| = 11 12 , which is the xeliminant given by the above.
a21 a22
Similarly, for a matrix A of order 3, we have


a11 a12 a13








a22 a23
a21 a23
a21 a22








|A| = a21 a22 a23 = a11
a12
+ a13
a32 a33
a31 a33
a31 a32
a31 a32 a33
= a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 ),
which is the x, y eliminant of the system of equations
a11 x + a12 y + a13 = 0; a21 x + a22 y + a23 = 0; a31 x + a32 y + a33 = 0.
An nth order determinant contains n! terms in its expression of which 21 n! terms are positive
and remaining 12 n! terms are negative. For example, let

1 2 0
A = 2 1 3 , then ,
1 0 2
|A| = 1(1.2 3.0) (2)[2.2 (1).3] + 0[2.0 (1).1]
= 2 + 14 + 0 = 16.
Definition 3.4.1 Let Mnn be the set of all square matrices of order n, whose elements
belong to a field of scalars F . Then a mapping f : Mnn F , which assigns to each matrix
A Mnn , which is called the determinant function on the set Mnn and it is denoted by

Determinants

167

detA, or |A| or det(aij ). The determinant associated with a square matrix A = [aij ] of order
n n is scalar (a real or a complex number) defined by
X
det A = det(aij ) =
Sgn a1(1) a2(2) an(n) ,


1
2 n
and Sgn = 1 according as is even or
(1) (2) (n)
an odd permutation, the summation extends over all possible permutations (1), (2), . . . , (n)
of n second subscripts
in as. det A is said to be a determinant of order n and is denoted

a11 a12 a1n


a a a2n
or shortly by |aij |n .
by vertical bars 21 22


an1 an2 ann
P
The summation
is said to be the expansion of det A. It contains n! terms as there
where is the permutation

are n! permutations of the set {1, 2, . . . , n}. Since there are 21 n! even and 12 n! odd permutations on the set {1,2, . . ., n}, the expansion of det A contains 12 n! positive terms and 12 n!
negative terms. Each term is a product of n elements. The first subscript of as run over
{1, 2, . . . , n} in natural order and the second subscript is a permutation (1), (2), . . . , (n)
on the set {1, 2, . . . , n}, each of which can occur only once. It is observed that each product
a1(1) a2(2) an(n) is constituted by taking one and only one element from each row and
each column of A and has a positive or a negative sign depending on whether Sgn is even
or odd.


P
a a
Let A = 11 12 . Then det A = Sgn a1(1) a2(2) , where is the permutation on
a21 a22





12
12
the set {1, 2}. There are two permutations on {1, 2}, they are 1 =
, 2 =
,
12
21
1 is even and 2 is odd. Therefore,
det A = Sgn 1 a11 a22 + Sgn 2 a12 a21 = a11 a22 a12 a22 .


a a
Thus det A = 11 12 = a11 a22 a12 a21 .
a21 a22

a11 a12 a13


P
Let A = a21 a22 a23 ; det A = Sgn a1(1) a2(2) an(n) , where is a permutation

a31 a32 a33


on {1, 2, 3}.
on
 There
 are sixpermutations

 {1, 2,3}, theyare 
123
123
123
123
1 =
, 2 =
, 3 =
, 4 =
,
123
132
213
231




123
123
5 =
, 6 =
.
312
321
Among them 1 , 4 , 5 are even and 2 , 3 , 6 are odd. Therefore,
det A = Sgn 1 a11 a22 a33 + Sgn 2 a11 a23 a32 + Sgn 3 a12 a21 a33
+Sgn 4 a12 a23 a31 + Sgn 5 a13 a21 a32 + Sgn 6 a13 a22 a31
= a11 a22 a33 a11 a23 a32 a12 a21 a33 + a12 a23 a31 + a13 a21 a32 a13 a22 a31 .
If the first two columns of A are adjoined to its right, then the expansion of a 3 3 determinant can be obtained as the product of diagonal elements with the assigned signs shown
in Fig. 3.1.
Each product of det A is obtained by taking the row subscripts in the natural order
performing permutation among the column subscript, and hence it is known as the row

168

Theory of Matrices

a11

a12

a13

a11

a12

a21

a22

a23

a21

a22

a31

a32

a33

a31

a32

+j

+j

j
+

Figure 3.1: Row expansion of a third order determinant.


expansion of det A. Similarly, det A obtained by taking the column subscripts in a natural
order and making all possible permutations among the row subscripts. Thus
X
(3.15)
det A =
Sgn a(1)1 a(2)2 a(n)n .

Ex 3.4.1 Find the number of 2 2 matrices over Z3 ( the field with three elements) with
determinant 1.
[IIT-JAM10]


ab
Solution: Let us take a 2 2 matrix as A =
where a, b, c, d can take three {0, 1, 2}
cd
values. From the given condition |ad bc| = 1. Thus either ad = 1, bc = 0 or ad = 0, bc = 1.
Now bc = 0, is possible as
b = 0, c = 1; b = 0, c = 0; b = 1, c = 0; b = 2, c = 0; b = 0, c = 1
and ad = 1 either through a = 1, b = 1; a = 2, b = 2. Therefore total number of such matrix
is = 2 2 5 = 20. Also when ad = 2, bc = 1 or ad = 1, bc = 2, then |ad bc| = 1. Number
of such matrix = 2 + 2 = 4.
Therefore, the number of 2 2 matrices over Z3 ( the field with three elements) with
determinant 1 is 24.
Ex 3.4.2 Let Dn be a determinant of order n in which the diagonal elements are 1 and
those just above and just below the diagonal elements are a and all other elements are zero.
Prove that D4 D3 + a2 D2 = 0 and hence find the value of
1 1 0 0
1 2 1
1
0
44 = 2 1 2 1 .
0 2 11 2

0 0
2 1
Solution: According to the definition of Dn , we get the form of D4 as


1 a 0 0






a a 0
a 1 a 0 1 a 0


= a 1 a a 0 1 a
D4 =





0 a 1 a 0 a 1
0 a 1
0 0 a 1


1 a

= D3 a2 D2 .
= D3 a.a
a 1
For the particular given determinant, we have,
1
1 0  2 1
 2
2
1
1 1 2
44 = 43
42 = 12 1 12
2
2 12 1
0 1 1
2






1
1 1
1
1
5
= 1

0
1
=
.
4
2 2
4
4
16

Determinants

169

Property 3.4.1 The determinant of a matrix and its transpose are equal, i.e., |A| = |AT |.
Proof: First, consider a matrix A of order 2, then


a11 a12
T


a21 a22 = a11 a22 a12 a21 = a11 a22 a21 a12 = |A |.
Let us consider a matrix of order 3, then


a11 a12 a13








a22 a23
a21 a23
a21 a22








|A| = a21 a22 a23 = a11
a12
+ a13
a32 a33
a31 a33
a31 a32
a31 a32 a33
= a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 ),
= a11 (a22 a33 a23 a32 ) a21 (a13 a32 a12 a33 ) + a31 (a12 a23 a22 a13 ),


a11 a21 a31


= a12 a22 a32 = |AT |.
a13 a23 a33


12
This property is true for any order of determinants. For example, let A =
,
45


1 2
= 1.5 2.4 = 3
|A| =
4 5


1 4
= 1.5 2.4 = 3.
and |AT | =
2 5
Hence |A| = |AT |. From this property, we can say that a theorem which holds for some row
operations on A, also holds equally well when corresponding column operations are made
on A.
Property 3.4.2 The interchange of two rows (or columns) of a square matrix A changes
the sign of |A|, but its value remains unaltered.

a11 a12 a13


a21 a22 a23
Proof: Let A = a21 a22 a23 and A = a11 a12 a13 be the square matrices of order
a31 a32 a33
a31 a32 a33
3, where A be obtained by interchanging any two rows of A. Therefore,



a21 a22 a23







a22 a23
a21 a23
a21 a22









a
a
a
|A | = 11 12 13 = a11
a12
+ a13
a32 a33
a31 a33
a31 a32
a31 a32 a33
= a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 )},
= {a11 (a22 a33 a23 a32 ) a12 (a21 a33 a23 a31 ) + a13 (a21 a32 a22 a31 )} = |A|.
Similar proof for any interchange between two columns independently, by considering the
equivalent from of the expression of |A|. This is true for any order of square matrices.
Property 3.4.3 If in a square matrix A of order n, two rows (columns) are equal or identical, then the value of |A| = 0.
Proof: Let |A| = 4, the value of the determinant. We know, the interchange of any two
rows or columns of a determinant changes the sign of the determinant without changing its
numerical value. Hence the matrix A remain unchanged. Therefore,
4 = |A| = 4 24 = 0 4 = |A| = 0.
If |A| = 0, then the matrix A is called singular matrix, otherwise it is non-singular.

170

Theory of Matrices

Property 3.4.4 Let the elements of mth row of A are all zero, then if we expand the
determinant of A with respect to the mth row, each term in the expression, contains a factor
zero. Hence the value of |A| is 0. Thus, if a row or a column of any matrix consists entirely
of zeros, then |A| = 0.
Result 3.4.1 If two rows (or columns) of a matrix A become identical, for x = a, then
(x a) is a factor of |A|. Further, if r rows (or columns) become identical for x = a, then
(x a)r1 is a factor of |A|.
2

a a 1
2

Ex 3.4.3 Prove without expanding that |A| = b b 1 = (a b)(b c)(c a).
c2 c 1
Solution: Let us consider the elements of A as polynomials in a. When a = b, two rows of
the matrix A become identical. Therefore a b is a factor of |A|. Now, let us consider the
elements of A as polynomials in b. When b = c, two rows of the matrix A become identical.
Therefore b c is a factor of |A|. Similarly, c a is a factor of |A|. Also, we see that, the
expression of |A| is a polynomial in a, b and c of degree 3. The leading term in the expansion
of |A| is a2 b. No other term in the expansion of |A| is a2 b. Therefore,
|A| = k(a b)(b c)(c a),
where the constant k is independent of a, b, c. Equating coefficients of a2 b from both sides
of this equality, we get
1 = k.1.1.(1) k = 1.
Therefore, |A| = (a b)(b c)(c a).
Property 3.4.5 If every element of any row (or column) of a matrix A be multiplied by a
factor k, then |A| is multiplied by the same factor k.
Property 3.4.6 If we add k times the elements of any row (column) of a matrix A to
the corresponding elements of any
other
row (column),
the
value of the determinant of A
a b a
b a + bk b



remains unchanged. Therefore,
,
,
are of the same value.
c d c + ak d + bk c + dk d
Also, if in an m n matrix A, if one row (column) be expressed as a linear combination of
other rows(columns) then |A| = 0.
Property 3.4.7 If every element of any row (or column) of a matrix A can be expressed
as the sum of two quantities, then the determinant can also be expressed as the sum of two
determinants.
Thus,




a11 + k1 a12 + k2 a1n + kn a11 a12 a1n k1 k2 kn




a21
a22

a2n a21 a22 a2n a21 a22 a2n




= .. ..
..
..
..
.. + .. ..
.. .

. .
.
.
.
. . .
.


an1




an2

ann
an1 an2 ann
an1 an2 ann
Property 3.4.8 Let f1 (x), f2 (x), g1 (x) and g2 (x) are differentiable functions of the real
variable x, then
d


d
d f1 (x) f2 (x) dx
f1 (x) dx
f2 (x) f1 (x) f2 (x)
+ d
=
d
g1 (x) dx
g2 (x)
dx g1 (x) g2 (x) g1 (x) g2 (x) dx
d


d
f1 (x) f2 (x) f1 (x) f2 (x)
dx
dx


.

+
= d
g1 (x) g2 (x) g1 (x) d g2 (x)
dx

dx

Determinants

171

This result can be extended to any finite order of determinants.

xn sin x cos x
n
, then show that f (n) (0) = 0.
Ex 3.4.4 If f (x) = n! sin n
2 cos 2
2
3
a a
a

Solution: According to the property of the derivative of the determinant, we get,



 n

x sin x cos x
n! sin x + n
cos x + n
2
2
+ 0 0
0
sin n
cos n
= n!
2
2
2
2
3
a a
a3
a
a
a

n
x sin x cos x
n
.
+ n! sin n
2 cos 2
0
0
0

f (n)

Therefore, [f (n) (x)]x=0 is givenby,


n

n
x sin x cos x
n! sin n
2 cos 2
n
0
+ 0 0
[f (n) (x)]x=0 = n! sin n
2 cos 2
2
3
2
a a
a
a a
a3

n
x sin x cos x
n
+ n! sin n
= 0 + 0 + 0 = 0.
2 cos 2
0
0
0

3.4.1

Product of Determinants

The product of two determinants of order n is also a determinant of the order n. Let |aij |
and |bij | be two determinants of order n. Then their product is defined by,
n
X
|aij |.|bij | = |cij |; where, cij =
aik bkj , i.e.,
k=1


a11

a21

..
.

an1


a12 a1n b11
a22 a2n b21
..
.. . ..
.
. .
an2 ann bn1

b12
b22
..
.
bn2

n

n
n
P

P
P


k=1 a1k b1k k=1 a1k b2k k=1 a1k bnk



b1n P
n
n
n
P
P



a2k b1k
a2k b2k
a2k bnk
b2n
k=1
k=1
.
.. = k=1

..
..
..
.

.
.
.



bnn
n
n
n
P

P
P

a
b
a
b

a
b
nk nk
k=1 nk 1k k=1 nk 2k
k=1

This rule of multiplication is called the matrix rule or the rule of multiplication of rows by
columns. Since interchange of rows and column does not alter the value of the determinant,
hence the product can be obtained in other forms also, cij may be taken also as cij =
n
P
aik bjk . This rule of multiplication is called the rule of multiplication of rows. Similarly,
k=1

we can define multiplication of rows by columns. From the definition, if A and B be square
matrices of the same order, then |AB| = |A|.|B|.

1 a a2

1 b b2
Ex 3.4.5 Prove without expanding
2
1 c c2
1 d d


a3 + bcd
b3 + cda
=0
c3 + dab
d3 + abc

172

Theory of Matrices

Solution: The given determinant can be written in the form





1 a a2 a3 + bcd 1 a a2 a3 1 a a2




1 b b2 b3 + cda 1 b b2 b3 1 b b2




=
2 3
2 3 +
2
=
1 c c 2 c 3 + dab 1 c c 2 c 3 1 c c 2
1 d d d + abc 1 d d d 1 d d
= 1 + 2 , say.


bcd
cda
dab
abc

Now, we simplify 2 as
2 3

a a a abcd
2 3

1 b b b abcd ; R10 = aR1 , R20 = bR2
2 =
abcd c c2 c3 abcd ; R30 = cR3 , R40 = dR4
d d2 d3 abcd
2 3


a a a 1
1 a a2 a3
2 3


b b b 1
1 b b 2 b3




= 2 3 =
2 3 = 1 ,
c c2 c3 1
1 c c2 c3
d d d 1
1 d d d
applying three successive interchanges to bring C4 to C1 . Therefore = 0.

Ex 3.4.6 Let m, n N with m n 1 1 and m
r = mC r . Prove that


1
1
1
1 
m m+1 m+2
m+n1
1
1 
1 
1

m m+1
m+2


m+n1
2
2
2
2
= 1.
..

..
..
..
..
.
.  .  .
. 
m  m+1
m+2

m+n1
n1

n1

n1

n1

Solution: Let n be the given determinant. Subtracting the preceding column from each
column beginning with the second, we have,


1

0
0
0
m

1

1
1

1
 m+1

m
m
m+n2

1
1
1
n = 2
.
..

..
..
..
..
.
.  .
. 
m  m.  m+1

m+n2
n1

Expanding in terms of the first row,


1
m

1
n = ..
.
m 

n2

n1

n1

1 
m+1

1
..
.
.  ..
m+1
n2

n1




m+n2
1

= n1 .
..
. 
m+n2
1

n2

Therefore n = n1 = n2 = = 2 . But

1
1 


2 = m m+1 = 1.
1

Consequently, n = 1.
2

b + c2 a2
a2

Ex 3.4.7 Prove that b2 c2 + a2 b2 = 4a2 b2 c2 .
c2
c2 a2 + b2

[WBUT 2007]

Determinants

173

2


b + c2 a2
a2 0 2c2 2b2

Solution: b2 c2 + a2 b2 = b2 c2 + a2 b2 [R10 = R1 R2 R3 ]
c2
c2 a2 + b2 c 2 c2 a2 + b2

2
2
0

0 c2 b2 0
2 2 c 2 b2


= 2 b2 a2 0 [R0 2 = R2 R1 ,
b
= 2 b c + a



c2 c2 a2 + b2
c2 0 a2 R3 = R3 R1 ]
= 2{0 c2 (a2 b2 ) + b2 (0 a2 c2 )} = 2 (2 a2 b2 c2 ) = 4a2 b2 c2 .
r
r
r
ac
ac
ac
1
1
1
Ex 3.4.8 If tan
+ tan
+ tan
= 0, prove that
c+x
c+y
c+z

1 x (a + x) c + x


1 y (a + y)c + y = 0.

1 z (a + z) c + z
q
q
q
ac
ac
1
Solution: Let = tan1 c+x
, = tan1 ac
c+y , = tan
c+z . Hence,from the given
condition we get,
+ + = 0 + =
tan + tan
or, tan( + ) = tan() or,
= tan .
1 tan tan
or, tan + tan + tan = tan tan tan .

Again let, x + c = X 2 , y + c = Y 2 , z + c = Z 2 then, tan =

ac
Z . Therefore, from the above relation we get,




1
1
1
( a c)3
ac
+ +
=
X
Y
Z
XY Z
or, Y Z + ZX + XY = a c.

ac
X , tan

ac
Y , tan

Now, the given determinant 4 becomes,




1 X 2 c (X 2 + a c)X


4 = 1 Y 2 c (Y 2 + a c)Y ; as a + x = a + X 2 c, etc.
1 Z 2 c (Z 2 + a c)Z


1 X 2 (X 2 + XY + Y Z + ZX)X


= 1 Y 2 (Y 2 + XY + Y Z + ZX)Y ; using C 0 2 = C2 + cC1
1 Z 2 (Z 2 + XY + Y Z + ZX)Z




1 X 2 X 2 (X + Y + Z) + XY Z 1 X 2 X 2 (X + Y + Z) 1 X 2 XY Z






= 1 Y 2 Y 2 (X + Y + Z) + XY Z = 1 Y 2 Y 2 (X + Y + Z) + 1 Y 2 XY Z
1 Z 2 Z 2 (X + Y + Z) + XY Z 1 Z 2 Z 2 (X + Y + Z) 1 Z 2 XY Z




1 X2 X2
1 X2 1




= (X + Y + Z) 1 Y 2 Y 2 + XY Z 1 Y 2 1 = (X + Y + Z).0 + XY Z.0 = 0
1 Z2 Z2
1 Z2 1


cos(x + a) sin(x + a) 1


Ex 3.4.9 Show that the value of cos(x + b) sin(x + b) 1 independent of x.
cos(x + c) sin(x + c) 1
Solution: If 4 be the value of the determinant, we have



cos(x + a)
sin(x + a)
1

4 = cos(x + b) cos(x + a) sin(x + b) sin(x + a) 0
cos(x + c) cos(x + a) sin(x + c) sin(x + a) 0

174

Theory of Matrices


cos(x + b) cos(x + a) sin(x + b) sin(x + a)


=
cos(x + c) cos(x + a) sin(x + c) sin(x + a)


2 sin 2x+a+b sin ab 2 cos 2x+a+b sin ba


2
2
2
2
=

2 sin 2x+a+c
sin ac
2 cos 2x+a+c
sin ca
2
2
2
2


2x+a+b
2x+a+b

ab
a c 2 sin
2 cos

2
2
= sin
sin
2
2 2 sin 2x+a+c 2 cos 2x+a+c
2

ac
cb
ab
sin
sin
,
= 4 sin
2
2
2
which is independent of x.
2

sin A cot A 1
2

Ex 3.4.10 If A + B + C = , then show that sin B cot B 1 = 0.
sin2 C cot C 1
2


2
sin A cot A 1
cot A
1 0
2 sin A 2
Solution: 2

[R2 = R2 R1 ,
sin2 B cot B 1 = sin2 B sin2 A cot B cot A 0 R30 = R3 R1 ]
sin C cot C 1 sin C sin A cot C cot A 0
2

sin B sin2 A cot B cot A


= 2
sin C sin2 A cot C cot A


sin(B A) sin(B + A) sin(A B)/ sin A sin B


=
sin(C A) sin(C + A) sin(A C)/ sin A sin C


sin(B + A) 1/ sin A sin B

= 0,
= sin(B A) sin(C A)
sin(C + A) 1/ sin A sin C
as two rows are identical.


a2
(s a)2 (s a)2

b2
(s b)2 = 2s3 (s a)(s b)(s c).
Ex 3.4.11 If 2s = a + b + c, show that (s b)2
(s c)2 (s c)2
c2
Solution: Let s a = , s b = , s c = . Hence
+ + = 3s (a + b + c) = 3s 2s = s,
+ = 2s (b + c) = a.
Similarly + = b, + = c. Now,


( + )2

2
2


2
2
2


( + )

=

2
2
2

( + )


( + )2 2

0
2


2
2
2

(C1 0 = C1 C3 , C2 0 = C2 C3 )
0
( + )

=

2 ( + )2 2 ( + )2 ( + )2


+

0
2


2
2

0
+

= ( + + )

( + )2


+
0
2

0
+ 2 (R3 0 = R3 R1 R2 )
= ( + + )2
2
2 2

Determinants

175



+ 2
0
2


0
+ 2 2 (C1 00 = C1 0 , C2 00 = C2 0 )




+
2
2

2 1
2

+ 2 (C1 000 = C1 00 + C3 , C2 000 = C2 00 + C3 )


= 2( + + )

0
0



( + )

2
= 2( + + )2 {( + )( + ) }
= 2( + + )2
2

( + )
1
= 2( + + )2

= 2( + + )3 = 2s3 (s a)(s b)(s c).




(b + c)2
c2
b2

(c + a)2
a2 = 2(ab + bc + ca)3 .
Ex 3.4.12 Show that c2
2
b2
a
(a + b)2
Solution: Using the properties of the determinants, we have,




(b + c)2
(ab + ca)2
c2
b2
b2 c2
b2 c2


1
(c + a)2
a2 = 2 2 2 c2 a2
(bc + ab)2
c2 a2
4 = c2
a
b
c
2
2
2 2
2 2
b2
a
(a + b)
a b
a b
(ca + bc)2


(ab + ca)2 b2 c2 (ab + ca)2 b2 c2 (ab + ca)2

1
; C2 C1 , C3 C1
(bc + ab)2 c2 a2
0
= 2 2 2 c2 a2

a b c
2 2
2
2 2
a b
0
(ca + bc) a b


2


2 (b + c) bc ab ca bc ab ca
(bc + ca + ab)
2

c
bc
+
ab

ca
0
=


b2 c2
b2
0
bc ab + ca




2ab
2ca
2 2bc

(bc + ca + ab) 2
; R1 (R2 + R3 )
c
bc
+
ab

ca
0
=


2
2
b c
b2
0
bc ab + ca




abc
cab
2 abc

2(bc + ca + ab) 2
2
2

c
a
bc
+
abc

c
a
0
=


3
3
ab c
ab2
0
b2 c ab2 + bca




0
0
2 abc

2(bc + ca + ab) 2
2
2
; C2 + C1 , C3 + C1
c
a
bc
+
abc
c
a
=


3
3
ab c
ab2
ab2
b2 c + bca
2(bc + ca + ab)2
{(bc2 + abc)(bca + b2 c) a2 b2 c2 } = 2(bc + ca + ab)3 .
b2 c2


n!
(n + 1)! (n + 2)!

Ex 3.4.13 For a fixed positive integer n, if 4 = (n + 1)! (n + 2)! (n + 3)! , then show that
(n + 2)! (n + 3)! (n + 4)!
=

4
[ (n!)
3 4] is divisible by n.

Solution: Taking out n!, (n + 1)! and (n + 2)! from the first, second and third row respectively, we get,


1 (n + 1) (n + 1)(n + 2)


4 = n!(n + 1)!(n + 2)! 1 (n + 2) (n + 2)(n + 3)
1 (n + 3) (n + 3)(n + 4)


1 (n + 1) (n + 1)(n + 2)


1
2(n + 2) ; R2 R1 , R3 R1
= (n!)3 (n + 1)2 (n + 2) 0
0
1
2(n + 3)

176

Theory of Matrices
= (n!)3 (n + 1)2 (n + 2).2
4

4 = 2(n + 1)2 (n + 2) 4 = 2n(n2 + 4n + 5).


(n!)3

4
Therefore, [ (n!)
3 4] is divisible by n.


bc bc + b2 bc + c2


Ex 3.4.14 Show that ca + a2 ca ca + c2 = (ab + bc + ca)3 .
ab + a2 ab + b2 ab

Solution: Let the value of the determinant be 4, then,




abc abc + ab2 abc + ac2

1
bca + ba2 bca bca + bc2
4=

abc
cab + ca2 cab + cb2 cab


abc + a2 b + a2 c abc + ab2 + b2 c abc + ac2 + bc2

1
; R 1 + R2 + R3
bca + ba2
bca
bca + bc2
=


abc
2
2

cab + ca
cab + cb
cab






1
a
b
c
1
1


ab + bc + ca
2
2


=
bca + ba2 bca 2 bca + bc = (ab + bc + ca) bc + ab ca ab + bc
abc
cab + ca cab + cb


cab
bc + ac ac + bc ab


1

0
0



; C2 C1 , C3 C1
0
= (ab + bc + ca) bc + ab ca bc ab

ac + ac
0
ab bc ac


1
0 0

= (ab + bc + ca)3 bc + ab 1 0 = (ab + bc + ca)3 .
ac + ac 0 1


1 + a2 b2

2ab
2b


2
2

= (1 + a2 + b2 )3 .
2ab
1a +b
2a
Ex 3.4.15 Show that

2
2

2b
2a
1a b
Solution: Let the value of the determinant be 4, then,


1 + a2 + b2

0
2b


2
2

; C1 bC3 , C2 + aC3
0
1+a +b
2a
4=

b(1 + a2 + b2 ) a(1 + a2 + b2 ) 1 a2 b2


1 0

2b


2
2 2

2a
= (1 + a + b ) 0 1

b a 1 a2 b2
= (1 + a2 + b2 )2 {(1 a2 b2 + 2a2 ) 2b(0 b)} = (1 + a2 + b2 )3 .


1 + a 1
1


Ex 3.4.16 Show that 1 1 + b 1 = abc 1 + a1 + 1b + 1c .
1
1 1 + c
Solution: Let the value of the determinant be 4, then,



1

1 + 1 + 1 + 1 1
1 + 1 1
a
b
c
b

1 a b 1 1c


4 = abc a 1 + b c = abc 1 + a1 + 1b + 1c 1 +
1
1 + 1 + 1 + 1 1
1
1+ 1
a

1
b

1
c
1
c




; C1 + C2 + C3

1
1+ c

Determinants

177



1

 1 1
b
c


1 1 + 1 1
b
c


1 1 1 + 1
b
c



 1 1 1
1 1 1 b c
0 1 0 ; R2 R 1 , R 3 R 1
= abc 1 + + +
a b
c
0 0 1




1 1 1
1 1 1
{1 0} = abc 1 + + +
.
= abc 1 + + +
a b
c
a b
c



a
b
ax + by

b
c
bx + cy = (b2 ac)(ax2 + 2bxy + cy 2 ).
Ex 3.4.17 Show that
ax + by bx + cy

0


1 1 1
= abc 1 + + +
a b
c

Solution: Let the value of the determinant be 4, then,






a
b
0


; C3 xC1 yC2
b
c
0
4 =

ax + by bx + cy ax2 2bxy cy 2



a
b
0

b
c
0
= (ax2 + 2bxy + cy 2 )
ax + by bx + cy 1
= (ax2 + 2bxy + cy 2 ){1(ac b2 )} = (b2 ac)(ax2 + 2bxy + cy 2 ).



0
(a b)2 (a c)2

0
(b c)2 = 2(b c)2 (c a)2 (a b)2 .
Ex 3.4.18 Show that (b a)2
(c a)2 (c b)2

0
Solution: Let the value of the determinant be 4, then,




(a a)2 (a b)2 (a c)2 a2 a 1 1 2a a2






4 = (b a)2 (b b)2 (b c)2 = b2 b 1 1 2b b2
(c a)2 (c b)2 (c c)2 c2 c 1 1 2c c2



a2
a 1 1 2a
a2
2
= b a2 b a 0 0 2b + 2a b2 a2
c2 a2 c a 0 0 2c + 2a c2 a2
2



a a 1
1 2a a2




= (b a)(c a) b + a 1 0 (b a)(c a) 0 2 b + a
c + a 1 0
0 2 c + a
= (b a)2 (c a)2 (b + a c a)(2c 2a + 2b + 2a) = 2(b c)2 (c a)2 (a b)2 .


x l m 1


x n 1

= (x )(x )(x ) for any value of l, m, n.
Ex 3.4.19 Show that

x 1
1
Solution: Let the value of the determinant be 4, then,


x
l
m 1

x x l n m 0
; R 2 R1 , R 3 R1 , R 4 R1
4 =

x l x m 0
x l m 0

178

Theory of Matrices




x x l n m
1 x l n m




= x l x m = ( x) 1 l x m
x l m
1 l m


1 x l n m




x x n




= ( x) 0 x x n = ( x)
x n
0 x n


1 x n

= (x )(x )(x ),
= ( x)( x)
1 n

which is independent on the value of l, m, n and consequently, 4 = (x )(x )(x ) is


true for any values of l, m, n.


(b + c)2
a2
a2

(c + a)2
b2 = 2abc(a + b + c)3 .
Ex 3.4.20 Prove that, b2
[WBUT 2004,
2
2
c
c
(a + b)2
2009]


(b + c)2
a2
a2

Solution:
b2
(c + a)2
b2

2
2
c
c
(a + b)2


(b + c)2 a2 (b + c)2 a2 (b + c)2

0
[C2 = C2 C1 , C30 = C3 C1 ]
(c + a)2 b2
0
= b2

2
2
c2
0
(a + b) c


(b + c)2 a (b + c) a (b + c)



(c + a) b
0
= (a + b + c)2 b2

c2
0
(a + b) c


2bc

2c
2b

0
2 2
[R1 = R1 R2 R3 ]
0
= (a + b + c) b (c + a) b

c2
0
(a + b) c


bc

c
b


2 2

0
= 2(a + b + c) b (c + a) b

c2
0
(a + b) c


bc

bc
bc


1
2
2
2


b bc + ab b
0
= 2(a + b + c)


bc 2
2
c
0
ac + bc c


1

1
1


2 2
2

0
= 2(a + b + c) b bc + ab b

2
c2
0
ac + bc c
[taking common bc from first row]


1
0
0

= 2(a + b + c)2 b2 bc + ab b2 [C20 = C2 + C1 , C30 = C3 + C1 ]
c2 c2 ac + bc






b2
b
2 b(a + c)
2 a + c
= 2(a + b + c)
= 2bc(a + b + c)
c2
c(a + b)
c a + b
= 2bc(a + b + c)2 (ac + bc + c2 ) = 2abc(a + b + c)3 .
2

a + ab
ac

Ex 3.4.21 Show that ab b2 + bc is divisible by 2 and find the other factor.
ac
bc c2 +

Determinants
2

a + ab
ac

Solution: ab b2 + bc = a b c


ac
bc c2 +

179

a +

a

a

b
b+
b


c
c
c + c

[dividing first, second and third rows by a, b, c respectively]


2

a + b2
c2

= a2 b2 + c2
a2
b2 c2 +
[multiplying f irst, second and third columns by a, b, c respectively]


+ a2 + b2 + c2 b2
c2

= + a2 + b2 + c2 b2 + c2 [C10 = C1 + C2 + C3 ]
+ a2 + b2 + c2 b2 c2 +


1 b2
c2

= ( + a2 + b2 + c2 ) 1 b2 + c2 [ + a2 + b2 + c2 is taking out]
1 b2 c2 +
2 2
1 b c


2
2
2
= ( + a + b + c ) 0 0 [R20 = R2 R1 , R30 = R3 R1 ]
0 0 0
= ( + a2 + b2 + c2 ) 2 .
Hence 2 and + a2 + b2 + c2 are only two factors of the given determinant.
4
4
4
4
4
4
4
P
P
P
P
P
P
P
Ex 3.4.22 If
a2i =
b2i =
c2i =
d2i = 1 and
ai bi =
bi ci =
ci di =
i=1

4
P
i=1

di ai =

4
P
i=1

ai ci =

i=1

4
P
i=1

i=1

i=1

i=1

i=1

i=1

bi di = 0 then show that




a1 b1 c1 d1


a2 b2 c2 d2


a3 b3 c3 d3 = 1.


a4 b4 c4 d4

Solution: Let the given determinant be .



a1 b1 c1

a b c
2 = 2 2 2
a3 b3 c3
a4 b4 c4

Then

d1 a1
d2 a2
d3 a3
d4 a4

Using multiplication by column rule, we get,


4
4
4
P
P 2 P

a
a
b
ai ci
i
i
i
i=1
i=1
i=1
4
4
4
P
P
P
2

b
a
b
bi ci
i
i
i

i=1
i=1
2 = i=1
4
4
4
P ci ai P ci bi P c2
i

i=1
i=1
i=1
P
4
4
4
P
P

di ai
d i bi
di ci

i=1

i=1

i=1

b1
b2
b3
b4

c1
c2
c3
c4


d1
d2
d3
d4

4
P



ai di
i=1


4
1 0 0 0
P



bi d i

i=1
= 0 1 0 0
4
0 0 1 0
P

ci di
0 0 0 1
i=1


4
P
d2i
i=1

= 1 = 1.
Ex 3.4.23
Prove that

a + 1 a
a
a


a a+2 a
a

=
24
1+
a
a a + 3 a

a
a
a a + 4

a
1

a
2

a
3

a
4

[WBUT 2004]

180

Theory of Matrices




a + 1 a
1 +
a
a
Solution:

a
a a+2 a


a

= (1.2.3.4) a1
a

a a+3 a

a1
a

a
a a + 4
1

a
1

a
2

1+
a
2
a
2

a
2

a
3
a
3

1+
a
3

a
3








a
1+ 4
a
4
a
4
a
4

[dividing first,

1 + a + a
1
2

1 + a + a
1
2

= 24
a
a
1 + a1 + a2
1 + +
1
2

second, third and fourth columns by 1, 2, 3 and 4 respectively]



a
a

+ a3 + a4 a2
3
4

a
a
a
a
a
0
+ 3 + 4 1+ 2 3
4
[C = C1 + C2 + C3 + C4 ]
a
a
a
a
a
+ 3 + 4 2 1 + 3 4 1
a
+ a3 + a4 a2
1 + a4
3


a
a
a

1
2
3
4



a

a a a a  1 1 + a2 a3
4

= 24 1 + + + +
a
a
a


1
1
+
1 2 3 4
2
3
4

a
a
1 a
1
+
2
3
4
h
i
a a a a
taking common 1 + + + + from first column
1 2 3 4
1 a a a
2 3 4


a a a a 0 1 0 0
= 24 1 + + + +
1 2 3 4 0 0 1 0
0 0 0 1
[R20 = R2 R1 , R30 = R3 R1 , R40 = R4 R1 ]


a a a a
a a a a
.
= 24 1 + + + +
.1 = 24 1 + + + +
1 2 3 4
1 2 3 4

Ex 3.4.24 If u = ax4 + 4bx3 + 6cx2 + 4dx + e, u11 = ax2 + 2bx + c, u12 = bx2 + 2cx + d,
u22 = cx2 + 2dx + e then prove
that

a b c u11




a b c
b c d u12






c d e u22 = u b c d .


c d e
u11 u12 u22 0


2 2 2

ax bx cx u11 x2
Solution: a b c u11


b c d u12



= 1 b c d u12
c d e u22 x2 c d e u22




u11 u12 u22 0
u11 u12 u22 0
2

ax + 2bx + c bx2 + 2cx + d cx2 + 2dx + e u11 x2 + 2xu12 + u12



1
b
c
d
u12

= 2

c
d
e
u22
x



u11
u12
u22
0
[ using R10 = R1 + 2R2 x + R3 ]




u11 u12 u22 u
0 0 0 u




1 b c d u12
1 b c d u12
= 2
=
[R10 = R1 R4 ]
x c d e u22 x2 c d e u22
u11 u12 u22 0
u11 u12 u22 0


b c d

u
= 2 c d e
x
u11 u12 u22




b
c
d

u
[R30 = R3 2xR1 R2 ]
c
d
e
= 2

x
u11 2xb c u12 2cx d u22 2xd e

Determinants
u
= 2
x

181






b c d
b c d
a b c






c d e = u c d e = u b c d
2 2 2




ax bx cx
a b c
c d e

by interchanging first and third rows and then first and second rows.

3.4.2

Minors and Co-factors

Let us consider an n n matrix A = [aij ]. Consider Mij , the (n 1) (n 1) sub-matrix


of A, which obtained by deleting the ith row and j th column of A. Now, |Mij | is called the
minor of aij . The co-factor Aij of the element aij is defined as
Aij = (1)i+j |Mij |.

(3.16)

1 2 4
For the matrix A = 3 6 5 , |M22 |, A22 and |M31 |, A31 are given by,
2
0 1
1 4
= 1 + 8 = 7;
|M22 | =
A22 = (1)2+2 |M22 | = 7.
2 1


2 4
= 10 + 24 = 14; A31 = (1)3+1 |M31 | = 14,
|M31 | =
6 5
which are respectively the minors and co-factors. It is obvious that if (i + j) be even, then
minor and co-factor of aij are same. Since each term in the expansion of a determinant
contains one element from any particular row (or column), we can express the expression as
a linear function of that row (or column), for


a11 a12 a13








a22 a23
a21 a23
a21 a22








|A| = a21 a22 a23 = a11
a12
+ a13
a32 a33
a31 a33
a31 a32
a31 a32 a33
= a11 A11 + a12 A12 + a13 A13 ,
where A11 , A12 , A13 are the co-factors of a11 , a12 and a13 respectively.
Complementary minor, algebraic compliment, principal minor:
Let M be a
minor of order m in |A| = |aij |nn . Now, if we delete all the rows and columns forming
M , the minor N formed by the remaining rows and columns of order (n m) is called the
complementary minor of M . For the determinant


1 2 4 6







3 6 5 0 1 2
and 1 9
;
4 =




5 4
3 6
2 0 1 9
2 7 5 4
are complementary. Let M be a minor of order r in which rows i1 , i2 , , ir and columns
j1 , j2 , , jr are present. Now, the algebraic complement of M is
(1)i1 +i2 ++ir +j1 +j2 ++jr M 0 ;

(3.17)

where M 0 is the algebraic complement of M . Algebraic complement of an element aij in


|aij | is the co-factor of aij in |aij |. In the above example, 4 algebraic complement of





1 2




is (1)1+2+1+2 1 9 = 1 9 .
3 6
5 4 5 4

182

Theory of Matrices

If the row and column indices in a minor are the same or equivalently, then
is said
the minor

6 0
. If we take
to be principal minor. In the above example, the principal minor of 4 is
7 4
the diagonal elements of minor from the diagonal elements of the matrix, then the indices
are equivalent. Since in a principal minor, sum of row and identical column subscripts are
always even, so the sign of minor is always positive. In this case, algebraic complement of
principal minor is equal to its complement.


1

1
1
h a 0
bc ch

fh
1 1 1


0
0



Ex 3.4.25 If = h b f and = af f h ch , find
2 in its simplest form.
0 c f
1 1 1 12
fh
af ab
h

1

f
1
1
bc f12 ch

b fc fh hc
Solution: Here
fh




adj = af f h ch = caf h af f h ch
1
1 1
1

a h h a
af
fh
ab h2
f
f b
h
taking out f c, 1, ah from 1st, 2nd and 3rd row respectively. Hence
2 = caf h0 , or,

0
1
=
.
2
caf h

Theorem 3.4.1 Laplace theorem : In an n order square matrix A, |A| can be expressed
as the aggregate of the products of all minors of order r formed from any r selected rows of
A and corresponding algebraic complement of them.
According to this theorem, we expand |A| = |aij |44 . Let the first two rows be selected. So
if we expand the determinant in terms of minors of ordered 2, we get,







a11 a12

a11 a13


1+2+1+2 a33 a34
1+2+1+3 a32 a34




|A| =
(1)
a43 a44 + a21 a23 (1)
a42 a44
a21 a22







a11 a14

a12 a13


1+2+1+4 a32 a33
1+2+2+3 a31 a34




+
(1)
a42 a43 + a22 a23 (1)
a41 a44
a21 a24







a12 a14

a13 a14


1+2+2+4 a31 a33
1+2+3+4 a31 a32




+
(1)
+
(1)
.





a22 a24
a41 a43
a23 a24
a41 a42
Ex 3.4.26 Using Laplaces theorem, show that,


a b c d


b a d c

= (a2 + b2 + c2 + d2 )2 .
|A| =

c d a b
d c b a
Solution: Using Laplaces theorem, we get,












1+2+1+2 a b a b
1+2+1+3 a c d b
|A| = (1)
b a b a + (1)
b d c a












1+2+1+4 a d d a
1+2+2+3 b c c b
+(1)
b c c b + (1)
a d d a












1+2+2+4 b d c a
1+2+3+4 c d c d
+(1)
a c d b + (1)
d c d c
= (a2 + b2 )(a2 + b2 ) + (ad + bc)(ad + bc) + (ac bd)(ac bd)
+(ac bd)(ac bd) + (ad + bc)(ad + bc) + (c2 + d2 )(c2 + d2 )
= (a2 + b2 )2 + 2(a2 d2 + b2 c2 2abcd + a2 c2 + b2 d2 2abcd) + (c2 + d2 )2
= (a2 + b2 )2 + 2(a2 + b2 )(c2 + d2 ) + (c2 + d2 )2 = (a2 + b2 + c2 + d2 )2 .

Determinants

183

From this we conclude that, if a, b, c and d are real numbers, then the given determinant is
non-singular if and only if at least one of a, b, c, d is non-zero.

3.4.3

Adjoint and Reciprocal of Determinant

Let A = [aij ] be a square matrix of order n and Aij be the cofactor of aij in |A|. Now, |Aij |
is called the adjoint or adjugate of |A|. Similarly, reciprocal or inverse is defined by
|A|0 =

1
|Aij |;
|A|

where |A| =
6 0.

(3.18)

Theorem 3.4.2 Jacobis theorem : Let A = [aij ] be a non-singular matrix of order n


and Aij be the cofactor of aij in |A|, then |Aij | = |A|n1 .
Proof: Let A = [aij ]nn be a square matrix of order n. Let Aij denotes
ij th element of aij in detA. Now, |A|.|Aij |
n
n
P
P

a
A
a1k A2k
1k
1k



k=1
a11 a12 a1n A11 A21 An1 k=1
n
n
P


P
a21 a22 a2n A12 A22 An2
k=1 a2k A1k k=1 a2k A2k


= . .
.. ..
..
.. =
.. ..
..
..
.
.
. .

.
an1 an2 ann A1n A2n Ann n .
n
P
P

k=1 ank A1k k=1 ank A2k


|A| 0 0


n
0 |A| 0
X


= . .
; as
aik Ajk = |A|, if i = j and = 0, if i 6= j

.
..
.. ..
k=1


0 0 |A|

the cofactor of
n
P



a1k Ank
k=1

n
P

a2k Ank
k=1


..

.

n

P
ank Ank
k=1

= |A|n |Aij | = |A|n1 , as |A| =


6 0.




a h g
A H G




Ex 3.4.27 If the adjugate of 4 = h b f be 40 = H B F , prove that
g f c
G F C
2
GH AF
HF BG
F G CH
CAG2
= ABH
= 4 and
=
=
= 4.
b
c
F
G
H
Solution:

1

H

G

BCF 2
a

Let us consider the following product





0 0 a h g 1.a + 0.h + 0.g 1.h + 0.b + 0.f 1.g + 0.f + 0.c
B F h b f = H.a + B.h + F.g H.h + B.b + F.f H.g + B.f + F.c
F C g f c G.a + F.h + C.g G.h + F.b + C.f G.g + F.f + C.c


a h g


= 0 4 0 = a42
0 0 4

(BC F 2 ).4 = a42 , i.e.,

BC F 2
= 4.
a

Similarly, by considering the products








A H G a h g
A H G a h g






0 1 0 h b f and H B F h b f






G F C g f c
0 0 1 g f c

184
we get,

Theory of Matrices
CA G2
AB H 2
BC F 2
=
=
= 4 respectively. Now we consider the product
a
b
c




A H G a h g 4 0 0




0 0 1 h b f = g f c




G F C g f c 0 0 4
(HG AF )4 = f 42

HF AF
= 4.
f

Similarly, by considering the products








A H G a h g
0 1 0 a h g






H B F h b f and H B F h b f






G F C g f c
1 0 0 g f c
we get,

HF BG
G

F GCH
H

= 4 respectively.



bc a2 ca b2 ab c2


Ex 3.4.28 Prove that ca b2 ab c2 bc a2 = (a3 + b3 + c3 3abc)2 .
ab c2 bc a2 ca b2


a b c


Solution: Let us consider the determinant = b c a and its value is (a3 +b3 +c3 3abc)
c a b


A B C


obtained by direct expansion. Now, adjoined of is 0 = B C A where A, B, C are
C A B
2
2
cofactor of a, b, c in . Therefore, A = bc a , B = ac b , C = ab c2 . By Jacobis
theorem, 0 = 31 = 2 ,


bc a2 ca b2 ab c2


or, ca b2 ab c2 bc a2 = (a3 + b3 + c3 3abc)2 .
ab c2 bc a2 ca b2

3.4.4

Symmetric and Skew-symmetric Determinants

If A be a symmetric matrix of order n, then |A| is said to be symmetric determinant of order


n. If A be a skew-symmetric

matrix
of order
n, Then |A| is said to be skew-symmetric deter a h g
0 a b




minant of order n. h b f and a 0 c are examples of symmetric and skew-symmetric
g f c
b c 0
determinants respectively. Now,
(i) The adjoint of a symmetric determinant is symmetric.
(ii) The square of any determinant is symmetric.
(iii) In a skew-symmetric determinant |A| = |aij |nn , Aij = (1)n1 Aji , where Aij and
Aji are the cofactors of aij and aji in |A| respectively.
(iv) The adjoint of a skew-symmetric determinant of order n is symmetric or skew-symmetric
according as n is even or odd.
Theorem 3.4.3 The value of every skew-symmetric determinant of odd order is zero.

Determinants

185

Proof: Let A be a skew-symmetric matrix of order n, where n is odd number. Hence by


definition, AT = A. Therefore,
|AT | = | A| |A| = (1)n |A| = |A|
2|A| = 0 |A| = 0.
Therefore, the value of every skew-symmetric determinant of odd order is zero.
Theorem 3.4.4 The value of every skew-symmetric determinant of even order is a perfect
square.
Proof: Let A = [aij ] be a skew-symmetric determinant of n, then by definition aij = aji

and aii = 0. Let,


0 a12 a1n
a21 0 a2n

An = . .
.. .
.. ..
.
an1 an2

Now let, Aij be the cofactor of aij in |aij |. Then, Aij be a determinant of order (n 1).
Now, if we transform Aij in Aji , then every rows of it must be multiplied by (1), therefore,


Aij = (1)n1 Aji . Thus,
0
A12 A1n

A12 0
A2n

adj|An | = .
.
.. .
..
..
.

A1n A2n 0


0
a34 a3n
By Jacobis theorem, we have,



a34 0
a4n
0 A12

= |An | .
.
.. = |An ||An2 |.
A12 0
..
..
.

a3n a4n 0
Therefore, |An ||An2 | = A212 , which is a perfect square. For n > 2, it is true. When n = 0,


0 a12
the result is obvious. Now,
= a212 ,
|A2 | =
a12 0
which is a perfect square. If |A2 | is perfect square, then |A4 | is also a perfect square, by
using the relation, |An ||An2 | = A212 . Let the result be true for n = m, then it is true for
m + 2 as |Am+2 ||Am | is a perfect square. Also, it is true for n = 2, 4. Hence by method of
induction the result is true for any even positive integer n.


0 a b c


a 0 d e
is a perfect square.
Ex 3.4.29 Show that the value of 4 =

b d 0 f
c e f 0
Solution: Expanding 4 by the minor of the elements of the first column, we get,






a b c
a b c
a b c






4 = a d 0 f b 0 d e + c 0 d e
e f 0
e f 0
d 0 f
= af (af be + cd) be(af be + cd) + cd(af be + cd)
= (af be + cd)(af be + cd) = (af be + cd)2 .
Since the given determinant 4 is a skew-symmetric determinant of even order, it is verified
also that, its value must be a perfect square.

186

3.4.5

Theory of Matrices

Vander-Mondes Determinant

The Vander-Mondes determinant is defined by


2 2 2
x0 x1 x2
2
Y


x0 x1 x2 = (x0 x1 )(x0 x2 )(x1 x2 ) =
(xi xj ).


1 1 1
i<j=0
The difference product of 3 numbers x0 , x1 , x2 is defined as
D.P.(x0 , x1 , x2 ) = (x0 x1 )(x0 x2 )(x1 x2 ) =

2
Y

(xi xj ).

i<j=0

Similarly the other DP are defined by


D.P.(x0 , x1 , x2 , x3 ) = (x0 x1 )(x0 x2 )(x0 x3 )(x1 x2 )(x1 x3 )(x2 x3 )
3
Y

D.P.(x0 , x1 , ..., xn ) =

i<j=0
n
Y

(xi xj )
(xi xj ).

i<j=0

Therefore, the Vander-Mondes determinant can be written in D.P. form as


2 2 2
x0 x1 x2
2
Y


x0 x1 x2 =
(xi xj ) = D.P.(x0 , x1 , x2 ).


1 1 1 i<j=0
Hence the Vander-Mondes determinant can be written in DP form. In general,
3 3

x0 x1 xnn


n
Y


=
(xi xj ) = D.P.(x0 , x1 , x2 , ..., xn ).
x0 x1 xn

i<j=0
1 1 1

3.4.6

Cramers Rule

Let us consider a system of n linear algebraic equations relating in n unknowns x1 , x2 , . . . , xn

of the explicit form


a11 x1 + a12 x2 + . . . + a1n xn = b1

a21 x1 + a22 x2 + . . . + a2n xn = b2


(3.19)
..
..
.
.

an1 x1 + an2 x2 + . . . + ann xn = bn


where the n2 coefficients aij and the n constants b1 , b2 , ..., bn are given real numbers. The
(3.19) can be written in the matrix notation as Ax = b where the real n n coefficient
matrix is A in which aij is the coefficient of xj in the ith equation, bT = [b1 , b2 , ..., bn ] is
a column n vector which are prescribed and xT = [x1 , x2 , ..., xn ] is the unknown n column
vector to be computed. Equations (3.19) are said to be homogeneous system if bi = 0; i
and non homogeneous system if at least one bi 6= 0.
This is simplest method for the solution of a nonhomogeneous system (3.19) of n linear
equations in n unknowns. Here it will assume that = det(A) 6= 0, so that unique solution
for x1 , x2 , ..., xn exists. From the properties of determinant, we have,

Determinants

187


a11 a12 a1n x1 a11


a21 a22 a2n x1 a21

=
x1 4 = x1


an1 an2 ann x1 an1

a11 x1 + a12 x2 + + a1n xn

a x + a22 x2 + + a2n xn
= 21 1


an1 x1 + an2 x2 + + ann xn

b1 a12

b a
= 2 22

bn an2

a12
a22

an2


a1n
a2n

ann

a12
a22

an2


a1n
a2n

ann

C10 = C1 + x2 C2 + + xn Cn .

a1n
a2n
[Using (3.19)] = 41 (say).

ann

(3.20)

Therefore, x1 = 41 /4. In general, let Aij be the cofactor of aij in , then multiplying
both sides of the ith equation by Aij for i = 1, 2, . . . , n and then adding we have,
n
n
n
n
P
P
P
P
(
ai1 Aij ) x1 + (
ai2 Aij ) x2 + + (
aij Aij ) xj + + (
ain Aij ) xn
i=1

n
P

i=1

i=1

i=1

bi Aij .

i=1

we multiply each equation by the cofactor in A of the coefficient of xk in that equation


and add the results we have,
n
X
4.xk =
Ajk bj ; k = 1, 2, ..., n
j=1

where, Aij is the cofactor of aij in det(A) of order n 1. Hence from equation (3.19) we
have
n
X
aik xk = bi ; i = 1, 2, ..., n
k=1
n

1 XX
1 XX
Ajk bj =
[
aik Ajk ]bj .

j=1
j=1
k=1

k=1

Since the RHS would reduce to 4 if b1 , b2 , ..., bn were replaced by a1k , a2k , ..., ank , so let
i = determinant obtained from , replacing ith column by the column vector b then the
unique solution of (3.19) is given by
n
X
i
xi =
= 41
Aij bj ; i = 1, 2, ..., n.
(3.21)

j=1
This is the well-known Cramers rule. Various methods have been devised to evaluate the
value of a numerical determinants. The following cases may arise:
1. The homogeneous system bi = 0, i of equations has a trivial solution x1 = x2 = =
xn = 0, if 6= 0 and an non trivial solution ( at least one of xs is non-zero) exist
when = 0. When 4 = 0, the solution is not unique.
2. The non homogeneous system of equations is said to be consistent if 4 =
6 0. In this
case the system (3.19)has an unique solution.
3. If 4 = 0 and all of 4i s be 0 then the system (3.19) may or may not be consistent. If
consistent it admits of infinite number of solutions.

188

Theory of Matrices

4. If 4 = 0 and at least one of 4i (i = 1, 2, ..., n) be nonzero then the system is inconsistent


or incompatible and (3.19) has no solution.
Cramers rule have the following serious drawbacks
(i) If the size of (3.19) is large (n 4), Crammers rule requires enormous amount of
computation for evaluation of determinants and no efficient method exists for evaluation of determinants. Hence this is purely theoretical but certainly not from numerical
point of view.
(ii) For large systems, this method is not useful as it involves a large number of manipulations. The Cramers rule involves to compute (n + 1) determinants each of order n (for
a system (3.19) of n equations and n variables). To evaluate a determinant of order n
requires (n! 1) additions and n!(n 1) multiplications. The total number of multiplications and divisions required to solve (3.19) by this method = (n 1) (n + 1)! + n.
The scenario is shown in Figure 3.2.
A system of linear equations
?
?
4=0

?
4=
6 0
[Consistent with unique solution]

?
?
41 = 42 = = 4n = 0
[Consistent with
infinitely many solutions]

?
At least one 4i s non-zero
[Inconsistent]

Figure 3.2: Different cases for existence of solutions.

Ex 3.4.30 Solve the following system of equations by Cramers rule


x + 2y + 3z = 6, 2x + 4y + z = 7, 3x + 2y + 9z = 14
Solution: The given system of
linearequations can be written in the form Ax = b, where,
123
A = the coefficient matrix = 2 4 1 , bT = (6 7 14) and xT = (x1 x2 x3 ). Here the
329
determinant of the coefficient matrix A is given by


1 2 3


4 = 2 4 1 = 1(4.9 1.2) + 2(1.3 2.9) + 3(2.2 4.3) = 20(6= 0).
3 2 9
Here the present system has unique solution. The determinant 4i obtained by replacing
the ith column of 4 by constant vector, given by






6 2 3
1 6 3
1 2 6






41 = 7 4 1 = 20; 42 = 2 7 1 = 20; 43 = 2 4 7 = 20.
14 2 9
3 14 9
3 2 14

Complex Matrices

189

Hence by Cramers rule solution exists. So the unique solution is


x=

20
20
20
= 1; y =
= 1, z =
= 1.
20
20
20

Now the sum of the three equations is 6x + 8y + 13z = 27. Substituting the values of x, y, z
we get LHS = 27, which is the check solution.
Ex 3.4.31 Find for what values of a and b, the system of equations
x + 4y + 2z = 1, 2x + 7y + 5z = 2b, 4x + ay + 10z = 2b + 1
has (i) an unique solution, (ii) no solution and (iii) infinite number of solution over the
field of rational numbers.
Solution: The given system of equations can be written in the form Ax = b, where,

14 2
A = 2 7 5 , bT = (1 2b 2b + 1) and xT = (x1 x2 x3 ).
4 a 10
Hence,



1 4 2


4 = 2 7 5 = 1(7.10 5.a) + 4(5.4 10.2) + 2(2.a 7.4) = 14 a
4 a 10


1 4 2


41 = 2b 7 5 = 7b 5a 68b + 4ab.
2b + 1 a 10

(i) When, 14 a 6= 0, i.e., a 6= 14, then 4 =


6 0, in this case, the solution will be unique.
(ii) When, 14 a = 0, i.e., a = 14, then 4 = 0, and then 41 = 6 12b. If 6 12b 6= 0,
i.e., b 6= 21 , then the system has no solution.
(iii) If a = 14 and b = 12 , then the system of linear equations becomes x + 4y + 2z =
1, y z = 0. Thus the system is consistent and have infinite number of solutions over the
field of rational numbers.

3.5

Complex Matrices

A matrix A is said to be complex, if its elements may be the complex


numbers. If A be

a complex matrix, then it can


 be expressed
 as A = P + iQ, i = 1, where P, Q are real
1 + 3i i
matrices. The matrix A =
can be written as
3+i 4

 



1 + 3i i
1 0
3 1
A=
=
+i
= P + iQ
3+i 4
3 4
1 0
so A is a complex matrix. If each element of the matrix A be replaced by its conjugate,
then the matrix, so obtained is called the conjugate of A and is denoted by A. Thus, if
A = P + iQ, then A = P iQ.
 Thus if A= [aij ], then A = [bij ], where bij = aij . Therefore,
1 3i i
for the above example, A =
. A matrix A is real if and only if A = A.
3i 4
Property 3.5.1 If A and B be conjugate of the matrices A and B respectively, then,
(i) A = A.

190

Theory of Matrices

(ii) A + B = A + B; A, B are conformable for addition.


(iii) kA = k A; k being a complex number.
(iv) AB = A B; A, B are conformable for product.

3.5.1

Transpose Conjugate of a Matrix

The transpose of the conjugate of a matrix A is called the transpose conjugate of A and is
denoted by A . Thus,
A = (AT ) = (A)T .
(3.22)


3 + 2i 3
It is also called as tranjugate of a matrix. For example, let A =
, then,
i 3 + 4i


3 2i 3
A=
, and so,
i 3 4i


3 2i i

T
A = (A) =
= (AT ).
3 3 4i
As A = (AT ) = (A)T , so A is a transpose conjugate matrix. Note that, if A is real, then
A = AT .
Property 3.5.2 If A and B be tranjugates of the matrices A and B respectively, then,
(i) [A ] = A.
(ii) [A + B] = A + B ; A, B are conformable for addition.

(iii) [kA] = k A ; k being a complex number.


(iv) [AB] = B A ; A, B are conformable for product.
Consider a complex matrix A. The relationship between A and its conjugate transpose A
yields following important kinds of complex matrices.

3.5.2

Harmitian Matrix

A complex matrix of order n n is said to be Harmitian, if, A = A. From the definition,


it is clear that, a square complex matrix A = [aij ] is Harmitian if and only if symmetric
elements are conjugate, i.e., aij = aji , in which each diagonal element aii must be real. For
example, consider the following complex matrix

7
1 i 3 + 2i
2 1 + 2i
A= 1+i
3 2i 1 2i 1
By inspection, the diagonal elements of A are real and symmetric elements 1 i and 1 + i,
3 + 2i and 3 2i, 1 + 2i and 1 2i are conjugate. Thus A is Harmitian matrix.
Ex 3.5.1 Write down the most general symmetric and hermitian matrix of order 2.
Solution: The most general symmetric complex matrix of order 2 is



a + ib e + if
As =
; i = 1
e + if c + id

Complex Matrices

191

which has real independent parameters. The most general hermitian matrix of order 2 can
be written interms of four independent parameters in the form



a c + id
Ah =
; i = 1.
c + id b
Ex 3.5.2 If A be a square matrix, then show that AA and A A are Hermitian.
Solution: Let A and B be the transposed conjugates of A and B respectively, then by
property (AB) = B A . Here we have,

[AA ] = [A ] A = AA .
Hence AA is Hermitian. Similarly,

[A A] = A [A ] = A A.
Therefore, A A is Hermitian.

3.5.3

Skew-Harmitian Matrix

A complex matrix of order n n is said to be skew-Harmitian, if, A = (A)T . A square


complex matrix A = [aij ] is skew-Harmitian if and only if it is skew-symmetric and each
diagonal element aii is either zero or purely imaginary. For example, the following complex
matrix

0
2 i 6 3i
0
1 + 5i
A = 2 i
6 3i 1 + 5i i
is skew-Harmitian matrix.
Ex 3.5.3 If S = P + iQ be a skew Hermitian matrix, then show that P is a skew symmetric
matrix and Q is real skew symmetric matrix.
Solution: Let S = P + iQ be a skew Hermitian matrix, where P and Q are real matrices.
Now,
S = P iQ and (S)T = P T iQT .
Since, S is skew Hermitian, by definition, (S)T = S, i.e.,
P T iQT = P + iQ P T = P and QT = Q.
Therefore, P is a skew symmetric matrix and Q is real skew symmetric matrix.
Ex 3.5.4 If A be a Hermitian matrix, then show that iA is a skew-Hermitian.
Solution: Since A be a Hermitian matrix, by definition, A = A. Now,
[iA] = i A = i A = (iA).
Therefore, iA is skew hermitian.
Ex 3.5.5 Let A be an n n matrix whose elements are complex numbers. Show that A + A
is Harmitian and A A is skew Harmitian.
Solution: Let P = A + A , then using the property (3.5.1), we get,
P = A + A = A + A = A + AT .
Therefore, (P )T = (A)T + A = A + A = P.
Hence Z is Harmitian. Let Q = A A , then using the property (3.5.1), we get,
Q = A A = A A = A AT .
Therefore, (Q)T = (A)T A = A A = Q.
Hence Q is skew Harmitian.

192

3.5.4

Theory of Matrices

Unitary Matrix

A complex square matrix A of order n is said to be unitary, if


A A1 = A1 A = In ; i.e., A = A1 .

(3.23)

Thus A must be necessarily be square and inverible. We note that a complex matrix A is
unitary if and only if its rows (columns) from an orthogonal set relative to the dot product
of complex vectors. For example, let,




1 1+i1i
1 1i1+i

; then, A =
A=
2 1i1+i
2 1+i1i

 
 

1 1+i1i 1 1i1+i
10

=
= I2
so, AA =
01
2 1i1+i 2 1+i1i

 

1 1i1+i 1 1+i1i
= A A.
=
2 1+i1i 2 1i1+i
Since A A1 = A1 A = I2 , so A is an unitary matrix. Note that, when a matrix A is real,
hermitian is same as symmetric and unitary is the same as orthogonal.
Theorem 3.5.1 For an unitary matrix A, |A| = 1.
Proof: Let A be an unitary matrix of order n, then by definition, A A = In . Therefore,
T

|A A| = |In | |A ||A| = 1
|A||A| = 1 |A||A| = 1
|A|2 = 1 |A| = 1.
Therefore, for an unitary matrix A, |A| = 1.
Ex 3.5.6 If A be an unitary matrix and I + A is non-singular
Solution:

3.5.5

Normal Matrix

A complex square matrix A is said to be normal if it commutes with A , i.e., if AA = A A.


For example, let,




2 + 3i 1
2 3i i

A=
, then, A =
i 1 + 2i
1 1 2i


 

2 + 3i 1
2 3i i
14 4 4i

AA =
=
i 1 + 2i
1 1 2i
4 + 4i 6



2 3i i
2 + 3i 1
=
= A A.
1 1 2i
i 1 + 2i
Since AA = A A, the complex matrix A is normal.This definition reduces to that for real
matrices when A is real.

3.6

Adjoint of a Matrix

Let A = [aij ]nn be a square matrix of order n. Let Aij be the cofactor of the ij th element
aij in detA. Then the square matrix (Aij )T is said to be the adjugate or adjoint of A and

Adjoint of a Matrix

193

is denoted by adjA. So, we first find the cofactor of ij th element in detA.


Then,
the adjoint


1 2
of A is obtained by transposing the cofactor. For example, let , A =
, then,
3 4
A11 = (1)1+1 |4| = 4, A12 = (1)1+2 |3| = 3,
A21 = (1)2+1 | 2| = 2, A22 = (1)2+2 |1| = 1.

T 

A11 A12
4 2
Therefore, adj A =
=
.
A A
3 1
21 22
102
For the matrix A = 1 1 0 , we get,


201





1
0
1+1
1+2 1 0

A11 = (1)
0 1 = 1, A12 = (1)
2 1 = 1,








1+3 1 1
2+1 0 2
A13 = (1)
A21 = (1)
2 0 = 2,
0 1 = 0,




1 2


2+3 1 0
= 3,
A22 = (1)2+2
A
=
(1)
23


= 0,
2 1
2 0
0 2
1 2
= 2,
= 2,
A31 = (1)3+1
A32 = (1)3+2

1 0
1 0
1 0
= 1.
A33 = (1)3+3
1 1
Therefore, the adjugate of A is given by

T
A11 A12 A13
1 0 2
adj A = A21 A22 A23 = 1 3 2 .
2 0 1
A31 A32 A33
From definition, we have,
(i) adj(AT ) = (adjA)T , and adj(kA) = k n1 adjA, where k is any scalar.
(ii) If 0 be a zero matrix, which is a square matrix of order n, then adj0 = 0.
(iii) IF I be a unit matrix of order n, then adjI = I.
(iv) If A is symmetric, then adjA is symmetric.
(v) If A is skew-symmetric then adjA is symmetric or skew-symmetric according as the
order of A is odd or even.
(vi) For the matrices A, B, adj (AB) = (adj B) (adj A).
Theorem 3.6.1 If A be a square matrix of order n, then
A.(adjA) = |A| In = (adjA).A.

(3.24)

Proof: Let A = [aij ]nn be a square matrix of order n. Let Aij denotes the cofactor of
ij th element of aij in detA. The ij th element of A(adjA) is the inner product of the ith row
of A and the j th column
of adjA, as

a11 a12 a1n


A11 A21 An1
a21 a22 a2n A12 A22 An2

A(adjA) = . .
.. ..
..
..
.. ..

.
.
.
.
an1 an2 ann

A1n A2n Ann

194

Theory of Matrices

n
P

a1k A1k

n
P

k=1
k=1
n
n
P
P

a
A
k=1 2k 1k k=1 a2k A2k
=

..
..

.
n .
n
P
P
ank A1k
ank A2k
k=1

k=1

|A| 0
0 |A|

= . .
.. ..
0

0
0
..
.

n
P

a1k Ank
k=1

n
P

a2k Ank

k=1

..

n
P

ank Ank

a1k A2k

k=1


n

X
|A|; if i = j

aik Ajk =
= |A|In , as,
0; if i = j

k=1

|A|

Similarly, taking the product between adjA and A and proceeding as before we get, (adjA)A =
|A|In . Therefore,
A(adjA) = |A|In = (adjA)A.
Thus the product of a matrix with its adjoint is commutative and it is a scalar matrix whose
diagonal element is |A|.
Theorem 3.6.2 If A be a non-singular matrix of order n, then |adjA| = |A|n1 .
Proof: We know that, A(adjA) = |A|In . Hence,
|A(adjA)| = ||A|In | |A||adjA| = |A|n
|adjA| = |A|n1 ; as |A| =
6 0.
Therefore, if A be a non-singular square matrix of order n, then |adjA| = |A|n1 .



bc a2 ca b2 ab c2 a b c 2




Ex 3.6.1 Show that, ca b2 ab c2 bc a2 = b c a .
KH:09
ab c2 bc a2 ca b2 c a b
Solution: If the right hand side is |A|2 , the the adjoint of A is given by,

bc a2 ca b2 ab c2
adjA = ca b2 ab c2 bc a2 .
ab c2 bc a2 ca b2
Let A be a non-singular matrix of order 3, then by theorem (3.6.2), we get,




bc a2 ca b2 ab c2
a b c 2




|adjA| = ca b2 ab c2 bc a2 = |A|2 = b c a .
ab c2 bc a2 ca b2
c a b
Theorem 3.6.3 If A be a non-singular matrix of order n, then adj(adjA) = |A|n2 A.
Proof: We know that, A(adjA) = |A|In . Now putting adjA in place of A, we get,
adjA(adj.adjA) = |adjA|In
or, adjA(adj.adjA) = |A|n1 In ; as |adjA| = |A|n1
or, A(adjA)(adj.adjA) = |A|n1 A
or, |A|In (adj.adjA) = |A|n1 A
or, adj.adjA = |A|n2 .A; as |A| =
6 0.
Therefore, if A be a non-singular matrix of order n, then adj(adjA) = |A|n2 A.

Adjoint of a Matrix

195

1 3 4
Ex 3.6.2 Find the matrix A, if adjA = 2 2 2 .
1 3 4
Solution: Since the adjA is given, so,





2 2




3 4 3 4
3 4
3 4 2 2






20 2
1 4
2 2 1 4




adj(adjA) =
1 4 1 4 2 2 = 6 8 10 .






4
6
8
1 3
1 3
2 2






1 3 1 3
2 2
Now, from the relation, |adjA| = |A|n1 we have,


1 3 4


|adjA| = 2 2 2 = 4 = |A|2 |A| = 2.
1 3 4
Using the relation, adj(adjA) = |A|n2 A, the matrix A is given by,

20 2
101
1
6 8 10 = 3 4 5 .
A=
2
46 8
234

3.6.1

Reciprocal of a Matrix

For a non-singular square matrix A = [aij ]nn of


1
defined by
1
|A| A11 |A| A21
1 A
1
|A| 12 |A| A22
1
adjA =
..
..

|A|

.
.
1
1
A
A
1n
|A|
|A| 2n

order n, the reciprocal matrix of A is

1
|A|
An1
1
|A|
An2

.
(3.25)
..

.
1
|A|
Ann

62 5
7 11 8
For example, let, A = 4 2 1 , then |A| = 2 and adjA = 12 18 14 . Therefore,
0 1 3
4 6 4

the reciprocal of A is
7 11 8
1
1
adjA = 12 18 14 .
|A|
2
4 6 4

3.6.2

Inverse of a Matrix

Let A be a square matrix of order n. For any other square matrix of the same order B, if
A.B = B.A = In ,

(3.26)

is satisfied then, B is called the reciprocal or inverse of A and is denoted as B = A1 . Since


A and B are conformable from the product AB and BA and AB = BA, A and B are square
matrices of the same order. Thus a matrix A may have an inverse or A may be invertible
only when it is a square matrix. By the property of adjoint matrix, we have,
A.(adjA) = |A|In = (adjA)A
1
1
A.
(adjA) = In =
(adjA)A;
|A|
|A|

provided |A| =
6 0.

196

Theory of Matrices

Again, by property of inverse, A.B = B.A = In . Comparing we get,


1
B = A1 =
adjA, provided |A| =
6 0.
|A|

(3.27)

Therefore, the inverse of any matrix exists if it be non-singular. The inverse of a nonsingular
triangular matrix is also the same dimension and structure.


12
Ex 3.6.3 Find the inverse of A =
.
34


ab
Solution: To find A1 , let A1 =
, then using AA1 = I2 , we get,
cd


 

12
ab
10
=
34
cd
01

 

a + 2c b + 2d
10

=
3a + 2c 3b + 4d
01
a + 2c = 1, b + 2d = 0, 3a + 2c = 0, 3b + 4d = 1
3
1
a = 2, b = 1, c = , d = .
 2

  2
2 1
ab
1
A =
= 3
1 .
cd
2 2
Moreover, A1 satisfies the property that


 

2 1
12
10
=
,
3
1
34
01
2 2


2 1
we conclude that A is non singular and that A1 = 3
1 .
2 2
Theorem 3.6.4 The inverse of a matrix is unique.
Proof: Let A be an invertible matrix of order n. Also, let, B and C are the inverses of A.
Then by definition of inverse, we have,
A.B = B.A = In and A.C = C.A = In .
Using the property that matrix multiplication is associative, we get,
C.(A.B) = (C.A).B C.In = In B C = B.
Hence,inverse of a matrix is unique.
Theorem 3.6.5 The necessary and sufficient condition for the existence of the inverse of
a square matrix A is that A is non-singular.
Proof: First, let, A be an n n invertible matrix and B is the inverse of A. Then, by
definition, A.B = B.A = In . Therefore,
|A.B| = |In | |A|.|B| = 1.
Therefore, |A| =
6 0 and consequently, A is non-singular. Hence the condition is necessary.
Conversely, let A be non-singular, i.e., |A| =
6 0. Now,
A.(adjA) = |A|In = (adjA)A
1
1
A.
(adjA) = In =
(adjA)A;
|A|
|A|
Hence by definition of inverse, A1 =

1
|A| adjA

as |A| =
6 0.

and it exists. Hence the condition is sufficient.

Adjoint of a Matrix

197

220
Ex 3.6.4 Find the matrix A, if adjA = 2 5 1 and |A| = 2.
011
Solution: Since the adjA and |A| is given,

220
1
1
(adjA) = 2 5 1 = B, (say).
A1 =
|A|
2
011


2 2 0


1
Therefore, |B| is given by, |B| = 2 2 5 1 = 2 6= 0 and the adjB is given by,
0 1 1


5 1




2 0 2 0
1 1
1 1 5 1






4 2 2
2 1 2 0
2 0




adjB =
0 1 0 1 2 1 = 2 2 2 .





2
2
6
2 2 2 2
2 5





0 1 0 1 2 5

Therefore, the matrix A is given by


4 2 2
2 1 1
1
A = B 1 = 2 2 2 = 1 1 1 .
2
2 2 6
1 1 3
Theorem 3.6.6 If A and B are invertible square matrices of same order, then inverse
of the product of two matrices is the product of their inverses in the reverse order, i.e.,
(AB)1 = B 1 A1 .
Proof: Let A and B are invertible square matrices of same order n, then , |A| =
6 0, and
|B| 6= 0. Therefore, |AB| = |A|.|B| 6= 0 and hence AB is invertible. Now,
(AB)(B 1 A1 ) = A(BB 1 )A1 , associate property
= AIn A1 = AA1 = In .
Again, (B 1 A1 )(A) = B 1 (A1 A)B; associate property
= B 1 In B = B 1 B = In .
Hence by definition and uniqueness theorem of inverse, we have (AB)1 = B 1 A1 . Continuing, we get, if A1 , A2 , , Ak be k invertible matrices of the same order, then,
1
1
(A1 .A2 . .Ak )1 = A1
k . .A2 .A1 .

(3.28)

Theorem 3.6.7 If A be an invertible matrix, A1 is invertible and (A1 )1 = A.


Proof: Let, A be an n n invertible matrix, then A 6= 0 and AA1 = A1 A = In . Now,
|A|.|A1 | = |AA1 | = |In | = 1,
which shows that |A1 | =
6 0, hence A1 is invertible. From the definition and uniqueness
theorem of inverse, we get A is the inverse of A1 and hence (A1 )1 = A.
Theorem 3.6.8 If A be an invertible matrix, then AT is invertible matrix and
(AT )1 = (A1 )T .

198

Theory of Matrices

Proof: Let A be invertible, then |A| =


6 0. Thus, |AT | = |A| 6= 0. Therefore, AT is invertible.
Also, from the relation AA1 = A1 A = I, we get,
(AA1 )T = (A1 A)T = I T
(A1 )T AT = AT (A1 )T = I.
From definition and uniqueness theorem of inverse, we get (A1 )T is the inverse of AT and
hence (AT )1 = (A1 )T .
Theorem 3.6.9 If A be an invertible matrix, then adjA1 = (adjA)1 =

1
|A| A.

Proof: From the definition of inverse, we have, AA1 = I = A1 A. Therefore,


adj(AA1 ) = adj(I) = adjAadjA1
or, adjA1 adjA = I = adjA adjA1 adjA1 = (adjA)1 .
Again, A1 =

1
|A| adjA,

so adjA = |A|A1 and so,


adjA1 = (adjA)1 =

1
A.
|A|

Theorem 3.6.10 If the sum of the elements in each row of a nonsingular matrix is k(6= 0)
then the sum of the elements in each row of the inverse matrix is k 1 .
Proof: Let A = [aij ]nn be a give non singular matrix, where |A| 6= 0. Since, the sum of
the elements in each row of a nonsingular matrix is k(6= 0), so
n
X

aij = k; i = 1, 2, , n.

j=1

Now, sum of the elements of the j th row of A1 =

|A| =

n
X

aij Aij =

i=1

=k

n
X
i=1

n
n
X
X
i=1

Aij 0 = k

!
air Aij

r=1
n
X

1
|A|

n
P

Aij . Therefore,

i=1
n
X

n
X

i=1

air Aij

r=1,r6=j

Aij .

i=1

Therefore, if the sum of the elements in each row of a nonsingular matrix is k(6= 0) then the
sum of the elements in each row of the inverse matrix is k 1 .
Ex 3.6.5 If A and B are both square matrices of order n and A has an inverse, show that
(A + B)A1 (A B) = (A B)A1 (A + B).
Solution: Since A has an inverse, so A1 exists. Now,
LHS = (A + B)A1 (A B) = (A + B)(A1 A A1 B)
= (A + B)(I A1 B) = A AA1 B + B BA1 B
= A B + B BA1 B = A + B B BA1 B
= A + AA1 B B BA1 B
= A(I + A1 B) B(I + A1 B) = (A B)(I + A1 B)
= (A B)(A1 A + A1 B) = (A B)A1 (A + B) = RHS.
Therefore, if A and B are both square matrices of order n and A has an inverse, show that
(A + B)A1 (A B) = (A B)A1 (A + B).

Adjoint of a Matrix

199

Ex 3.6.6 Show that if the non singular symmetric matrices A and B commute then A1 B, AB 1
and A1 B 1 are symmetric.
Solution: By the given condition, AB = BA and |A| 6= 0, |B| 6= 0. Also as A and B are
symmetric matrices AT = A, B T = B. Now,
(A1 B)T = B T (A1 )T = B(AT )1
= BA1 = A1 BAA1 = A1 B;

as AB = BA B = A1 BA.

Therefore, A1 B is symmetric. Similarly, AB 1 is also symmetric. Also,


(A1 B 1 )T = {(BA)1 }T = {(AB)1 }T ; as AB = BA
= {(AB)T }1 = (B T AT )1 = (AT )1 (B T )1
= A1 B 1 ; as AT = A and B T = B.
Therefore, A1 B 1 is symmetric.


2 1
Ex 3.6.7 If A =
, then show that A2 4A + 3I = 0. Hence obtain A1 .
1 2
Solution: For the given matrix A, we have,


 

2 1
2 1
5 4
A2 =
=
.
1 2
1 2
4 5
Thus the expression A2 4A + 3I becomes,





 

5 4
2 1
10
00
4
+3
=
= 0.
4 5
1 2
01
00
Now the expression A2 4A + 3I = 0 shows that A is non singular (so that A1 exists )
and can be written in the form
A2 4A + 3I = 0


1
1 6 1
1
A = [A 4I] =
.
3
3 1 6

 

41
23
Ex 3.6.8 Find A, from the equation A
=
.
32
91
Solution: The given matrix equation can be written in the form AB = C, then,


4 1
= 8 3 = 5 6= 0.
|B| =
3 2


1
1
2 1
.
B 1 =
(adjB) =
|B|
5 3 4

 



1 4 9 2 + 12
23 1
2 1
So, A = CB 1 =
=
9 1 5 3 4
5 18 3 9 + 4




1 5 10
1 1 2
=
=
.
3 1
5 15 5
5

 

AO
A1
O
Ex 3.6.9 Show that the inverse of
is
, where A and C are nonBC
C 1 BA1 C 1
singular.

200

Theory of Matrices

Solution: Let us consider the product,





AO
A1
O
BC
C 1 BA1 C 1

 

I O
AA1
O
=
=
= I.
BA1 CC 1 BA1 CC 1
O I



A1
O
AO
Again,
C 1 BA1 C 1
BC

 

1
AA
O
I O
=
=
= I.
C 1 BA1 A + C 1 B CC 1
O I




A1
O
AO
Therefore,
is the inverse of
.
C 1 BA1 C 1
BC
Deduction 3.6.1 Solution of system of linear equations by matrix inverse method
: Here we shall be concerned with the solution of a system of n linear algebraic equations
relating in n unknowns x1 , x2 , ..., xn of the explicit form (3.19), where the n2 coefficients
aij and the n constants b1 , b2 , ..., bn are given real numbers. The (3.19) can be written in
the matrix notation as Ax = b where the real n n coefficient matrix is A in which aij is
the coefficient of xj in the ith equation, bT = [b1 , b2 , ..., bn ] is a column n vector which are
prescribed and xT = [x1 , x2 , ..., xn ] is the unknown n column vector to be computed up to
a desired degree of accuracy.
If det(A) 6= 0, then unique A1 exists, where the inverse of the matrix A. Thus the
matrix inversion method finds solution of (3.19) as
1
(adjA)b
Ax = b A1 Ax = A1 b =
|A|
n
X
xi =
A1
(3.29)
ij bj ; i = 1, 2, ..., n.
j=1

Thus we see that in the solution of a system (3.19) by matrix method, the chief problem is
the inversion of the coefficient matrix A. This method is obviously unsuitable for solving
large systems, since the computation of A1 by cofactor i.e., evaluation of determinants,
will then become exceedingly difficult.
Ex 3.6.10 Using matrix inversion method, solve the system of equations
x + 2y + 3z = 6, 2x + 4y + z = 7, 3x + 2y + 9z = 14
Solution: The given non homogeneous system can be written as Ax = b, where A is
coefficient matrix and b is constant vector. The solution of the system can be written as
x = A1 b provided |A| =
6 0. Here |A| = 20(6= 0). Hence A is nonsingular and A1 exists.
Now

34 12 10
34 12 10
1
15 0
5 A1 =
5 .
adjA = 15 0
20
8 4
0
8 4
0


T
Hence the solution is given by A1 b = 1 1 1 . Therefore the solution is given by x = y =
z = 1.
Result 3.6.1 This method is obviously unsuitable for solving large systems, since the computation of A1 by cofactor i.e., evaluation of determinants, will then become exceedingly
difficult. Various methods have been devised to evaluate the value of A1 .
If a given matrix is of higher order, then we apply some numerical methods to find the
inverse. For further discussion the reader may see the Numerical book of the Author.

Adjoint of a Matrix

3.6.3

201

Singular Value Decomposition

For a rectangular matrix, like LU, QR decomposition method, a similar decomposition is


possible, which is known as the singular value decomposition. It plays a significant role in
matrix theory. Also it is used to find the generalized inverse of a singular matrix, which has
several applications in image processing.
Let A be an m n(m n) real matrix, then the n n real sub-matrices AT A and AAT
are symmetric, positive definite and have eigenvalues say k . Then we can find the n
orthonormalized eigenvectors Xk of AT A such that
AT AXk = k Xk .
Let Yk be orthonormalized eigenvectors of AAT , then
AAT Yk = k Yk .
Then to solve the eigenvalue problem, find an orthogonal matrix U such that A can be
decomposed into the form
A = U DV T
which is called the singular value decomposition of the matrix A, where the n n matrix V
consists of Xk , which are n orthonormalized eigenvectors of AT A. If some k = 0, then the
corresponding column of V must be identically zero as its norm is 0. Also U T U = V T V =
V V T = In and the diagonal matrix D is as

1 0 0
0
2 0

D= .
.
.
..
..

0
0 n

1 , 2 , . . . n are called the singular values of A satisfying 1 2


The values

. . . n 0. Since all eigenvalues of AT A should be non-negative, except for possible


perturbations due to rounding errors, so if any k is small negative number. If the rank of
A is r(< n), then
p
p
p
r+1 = r+2 = . . . n = 0.

If all the i s are distinct satisfying 1 > 2 > . . . > n , then the singular value
decomposition of the matrix A is unique. One of the possible disadvantages of this method
is AT A must be formed, and this may be lead to a loss of information due to use of finitelength computer arithmetic.

12
Ex 3.6.11 Find the SVD of A = 2 1 and hence find A1 .
13



12
121
Solution: Here A = 2 1 so that AT =
.
213
13

 

12 
121
6 7
T

Hence AA = 2 1
=
.
213
7 14
13

202

Theory of Matrices

Hence the eigenvalues of AT A are 1 = 18.062,


2 = 1.9377
and the corresponding eigenvectors are [0.5803, 1]T , [1, 0.5803]T . Also, 1 = 4.2499, 2 = 1.3920. The eigenvectors
Y1 , Y2 of AAT are given by


12 
0.6071
1
1
0.5803
2 1
Y1 = AX1 =
= 0.5084
1
4.2499
1
13
0.8424



12
0.1154
1
1
1
2 1
= 1.0716 .
Y2 = AX2 =
0.5803
1.3920
2
13
0.5323
Hence the singular value decomposition of A is given by



12
0.6071 0.1154 
4.2499 0
0.5803
1

A = 2 1 = 0.5084 1.0716
.
0 1.3920
1 0.5803
13
0.8424 0.5323
Thus the A1 is given by
A1 = 
V D1 U T



0.5803
1
4.2499 0
1.6071 0.5084 0.8424
=
.
1 0.5803
0 1.3920
0.1606 1.4916 0.7409

3.7

Orthogonal Matrix

A square matrix A of order n is said to be orthogonal if


A = In .
AAT = AT
1 2 2
For example, let, A = 13 2 1 2
, then,

2 2 1
1 2 2
1 2 2
1
1
AAT = 2 1 2 2 1 2
3
3
2 2 1
2 2 1

900
100
1
0 9 0 = 0 1 0 = AT A.
=
9
009
001
Hence A is orthogonal matrix. Unit matrices are always orthogonal as
I T I = I T I = I.

0 2
Ex 3.7.1 Determine the values of , , so that A = is orthogonal.

Solution: Since the matrix A is orthogonal, by definition, AAT = I = AT A. So,

0 2
0
2 0 0
100
2 = 0 6 2 0 = 0 1 0


0 0 3 2
001
1
1
1
22 = 1, 6 2 = 1 and 3 2 = 1 = , = , = .
2
6
3
Ex 3.7.2 Find an orthogonal matrix of order 3, whose first row is a multiple of (2, 1, 2).

Orthogonal Matrix

203

Solution: Normalizing (2, 1, 2) we get, ( 23 , 13 , 23 ). Considering ( 23 , 13 , 23 ) as the first row, let


the orthogonal matrix A be,

2 1 2
3 p x
3 3 3
A = p q r so that AT = 13 q y .
2
xy z
3 r z
Using the definition of orthogonal matrix AT = AT A = I, we have,
p2 + q 2 + r2 = 1, 2p + q + 2r = 0, 2x + y + 2z = 0,
px + qy + rz = 0, x2 + y 2 + z 2 = 1.
Since there are five equations in six unknowns, there are infinite number of solutions satisfying the equations. Taking q = 0, we have r = p and so p2 = 12 , i.e., p = 12 . Taking,
p = 12 , we get r = 12 and so x z = 0, i.e., x = z and y = 4x. Therefore, using the
1
1
4
1
relation x2 + y 2 + z 2 = 1, we get, x2 = 18
. Taking x = 3
, we have, y = 3
, z = 3
.
2
2
2
Therefore, the orthogonal matrix is given by,
2 1 2 2
1
2
3 3 3

1
0 12 .
A = p q r = 2
1
4
1

xy z
3 2
3 2 3 2

Theorem 3.7.1 Orthogonal matrix A is non-singular and the value of |A| = 1.


Proof: Let A be an orthogonal matrix of order n. Then by definition, AAT = AT A = In .
Therefore,
|AAT | = |In | |AT ||A| = 1 |A|2 = 1; as |AT | = |A|.
Hence A is non-singular and |A| = 1.
Ex 3.7.3 Obtain the most general orthogonal matrix of order 2.
Solution:
Let us start with an arbitrary matrix of order 2 which can be written as A =


ab
, where a, b, c, d are any scalars, real or complex. If this is to be an orthogonal matrix,
cd
its elements must satisfy AAT = AT A = I2 , i.e.,


  2
 

ab
ac
a + b2 ac + bd
10
=
=
cd
bd
ac + bd c2 + d2
01
a2 + b2 = 1 = c2 + d2 ; ac + bd = 0.
The general solution is a = cos , b = sin , where is real or complex and c = cos , d = sin ,
where, is a scalar. Using the relation,ac + bd = 0, we get,

cos( ) = 0 = .
2
Thus the most general orthogonal matrix of order 2 then becomes

 

ab
cos sin
A=
=
cd
sin cos
for some value of . Choosing the upper signs, we get the most general orthogonal matrix
of order 2 with |A| = 1, while, the lower signs, we get the most general orthogonal matrix
of order 2 with |A| = 1.

204

Theory of Matrices

Theorem 3.7.2 The product of two orthogonal matrices of same order is orthogonal.
Proof: Let A, B be two orthogonal matrices of order n. Then by definition, AAT = AT A =
In and BB T = B T B = In . Now,
(AB)T (AB) = (B T AT )(AB) = B T (AT A)B
= (B T In )B = B T B = In .
Similarly, (AB)(AB)T = In . Hence AB is orthogonal.
Theorem 3.7.3 If A be an orthogonal matrix, then A1 = AT .
Proof: Let A be an orthogonal square matrix of order n, then AT A = AAT = In . Thus,
A(AT A) = AIn and (AAT )A = In A
[AAT In ]A = 0.
Since A is an orthogonal matrix, so |A| =
6 0 and so,
AAT In = 0 AAT = In = AT A( similarly ).
From the definition and uniqueness of inverse, A1 = AT . Similarly,, it can be shown that
the transpose of an orthogonal matrix is orthogonal.
Theorem 3.7.4 The inverse of an orthogonal matrix is orthogonal.
Proof: Let A be an orthogonal matrix of order n, then |A| =
6 0 and A1 exists. Now,
(A1 )T (A1 ) = (AT )1 (A1 ) = (AAT )1
= (In )1 = In .
Hence A1 is orthogonal. Also, using the definition we can show that, the transpose of an
orthogonal matrix is also orthogonal.
Ex 3.7.4 Let A be an orthogonal matrix. Then kA is an orthogonal matrix if k = 1.
Solution: Since A be an orthogonal matrix of order n, we have by definition, AT A =
AAT = In . Now, kA is an orthogonal matrix, if,
(kA)T (kA) = In (kAT )(kA) = In
k 2 AT A = In k 2 = 1, i.e., k = 1.
Thus, if kA is an orthogonal matrix then k = 1.
Ex 3.7.5 Let A and B are orthogonal and |A| + |B| = 0. Prove that A + B is singular.
Solution: Since A and B are orthogonal matrices, so |A| 6= 0 and |B| =
6 0. Let AT + B T =
T
T
T
T
C , which implies that I + AB = AC and B + A = AC B. Therefore,
|A + B| = |A||C T ||B| = |A|2 |C T |; as |A| + |B| = 0
= |C T | = |AT + B T | = |A + B|
2|A + B| = 0 |A + B| = 0.
Therefore, A + B is singular.

Submatrix

205



AO
Ex 3.7.6 If the matrices A and B are orthogonal, then show that the matrix
is also
OB
orthogonal.


AO
Solution: Let C =
. Since A and B are orthogonal, AAT = I and BB T = I. Now,
OB

 T
 

 
AO
A O
AAT O
I O
CC T =
=
=
= I.
T
OB
O I
O BB T

 O B
AO
Hence C, i.e.,
is orthogonal.
OB
Ex 3.7.7 Let A be a skew symmetric matrix and (I +A) be a nonsingular matrix, then show
that B = (I A)(I + A)1 is orthogonal.
Solution: Since the matrix A is a skew symmetric, so AT = A, and so, (I A)T = I + A
and (I + A)T = I A. Now,
B T = [(I A)(I + A)1 ]T = [(I + A)1 ]T (I A)T
= {(I + A)T }1 (I A)T = (I A)1 (I + A).
Also,
(I + A)(I A) = I A + A A2 = (I A)(I + A).
We are to show that B T B = I. For this,
B T B = (I A)1 (I + A)(I A)(I + A)1
= (I A)1 (I A)(I + A)(I + A)1 = I.I = I.
Hence B = (I A)(I + A)1 is orthogonal. Conversely, let B = (I A)(I + A)1 be
orthogonal, then by definition, B T B = I. Therefore,

or,
or,
or,
or,
or,

[(I + A)1 ]T (I A)T (I A)(I + A)1 = I


[(I + A)T ]1 (I A)T (I A)(I + A)1 = I
(I + AT )1 (I AT )(I A)(I + A)1 = I
(I AT )(I A) = (I + AT )(I + A)
I A AT + AT A = I + A + AT + AT A
2(A + AT ) = 0 AT = A.

Therefore, A is skew symmetric.

3.8

Submatrix

Let A = [aij ]mn be a matrix. Any matrix, obtained by omitting some rows or columns or
both a given matrix A, is called a submatrix of A. Consider an square matrix A = [aij ] of
order n and delete some,
 but not all,of its rows or columns, we obtain a sub-matrix of A.
a31 a32 a33
Let A = [aij ]44 , then,
is a submatrix of A. Thus sub matrix can be formed
a41 a42 a43
from a given matrix A by deleting some of its rows or columns or both. The determinant
of the square matrix of order r, obtained from a given m n matrix A by omitting (m r)
rows and (n r) columns is called minor of A of order r. The sub matrix formed by the
elements of the first r rows and columns of A is called the leading sub matrix of order r and
its determinant is known as the leading minor of order r.

206

3.9

Theory of Matrices

Partitioned Matrix

A matrix A can be divided into sub-matrices if we draw horizontal lines between rows and/
or vertical lines between columns, the matrices are obtained called partitioned or block
matrix of A. Consider the above matrix A = [aij ]44 , then

a11
a11 a12 a13 a14

a21
... ... ... ...


a21 a22 a23 a24 ,

...
a31 a32 a33 a34

a31
a41 a42 a43 a44

a41

..
. a12
..
. a22
..
. ...
..
. a32
..
. a42

.
a13 .. a14

.
a23 .. a24

..
... . ...

..
a33 . a34

..
a .a
43

44

are partitioned matrices of A. A partitioned matrix can be represented economically by


denoting each constituent submatrix by a single matrix symbol. Thus the above partitioned
matrices of A can be written as
 

A11
A11 A12 A13
,
A21
A21 A22 A23
respectively,
where, for the first matrix, A11 = (a11 a12 a13 a14 ) and in the second


.
a11
A11 =
and so on. The augmented matrix [A..b] of a linear system Ax = b is a
a21
partitioned matrix. Partitioning of matrices is useful to effect addition and multiplication
by handling smaller matrices.

3.9.1

Square Block Matrices

Let M be a block matrix. Then M is called a square block matrix if:


(i) M is a square matrix.
(ii) The blocks from a square matrix.
(iii) The diagonal blocks are also square matrices.
The latter two conditions will occur if and only if there are the same number of horizontal
and vertical lines and they are placed symmetrically. Consider the following two block
matrices:

..
..
..
..
1
2
.
3
4
.
5
1
2
.
3
4
.
5

.
.
.
.
1 1 .. 1 1 .. 1
1 1 .. 1 1 .. 1

A = 9 8 ... 7 6 ... 5 ; B = 9 8 ... 7 6 ... 5 .

4 4 ... 4 4 ... 4

.
.
4 4 .. 4 4 .. 4

..
..
..
..
3 5 . 3 5 . 3
3 5 . 3 5 . 3
The block matrix A is not a square matrix, since the second and third diagonal blocks are
not square. On the other hand, the block matrix B is a square block matrix.

Partitioned Matrix

3.9.2

207

Block Diagonal Matrices

Let M = [Aij ] be a square block matrix such that the non diagonal blocks are all zero
matrices, i.e., Aij = 0, for i 6= j. Then M is called a block diagonal matrix. We sometimes
denote such a block diagonal matrix by writting
M = diag(A11 , A22 , , Arr ).
The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to the algebra of the individual blocks. Specially, suppose f (x) is a polynomial and M is the above block diagonal matrix. Then f (M ) is a block diagonal matrix
and
f (M ) = diag (f (A11 ), f (A22 ), , f (Arr )) .
Also, M is invertible if and only if each Aii is invertible, and, in such a case, M 1 is a block
diagonal matrix and

1
1
M 1 = diag A1
11 , A22 , , Arr .
Analogously, a square block matrix is called a block upper triangular matrix if the blocks
below the diagonal are zero matrices, and a block lower triangular matrix if the blocks above
the diagonal are zero matrices. Consider the following two block matrices:
(i) A is upper triangular since the block below the diagonal is zero block.
..
.
..
.

0
1 2

5
A= 3 4
.

..
0 0 . 6
(ii) B is lower triangular since the blocks above the diagonal are zero blocks.

..
..
1
.
0
0
.
0

.
.
2 .. 3 4 .. 0

B=

5 ... 0 6 ... 0

..
..
0 . 7 8 . 9

(iii) C is diagonal since the blocks above and below the diagonal are zero blocks.

..
.
..
.

..
.
..
.

0
0 0
1 2
1

5
2 3
C= 3
,
,D = 3 4

..
..
0 6 . 7
0 . 4 5
(iv) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will
make it into either a block upper triangular matrix or a block lower triangular matrix.

208

3.9.3

Theory of Matrices

Block Addition

Let A = [Aij ] and B = [Bij ] are block matrices with the same numbers of row and column blocks, and suppose that corresponding blocks have the same size. Then adding the
corresponding blocks of A and
B also adds the corresponding elements
of A and B as
A11 + B11 A12 + B12 A1n + B1n
A21 + B21 A22 + B22 A2n + B2n
.
A+B =

...
...
...
...
Am1 + Bm1 Am2 + Bm2 Amn + Bmn
where A and B are conformable for addition. Multiplying each block of A by a scalar by a
scalar k multiplies each element of A by k. Thus,

kA11 kA12 kA1n


kA21 kA22 kA2n
.
kA =
...
... ... ...
kAm1 kAm2 kAmn
Suppose M and N are block diagonal matrices where corresponding blocks have the same
size, say M = diag(Ai ) and N = diag(Bi ), then M + N = diag(Ai + Bi ).

3.9.4

Block Multiplication

Let A = [Aik ] and B = [Bkj ] are block matrices such that they are conformable for multiplications. Then the block
multiplication of A
and B is given by
C11 C12 C1n
p
X
C21 C22 C2n
, where, Cij =
AB =
Aik Bkj ,
... ... ... ...
k=1
Cm1 Cm2 Cmn
provided all the products of the form Aik Bkj can be formed.
Ex 3.9.1 Compute AB using block multiplication, where,

..
..
1
2
.
1
1
2
3
.
1

.
.
3 4 .. 0
4 5 6 .. 1
A=
and B =
.

..
..
0 0 . 2
0 0 0 . 1
Suppose M be block diagonal matrix, the M k is defined by,

M k = diag Ak11 , Ak22 , , Akrr .




E F
R S
Solution: Here, A =
and B =
, where E, F, G, R.S, T are the given
012 G
013 T
blocks, and 012 and 013 are zero matrices of the indicated sites. Hence,


 

E F
R S
ER ES + F T
=
AB =
012 G
013 T
013
GT

..
9
12
15
.
4

    

9 12 15
3
1

..
+

7
0 = 19 26 33 . 7 .
= 19 26 33
... ... ... ...
(0 0 0)
2

.
0 0 0 .. 2

Partitioned Matrix

209


Ex 3.9.2 If M = diag(A, B), where, A =


12
, B = [5]. Find M 2 .
34

Solution: For the given matrices A and B, we have,




 

12
12
7 10
2
A =
=
and B 2 = [25].
34
34
15 22
Since M is block, square each block:
2



M = diag

3.9.5

7


7 10

, [25] = 15
15 22

10

..
.
..
.

22

..
. 25

Inversion of a Matrix by Partitioning

When a matrix is very large and it is not possible to store the entire matrix into the primary
memory of a computer at a time, then matrix partition method is used to find the inverse
of a matrix. When a few more variables and consequently a few more equations are added
to the original system then also this method is very useful.
Let the coefficient matrix A be partitioned as

..
B
.
C

A=
(3.30)

..
D . E
where B is an l l matrix, C is an l m matrix, D is an m l and E is an m m matrix;
and l, m are positive integers with l + m = n. Let A1 be partitioned as

..
P . Q

A1 =
(3.31)

..
R . S
where the matrices P, Q, R and S are of the same orders as those of the matrices B, C, D
and E respectively. Then

..
..
..
B . C P . Q I1 . 0

AA1 =
(3.32)
= ,
..
..
..
D . E
R . S
0 . I
2

where I1 and I2 are identity matrices of order l and m respectively. From (3.32), we have,
BP + CR = I1 ; BQ + CS = 0
and DP + ER = 0; DQ + ES = I2 .
Now, BQ+CS = 0 gives Q = B1 CS i.e., DQ = DB1 CS. Also, from DQ+ES = I2 ,
we have (EDB1 C)S = I2 . Therefore, S = (EDB1 C)1 . Similarly, the other matrices
are,

210

Theory of Matrices
S = (E DB1 C)1
Q = B1 CS
R = (E DB1 C)1 DB1 = SDB1
P = B1 (I1 CR) = B1 B1 CR.

It may be noted that, to find the inverse of A, it is required to determine the inverses of
two matrices B and (E DB1 C) of order l l and m m respectively.
That is, to compute the inverse of the matrix A of order n n, the inverses of two lower
order (roughly half) matrices are to be determined. If the matrices B, C, D, E are still large
to fit in the computer memory, then further partition them.

334
Ex 3.9.3 Find the inverse of the matrix A = 2 1 1 using the matrix partition method.
135
Hence find the solution of the system of equations
3x1 + 3x2 + 4x3 = 5; 2x1 + x2 + x3 = 7; x1 + 3x2 + 5x3 = 6.
Solution: Let
be partitioned as
the matrix A
..

..
3 3 . 4

B . C
..


A= 2 1 . 1 =
,

..

D . E
..
1
3
.
5


 


 
33
4
where B =
, C=
, D= 13 , E= 5
21
1

..
P . Q

and A1 =
, where P, Q, R and S are given by
..
R . S
S = (E DB1 C)1 , R = SDB1 , P = B1 B1 CR, Q = B1 CS.




1 1 3
1
1 3
Now,
=
.
B1 =
2 3
3 2 3
3

 

 1 1 3
1
4
E DB1 C = 5 1 3
= .
2 3
1
3
3
S=3



 1 1 3


R = 3 1 3
= 5 6
2
3
3


1 1 3
1
1
P = B B CR =
2 3
3

 



1 1 3
4 
2
3
5 6 =

.
2 3
1
9 11
3

 


1 1 3
4
1
Q=
3=
.
2 3
1
5
3
Therefore, A1 is given by,

A1

2
3 1
= 9 11 5 .
5
6 3

Rank of a Matrix

211

Hence, the solution of the given system of equations is given by

2
3 1
5
17
x = A1 b = 9 11 5 7 = 62 .
5
6 3
6
35
Hence the required solution is x1 = 17, x2 = 62, x3 = 35.

3.10

Rank of a Matrix

Rank of a matrix A of order m n is defined to be the greatest positive integer r such that
(i) there exist at least one square sub-matrix of A of order r, whose determinant is not
equal to zero, and
(ii) the determinant of every (r + 1) order square sub-matrix in A is zero.
In other words, the rank of A is defined to be the greatest positive integer r such that A
has at least one non-zero minor of order r, it is denoted by (A) and r(A). Now,
(i) Rank of a zero matrix is defined to be 0.
(ii) Every minor of order greater than (r + 1) can be expressed in terms of minors of order
(r + 1). So every minor of order greater than r is zero.
(iii) Rank of a non-singular square matrix of order n is n and rank of a singular square
matrix of order n is less than n. The rank of an unit matrix of order n is n.
(iv) For a non-zero m n matrix A, we have, 0 < rank of A < min{m, n}.
(v) Rank of A = rank of AT , since A and AT have identical minors.
So we are to first take the higher order minor and to continue by decreasing the order of
minor by one, until the rank of the matrix is obtained, i.e., we have to come in such position
when the minor becomes non-zero. If the order of given matrix be greater than 3, then this
method becomes laborious in general.
Ex 3.10.1
Find the
ranks ofthe following
matrices:


1 0 1
1 2 2
123
(a) A = 1 2 3 (b) B = 1 0 2 , (c) C =
.
246
01 0
2 1 4
Solution: (a) det A = 4 6= 0. Therefore, the rank of the matrix A = 3.
(b) det B = 0 as first and third columns are identical. The rank B is not 3. But
1 2


1 0 = 2 6= 0 ( a submatrix of order 2). Hence rank of B is 2.
(c) The submatrices
of order

2 of C are
1 2 1 3 2 3




2 4 ; 2 6 ; 4 6 .
The determinants of all these submatrices are zero. Therefore, rank C is less than 2. But,
all the submatrices of order one, viz., [1], [2], [3], [4], [5], [6], there is at least one non-zero
element and hence rank of C is 1.

212

3.10.1

Theory of Matrices

Elementary Operation

Now, we are tom present some convenient operations by which the rank can be easily obtained. Elementary operation is such an operation or transformations. When the transformations are applied to rows, the elementary transformations are said to be elementary
row transformations and when applied to columns they are said to be elementary column
transformations.
The following operations on a matrix A = [aij ]mn are called elementary operations:
(i) Interchange of any two rows (or columns) that is replace the rth row [ar1 ar2 arn ]
by the sth row [as1 as2 asn ] and replace [as1 as2 asn ] by [ar1 ar2 arn ].
It is denoted by Rrs ( or Crs ).
(ii) Multiplication of ith row (or ith column ) by a non-zero scalar c is denoted by cRi ( or
cCi ) or Ri (c) ( or Ci (c) ). Multiply ith row of A by c 6= 0, i.e., replace [ai1 ai2 ain ]
by [cai1 cai2 cain ].
(iii) Addition of c times the j th row (or column) to the ith row (or column) is denoted by
Ri + cRj (or Ci + cCj ) or Rij (c) (or Cij (c)). Add c times row j of A to row i of A,
i 6= j, i.e., replace [ai1 ai2 ain ] by [ai1 + caj1 ai2 + caj2 ain + cajn ].
Comparing the elementary operations with the properties of determinants, we observe
that after elementary transformations, a singular matrix remains singular and the determinant of the non-singular matrix is altered to the extent of a non-zero scalar multiple. Also,
the determinants of the submatrices of all orders in any m n matrix are affected similarly.
Therefore, elementary transformations do not affect the rank of a matrix.

357
Ex 3.10.2 Find the rank of A = 2 1 3 by minor method.
144
Solution: Let A be a 3 3 matrix and it has minors of 1, 2, 3. Minor of order 3 is



3 5 7 0 0 0



= 2 1 3 = 2 1 3 = 0; R1 (R2 + R3 ).
1 4 4 1 4 4
Hence the rank of A is < 3. The second order minors constructed from the first two rows
are






3 7
5 7
3 5





= 7.
=
= 5;
= 8;
2 3
1 3
2 1
Similarly, we can construct minors by using first and third, and, second ant third rows. Thus
the rank of the given matrix A is 2.

k
1
0
3
k 2 1 , for different values of k.
Ex 3.10.3 Determine the rank of A =
3(k + 1) 0 k + 1
Solution: Using the elementary row and column operations, we get

k
1
0
k 1
0

3
k 2 1 0 k 2 1 = B(say), c13 (3).
3(k + 1) 0 k + 1
0 0 k+1


k 1

0

Now, |B| = 0 k 2 1 = k(k 2)(k + 1).
0 0 k + 1

Rank of a Matrix

213

If k 6= 0, 2, 1, the |B| =
6 0and the
 rank of the given matrix is 3. If k = 0, one minor of
1 0
the equivalent matrix B is
= 1 6= 0. Therefore, the rank of the given matrix is 2.
2 1
Similarly, the rank is 2 for k = 2 or k = 1. Hence the rank of the given matrix is either 3
or 2 for different values of k.

1210
2 4 8 6

Ex 3.10.4 Determine the rank of a matrix A =


0 0 5 8 .
3663
Solution: Let us apply elementary row operations on A to reduce it to a row-echelon
matrix.

1210
1210
1210
2 4 8 6 R2 2R1 0 0 6 6 1 R2 0 0 1 1

A=
0 0 5 8
0 0 5 8 0 0 5 8
R4 3R1
3663
0033
0033

1 2 0 1
R 1 R2
1 2 0 1
1200

R1 + R 3 0 0 1 0
0 0 1 1

1 R3 0 0 1 1

0 0 0 1 = R, say.
R3 5R2 0 0 0 3 3 0 0 0 1
R2 R3
R4 3R2 0 0 0 0
000 0
0000
R is a row-reduced echelon matrix and R has 3 non-zero rows. Therefore rank R = 3. Since
A is row equivalent to R, rank A = 3.

3.10.2

Row-reduced Echelon Matrix

An m n matrix A is said to be row-reduced echelon matrix if it satisfies the following


properties:
(i) All zero roes if there be any, appear at the bottom of the matrix.
(ii) The first non-zero entry from the left of a non-zero row is 1. This entry is called
leading one of its row.
(iii) For each non-zero row, the leading one appears to the right and below any leading
ones in preceding rows.
(iv) If a column contains a leading one, then all other entries in that column are zero.

100
101
For example, 0 1 0 , 0 1 3 are examples of row reduced echelon matrix. Any matrix
001
000
A of order m n and rank r(> 0) can be reduced to one of the following forms

..

I
.
0
Ir
r
 . 
... ... ...,
, [Ir ]
.
,

Ir . 0
..
0
0 . 0

by a sequence of elementary transformations. Those reduced forms are called normal form
of A.

214

Theory of Matrices

3 1 2
Ex 3.10.5 Find the rank of the matrix A = 6 2 4 by normalization method.
3 1 2
Solution: By using the elementary transformations, we get,

300
R21 (2) 3 1 2
C21 ( 13 )
100
1
0 0 0 R1 ( 3 ) 0 0 0 .

A 0 0 0

2
R31 (1) 0 0 0 C31 ( 3 ) 0 0 0
000
Thus the rank of the given matrix A is 1.

0012 1
1 3 1 0 3

Ex 3.10.6 Obtain the fully reduced form of the matrix


2 6 4 2 8 .
3 9 4 2 10
Solution: Let us apply elementary operations on the matrix.

0012 1
1310 3
13103
R

2R
1
1 3 1 0 3 R12 0 0 1 2 1 3

0 0 1 2 1

2 6 4 2 8 2 6 4 2 8
0 0 0 2 2
R4 3R1
3 9 4 2 10
3 9 4 2 10
00121

1 3 0 2 2
R1 R 2
1 3 0 2 2
0 0 1 2 1 1 R3 0 0 1 2 1

R3 2R2 0 0 0 2 0 0 0 0 1 0
R4 R 2
000 0 0
000 0 0

13002
10000
R1 + 2R3
C

3C
1
2

0 0 1 0 1
0 0 1 0 1

0 0 0 1 0
0 0 0 1 0
R2 2R3
C5 2C1
00000
00000

10000
10000
10000

C5 C3
0 0 1 0 0 C23 0 1 0 0 0 C34 0 1 0 0 0 = R, say.

00010

00010

0 0 1 0 0
00000
00000
00000
R is the fully reduced normal form.
Deduction 3.10.1 Solution of system of linear equations by rank of the matrix
method : Here we shall be concerned with the numerical computation of the solution of
a system of n linear algebraic equations relating in n unknowns x1 , x2 , ..., xn of the explicit
form (3.19) where the n2 coefficients aij and the n constants b1 , b2 , ..., bn are given real
numbers. The (3.19) can be written in the matrix notation as Ax = b where the real
n n coefficient matrix is A in which aij is the coefficient of xj in the ith equation, bT =
[b1 , b2 , ..., bn ] is a column n vector which are prescribed and xT = [x1 , x2 , ..., xn ] is the
unknown n column vector.
(i) A nonhomogeneous system of n equations in n unknowns has an unique solution if
and only if A is non singular, i.e., det(A) 6= 0.
(ii) If r(A), r(A, b) be rank of coefficient and the augmented matrix respectively, the necessary and sufficient condition for the existence of a unique solution of the consistent
system AX = b is r(A) = r(A, b)=number of unknowns.

Rank of a Matrix

215

(iii) If r(A) 6= r(A, b) then the equations are inconsistent or over determined and they have
no solution. If b = 0 and det(A) 6= 0, then the system has the only unique trivial
solution x = 0.
(iv) If r(A) = r(A, b) < the number of unknowns, then the equations have infinite number
of solutions.
Homogeneous system of equations lead to eigenvalue problems and such system possesses
only a trivial solution if |A| 6= 0 and a non trivial solution if r(A) = k < n.
This is shown in Figure 3.3.
A system of linear equations
?
?
Has a solution
[rank (Ac ) = rank (A)]
?
?
Has a unique solution
[rank (Ac ) = rank (A) = n]

?
Has no solution
[rank (Ac ) 6= rank(A)]
?
Has infinite many solutions
[rank (Ac ) = rank (A) < n]

Figure 3.3: Different cases for existence of solutions.


Ex 3.10.7 Find for what values of a and b, the system of equations
x + y + z = 1, x + 2y z = b, 5x + 7y + az = b2
has (i) an unique solution, (ii) no solution and (iii) infinite number of solution over the
field of rational numbers.
Solution: The given system of linear equation can be written in the form Ax = B, where
the augmented matrix is given by

..
1
1
1
.
1

(A|B) = 1 2 1 ... b .

.. 2
57 1 .b
Let us apply the elementary row and column operations on (A|B) as

..
..
1
1
1
.
1
1
0
3
.
b
+
2
R2 R 1
R 1 R2

.
.

(A|B)
.
0 1 2 .. b 1
0 1 2 ..
b1

R3 5R1
R

2R
3
2
.
.
0 2 a 5 .. b2 5
0 0 a 1 .. b2 2b 3
(i) If a 6= 1, then the rank of A and (A|B) are 3 = the order of the matrix. Therefore in
this case the system has an unique solution.
(ii) If a = 1 and b2 2b 3 6= 0, then rank of (A|B) = 3 and rank of A = 2 and therefore
the system is in inconsistent. Thus if a = 1, b 6= 1, 3 then the system has no solution.
(iii) If a = 1 and b2 2b 3 = 0, then rank of (A|B) = rank of A = 2 and therefore the
system is consistent. Thus if a = 1, b = 1 or a = 1, b = 3 then the system has infinite
number of solutions.

216

Theory of Matrices

Ex 3.10.8 Find for what values of a and b, the system of equations


x1 + 4x2 + 2x3 = 1, 2x1 + 7x2 + 5x3 = 2b, 4x1 + ax2 + 10x3 = 2b 1
has (i) an unique solution, (ii) no solution and (iii) infinite number of solution over the
field of rational numbers.
Solution: If the matrix form of the system is AX = B, then

14 2
x1
1
A = 2 7 5 , X = x2 , B = 2b .
4 a 10
x3
2b + 1
Let us apply the elementary row and column operations on (A|B) as

..
..
1 4 2 . 1 R21 (2) 1 4 2 . 1

..

(A|B) = 2 7 5 ... 2b

R (4) 0 1 1 . 2b 2
31
..
..
4 a 10 . 2b + 1
0 a 16 2 . 2b 3

R2 (1) 1 4
2
1
0 1 1 2 2b .

R32 (2) 0 a 14 0 1 2b

(i) Row-reduced form of

1 4 2
A = 0 1 1
0 a 14 0

The solution of the system will be unique, if (A) = 3. For this a 14 6= 0, i.e., a 6= 14.
(ii) The system has no solution, if (A) 6= (C). If a = 14 and 1 2b 6= 0, i.e., b 6= 12
then (A) = 2 and (C) = 3. In this case the system has no solution. (iii) If a = 14 and
b = 21 then (A) = (C) = 2. The system is consistent and one (3 2) variable is free. The
equations are equivalent to
x1 + 4x2 + 2x3 = 1, x2 x3 = 1.
Considering x3 as arbitrary, x1 = 3 6x3 , x2 = 1 + x3 . Putting rational values to x3 , an
infinite number of solutions of the system over the field of rational numbers is obtained.

3.11

Elementary Matrices

A square matrix obtained from a unit matrix In of order n by a single elementary transformation is called an elementary matrix of order n. There are three different forms of
elementary matrices:
(i) The matrix Eij is obtained by interchanging the ith and j th rows ( or columns) of an
unit matrix. Also, |Eij | = 1.
(ii) The matrix Ei (c) is obtained by multiplying the ith row (or column) of an unit matrix
by a non-zero scalar c. Also, |Ei (c)| = c 6= 0.
(iii) Eij (c) is obtained by multiplying every element of the j th row by c of an unit matrix and
adding them to the corresponding elements of the ith row of the matrix. |Eij (c)| = 1.

Elementary Matrices

217

Therefore, an elementary matrix is non-singular. Every elementary row (column) transformation on a matrix can be brought by pre(post)-multiplication with an elementary matrix.
Now,
(i) The interchange of the ith and j th rows of Eij will transform Eij to the unit matrix.
This transformation is effected on pre-multiplication by Eij . Thus
Eij Eij = I (Eij )1 = Eij .
(ii) If the ith row of Ei (c) is multiplied by 1c , it will transform to the unit matrix. It is
nothing but pre-multiplication by Ei 1c , i.e.,
 
 
1
1
1
Ei
Ei (c) = I [Ei (c)] = Ei
.
c
c
(iii) Similarly, Eij (c)Eij (c) = I gives [Eij (c)]1 = Eij (c).
Thus, the inverse of an elementary matrix is an elementary matrix of the same type.
An elementary matrix is that, which is obtained from a unit matrix, by subjecting it to
any of the elementary
Let

transformations.

100
100
k00
1k0
I3 = 0 1 0 = E0 , E1 = 0 0 1 , E2 = 0 1 0 , E3 = 0 1 0
001
010
001
001
be three matrices obtained from the unit matrix I3 by elementary row transformations as
I3

E1 ,
R23

I3

E2 ,
kR1

I3

E3 .
R1 + kR2

The matrices E1 , E2 , E3 obtained from a unit matrix by elementary operations are referred
as left-elementary matrices. The elementary row (column) transformations of a matrix A can
obtained by pre-multiplying
A by the corresponding elementary matrices.
(post-multiplying)

a1 b1 c1
Consider the matrix A = a2 b2 c2 . Then
a3 b3 c3

100
a1 b1 c1
a1 b1 c1
E1 A = 0 0 1 a2 b2 c2 = a3 b3 c3
010
a b c
a b c

3 3 3 2 2 2

k00
a1 b1 c1
ka1 kb1 kc1
E2 A = 0 1 0 a2 b2 c2 = a2 b2 c2
001
a b c
a b c

3 3 3 3 3 3

k00
a1 b1 c1
a1 + ka2 b1 + kb2 c1 + kc3
.
a2
b2
c2
E3 A = 1 0 0 a2 b2 c2 =
001
a3 b3 c3
a3
b3
c3
Here we see that the matrix A is subjected to the same elementary row transformations
R23 , kR1 and R1 + kR2 respectively, as the unit matrix to obtain E1 , E2 , E3 . Similarly,
we obtain elementary column transformations of a matrix A by post-multiplying it with a
matrix known as right elementary matrix.
Theorem 3.11.1 If A be an n n matrix, the following are equivalent
(i) A is invertible
(ii) A is row-equivalent to the n n identity matrix.

218

Theory of Matrices

(iii) A is the product of elementary matrices.


Proof: Let R be the row reduced echelon matrix which is row equivalent to A, then
R = Ek E2 E1 A,
where E1 , E2 , , Ek are elementary matrices. We know, each elementary matrix E is
invertible, so,
A = E11 E21 Ek1 R.
As product of invertible matrices are invertible, we see that A is invertible if and only if R
is invertible. Since the square matrix R is a row-reduced echelon matrix, R is invertible if
and only if each row of R contains a non-zero entry, i.e., if and only if R = I, the identity
matrix. We have now shown that A is invertible if and only if R = I and if R = I, then
A = E11 E21 Ek1 . Thus (i), (ii) and (iii) are equivalent statements about the n n
matrix A.
Note : If A be an invertible nn matrix and if a sequence of elementary row operations
reduces A to the identity, then that same sequence of operations when applied on I yields
A1

112
Ex 3.11.1 Find the inverse of A where A = 2 4 4 .
337
Solution: Let us form the 3 3 matrix (A|I3 ) and perform elementary row operations to
reduce A to a row-reduced echelon matrix.

..
..
1 1 2 . 1 0 0 R2 2R1 1 1 2 . 1 0 0

..

(A|I3 ) = 2 4 4 ... 0 1 0

R 3R 0 2 0 . 2 1 0
3
1
..
..
337.001
0 0 1 . 3 0 1

..
..
1
1 0 2 . 2 2 0
1 1 2 . 1 0 0
1

R1 R 2
.
..
2 R2
1

0 1 0 .. 1 1 0

0 1 0 . 1 2 0
2

..
..
0 0 1 . 3 0 1
0 0 1 . 3 0 1

..
1
1 0 0 . 8 2 2
R1 2R3

..
0 1 0 . 1 1 0 = (I3 |A1 ).

..
0 0 1 . 3 0 1

8 21 2
= 1 12 0 .
3 0 1

Therefore A1

3.11.1

Equivalent Matrices

Two matrices A and B are said to be equivalent, if it is possible to pass from one to the
other by a chain of elementary transformations and this fact is written as A B. The
equivalent matrices have the following properties
(i) Any non-singular matrix is equivalent to the unit matrix.

Elementary Matrices

219

(ii) An m n matrix B is equivalent to A, if and only if B = P AQ, where P and Q are


two suitable non-singular matrices of order m and n respectively.
(iii) If B = P A, then B is row equivalent to A and if B = AB, then B is column equivalent
to A.
(iv) Equivalent matrices have the same rank.
(v) All the operations performed on Q matrix A are elementary row operations and therefore this fact is equivalent to multiply A on the right by a suitable non-singular matrix
P . Thus, P A = I, when |A| =
6 0 and |P | =
6 0, which gives A = P 1 . Therefore, P 1 is
the product of some elementary matrices. Consequently, a non-singular matrix is the
product of some elementary matrices.


12
Ex 3.11.2 Express A =
as the product of elementary matrices.
25
Solution: Applying elementary row operations on A we get,






1 2 R2 2R1 1 2 R1 2R2 1 0
.
25

01

01
A is reduced to row reduced echelon equivalent to I2 , so A is non-singular. Now,
(R1 2R2 )(R2 2R1 )A = I2
or, E12 (2)E21 (2)A = I2
or, A = [E12 (2)]1 [E21 (2)]1 = E21 (2)E12 (2).
A has been expressed as the product of two elementary matrices E21 (2), E12 (2). Also,
A1 = {E12 (2)}1 {E21 (2)}1 = E12 (2)E21 (2)


 

1 2
1 0
5 2
=
=
.
0 1
2 1
2 1

201
Ex 3.11.3 Show that the matrix 3 3 0 is non-singular and express it as a product of
623
elementary matrices.
Solution: Let the given matrix be denoted by A. We apply elementary row operations on
A to reduce it to row-reduced echelon matrix.

1 0 12
1 0 12 R2 3R1 1 0 12
1
1
R
0 3 3 3 R2 0 1 1

A 2 1 3 3 0
2
2

6 2 3 R3 6R1 0 2 0
02 0

1 0 12
R1 12 R3 1 0 0
R3 2R2
0 1 0.

0 1 12

00 1
R2 + 12 R3 0 0 1
Since A is row equivalent to I3 , A is non-singular. We observe that,
1
1
1
1
(R2 + R3 )(R1 R3 )(R3 2R2 )( R2 )(R3 6R1 )(R2 3R1 )( R1 )A = I3
2
2
3
2
1
1
1
1
or, E23 ( )E13 ( )E32 (2)E2 ( )E31 (6)E21 (3)E1 ( )A = I3
2
2
3
2
1 1
1
1
1
or, A = [E1 ( )] [E21 (3)]1 [E31 (6)]1 [E2 ( )]1 [E32 (2)]1 [E13 ( )]1 [E23 ( )]1
2
3
2
2
1
1
or, A = E1 (2)E21 (3)E31 (6)E2 (3)E32 (2)E13 ( )E23 ( ).
2
3

220

3.11.2

Theory of Matrices

Congruent Matrices

Let A and B be n n matrices over a field F . A matrix A is said to be congruent to


matrix B, written A ' B, if there exists an matrix P(which
 may not be
 a square
 matrix)
2
1
6
6
over F such that B = P T AP. For example, let A =
and B =
, then a
2 3
3 9


1 2
non-singular matrix P = 
such that 
 

1 1  
1 1
2 1
1 2
6 6
T
P AP =
=
= B.
2 1
2 3
1 1
3 9
Thus, the matrix A is congruent to the matrix B and the matrix P is called the transforming
matrix for the congruence from A to B.
(i) Operations under elementary congruent transformations is known as congruence operation.
(ii) If B = P T AP under congruence operation then A and B have the same rank.
An elementary congruent transformation of a matrix is defined as a pair of elementary
transformations, one of which is with rows and the other is the corresponding transformation
with the columns.

3.11.3

Similar Matrices

Let A and B be two square matrices of the same order n over the field F . If there is an
invertible n n matrix P over F such that B = P 1 AP ; then B is similar to 
A over 
F , or,
5 1
B is obtained from A by a similarity transformation. For example, let A =
and
2 1




5 1
12
B=
, then a non-singular matrix P =
such that
2 1
47



 

7 2
5 1
12
5 1
1
P AP =
=
= B.
4 1
2 1
47
2 1
Thus, the matrix A is similar to the matrix B.
Theorem 3.11.2 Similarity of matrices is an equivalence relation over the set of n n
matrices over F .
Proof: Let A, B and C be n n matrices over F . Let be the relation of similarity
between matrices. That is, AB if there exists an invertible matrix P of order n n such
that B = P 1 AP. Now,
(i) Since, IA = AI, I being the n n identity matrix, it follows that A = I 1 AI, for every
n n matrix A. Hence AA, A in the set od n n matrices. So the relation is reflexive.
(ii) Now, suppose that AB holds. Then there exists an invertible matrix P such that
B = P 1 AP P B = AP
A = P BP 1 = (P 1 )1 BP 1 .
This shows that BA holds. Hence is symmetric.
(iii) Finally, suppose that Ab and BC hold, then there exist invertible matrices P and Q
such that
A = P 1 BP and B = Q1 CQ
A = P 1 (Q1 CQ)P = (QP )1 C(QP ).

Elementary Matrices

221

Since QP is invertible, it follows that AC holds. Hence is transitive. Thus we see that,
the relation is reflexive, symmetric and transitive and hence it is an equivalence relation.

Exercise 3
Section-A
[Multiple
Choice Questions]




23
1 0
1. If A =
,B=
then 2A B is

4 5
 2 3




10
2 3
33
56
(a)
(b)
(c)
(d)
01
1 1
22
67


12
2. If A =
then A2 5A is equal to
34
(a) I
(b) 2I
(c) 0
(d) A I


cos sin
3. If A =
then A3 is
sin cos






cos 3 sin 3
cos3 sin3
cos 2 sin 2
(a)
(b)
(c)
sin 3 cos 3
sin 2 cos 2
sin3 cos3


cos sin
(d)
sin cos
4. Matrix A has p rows and p + 5 columns. Matrix B has q rows and 11 q columns.
Both AB and BA exist. The values of p and q are
(a) p = 2, q = 3
(b) p = 3, q = 8
(c) p = q = 3
(d) p = 0, q = 0




1 2
30
5. If A + B =
and A B =
then A is
2 0
26








2 1
1 1
20
4 2
(a)
(b)
(c)
(d)
2 3
0 3
31
4 6


21
6. If A =
then A2 4A + 3I is equal to
12
(a) A
(b) I
(c) I
(d) 0

12 3
7. The value of [2 3 4] 0 1 5 is
3 1 1
(a) [2 0 12]
(b) [4 3 4]
(c) [6 15 4]
(d) [14 11 17]

 

2 35
7 85
8. If 2A +
=
then A is
0 1 2
4 1 4






550
5/2 5/2 0
9/2 11/2 5
(a)
(b)
(c)
402
2 0 1
2 1 3


23 4
(d)
0 1 1






10
23
1 1
9. If A =
,B=
,C=
, then the value of AB + AC is
 0 1
 4 5
2 3


12
34
3/2 2
1/2 1
(a)
(b)
(c)
(d)
22
68
3 4
1 1

222

Theory of Matrices


   
x1
2
1
10. If
=
then the values of x and y are
2y
4
0
(a) x = 3/2, y = 1
(b) x = 3, y = 7
x = y = 7/5

(c) x = 1/2, y = 2/3

(d)

11. The number of 2 2 matrices over Z3 ( the field with three elements) with determinant
1 is
[IIT-JAM10]
(a) 24
(b) 60
(c) 20
(d) 30


a b c


12. The value of the determinant 1 2 0 is
2 4 0
(a) a+b+c

(b) 2

(c) 4
(d) 0


0 a b


13. The value of the determinant a 0 c is
b c 0
(b) abc
(c) abc


2000 2001 2002


14. The value of 2003 2004 2005 is
2006 2007 2008
(a) 0

(d) 2abc

(a) 2000

(b) 0
(c) 45
(d) none of these


x x + 1 x + 2


15. The value of x + 3 x + 4 x + 5 is
x + 6 x + 7 x + 8
(b) x3
(c) 0


5 10 20


16. The value of 0 5 10 is
0 0 2
(a) x

(a) 10

(b) 50

(c) 100

[WBUT 2007]

(d) x + 1

(d) 200

17. The value of the skew-symmetric determinant of odd order is


(a) a perfect square

(b) 0
(c) 1
(d) none of these


x 2 2 5


18. The value x of the equation x 7 3 6 = 0 is
2x 6 4 7
(a) 6

(b) 0

(c) 3
(d) 5


1 1 1


19. The root of the equation x = 0 are
x2 2 2
(c) + ,
(d) + 1, + 1


x + 2 3
3

20. One factor of the determinant 3 x + 4 5 = 0 is
3
5 x + 4
(a) 1, 1

(a) x 1

(b) ,

(b) x 2

(c) x 3

(d) x + 1

Elementary Matrices

223



1 2 3


21. The value of the determinant 2 3 4 is
3 4 5
(a) 2

(b) 6

(c) 0

(d) 1


1 2 1


22. The value of the determinant 2 3 2 is
1 4 1
(a) 5
(b) 0
(c) 10
(d) 15


1


23. If , are the roots of the equation x2 2x + 5 = 0 then the value of 0 is
0
0
(a) 10

(c)

(b) 0

(d) 5

24. If 1 and 2 be two determinants such that their values are respectively 5 and 10.
Then the value of their product is
(a) 5
25. Let 1
(a) 0

(b) 10
(c) 50
(d) 2




5 10 15
5 0 0




= 0 2 0 and 2 = 0 2 0 . Then 1 2 is
0 0 1
0 0 1
(c) 100

(b) 100

(d) 10

26. Let A be a square matrix such that AAT = I then det A is equal to
(a) 0
(b) 2
(c) 1
(d) none of these


0 a b c d


a 0 a b d


27. The value of b a 0 a c is
c b a 0 d


d d c d 0
(a) 0

(c) abcd

(b) abcd

(d) 1



1 2 0


28. The cofactor of the element 2 in the determinant 0 4 3 is
1 2 4
(a) 3

(b) 3

(c) 4

(d) 22



1 2 6


29. The minor of the element 1 of the determinant 2 1 4 is
3 2 1
(a) 15

(b) 17

(c) 17

(d) 15

30. If 0 be the adjoint determinant of the determinant of order 4 then the value of 0
is
(a)

(b) 2

(c) 3

(d) 4

31. If the value of a determinant is 5 and if its first row is multiplied by 3 then the value
of the new determinant is
(a) 3

(b) 5

(c) 15

(d) 5/3

224

Theory of Matrices

100002
0 1 0 0 2 0

0 0 1 2 0 0

is
32. The determinant of the matrix

0 0 2 1 0 0
0 2 0 0 1 0
200001
(a) 0
(b) -9
(c) -27
(d) 1.


a2
33. The matrix
is singular when a is equal to
31

NET(Dec)11

(a) 1

(b) 2
(c) 3
(d) 6

0 1 2
34. The matrix 1 3 is singular when is equal to
21 2
(a) 1/2

(b) 2

(c) 6
(d) 1/2


2 1
35. The adjoint of the matrix
is
3 2






2 1
2 1
2 3
(a)
(b)
(c)
3 2
3 2
12


31
36. The inverse of the matrix
is
21






1 1
1 1
11
1
(a)
(b)
(c) 2
2 3
2 3
23

23 1
37. The inverse of the matrix 0 a 1 exists if
0 0 3
(a) a = 0

(c) a = 3

(b) a = 2


38. The matrix

cos sin
sin cos

12
(d)
2 3


(d)

31
21

(d) a 6= 0


is orthogonal for

(a) all values of


(b) = /2
(c) = 0
(d) =

cos sin 0
39. The matrix sin cos 0 is orthogonal when k is equal to
0
0 2k
(a) 1

(b) 1/2

(c) 1/3

(d) 1/4

40. If A is an orthogonal matrix then det A is equal to


(a) 0

(b) 2

(c) 4

(d) 1

41. If A is an orthogonal matrix then A1 is equal to


(a) A

(b) A2

(c) AT

(d) none of these

42. If A is an orthogonal matrix then which matrix is not orthogonal


(a) 2A

(b) AT

(c) A1

(d) A2

43. Three matrices A, B, C are such that AB = AC. Then B = C when


(a) A is singular

(b) A is null

(c) A is non-singular

(d) for all A

Elementary Matrices

225

44. If A is a singular matrix then AB is


(a) singular

(b) non-singular


12
45. The rank of the matrix
is
30
(a) 0

(b) 2

(a) 4

(b) 3

(a) 1

(b) 2

(c) 3

(a) 0

(b) 1

(c) 2

(c) orthogonal

(d) symmetric

(c) 1

(d) 3

2000
46. The rank of the matrix 0 2 0 0
0040
(c) 2
(d) 1


234
47. The rank of the matrix
468
(d) none of these

2000
48. The rank of the matrix 0 0 0 1 is
0002
(d) 3

1xxx
x 1 x x

49. The rank of the matrix


x x 1 x is one when
xxx1
(c) x = 2
(d) x = 1/3

242
50. If the rank of the matrix 2 1 2 is 2 then the value of x is
10x
(a) x = 0

(b) x = 1

(a) 0

(b) 1

(c) 2

(d) 3

51. If A and B are two matrices such that AB is determinable. Then rank(AB) is equal
to
(a) rank A (b) rank B (c) min{rank A, rank B } (d) max{rank A, rank B }



320
210
52. If A =
and B = 1 2 5 then rank(AB) is
123
334
(a) 2

(b) 3

(c) 1

(d) none of these

53. If the rank of the matrix A is 5 then the rank of the matrix 7A is
(a) 1

(b) 5

(c) 7

(d) 12

54. If P is a non-zero column matrix and Q is a row matrix then the rank of P Q is
(a) 0

(b) 3

(c) 1

(d) none of these

55. The following system of equations x + 2y = 3, 2x + ay = b has unique solution if


(a) a = 5
(b) a = 4
(c) a = 4, b = 1
(d) a 6= 4
56. The system of equations x + 4y + z = 0, 4x + y z = 0 has
(a) unique solution
(b) many solutions
(c) no solution

226

Theory of Matrices

57. The system of equations x + y = 3, x + ay = b has no solution if


(a) a = 1, b = 3
(b) a = 1, b 6= 3
(c) a 6= 1, b = 3
(d) a 6= 1, b 6= 3
58. For what value of a the system of equations x+y+z = 1, x+2yz = 3, 5x+7y+az = 9
has many solutions
(a) 4
(b) 1
(c) 3
(d) 0
59. Let A and Ac represent respectively the coefficient and augmented matrices of a system
of n equations containing n variables. The system of equation AX = b has many
solutions if
(a) rank(A) 6= rank(Ac )
(b) rank(A) = rank(Ac ) = n
(c) rank(A) =rank(Ac )< n
(d) none of these
60. Let A, B be nn real matrices. Which the following statements is correct?NET2012(June)
(a) rank(A + B) = rank (A)+rank(B)
(b) rank(A + B) rank (A)+rank(B)
(c) rank(A + B) = min{rank(A), rank(B)}
(d) rank(A + B) = max{rank(A), rank(B)}
61. Let AX = b be a system of equations with n equations and n variables and Ac be the
augmented matrix [A : b]. The system has a unique solution if
(a) rank(A) = rank(Ac ) = n
(b) rank(A) = rank(Ac ) < n,
(c) rank(A) = rank(Ac ) < n 1
(d) rank(A) 6= rank(Ac )
62. The system of equations AX = b, where A is the coefficient matrix of order n n, is
consistent if
(a) rank(A) = rank(Ac )< n
(b) rank(A) = rank(Ac ) = n
(c) rank(A) 6= rank(Ac )
, (d) rank(A) = rank(Ac ) n
63. Let A be a 5 4 matrix with real entries such that the space of all solutions of the
linear system AX t = [1, 2, 3, 4, 5]t is given by
n
o
[1 = 2s, 2 + 3s, 3 + 4s, 4 + 5s]t ; s < ,
the rank of A is equal to
(a) 4
(b) 3
(c) 2

NET(Dec)11
(d) 1.

64. The values of k, for which k(1, 0, 1) is/are the solution(s) of the system of equations
x + 2y z = 0, 2x + y 2z = 0 is
(a) k = 1
(b) k 6= 1
(c) k is any value
(d) none of these
65. A system of equations is called consistent if it has
(a) no solution
(b) has unique solution
(c) has many solutions
(d) none of these
66. The system of equations x1 + 2x2 = 6, 3x1 + 6x2 = 5 is
(a) consistent
(b) inconsistent
67. The solution of the following system of equations x1 + 2x2 + 3x3 = 6, x2 + 3x3 = 4,
x3 = 2 is
(a) (1, 1, 1)
(b) (2, 1, 1)
(c) (0, 2, 2)
(d) (4, 2, 2)
68. The number of solutions of the equations x + 2y = 5, 2x + 4y = 3 is
(a) 1
(b) infinite
(c) 2
(d) 0

Elementary Matrices

227

69. The system of equations 2x y + 3z = 9, x + y + z = 6, x y + z = 2 has


(a) a unique non-zero solution
(b) infinitely many solutions
(c)no solution
(d) zero solution
n
o
P
P
70. Let S = A : A = [aij ]55 , aij = 0or1, i, j; j aij = 1, i and
a
=
1,
j
, then
i ij
the number of elements in S is
NET(June)11
(a) 52
(b) 55
(c) 5!
(d) 55
71. Let D be a non zero n n real matrix with n 2. Which of the following implications
is valid?
NET(June)11
(a) det(D) = 0 implies rank(D) = 0 (b) det(D) = 1 implies rank(D) 6= 1
(c) det(D) = 1 implies rank(D) 6= 0 (d) det(D) = n implies rank(D) 6= 1
Section-B
[Objective Questions]
1. Give an example of a 3 3 skew-Hermitian matrix.
2. Let A = X + iY be a skew-Hermitian matrix. Show that the diagonal elements of X
are all purely imaginary or 0 and Y is a real symmetric matrix.
3. Let A be a 3 3 real matrix with det(A) = 6. Then show that det(adjA) = 36.
4. Let A be m m real matrix. Show that the row rank of A is the same as the column
rank of A.

14 6
5. Find the column rank of the matrix A = 2 5 9 .
3 6 12
6. Let A be a nonsingular real matrix of order n. Show that det(A1 ) = (detA)1 .
7. Consider the group of all non-singular
3 3 real matrices
under
matrix multiplication.
100
304
Show that the two matrices A = 1 3 0 and B = 0 1 0 are conjugate.
121
001
Section-C
[Long Answer Questions]
1. Obtain A + B and A B in each of the following cases:

3 9
1 6
(a) A = 2 7 ; B = 3 0 .
5 6
8 11
 2 2 
 2 
a b
b bc
(b) A =
; B=
.
2a ac
ac c2
2. Find AB and BA and determine the commutator and anticomutator.




3 9
2 7
(a) A =
; B=
.
1 7
1 5




5 6
4 3
(b) A =
; B=
.
3 2
2 1

228

Theory of Matrices

3. Usingthe following



matrices

3
1

1 0
1
3
1

;
A=
;b=
;C=2
0 1
3 1
3 1




1 3
1
3

;
; F = 12
D = 21
3 1
3 1
2
2
2
Show that (i) A = B = C = I, (ii) AB = D, (iii) AC = BA = F.


2 5
4. If A =
, find scalars a, b such that I + aA + bA2 = 0.
3 1
5. How many multiplications of scalars to compute the product AB, where A is an m n
matrix and B is an n p matrix.

2
2
5

6. If A = 1 , B = (3 0 1 5) and C =
8 . Compute ABC, which of the two
0
1
possible ways of doing this is easier?
7. Show that Ak , for all k 2

1 1 1
2i
(a) for the matrix A = 1 2 ; = e 3 , is
1 2
k

Ak = (1) 2 3 2 I; k = even
= (1)

k1
2

k1
2

A; k = odd.


 k

ab
a b(ak + ak1 + + a + 1)
, is Ak =
.
01
0
1

111
1 k k(k+1)
2
JECA 08
(c) for the matrix A = 0 1 1 , is Ak = 0 1 k .
001
00 1

10 0
Show that every even power of the matrix A = 0 1 0 where a and b are arbitrary
a b 1
scalars, equals to the unit matrix and every odd power equals to itself.


8 57
3
Find the upper triangular matrix A such that A =
.
JECA00
0 27


2 13
T
Let x = (2, 1, 1, 0) , y = (1, 1, 1, 3), A =
, u = (1, 4, 2)T and v =
1 0 1
(1, 4, 3)T . Show that the matrix product AuxT yv T is defined. Evaluate the product.


ab
If A =
prove that,
cd
A2 (a + b)A + (ad bc)I2 = 0.


(b) for the matrix A =

8.

9.
10.

11.

If ad bc 6= 0, find A1 .
12. Find all 2 2 real matrices which commute with

JECA04

Elementary Matrices

229



01
(a) the matric A =
. Ans:
00


23
(b) the real matrix A =
.
14


ab
, a, b <.
0a

13. Find the matrices A and B, if 2A + B T =

JECA98





2 5
18
, AT + 2B =
.
10 2
41

14. If AB = B and BA = A show that A and B are both idempotent.


15. If for two matrices A and B, AB = BA = In , then prove that A is nonsingular and
A1 = B.
JECA06




17 8
21
16. Find the matrix A, if (i) A2 =
and (ii) A2 =
.
8 17
02


ab
17. Find all the real matrices A =
, such that A2 = I2 .
BH02
cd


31
18. If A =
find B = A3 3A2 + 2A, C = 2A2 + 3A + I, BC and CB.
20

54 0
19. If A = 1 3 8 , find column vectors u, v such that uT Av = 8. Are u and v unique?
2 6 12
20. Prove that, if A and B are two matrices such that AB = A and BA = B, then AT , B T
are idempotent.
21. Show that there are no 2 2 matrices A and B such that AB BA = I2 holds.


ab
22. Consider 2 2 matrix A =
. If a + d = 1 = ad bc, then find A3 .
Gate0 98
cd


1 3 3


23. Prove that 2 0 9 is divisible by 19.
JECA04
3 6 1


59
24. Let A =
. Find |A2004 3A2003 |.
12


1 1 1 1


1 x 1 1
, prove that 0 (x) = 3(x 1)2 .
25. If (x) =

1 1 x 1
1 1 1 x
26. If , , , be the roots of the equation x4 x3 + 2x2 + x + 1 = 0, find the value of
2

+ 1 0
0
0

0 2 + 1 0
0

.
2
0
0

+
1
0

2
0
0
0 + 1


a
b a + b

c b + c = 0, then prove that either a, b, c are in GP or is a root of
27. If b
a + b b + c 0
the equation ax2 + 2bx + c = 0.

JECA02

230

Theory of Matrices

28. If , , are the roots of x2 (px + q) = r(x + 1), prove that




1 + 1
1

1 1 + 1 = 0.


1
1 1+
29. If , , are the roots of ax2 + bx + c = 0 then find the value of



1
cos( ) cos

cos( )
1
cos .

cos
cos
1
2

b + c2 ab
ac

30. Express 4 = ba c2 + a2 bc as a square of a determinant of order 3. Hence
ca
cb a2 + b2
determine the value of 4.
CH98, JECA05


2
2
2


bc a
ca b
ab c


31. Show that bc + ca + ab bc ca + ab bc + ca ab = (b c)(c a)(a b)(a + b +
(a + b)(a + c) (b + c)(b + a) (c + a)(c + b)
c)(ab + bc + ca).
32. Prove that


(b + c)2
a2
a2

(c + a)2
b2 = 2abc(a + b + c)3 .
(a) b2
BH98
2
2
c
c
(a + b)2


1 a a2 a3 + bcd


1 b b2 b3 + cda

= 0.
(b)
BH99
2 3

1 c c 2 c 3 + dab
1 d d d + abc


1 + a 1
1
1

1 1+b 1

1
1
1
1
1
(c)
= abcd 1 + a + b + c + d . CH00, BH00, 04, V H02
1
1
1
+
c
1


1
1
1 1 + d
3 2
1


(d) 3 2 1 = ( )( )( )( + + ).
BH01, 03, V H05
3 2 1


m

2r 1
Cr
1
m
2

P
2m
m + 1 , then find
33. m be a positive integer and 4r = m 1
4r .
r=1
sin2 (m2 ) sin2 (m) sin(m2 )
34. Using Laplaces theorem, show that,


0 a b c


a 0 d e
= (af be + cd)2 .
(a)

b
d
0
f


c e f 0


a b a b


b a b a
= 4(a2 + b2 )(c2 + d2 ).
(b)

c
d
c
d


d c d c

BH02, 05, V H03

[WBUT 2005]

Elementary Matrices

231



a b c d


b a d c

= (a2 + b2 + c2 + d2 )2 .
(c) |A| =

c d a b
d c b a
Hence show that the matrix A in which a, b, c, d are real numbers, is non-singular, if
and only if at least one of a, b, c, d is non-zero.
CH98, 02
35. Solve the system of equations by Cramers rule.
(a) x + 2y 3z = 1, 2x y + z = 4 and x + 3y = 5.
36. Expressthe matrix
of a symmetric and a skew-symmetric
matrix.
A as the
sum

231
451
1 3 4
(i)A = 7 5 6 . (ii)A = 3 7 2 JECA98; (iii)A = 7 0 6 .
467
168
28 1
37. Show that every matrix can be expressed uniquely as the sum of a real and a purely
imaginary matrix.
38. (a) Show that the sum of two hermitian matrices is a hermitian matrix.
(b) Show that the product of two hermitian matrices is hermitian if and only if the
two matrices commute with each other.
(c) Prove that in a Hermitian matrix, the diagonal elements are all real.
(d) Let S and A be the matrices obtained by taking the real and imaginary parts,
respectively, of each element of a hermitian matrix H, i.e., H = S + iA, where S
and A are real. Show that S is a symmetric matrix while A is an antisymmetric
matrix.
(e) If H1 is hermitian and H2 is antihermitian, show that both H1 +iH2 and H1 iH2
are hermitian.
(f) Show that any hermitian matrix of order two can be expressed uniquely as a
linear combination of the four vectors:

 
 



10
1 0
01
0 i
;
;
and
.
01
0 1
10
i 0
The last three matrices are known as the Pauli spin matrices for a spin
in quantum mechanics.

1
2

particle

39. (a) If A be a square matrix, then show that A + AT is symmetric and A AT is skew
symmetric.
(b) If A and B are Hermitian, show that A+B, AB +BA are Hermitian and AB BA
is skew Hermitian.
(c) Let A be an n n matrix which is both Hermitian and unitary. Then show that
A2 = I.
Gate0 01
(d) If a matrix A is triangular as well as hermitian, show that A is diagonal.
(e) Let P be a hermitian matrix with the property P 2 = P . Show that for any vector
X, the vectors P X and (I P )X are orthogonal to each other.


cos ei
sin ei
40. (a) Show that the most general unitary matrix is
.
sin ei() cos ei()

232

Theory of Matrices
(b) Show that a unitary triangular matrix must be diagonal.
(c) Prove that the determinant of a Hermitian matrix is real.
(d) Show that a unitary matrix commutes with its own Hermitian conjugate.
(e) If H is a Hermitian matrix, show that U = (H iI)(H +iI)1 is a unitary matrix.

(f) If A is skew-Hermitian, B is symmetric, AB = BA and B + A is non-singular,


show that (B A)(B + A)T is unitary.


cos
sin
41. (a) Show that the most general orthogonal matrix of order 2 is
.
sin cos
(b) Show
that the most general orthogonal matrix of order 3 is

cos cos cos sin sin cos cos sin sin cos cos sin
sin cos cos + cos sin sin cos sin + cos cos sin sin .
sin cos
sin sin
cos

cos sin 0
(c) Find k such that A = sin cos 0 is an orthogonal matrix.
0
0 k
(d) Find the condition
on thereal scalars a and b for which the following matrix is

a+b ba
orthogonal:
.
ab a+b
(e) If A, B are two orthogonal matrices and detA detB < 0, prove that A + B is
singular.
JECA08
(f) If A is a skew symmetric matrix, then show that P = (I A)(I +A) is orthogonal.
JECA07
(g) If A and B are commutating orthogonal matrices such that I + A and I + B are
non-singular, show that (I AB)(I + A + B + AB)1 is skew symmetric.

qpp
42. (a) Find the condition for which the matrix p q p , where p and q are numbers,
ppq
is nonsingular. Show that the inverse, when exists, is a matrix of the same form.


ab
(b) If A =
, prove by using elementary row operations that A is invertible if
cd
and only if ad bc 6= 0. If this condition holds, find A1 .


cosh x sinh x
, and hence show that
(c) Find the inverse of the matrix A =
sinh x cosh x


cosh nx sinh nx
An =
, n Z.
sinh nx cosh nx
43. Find the inverse of the matrix

4 0 1
1 1 0
(a) 2 3 5
Ans: 15 1 0 1
11
1 4 0
5 1 1

3 3 2
1 1 0
(b) 4 3 2
Ans: 0 1 2
2 2 1
2 0 3

BH01, 04

VH02

44. Find the inverse of the matrix by row and column operations.

Elementary Matrices

233

113
(a) 0 1 2 ; C3 (C1 + 2C2 ); 19 C3 ; R1 R2 ; R3 R2 ; R3 3R1 . BH98
710

3 1 1
(b) 0 1 2 ; R3 (3R3 R2 ); C3 + (2C2 C1 ); 12 R1 ; 1R2 . BH98, 00
1 0 1

1 2 1
45. If A = 1 4 1 , find the matrix B such that AB = 6I3 and hence solve the
3 0 3
system of equations 2x + y + z = 5, x y = 0, 2x + y z = 1
CH01

310
46. Find the inverse of the matrix 0 2 3 and hence solve the system of equations
102
3x + y = 4, 2x + 3z = 2, x + 2z = 6.
BH02.
1 2 2
3

47. Show that the matrix 23 13 23 is orthogonal and hence solve the system of equa2
2 1
3 3 3
tions x + 2y + 2z = 2, 2x y + 2z = 1, 2x 2y + z = 7.
BH03.
48. Solve the following system of equations
(a) x + 2y + 3z = 14, 2x y + 5z = 15, 3x + 2y + 4z = 13.

CH02

(b) 2x y + 3z = 1, x + 2y + 2z = 0, 3x + 4y 4z = 2

BH98

(c) 2x + 4y + 3z + w = 1, x + 2y + z 3w = 1, 3x + 6y + 4z 2w = 4.

BH99

(d) 2x y + 3z = 0, x + 2y 4z = 1, 4x + 3y 2z = 3.

BH00

(e) y z = 0, 3x 2y + 4z = 1, 9x + y + 8z = 0.

BH01

by the matrix inverse method.


49. Solve, if possible, the following system of equations
(a) 2x + y 3z = 8, x y 2z = 2, x + 2y z = 10.

BH03

(b) 3x + y = 4, 2x + 3z = 2, x + 2z = 6.

BH04

(c) x + y + z = 6, x + 2y + 3z = 14, x y + z = 2.

BH04.

(d) x y + 2z = 6, x + y + z = 8, 3x + 5y 7z = 14.

BH05.

by matrix method.
50. (a) Determine the values of a, b so that the system of equations x + 4y + 2z =
1, 2x + 7y + 5z = 2b, 4x + ay + 10z = 2b + 1 has (i) unique solution, (ii) no
solution (iii) many solutions in the field of rational numbers.
CH95
(b) Determine the values of k so that the system of equations x + y z = 1, 2x +
3y + kz = 3, x + ky + 3z = 2 has (i) unique solution, (ii) no solution (iii) many
solutions in the field of real numbers.
CH05
(c) Determine the values of a, b, c so that the system of equations x + 2y + z =
1, 3x + y + 2z = b, ax y + 4z = b2 has (i) unique solution, (ii) no solution (iii)
many solutions in the field of real numbers.
CH97, V H03

234

Theory of Matrices
(d) Determine the values of a, b, c so that the system of equations x + y + z = 1, x +
2y z = b, 5x + 7y + az = b2 has (i) unique solution, (ii) no solution (iii) many
solutions in the field of real numbers.
CH99

3
51. If , ,
are in AP
and are roots of x + qx + r = 0, then find the rank of

.
JECA99

52. Determine
the rankof thefollowing matrices

1 2 1 0 3
1 2 1 0
1 2 1 0
2 4 4 1 1
2 4 4 1
2 4 4 1

(i)
0 0 5 2 4 (ii) 0 0 5 2 (iii) 0 0 5 2
3 6 8 1 6
1 2 0 3
1 2 0 3

2 4 2 0
1 1 2 0
13 4 3
1 2 2 3
2 2 1 5

(iv)
0 0 5 2 (v) 1 3 1 0 (vi) 3 9 12 3 .
13 4 1
36 8 1
1 7 4 1
53. Obtain
echelon matrix which
to
a row reduced

is row equivalent

1 2 1 0
0012 1
2 1 3 2
2 4 4 6

(ii) 1 1 5 2 (iii) 1 3 1 0 3
(i)
0 0 5 2
2 6 4 2 8
1 1 1 1
3 6 8 1
3 9 4 2 10

2 3 1 4
2 1 3
(iv) 0 1 2 1 (v) 3 2 1 and hence find its rank.
0 2 4 2
1 4 5

abc
54. If a, b, c be real and unequal, find the rank of the matrix b c a , when (i)a+b+c = 0
cab
and (ii)a + b + c 6= 0.

a 1 1
1 a 1

55. Determine the rank of


V H03
1 1 a , when a 6= 1 and a = 1.
1 1 1

201
56. Express the matrix A = 3 3 0 as a product of elementary matrices and hence find
623
1
A .
BH02, CH05
57. Let be a real number. Prove that the matrices


 i

cos sin
e
0
and
sin cos
0 ei
are similar over the field of complex numbers.

BU (M.Sc.)02

58. Obtain the normal form under congruence and the rank of the symmetric matrix

023
A = 2 4 5.
356

Chapter 4

Vector Space
In many application in mathematics, the sciences, and engineering the notion of vector space
axis. Here we define the notion and structure of the vector space. In geometry, a vector
has either 1, 2 or 3 components and it has a direction. In three-dimension, a vector can
be represented uniquely by three components. Here three-dimensional vector is extended
into an n-dimensional vector and it is studied in algebraic point of view. An n-dimensional
vector has n components.

4.1

Vector Space

Let (F, +, .) be a field. Let (V, ) be system, where V is a non-empty set and let be an
external composition of F with V . V is said to be a vector space or linear space over the
field F , if the following axioms are satisfied:
(i) hV, i is an abelian group.
(ii) a F, V a V
(iii) (a + b) = a b ;

a, b F and V

(iv) a ( ) = a a ; a F, , V.
(v) (a.b) = a (b );

a, b F and V

(vi) 1 = , where 1 is the identity element in F .


The vector space V over the field F is very often denoted by V (F ). The elements of V are
called vectors and are denoted by , , , . . . etc. and the elements of F are called scalars and
are denoted by a, b, c, . . . etc. The operation is called vector addition, and the operation
is called scalar multiplication.
1. The field of scalars F is called ground field of the vector space V (F ).
2. The external composition of F with V is called multiplication by scalars.
3. If F is the field R, of real numbers, then V (R) is called real vector space. Similarly,
if F is the field of rational numbers (Q) or the field of complex numbers (C) then V is
called respectively rational or complex vector space.
4. A vector space V = {} consisting of zero vector alone is called a trivial vector space.
235

236

Vector Space

Elementary properties
Here we shall discuss so elementary properties of a vector space. In a vector space V (F ),
we have
(i) c = ; for any c F and V .
(ii) 0 = ; V, 0 F
(iii) (a) = (a) = a(); a F and V.
(iv) a( ) = a a; a F and , V.
(v) a = either a = 0 or = ; a F and V.
(vi) For a, b F and any non null vector in V,
a = b a = b.
(vii) For , V and any non zero scalar a in F,
a = a =
Proof: The property holds directly from the definition of a vector space.
(i) Since is the null element in V, we have, + = in V. Thus,
+ c = c; as is the additive identity in (V, +)
= c( + ) = c + c
c = ; by cancellation law in group (V, +).
(ii) 0 is the zero element in F so 0 + 0 = 0 in F. Now
+ 0 = 0 = (0 + 0) = 0 + 0
Using cancellation law in group (V, +), we have, 0 = .
(iii) Since (a) is the additive identity of a in F , we have,
= 0 = [a + (a)] = a + (a)
Thus (a) is the additive inverse of a i.e., (a) = (a) and similarly, a() =
(a). Thus, (a) = (a) = a().
(iv) Using the definition of subtraction, = + (). Thus using the property (iii)
we get,
a( ) = a[ + ()] = a + a()
= a + [(a)] = a a.
Hence the property. Also, + = 1 + 1 = (1 + 1) = 2.
(v) Let a = and let a 6= 0,then a1 exist in F. Now,
a = and a 6= 0 a1 (a) = a1
(a1 a) = ; as (ab) = a(b); and a1 =
1 = = . as 1 = , by definition.
Thus, a = and a 6= 0 = . Again, let a = and 6= . Let, if possible, a 6= 0.
Then a1 exists and so
a = 1 = ,
which is a contradiction. So whenever a = and 6= then a = 0. Hence,
a = either a = 0 or = .

Vector Space

237

(vi) Let a, b be any two scalars and be a non null vector in V such that a = b holds.
Then,
a = a and a 6= 0 a a = and a 6= 0
(a b) = and 6=
a b = 0 a = b.
(vii) Let , be any two vectors in V and a non zero scalar in F such that a = b holds.
Then,
a = a and a 6= 0 a a = and a 6= 0
a( ) = and a 6= 0
= = .
Ex 4.1.1 (Vector space of Matrices) Let V be the set of all m n matrices belong to the
field F. Show that V is a vector space over F under the usual addition of matrices and
multiplication of a matrix by a scalar as the two composition.
Solution: Let A = [aij ]mn ; B = [bij ]mn ; C = [cij ]mn be any three matrices in V , where
aij , bij , cij F. The + composition on V , defined by
(aij ) + (bij ) = (aij + bij )
and the external composition (known as multiplication of matrices by real numbers) be
defined by c(aij ) = (caij ).
(i)
A + B = [aij ] + [bij ] = [aij + bij ]mn .
Since aij + bij F (as F is a field), so, A + B V ; A, B V . So the closure axiom is
satisfied.
(ii) We know, matrix addition is always associative, so,
A + (B + C) = [aij + bij + cij ] = (A + B) + C; A, B, C V.
(iii) Let = [0]mn ; as 0 is the additive identity in F so, 0 F and so V and
A + = [aij + 0] = [0 + aij ] = + A; A V.
Hence is the additive identity in V .
(iv) As (aij ) is the additive inverse of aij so, (aij ) F and so,
A + (A) = (A) + A = ; A V.
Hence (A) V is the additive inverse in V .
(v) We know, matrix addition is always commutative, so,
A + B = [aij + bij ] = [bij + aij ]; + is abelian in F
= B + A; A, B V.
Hence addition (+) composition is commutative in V .
(vi) If A = [aij ]mn and c F is an arbitrary element, then cA is also a m n matrix and
cA = c[aij ]mn = [caij ]mn .
As caij F , so, cA V . Therefore closure axiom with respect to multiplication is satisfied.
(vii) Now,
c[A + B] = c[aij + bij ] = [caij + cbij ]
= [caij ] + [cbij ] = cA + cB; A, B V.

238

Vector Space

(viii)
(c + d)A = [(c + d)aij ] = [caij + daij ]; F is field
= [caij ] + [daij ] = cA + dA; A V.
(ix)
(cd)A = [(cd)aij ] = [cdaij ]
= [c(daij )] = c[daij ] = c(dA).
(x)
1A = 1[aij ] = [1aij ] = [aij ] = A; as 1 F.
Since all the axioms for vector space hold, so, V (F ) is a vector space. This space is called
Vector space of Matrices and is denoted by Mmn (F ).
Ex 4.1.2 (Vector space of polynomials) Let P [x] be the set of all polynomials over a real field
<. Show that P [x] is a vector space with ordinary addition of polynomials and the multiplication of of each coefficient of the polynomial by a member of < as the scalar multiplication
composition.
Solution: Let P [x] be a set of all real polynomials of degree < n. A real polynomial in
x of degree k is a function that is expressible as f = c0 + c1 x + c2 x2 + . . . + ck xk , where
c0 , c1 , c2 , . . . , ck <, with ck 6= 0. The addition composition (+) on P [x] is defined as
f + g = (c0 + c1 x + c2 x2 + . . . + ck xk ) + (d0 + d1 x + d2 x2 + . . . + dl xl )
= (c0 + d0 ) + (c1 + d1 )x + . . . + (ck + dk )xk + dk+1 xk+1 + . . . + dl xl ; k < l
= (c0 + d0 ) + (c1 + d1 )x + . . . + (cl + dl )xl + cl+1 xl+1 + . . . + ck xk ; k > l
= (c0 + d0 ) + (c1 + d1 )x + . . . + (ck + dk )xk ; k = l
(i.e., add coefficients of like-power terms) and an external composition of < with P [x], called
multiplication of polynomials by real numbers be defined by,
rf (x) = (rc0 ) + (rc1 )x + (rc2 )x2 + . . . + (rck )xk ; r(6= 0) <.
We are to show that, P [x](<) is a vector space with respect to the above defined compositions. It is easy to verify that (P [x], +) is an abelian group. Now, if f, g P [x], then
, <, we have,
(i)[f + g] = [c0 + c1 x + c2 x2 + . . . + ck xk + d0 + d1 x + d2 x2 + . . . + dl xl ]
= (c0 + d0 ) + (c1 + d1 )x + . . .
= (c0 + c1 x + c2 x2 + . . .) + (d0 + d1 x + d2 x2 + . . .)
= (c0 + c1 x + c2 x2 + . . .) + (d0 + d1 x + d2 x2 + . . .)
= f + g.
(ii)( + )f = ( + )[c0 + c1 x + c2 x2 + . . . + ck xk ]
= ( + )c0 + ( + )c1 x + ( + )c2 x2 + . . . + ( + )ck xk
Ex 4.1.3 (Continuous function space:) Prove that, the set C[a, b] of all real valued continuous function defined on the interval [a, b] forms a real vector space with respect to addition,
defined by,
(f + g)(x) = f (x) + g(x); f, g C[a, b]
and multiplication by a real number by
(f )(x) = f (x); f C[a, b].

Vector Space

239

Solution: Let f, g, h be any three elements of C[a, b]. The addition composition and
multiplication by a scalar is defined by,
(f + g)(x) = f (x) + g(x); f, g C[a, b]
(f )(x) = f (x); f C[a, b].
(i) We know, sum of two continuous function is also a continuous function, so,
f + g C[a, b]; f, g C[a, b].
Hence closure property holds.
(ii) Now,
[f + (g + h)](x) = f (x) + (g + h)(x); by definition
= f (x) + g(x) + h(x)
= (f + g)(x) + h(x) = [(f + g) + h](x).
Thus, f + (g + h) = (f + g) + h; f, g, h C[a, b].
Therefore, the addition composition is associative.
(iii) Let (x) = 0, x [a, b], then (x) is also a continuous function on [a, b], i.e., (x)
C[a, b] and,
(f + )(x) = f (x) + (x)
= f (x) + 0 = f (x)
= (x) + f (x) = ( + f )(x),
f + = f = + f ; f C[a, b].
Hence is the additive identity in C[a, b]. The zero vector in C[a, b] maps every x [a, b]
into zero element 0 F .
(iv) We know, if f (x) is a continuous function in [a, b], then f (x) is also a continuous in
[a, b] and
[f + (f )](x) = f (x) + (f (x)) = (x)
= (f (x)) + f (x) = (f + f )(x)
f + (f ) = = (f ) + f ; f C[a, b].
Therefore, (f ) is the additive inverse in [a, b].
(v) If f (x)+g(x) is continuous function of x, then g(x)+f (x) is also continuous for x [a, b].
So,
f + g = g + f ; f, g C[a, b].
(vi) The multiplication of a continuous function, with a real number is given by,
(f )(x) = f (x); <, f C[a, b]
f C[a, b].
(vii) Now,
(f + g)(x) = f (x) + g(x) = g(x) + f (x)
= (f + g)(x)
(f + g) = f + g; f, g C[a, b].
(viii)

240

Vector Space
[( + )f ](x) = [( + )f (x)] = [f (x) + f (x)]
= [f + f ](x)
( + )f = f + f ; f C[a, b]; , <.

(ix)
[()f ](x) = [()f (x)] = f (x)
= [f (x)] = [f ](x)
()f = (f ); , < and f C[a, b].
(x)
(1f )(x) = 1f (x) = f (x)
1f = f ; f C[a, b] and1 <.
Since all the axioms for vector space hold, so, V (F ) is a vector space. This space is called
Vector space of continuous functions.
Ex 4.1.4 Consider the vector space F 3 , where F is the Galois field of order 3, i.e., F =
{0, 1, 2} and addition and multiplication in F are modulo 3. If this vector space, find (i)
(1, 1, 2) + (0, 2, 2), (ii) the negative of (0, 1, 2) and (iii) 2(1, 1, 2).
Solution: According to the definition of addition,
(1, 1, 2) + (0, 2, 2) = (1 + 0, 1 + 2, 2 + 2, ) = (1, 3, 4)
= (1, 0, 1), as 3 0(mod3) and 4 1(mod3).
Let the negative of (0, 1, 2) be (x1 , x2 , x3 ), then by definition,

(0, 1, 2) + (x1 , x2 , x3 ) = (0, 0, 0)


(x1 , 1 + x2 , 2 + x1 ) = (0, 0, 0)
x1 = 0, 1 + x2 = 0, 2 + x3 = 0
x1 = 0, 1 + x2 = 3, 2 + x3 = 3 as 3 0(mod3)
x1 = 0, x2 = 2, x3 = 1.

Thus the negative of (0, 1, 2) is (0, 2, 1). Also, by definition,


2(1, 1, 2) = (2, 2, 4) = (2, 2, 1) as 4 1(mod3).

4.1.1

Vector Subspaces

In the study of the algebraic structure, it is of interest to examine subsets that possesses
the same structure as the set under consideration.
Definition 4.1.1 Let V (F ) be a vector space. A non empty subset W of V is called a sub
vector space or vector sub space of V , if W is a vector space in its own right with respect to
the addition and multiplication by scalar compositions on V , restricted only on points of
W.
Note that, every vector space has at least two subspaces :
(i) In an arbitrary vector space V (F ), V itself is a subspace of V . This subspace is called
improper subspace of V .
(ii) In an arbitrary vector space V (F ), the set {} consisting only the null vector forms a
subspace. This subspace is called the trivial or zero subspace of V .

Vector Space

241

Criterion for identifying subspaces


Let V (F ) be a vector space. A necessary and sufficient conditions for a non empty subset
W of V to be a subspace of V are that,
(i) W, W + W, i.e., W is closed under vector addition, and
(ii) a F, and W a W, i.e., W is closed under scalar multiplication.
Proof: Condition necessary : Let us first suppose that W is a subspace of V (F ). Then,
by definition, W is a vector space in its own right. Consequently, W must be closed under
addition, and the scalar multiplication on W over F must be well defined. Hence,
W, W + W,
by closure property in the group (W, +) and,
a F, and W a W,
by definition of vector space W (F ). Thus, the condition is necessary.
Condition sufficient : Let the given conditions be satisfied in W . Now, if be an arbitrary
element of W and 1 is the unity of F , then 1 F and therefore, according to the given
conditions, we have
1 F, W (1) W W.
Thus every element in W has its additive inverse in W . Consequently,
W, W W, W
[ + ()] W W.
This shows that, hW, +i is a subgroup of the additive group hV, +i. Moreover, all the
elements of W being the elements of V , and the addition of vectors being commutative in
V , so it is in W . Therefore, hW, +i is an abelian subgroup of the additive group hV, +i. Also,
it is being given that, the scalar multiplication composition is well defined in W . Further all
elements in W being elements of V , all the remaining four conditions of the vector space are
satisfied by elements of W as they are hereditary properties. Thus, W is by itself a vector
space over F . Hence, W is a subspace of V (F ).
Result 4.1.1 The necessary and sufficient conditions for a non-empty subset W of a vector
space V (F ) to be a subspace are that,
W, W a + b W ; a, b F.

(4.1)

Deduction 4.1.1 Thus a subset W of a vector space V is a subspace of V if and only if


the following four properties hold:
(i) + W ; , W.
(ii) c W ; c F and W.
(iii) W has a zero vector.
(iv) Each vector in W has an additive inverse in W .
Ex 4.1.5 Let S = {(a, b, c) : a, b, c < and a 2b 3c = 0}. Show that S is a subspace of
the real vector space <3 .

242

Vector Space

Solution: Obviously, S is a nonempty subset of the real vector space <3 (<). Let =
(a1 , b1 , c1 ) and = (a2 , b2 , c2 ) S, then, ai , bi , ci < and
a1 2b1 3c1 = 0; a2 2b2 3c2 = 0.
For any two scalars, x, y <, we have,
x + y = x(a1 , b1 , c1 ) + y(a2 , b2 , c2 )
= (xa1 + ya2 , xb1 + yb2 , xc1 + yc2 ).
Since ai , bi , ci < and x, y <, we have, xa1 + ya2 , xb1 + yb2 , xc1 + yc2 < and,
(xa1 + ya2 ) 2(xb1 + yb2 ) 3(xc1 + yc2 )
= x(a1 2b1 3c1 ) + y(a2 2b2 3c2 )
= x0 + y0 = 0.
Therefore, x + y S, shows that S is a subspace of <3 (<).
Ex 4.1.6 In a real vector space <3 , every plane through the origin is a subspace of <3 .
Solution: Let a plane through the origin be ax + by + cz = 0, where a, b, c are given real
constants in < with (a, b, c) 6= (0, 0, 0). Consider the set of planes passing through origin as
S = {(x, y, z) <3 : ax + by + cz = 0}.
Obviously, S is a nonempty subset of the real vector space <3 (<). Let = (x1 , y1 , z1 ) and
= (x2 , y2 , z2 ) S, then, xi , yi , zi < and
ax1 + by1 + cz1 = 0; ax2 + by2 + cz2 = 0.
For any two scalars, p, q <, we have,
p + q = p(x1 , y1 , z1 ) + q(x2 , y2 , z2 ) = (px1 + qx2 , py1 + qy2 , pz1 + qz2 ).
Since xi , yi , zi < and p, q <, we have, px1 + qx2 , py1 + qy2 , pz1 + qz2 < and,
a(px1 + qx2 ) + b(py1 + qy2 ) + c(pz1 + qz2 )
= p(ax1 + by1 + cz1 ) + q(ax2 + by2 + cz2 ) = p0 + q0 = 0.
Therefore, p + q S, shows that S is a subspace of <3 (<). In <3 , any plane not through
passes through the origin is not a subspace. Hence any solution of a system of homogeneous
equation in n unknowns forms a subspace of <n called the solution space and so the non
homogeneous system of linear equations in n unknowns is not a subspace of <n .
Ex 4.1.7 Show that W = {(x, y, z) <3 : x2 + y 2 + z 2 = 5} is not a subspace of <3 .
Solution: Let = (x1 , y1 , z1 ) and = (x2 , y2 , z2 ) be any two vectors of W . Therefore,
x21 + y12 + z12 = 5 and x22 + y22 + z22 = 5. Now,
+ = (x1 + x2 , y1 + y2 , z1 + z2 ) 6 W
because (x1 + x2 )2 + (y1 + y2 )2 + (z1 + z2 )2 6= 5. Hence W is not a subspace of <3 .
Ex 4.1.8 If a vector space V is the set of real valued continuous functions over <, then
d2 y
dy
show that the set W of solutions of 2 dx
2 9 dx + 2y = 0 is a subspace of V .

Vector Space

243
2

d y
dy
Solution: Here W = {y : 2 dx
2 9 dx + 2y = 0}, y = f (x). y = 0 is the trivial solution, so
0 W . Thus, W is a nonempty subset of the real vector space V . Let y1 , y2 W , then,

d 2 y1
dy1
d 2 y2
dy2

9
+
2y
=
0;
and
2
9
+ 2y2 = 0.
1
dx2
dx
dx2
dx
Let a, b be two scalars in <, then,
2

d2
d
(ay1 + by2 ) 9 (ay1 + by2 ) + 2(ay1 + by2 )
dx2
dx
d 2 y1
dy1
d 2 y2
dy2
= a[2 2 9
+ 2y1 ] + b[2 2 9
+ 2y2 ]
dx
dx
dx
dx
= a0 + b0 = 0.

Since ay1 + by2 satisfies the differential equation, so,


y1 , y2 W ay1 + by2 W ; a, b <.
Therefore, W is a subspace.
Ex 4.1.9 Let V be the vector space of all functions from the real field < into <, and let,
Ve = {f V ; f (x) = f (x), x <}
be the set of even functions. Show that Ve is a subspace of V .
Solution: First we are to show that, Ve is non-empty. Here, Ve , as,
(x) = 0 = (x).
Now, if f, g Ve , then for any scalars a, b < and x <, we have,
(af + bg)(x) = (af )(x) + (bg)(x)
= af (x) + bg(x)
= af (x) + bg(x); as both f, g are even
= (af )(x) + (bg)(x) = (af + bg)(x).
This shows that, whenever f and g are even functions, af + bg is also even. Thus,
f, g Ve af + bg Ve ; a, b <.
Hence Ve is a subspace of V . Similarly, the set of all odd functions, given by,
Vo = {f V ; f (x) = f (x), x <}
is a subspace of V .
Ex 4.1.10 Let V be the vector space of square matrices of order n over a field <. Prove
that, the set of all symmetric matrices in V are subspaces of V .
Solution: Let W = {A V : AT = A}, where A = [aij ]nn and aij <. Now, if
A, B W , then for any scalars a, b <, we have,
[aA + bB]T = [a[aij ] + b[bij ]]T
= [[aaij ] + [bbij ]]T ; where aaij <, bbij <
= [aaij ] + [bbij ]T = [aaji ] + [bbji ]
= [aaij ] + [bbij ]; as aij = aji and bij = bji
= a[aij ] + b[bij ] = aA + bB.
This shows that,A, B W aA + bB W ; a, b <. Hence, W is a subspace of V .
Similarly, the set of all skew symmetric matrices in V are subspaces of V .

244

Vector Space


Ex 4.1.11 Let W denote the collection of all elements of the form

a b
b a


from the space

M2 (F ). Prove that, W (F ) is a subspaces of M2 (F ).







a b
ai bi
Solution: Here, W =
; a, b F . Let Ai =
; i = 1, 2 W, then,
b a
b

 i ai
a1 + a2 b1 + b2
A1 + A2 =
W
(b1 + b2 ) a1 + a2

 

a1 b1
ca1 cb1
cA1 = c
=
W.
b1 a1
cb1 ca1
Therefore, W (F ) is a subspaces of M2 (F ).
Two important subspaces
Theorem 4.1.1 Let V be a vector space over the field F and let V . Then W = {c; c
F } forms a subspace of V .
Proof: Clearly, W is non empty. Here we consider two cases for = and 6= .
Case 1: Let = , then W = {c} = {}, so that, W is a subspace of V .
Case 2: Let 6= , then W is a non empty subset of V , as W . Let 1 , 2 W , then
for some scalers c1 , c2 F , we have,
1 = c1 ; 2 = c2
1 + 2 = (c1 + c2 ) W ; as c1 + c2 F.
Thus, 1 W, 2 W 1 + 2 W.
Let k F be another scalar. Then,
k1 = k(c1 ) = (kc1 ) W, as kc1 F.
So, k F, 1 W k1 W.
Hence W is a subspace of V . This subspace is said to be generated by the vector and
is said to be the generator of the subspace W .
Theorem 4.1.2 Let V (F ) be a vector space and , V . Then the set of all linear
combinations, i.e., W = {c + d; c F, d F } forms a subspace of V .
Proof: As = (0 + 0) W , so W is a non empty subset of V . Let 1 = c1 + d1 W
and 2 = c2 + d2 W , where the scalars c1 , c2 , d1 d2 F . Since c1 , c2 , d1 d2 F , we have,
c1 + c2 F and d1 + d2 F by closure axiom in F . Now,
1 + 2 = c1 + d1 + c2 + d2
= (c1 + c2 ) + (d1 + d2 ) W.
Therefore, 1 W, 2 W 1 + 2 W.
Let k F be another scalar. Then,
k1 = k(c1 + d1 ) = (kc1 ) + (kd1 )
W ; as kc1 F, kd1 F.
So, k F, 1 W k1 W.
Thus, W is a subspace of V . The set {, } is said to be a generating set of the subspace
W . In general, if 1 , 2 , . . . , r V , then
W = {c1 1 + c2 2 + . . . + cr r ; ci F }
forms a subspace of V and {1 , 2 , . . . , r } is a generating set of the subspace W .

Vector Space

245

Algebra of vector subspaces


Theorem 4.1.3 The intersection of any two subspaces of a vector space is also a subspace
of the same.
Proof: Let W1 , W2 be two subspaces of a vector space V (F ). Clearly, W1 and W2
and so,
W1 W2 W1 W2 6= .
Thus, W1 W2 is non empty. If W1 W2 = {}, then a, b F , we have,
a + b = a + b
= + = {} = W1 W2 .
Thus W1 W2 is a subspace of V (F ). Now, let W1 W2 6= {} and let , W1 W2 and
a, b F. Therefore,
, W1 W2 , W1 and , W2 .
Since W1 as well as W2 is a subspace of V ,

W1 , W1 a + b W1
a + b W1 W2
W2 , W2 a + b W2
W1 W2 , W1 W2 a + b W1 W2 ; a, b F.
Therefore, W1 W2 is also a subspace of V (F ).
Theorem 4.1.4 The intersection of an arbitrary collection of subspaces of a vector space
is a subspace of the same.
Proof: Let {Wk ; k } be an arbitrary
of a vector space V (F ).
\collection of subspaces
\
Then,
each Wk
Wk W =
Wk 6= .
k

Thus, W is non empty. Now, let , W, then


\
,
Wk , each Wk
k

a + b each Wk ; [ each Wk is a subspace ]


\
a + b
Wk ; a, b F.
k

Hence

Wk is a subspace of V (F ).

Theorem 4.1.5 The union of two subspaces of a vector space is its subspace if and only if
one is contained in the other.
Proof: Let W1 and W2 be two subspaces of a vector space V (F ), then we are to show that
W1 W2 is a subspace of V iff either W1 W2 or W2 W1 , i.e., either W1 W2 = or
W2 W1 = . If possible, let us assume that both W1 W2 6= and W2 W1 6= . Then
vectors , such that W1 but 6 W2 and W2 but 6 W1 . Now,
W1 W1 W2 and W2 W1 W2
+ W1 W2 ; as W1 W2 is a subspace of V (F )
+ W1 or + W2 .

246

Vector Space

Again, if + W1 , then W1 being a subspace so,


+ W1 , W1 ( + ) W1 ; W1 is a subspace
W1 ,
which is a contradiction. Similarly,
+ W2 , W2 ( + ) W2 ; W2 is a subspace
W2 ,
which is a contradiction. Thus, + 6 W1 ; + 6 W2 and so + 6 W1 W2 . So
our assumption that both W1 W2 6= and W2 W1 6= is not tenable and so either
W1 W2 = or W2 W1 = , i.e., W1 W2 or W2 W1 .
Conversely, let W1 and W2 be the two subspaces of a vector space V (F ), such that
either W1 W2 or W2 W1
either W1 W2 = W2 or W1 W2 = W1 .
But W1 and W2 being the subspaces of V and W1 W2 being either equal to W2 or equal
to W1 , so in each case W1 W2 is a subspace of V (F ). Thus, a vector space can not be the
union of two proper subspaces.
Result 4.1.2 The union of two subspaces of a vector space V (F ) is not, in general, a
subspace of V (F ). For example, let < be the field of real numbers, and let us consider two
subspaces S and T of the vector space <3 , where,
S = {(a1 , a2 , 0); a1 , a2 <} and T = {(0, a2 , a3 ); a2 , a3 <}.
Now, if we consider two particular elements as = (1, 2, 0) and = (0, 3, 4) of S and T
respectively. Here, S T and S T , but + = (1, 5, 4) 6 S T . Thus here,
S T, S T + 6 S T.
Hence (S T )(F ) is not a subspace of <3 (<).

4.2

Linear Sum

Let W1 and W2 be two subspaces of a vector space V (F ). Then the subset,


W1 + W2 = {s + t : s W1 , t W2 }

(4.2)

is said to be the linear sum of the subspaces W1 and W2 . Clearly, each element of W1 + W2
is expressible as sum of an element of W1 and the element of W2 .
Result 4.2.1 Let W1 , then,
= + ; where W1 and W2
W1 + W2 W1 W1 + W2 .
Similarly, W2 W1 + W2 .
Theorem 4.2.1 Let W1 and W2 be two subspaces of a vector space V (F ), then,
W1 + W2 = {w1 + w2 : w1 W1 and w2 W2 }
is a subspace of V (F ).

Linear Sum

247

Proof: Let W1 , W2 be two subspaces of a vector space V (F ). Then each of W1 and W2 is


nonempty and so W1 + W2 6= . Let , be two arbitrary elements of W1 + W2 ,. then,
= 1 + 2 for some 1 W1 and 2 W2 ,
and = 1 + 2 for some 1 W1 and 2 W2 .
Let a, b be any two scalars in F , then
1 W1 , 1 W1 a1 + b1 W1 ; as W1 is a subspace,
2 W2 , 2 W2 a2 + b2 W2 ; as W2 is a subspace.
Therefore, we get,
a + b = a(1 + 2 ) + b(1 + 2 )
= (a1 + b1 ) + (a2 + b2 ) W1 + W2 .
Thus, , W1 + W2 a + b W1 + W2 ; a, b F.
Thus W1 + W2 is a subspace of V (F ).

4.2.1

Smallest Subspace

Let S be a subset of a vector space V (F ). Then a subset U of V is called the smallest


subspace containing S, if U is a subspace of V containing S and is itself contained in every
subspace of V containing S. Such a subspace is also called a subspace generated or spanned
by S, and we shall denote it by {S}. Clearly, {S}is the intersection of all subspaces of S,
each containing S.
Theorem 4.2.2 The subspace W1 + W2 is the smallest subspace of V containing the subspaces W1 and W2 .
Proof: Let S be any subspace of V containing the subspaces W1 and W2 . Let be an
element of W1 + W2 , then,
= 1 + 2 , for some 1 W1 , 2 W2 .
Since W1 S, W2 S, so 1 S, 2 S. Also, as S is a subspace of V , we get,
1 , 2 S 1 + 2 S
S.
Thus, = 1 + 2 W1 + W2 S
W1 + W2 S.
Thus, W1 + W2 is the smallest subspace containing W1 and W2 .

4.2.2

Direct Sum

A vector space V (F ) is said to be the direct sum of its subspaces W1 and W2 , denoted by
V = W1 W2 , if each element of V is uniquely expressible as the sum of an element of W1
and element of W2 , i.e., if each V is uniquely expressed as,
= 1 + 2 ; 1 W1 and 2 W2 .
W1 and W2 are said to the complementary subspaces.
Theorem 4.2.3 The necessary and sufficient conditions for a vector space V (F ) to be the
direct sum of its subspaces W1 and W2 are,
(i) V = W1 + W2

and

(ii) W1 W2 = {}.

248

Vector Space

Proof: First suppose that, V = W1 W2 . Then, each element of V is expressed uniquely


as the sum of an element of W1 and an element of W2 . Consequently, V = W1 + W2 and so
(i) is satisfied. Now, to verify the validity of (ii), if possible, let (6= ) W1 W2 . Then
is a non zero vector common to both W1 and W2 . We may write,
= + where W1 and W2
= + where W1 and W2 .
This shows that, a non zero element V is expressible in at least two ways as of an
element of W1 and an element of W2 . This contradicts the fact that V = W1 W2 . Hence
is the only vector common to both W1 and W2 . Thus W1 W2 = {}. Therefore,
V = W1 W2 V = W1 + W2 and W1 W2 = {}.
Conversely, let the conditions (i) and (ii) hold and we are to show that V = W1 W2 , i.e.,
we are to show that, each element of V is expressed uniquely as the sum of an element of
W1 and an element of W2 . Now, the condition V = W1 + W2 revels that each element of V
is expressed as the sum of an element of W1 and an element of W2 . Hence we are to show
that this expression is unique. Let, if possible, (6= ) V be expressed as,
= 1 + 2 ; 1 W1 and 2 W2
= 1 + 2 ; 1 W1 and 2 W2
1 + 2 = 1 + 2
1 1 = 2 2 W1 W2 ; as 1 1 W1 and 2 2 W2
1 1 = and 2 2 = ; as W1 W2 = {}
1 = 1 and 2 = 2 .
Thus each element of V is uniquely expressed as sum of an element of W1 and an element
of W2 . Hence V = W1 W2 .
Ex 4.2.1 Let W1 and W2 be two subspaces of <3 (<), where W1 be the xy plane and W2 be
the yz plane. Find the direct sum of W1 and W2 .
Solution: Here given that the vector space V = <3 (<) = {(x, y, z) : x, y, z <}. The two
subspaces W1 and W2 are given by,
W1 = {(a, b, 0); a, b <}; W2 = {(0, b, c); b, c <}.
The linear sum of W1 and W2 is given by,
W1 + W2 = {(a, b, c); a, b, c <}.
Since every vector in <3 is the sum of vector in W1 and a vector in W2 , so, W1 + W2 =
<3 .Also,
W1 W2 = {(0, b, o); b <} 6= {(0, 0, 0); b 6= 0 <}.
Thus, <3 is not the direct sum of W1 and W2 . Also, take a particular example, say, (5, 7, 9),
then,
(5, 7, 9) = (5, 5, 0) + (0, 2, 9); (5, 7, 9) = (5, 4, 0) + (0, 3, 9).
Thus the linear sums are not unique. This also shows that <3 is not a direct sum of W1 and
W2 .
Ex 4.2.2 In a vector space V of all real valued continuous functions, defined on the set <,
of real numbers, let Ve and V0 denote the sets of even and odd functions respectively. Show
that Ve and V0 are subspaces of V and V = Ve V0 .

Quotient Space

249

Solution: It has already been shown that, Ve as well as V0 is a subspace of V . Now, in


order to show that, V = Ve V0 , we must prove that V = Ve + V0 and Ve V0 = {}. Let
f be an arbitrary element of V . Then x <, we have,
1
1
[f (x) + f (x)] + [f (x) f (x)]
2
2
= fe (x) + f0 (x) = (fe + f0 )(x),

f (x) =

where, fe (x) = 21 [f (x) + f (x)] and f0 (x) = 12 [f (x) f (x)]. Also,


1
[f (x) + f (x)] = fe (x)
2
1
and f0 (x) = [f (x) f (x)] = f0 (x).
2
fe (x) =

Therefore, fe Ve and f0 V0 and consequently,


f = fe + f0 ; where fe Ve , f0 V0 .
This shows that, every element of V is expressed as sum of an element of Ve and an element
of V0 , so, V = Ve + V0 . Also, the condition Ve V0 = {} follows from the fact that the zero
function is the only real valued function on <, which is both even and odd. Thus,
V = Ve + V0 ; Ve V0 = {} V = Ve V0 .
Ex 4.2.3 Let V be the vector space of square matrices of order n over a field <. Let Vs
and Va be subspaces of symmetric and antisymmetric matrices in V respectively. Show that
V = V s Va .
Solution: It has already been shown that, Vs as well as Va is a subspace of V . Now, in
order to show that, V = Vs Va , we must prove that V = Vs + Va and Vs Va = {}. Let
A be an arbitrary element of V and A = X + Y , where,
X=

1
1
(A + AT ) and Y = (A AT ).
2
2

Since X T = X and Y T = Y , so X Vs and Y Va . If M Vs Va , then, M = M T and


M T = M . It implies that, M = M , i.e., M = 0 = . Hence, Vs Va = {}.Thus,
V = Vs + Va ; Vs Va = {} V = Vs Va .

4.3

Quotient Space

Let W be a subspace of a vector space V (F ) and let V . Then the set given by,
+ W = { + w : w W } V
is called a coset of W in V and is denoted by + W . The set of all distinct cosets of W is
denoted by V /W .
Theorem 4.3.1 Let W be a subspace of a vector space V (F ). Then the set V /W of all
cosets W + , where V , is a vector space over the field F , with respect to addition and
scalar multiplication by,
(i) (W + ) + (W + ) = W + ( + )

250

Vector Space

(ii) a(W + ) = W + a; , V and a F .


Proof: First we are to show that, the composition is well defined. Let, W + = W +
1 and W + = W + 1 . Then,
W + = W + 1 and W + = W + 1
1 W and 1 W
( 1 ) + ( 1 ) W
( + ) (1 + 1 ) W
W + ( + ) = W + (1 + 1 )
(W + ) + (W + ) = (W + 1 ) + (W + 1 ).
This shows that the addition composition on (V /W ) is well defined. Again,
W + = W + 1 1 W
a( 1 ) W, i.e., a a1 W
W + a = W + a1
a(W + ) = a(W + 1 ).
So the scalar multiplication is well defined. Now, it is easy to show that (V /W, +) is an
abelian group. In fact, the coset W + = W is the additive identity and every coset W +
is (V /W ) has its additive inverse W + () in (V /W ). Moreover, (W + ), (W + ) V /W
and a, b F , we have,
(i)

a[(W + ) + (W + )] = a[W + ( + )]
= W + a( + ) = W + a + a
= (W + a) + (W + a) = a(W + ) + a(W + ).
(ii) (a + b)(W + ) = W + (a + b) = W + (a + b)
= (W + a) + (W + b) = a(W + ) + b(W + ).
(iii) (ab)(W + ) = W + (ab) = W + a(b)
= a(W + b) = a[b(W + )].
(iv) 1(W + ) = W + 1 = W + .
Hence V /W is a vector space and this vector space V /W is called a quotient space of V by
W.
Theorem 4.3.2 Let W be a subspace of a vector space V (F ), and , V . Then + W =
+ W if and only if W .
Proof: First let, + W = + W and let + W . Then,
= + w1 = + w2 ; for some w1 , w2 W
= w2 w1 W.
Conversely, let W . Then,
= + w3 ; for some w3 W
and = + w4 ; for some w4 W.

Linear Combination of Vectors

251

Let + W , then = + w5 for some w5 W . Thus,


= + w5 = ( + w3 ) + w5
= + w6 ; w6 = w3 + w5 W
+ W.
Therefore, + W + W , so + W + W. Let + W , then,
+ W = + w7 ; for some w7 W
= ( + w4 ) + w7 = + w8 ; w8 = w4 + w7 W
+ W, so, + W + W.
Hence it follows that + W = + W. Hence the theorem.

4.4

Linear Combination of Vectors

Let V (F ) be a vector space, and S = {1 , 2 , . . . , n } be a finite subset of V (F ). A vector


V is said to be a linear combination of the vectors 1 , 2 , . . . , n if can be expressed
in the form
(4.3)
= c1 1 + c2 2 + . . . + cn n
for some scalars c1 , c2 , . . . , cn in F . Note that, if we write is a linear combination of vectors
1 , 2 , . . . , n , then we are to solve a system AC = B of linear equations in unknowns
C = (c1 , c2 , . . . , cn ) and B = and the columns of the coefficient matrix A are k s. If the
system of linear equations AC = B has no solution, then can not be expressed as a linear
combination of 1 , 2 , . . . , n .
Ex 4.4.1 Express (2, 1, 6) as a linear combination of (1, 1, 2), (3, 1, 0) and (2, 0, 1) in
a real vector space <3 (<).
Solution: Let = (2, 1, 6), 1 = (1, 1, 2), 2 = (3, 1, 0) and 3 = (2, 0, 1). Here we
seek scalars c1 , c2 , c3 < such that the relation = c1 1 + c2 2 + c3 3 . holds. Using the
values of 1 , 2 and 3 , we get,
(2, 1, 6) = c1 (1, 1, 2) + c2 (3, 1, 0) + c3 (2, 0, 1)
= (c1 + 3c2 + 2c3 , c1 c2 , 2c1 c3 )
Combining terms on the left and equating corresponding entries leads to the linear system

c1 +3c2 +2c3 = 2
c1 c2
=1
.

2c1
c3 = 6
The above system of equations is consistent and so has a solution:c1 = 87 , c2 = 15
8 , c3 =
17
7
15
17
.
Hence

is
a
linear
combination
of
the

s
as

.
i
4
8 1
8 2
4 3
Alternatively, we write down the augmented matrix M of the equivalent system of linear
equations, where 1 , 2 , 3 are the first three columns of M and is the last column, and
then reduce M to echelon form

1 3 2 2
1 3 2 2
1 3 2 2
1 1 0 1 0 4 2 1 0 4 2 1 .
2 0 1 6
0 6 5 10
0 0 4 17
The last matrix corresponds to a triangular system, which has a solution
17
c1 = 78 , c2 = 15
8 , c3 = 4 .

252

Vector Space

Ex 4.4.2 In a vector space <3 (<), show that (2, 5, 3) can not be expressed as a linear combination of vectors (1, 3, 2), (2, 4, 1) and (1, 5, 7).
Solution: Let = (2, 5, 3), 1 = (1, 3, 2), 2 = (2, 4, 1) and 3 = (1, 5, 7). The
vector = (2, 5, 3) is a linear combination of 1 , 2 , 3 if we can find scalars c1 , c2 , c3 <,
so that, = c1 1 + c2 2 + c3 3 holds. Using the values of 1 , 2 and 3 , we get,
(2, 5, 3) = c1 (1, 3, 2) + c2 (2, 4, 1) + c3 (1, 5, 7)
= (c1 + 2c2 + c3 , 3c1 4c2 5c3 , 2c1 c2 + 7c3 )
Combining terms on the left and equating corresponding entries leads to the linear system

c1 +2c2 +c3 = 2
3c1 +4c2 5c3 = 5 .

2c1 c2 +7c3 = 3
The above system of equations is not consistent and have no solution. Hence can not be
expressed as a linear combination of the i s.
Ex 4.4.3 Express p(t) = 3t2 + 5t 5 as a linear combination of the polynomials p1 (t) =
t2 + 2t + 1, p2 (t) = 2t2 + 5t + 4 and p3 (t) = t2 + 3t + 6.
Solution: We seek scalars c1 , c2 , c3 <, so that, p(t) = c1 p1 + c2 p2 + c3 p3 , i.e.,
3t2 + 5t 5 = c1 (t2 + 2t + 1) + c2 (2t2 + 5t + 4) + c3 (t2 + 3t + 6)
= (c1 + 2c2 + c3 )t2 + (2c1 + 5c2 + 3c3 )t + c1 + 4c2 + 6c3
c1 + 2c2 + c3 = 3; 2c1 + 5c2 + 3c3 = 5; c1 + 4c2 + 6c3 = 5.
The system of equation has solution c1 = 3, c2 = 1, c3 = 2. Therefore,
p(t) = 3p1 + p2 2p3 .








11
00
0 2
40
Ex 4.4.4 Let A =
,B =
,C =
and X =
. Express X as a linear
10
11
0 1
20
combination of A, B, C.
Solution: We seek to scalars p, q, r such that X = pA + qB + rC holds. Now,







 

40
11
00
0 2
p p + 2r
=p
+q
+r
=
.
20
10
11
0 1
p+q qr
p = 4, p + 2r = 0, p + q = 2, q r = 0 or, p = 4, q = 2, r = 2.
Hence the required relation is X = 4A 2B 2C.

4.4.1

Linear Span

Let V (F ) be a vector space and S be a non-empty subset of V .


(i) If S is finite, the set of all linear combinations of the vectors of S, which is a smallest
subspace of V , is defined as a linear span of S and is denoted by L(S) or span S.
Thus, if = {1 , 2 . . . , n }, then,
( n
)
X
L(S) =
ci i ; ci F .
(4.4)
i=1

Linear Combination of Vectors

253

(ii) If S is infinite, the set of all linear combinations of a finite number of vectors from S
is defined as a linear span of S and is denoted by L(S) or span S.
In both cases, the space L(S) is said to be generated of spanned by the set S and S is said
to be the set of generators of L(S). For convenience, L() = {}, = null set.
Ex 4.4.5 Determine the subspace of R3 spanned by the vectors = (1, 2, 3), = (1, 2, 4).
Show that = (3, 2, 2) is in the subspace.
Solution: L{, } is the set of vectors {c + d}, where c, d are real numbers and
c + d = c(1, 2, 3) + d(1, 2, 4) = (c d, 2c + 2d, 3c + 4d).
If L{, } then there must be real numbers c, d such that
(3, 2, 2) = (c d, 2c + 2d, 3c + 4d).
Therefore, c d = 3, 2c + 2d = 2, 3c + 4d = 2. These equations are consistent and their
solution is c = 2, d = 1. Hence L{, }, i.e., belongs to the subspace generated by
the vectors , .
Ex 4.4.6 Find the condition among x, y, z such that the vector (x, y, z) belongs to the space
generated by = (2, 1, 0), = (1, 1, 2), = (0, 3, 4).
Solution: If (x, y, z) L{, , } then (x, y, z) can be expressed as a linear combination of
, , . Let (x, y, z) = c1 + c2 + c3 then,
(x, y, z) = c1 (2, 1, 0) + c2 (1, 1, 2) + c3 (0, 3, 4)
= (2c1 + c2 , c1 c2 + 3c3 , 2c2 4c3 ).
This gives 2c1 + c2 = x, c1 c2 + 3c3 = y, 2c2 4c3 = z. Multiplying second equation by
2 and subtracting from first equation we get,
3c2 6c3 = x 2y or, c2 2c3 = (x 2y)/3.
Again, from third equation c2 2c3 = z/2. Hence (x 2y)/3 = z/2 or, 2x 4y 3z = 0,
which is the required condition.
Theorem 4.4.1 The linear span L(S) of a non empty subset S of a vector space V (F ) is
the smallest subspace of V containing S.
Proof: Let S = {1 , 2 . . . , n } V and let,
)
( n
X
L(S) =
ci i ; ci F .
i=1

Now, L(S) is a nonempty subset of V , as 1 V (F ). Let 1 S, then we can write,


1 = 11 1 L(S).
Therefore, 1 S 1 L(S) S L(S).
Now, in order to show that L(S) is a subspace of V (F ), let and be any two arbitrary
elements of L(S). Then each one of them is a linear combination of finite number of elements
os S. Let,
m
X
=
ai i ; i S and ai F
i=1

and =

n
X
j=1

Now, for a, b F, we have,

bj j ; j S and bj F.

254

Vector Space

a + b = a

m
X

!
ai i

+b

i=1
m
X

n
X

bj j

j=1

(aai )i +

i=1

n
X

(bbj )j ,

j=1

which is a linear combination of finite number of elements of S and so is a member of L(S).


Thus,
L(S), L(S) a + b L(S); a, b F.
Thus L(S) is a subspace of V (F ). Next, let W be any subspace of V containing the set S
and let L(S). Then,
= c1 1 + c2 2 + . . . + cn n ,
for some scalars c1 , c2 , . . . , cn F . Since W is a subspace of V containing i and ci F , it
follows that
ci i W ; i = 1, 2, . . . , n.
Since W is a subspace and c1 1 , c2 2 , . . . , cn n W, it follows that,
c1 1 + c2 2 + . . . + cn n W
W.
Thus, L(S) W L(S) W.
Hence L(S) is the smallest subspace of V (F ) containing S and it is called the subspace
spanned or generated by S.
Theorem 4.4.2 If S and T be nonempty finite subsets of a vector space V (F ), then
(i) S T L(S) L(T ).
(ii) S is a subspace of V L(S) = S.
(iii) L{L(S)} = L(S).
(iv) L(S T ) = L(S) + L(T ).
Proof: (i) Let S = {1 , 2 , . . . , r } and T = {1 , 2 , . . . , r , r+1 , . . . , n } be two subsets
of a vector space V (F ) and let L(S). Then for some scalars, ci F , we have,
= c1 1 + c2 2 + . . . + cn n .
r
X
Thus, L(S) =
ci i
i=1

n
X

ci i ; cr+1 = cr+2 = . . . = cn = 0

i=1

L(T ) L(S) L(T ).


(ii) Let S be a subspace of V and let L(S). Then is a linear combination of finite
number of elements of S, i.e.,
n
X
=
ci i ; for some ci F.
i=1

Linear Combination of Vectors

255

But S is a subspace of V , then the smallest subspace containing S is S. So,

n
P

ci i S

i=1

and so S and therefore, L(S) S. As


S = 1. L(S)
and we know L(S) is the smallest subspace containing S so, S L(S). Combining these
two, we conclude, L(S) = S. Conversely, let L(S) = S, then L(S) being a subspace of V ,
so is therefore S.
(iii) Let S1 = L(S), then S1 is a subspace of V (F ). Therefore, by (ii), we have,
L(S1 ) = S1 L{L(S)} = L(S).
(iv) It has been already proved that, the linear sum of the two subspaces is a subspace and
the linear span of a non empty subset of a vector space is its subspace. Hence L(S) + L(T )
as well as L(S T ) is a subspace of V (F ). Let S and T be two finite sets, given by,
S = {1 , 2 , . . . , r } and T = {1 , 2 , . . . , k }
and let L(S T ). Then is a linear combination of finite number of elements of S T .
Thus,
r
k
X
X
L(S T ) =
ci i +
di i ; for some ci , di F
i=1

= + (say); =

i=1
r
X

ci i L(S), =

i=1

k
X

di i L(T )

i=1

L(S T ) L(S) + L(T ).

(4.5)

Again, let L(S) + L(T ), then,


= 1 + 2 ; for some 1 L(S) and 2 L(T )
=

r
X

ci i +

i=1

k
X

di i ;

for some ci , di F

i=1

L(S T ); 1 , 2 , . . . , r , 1 , 2 , . . . , k S T.
So, L(S) + L(T ) L(S T )
L(S) + L(T ) L(S T ).

(4.6)

Hence from (4.5) and (4.6) we have, L(S T ) = L(S) + L(T ).


Theorem 4.4.3 If S and T be two non empty finite subsets of a vector space V (F ) and
each element of T is a linear combination of the vectors of S, then L(T ) L(S).
Proof: Let S = {1 , 2 , . . . , r }, T = {1 , 2 , . . . , m } and let i = ci1 1 +ci2 2 +. . .+cir r
for some cij F , i = 1, 2, . . . , r and j = 1, 2, . . . , m. Let be an element of L(T ), then
= a1 1 + a2 2 + . . . + am m ; for some ai F
r
r
r
X
X
X
= a1
c1j j + a2
c2j j + . . . + am
cmj j
j=1

r
X
j=1

(a1 c1j )j +

j=1
r
X

(a2 c2j )j + . . . +

j=1

j=1
r
X

(am cmj )j

j=1

= d1 1 + d2 2 + . . . dr r L(S)
as di = a1 c1i + a2 c2i + . . . + am cmi F ; i = 1, 2, . . . , r.
Thus, L(T ) L(S) and so L(T ) L(S).

256

Vector Space

Ex 4.4.7 Determine the subspace of <3 spanned by the vectors = (1, 2, 3), = (3, 1, 0).
Examine, if = (2, 1, 3) and = (1, 3, 6) are in the subspace.
Solution: Let S = {, }, where = (1, 2, 3) and = (3, 1, 0). Then,
L(S) = L{, } = {c + d; c, d <}
= {c(1, 2, 3) + d(3, 1, 0); c, d <}
= {(c + 3d, 2c + d, 3c); c, d <} <3 .
Let L(S), then there must be real numbers c, d such that,
(2, 1, 3) = (c + 3d, 2c + d, 3c)
c + 2d = 2, 2c + d = 1, 3c = 3.
These equations are inconsistent and so is not in L(S). Let L(S), then there must be
real numbers c, d such that,
(1, 3, 6) = (c + 3d, 2c + d, 3c)
c + 3d = 1, 2c + d = 3, 3c = 6.
These equations are consistent and c = 2, d = 1 so that = 2 , showing that L(S).
Ex 4.4.8 In the vector space V3 (<), consider the vectors = (1, 2, 1), = (3, 1, 5), =
(3, 4, 7). Show that the subspaces spanned by S = {, } and T = {, , } are the same.
Solution: Since S T, we have L(S) L(T ). Now, we are to show that can be expressed
as a linear combination of and . Let = a + b, for some scalars a, b F , then,
(3, 4, 7) = a(1, 2, 1) + b(3, 1, 5) = (a + 3b, 2a + b, a + 5b)
a + 3b = 3, 2a + b = 4, a + 5b = 7 a = 3, b = 2.
Therefore, = 3 + 2. Now, let be an arbitrary element of L(T ). Then,
= a1 + a2 + a3 ; for some scalars a1 , a2 , a3 F
= a1 + a2 + a3 (3 + 2)
= (a1 3a3 ) + (a2 + 2a3 ) L(S).
L(T ) L(S) L(T ) L(S).
Therefore, L(S) = L(T ) and so the subspaces spanned by S = {, } and T = {, , } are
the same.
Ex 4.4.9 Let S = {, , } and T = {, , + , + } be subsets of a real vector space
V . Show that, L(S) = L(T ).
Solution: S and T are finite subsets of V and each element of T is a linear combination of
the vectors of S and therefore L(T ) L(S). Again,
= + 0 + 0( + ) + 0( + )
= 0 + + 0( + ) + 0( + )
= 0 + 0( + ) + ( + ).
This shows that, each element of S is a linear combination of the vectors of T , and so
L(S) L(T ). It follows that L(S) = L(T ).

Linear Combination of Vectors

257

Ex 4.4.10 If (1, 2, 1), = (2, 3, 2), = (4, 1, 3) and = (3, 1, 2) <3 (<), prove that
L({, }) 6= L({, }).
Solution: Let, if possible, L({, }) = L({, }), then scalars x, y < and for arbitrary
a, b <, such that, x + y = a + b, i.e.,
x(1, 2, 1) + y(2, 3, 2) = a(4, 1, 3) + b(3, 1, 2)
(x + 2y, 2x 3y, x + 2y) = (4a 3b, a + b, 3a + 2b)
x + 2y = 4a 3b, 2x 3y = a + b, x + 2y = 3a + 2b.
From first and third, we have, x = 12 (a 5b), y = 14 (7a b). Now,
2x 3y =

17
(a + b) 6= (a + b).
4

Therefore, these equations are inconsistent and so L({, }) 6= L({, }).


Theorem 4.4.4 The linear sum of two subspaces W1 and W2 of a vector space V (F ) is
generated by their union W1 + W2 = L(W1 W2 ).
Proof: It has been already proved that, the linear sum of the two subspaces is a subspace
and the linear span of a non empty subset of a vector space is its subspace. Consequently,
W1 + W2 as well as L(W1 W2 ) is a subspace of V (F ). Now, let W1 + W2 , then
= 1 + 2 ; for some 1 W1 and 2 W2
= 11 + 12 .
Therefore, is a linear combination of 1 , 2 W1 W2 and so, L(W1 W2 ). Thus,
W1 + W2 L(W1 W2 )
W1 + W2 L(W1 W2 ).
Again, W1 W1 + W2 and W2 W1 + W2
W1 W2 W1 + W2 L(W1 + W2 ).
But, L(W1 W2 ) being the smallest subspace containing W1 W2 , so L(W1 W2 ) W1 +W2 .
Therefore L(W1 W2 ) = W1 + W2 .

4.4.2

Linearly Dependence and Independence

This concept of linearly dependence and independence plays an important role in the theory
of linear algebra and in mathematics in general.
Definition 4.4.1 Let V (F ) be a vector space. A finite set of vectors S = {1 , 2 , . . . , n }
of V (F ) is said to be linearly dependent (LD) if scalars c1 , c2 , . . . , cn F , not all zero
such that,
c1 1 + c2 2 + . . . + cn n = .

(4.7)

An arbitrary set of vectors of a vector space V (F ) is said to be linearly dependent in V if


a finite subset of S which is linearly dependent in V .
Ex 4.4.11 Prove that the set of vectors {1 , 2 , 3 }, where 1 = (2, 2, 3), 2 = (0, 4, 1)
and 3 = (3, 1, 4) in <3 (<) is linearly dependent.

258

Vector Space

Solution: Let c1 , c2 , c3 < be three scalars such that c1 1 + c2 2 + c3 3 = holds. Then,


c1 (2, 2, 3) + c2 (0, 4, 1) + c3 (3, 1, 4) = = (0, 0, 0)
(2c1 + 3c3 ; 2c1 4c2 + c3 ; 3c1 + c2 4c3 ) = (0, 0, 0)
2c1 + 3c3 = 0; 2c1 4c2 + c3 = 0; 3c1 + c2 4c3 = 0
c1 = 3, c2 = 1, c3 = 2.
So, scalars c1 = 3, c2 = 1, c3 = 2, not all zero, such that c1 1 + c2 2 + c3 3 = holds.
Thus, {1 , 2 , 3 } is linearly dependent in <3 .
Ex 4.4.12 If C is the field of complex numbers, prove that the vectors (x1 , y1 ), (x2 , y2 )
V2 (C) are linearly dependent if and only if x1 y2 x2 y1 = 0.
Solution: Let a, b C, then,
a(x1 , y1 ) + b(x2 , y2 ) = (ax1 + bx2 , ay1 + by2 ) = (0, 0)
ax1 + bx2 = 0; ay1 + by2 = 0.
The necessary and sufficient condition for these equations to possess a common non zero
values of a and b is that,

x1
y1

= 0 x1 y2 x2 y1 = 0.
x2
y2
Hence, the given vectors are linearly dependent if and only if x1 y2 x2 y1 = 0.
Definition 4.4.2 Let V (F ) be a vector space. A finite set of vectors S = {1 , 2 , . . . , n }
of V (F ) is said to be linearly independent (LI) if scalars c1 , c2 , . . . , cn F , such that,
c1 1 + c2 2 + . . . + cn n = c1 = c2 = . . . = cn = 0.
(4.8)
An arbitrary set of vectors of a vector space V (F ) is said to be linearly independent in V if
a finite subset of S which is linearly independent in V .
Ex 4.4.13 Prove that the set of vectors 1 = (2, 1, 4), 2 = (3, 2, 1) and 3 = (1, 3, 2)
in V3 (<) is linearly independent.
Solution: Let c1 , c2 c3 < be three scalars such that c1 1 + c2 2 + c3 3 = holds. Then,
c1 (2, 1, 4) + c2 (3, 2, 1) + c3 (1, 3, 2) = = (0, 0, 0)
(2c1 3c2 + c3 ; c1 + 2c2 3c3 ; 4c1 c2 2c3 ) = (0, 0, 0)
2c1 3c2 + c3 = 0; c1 + 2c2 3c3 = 0; 4c1 c2 2c3 = 0
c1 = 0, c2 = 0, c3 = 0.
So, scalars c1 = 3, c2 = 1, c3 = 2, not all zero, such that c1 1 + c2 2 + c3 3 = holds.
Thus, {1 , 2 , 3 } is linearly dependent in <3 .
Ex 4.4.14 In the vector space P [x] of all polynomials over the field F , the infinite set
S = {1, x, x2 , . . .} is linearly independent.
Solution: In order to show that the given infinite set S is linearly independent, we must
show that every finite subset of S is linearly independent. Let A = {xm1 , xm2 , . . . , xmr }
be an arbitrary finite subset of S, so that each mi is a non negative integer. Now, let
a1 , a2 , . . . , ar F be r scalars such that,
a1 xm1 + a2 xm2 + . . . + ar xmr =
holds. Since this is true for arbitrary values of xmi s, we have by definition of equality of
polynomials,a1 = a2 = . . . = ar = 0. This shows that A is linearly independent and hence S
is linearly independent.

Linear Combination of Vectors

259

Ex 4.4.15 Find the values of x such that the vectors (1, 2, 1), (x, 3, 1) and (2, x, 0) are linearly dependent.
[WBUT
2005]
Solution: If the given vectors are linearly dependent then
c1 (1, 2, 1) + c2 (x, 3, 1) + c3 (2, x, 0) = gives
1 2 1


x 3 1 = 0 1(0 x) 2(0 2) + 1(x2 6) = 0


2 x 0
or, (x 2)(x + 1) = 0 or, x = 1, 2.
Hence the required values of x are 1, 2.
Ex 4.4.16 Prove that, the vector space of all periodic function f (t) with period T contains
an infinite set of linearly independent vectors.
Solution: Let us consider, the infinite set of periodic functions


2nx
2nx
S = 1, cos
, sin
; nN .
T
T
Consider a finite subset Sn (for each positive integer n), of S as,


2x
2nx
2nx
2x
Sn = 1, cos
, , sin
, , cos
, sin
, .
T
T
T
T
To prove Sn is linearly independent,
let scalars a0 , a1 ,  , an , b1 , b2 , , bn such that,
n 
X
2rx
2rx
a0 +
ar cos
+ br sin
=0
(i)
T
T
r=1
!
Z T
Z T
Z T
n
X
2rx
2rx
a0 dx +
ar
cos
dx + br
sin
dx = 0
T
T
0
0
0
r=1
a0 T = 0 a0 = 0.
Now we use the following integration formulas,
Z T
Z T
2rx
2kx
2rx
2kx
T
cos
cos
dx =
sin
sin
dx = rk
T
T
T
T
2
0
0
Z T
2rx
2kx
and,
sin
cos
dx = 0.
T
T
0
and integrating we get,
Multiplying both sides of (i) by cos 2kx
T

ZT
ZT
ZT
n
X
2kx
ar cos 2rx cos 2kx dx + br cos 2kx sin 2rx dx = 0
a0 cos
dx +
T
T
T
T
T
r=1
0

a0 .0 +

n
X
r=1

[ar

T
rk + br .0] = 0 ar = 0, r = 1, 2, , n.
2

and integrating from 0 to T , wt get,


Similarly, multiplying both sides of (i) by sin 2kx
T
br = 0; r = 1, 2, . . . , n.
Thus, Sn is linearly independent, for every positive integer n and consequently S is linearly
independent.

260

Vector Space

Ex 4.4.17 Prove that, if , , are linearly independent vectors of a complex vector space
V (C), then so also are + , + , + .
Solution: Let a, b, c C be any three scalars such that
a( + ) + b( + ) + c( + ) =
(a + c) + (a + b) + (b + c) =
a + c = a + b = b + c = 0; as {, , }is linearly independent.


1 0 1


As 1 1 0 = 2 6= 0, so the given homogeneous system has unique solution and the unique
0 1 1
solution is trivial. Thus, a = b = c = 0, shows that { + , + , + } is linearly
independent.
Theorem 4.4.5 A set containing a single non null vector is linearly independent.
Proof: Let S = {}; 6= be a subset in a vector space V (F ). Let for some scalar a F ,
we have
a = a = 0; as 6= .
Therefore, the set is linearly independent. The set {} is also linearly independent.
Theorem 4.4.6 Every set of vectors containing the null vector is linearly dependent.
Proof: Let S = {1 , 2 , . . . , r , . . . , m }, where at least one of them say r = . Then, it
is clear that,
0 + 0 + . . . + 1 + . . . + 0 =
1

m
X

ci i = ;

where cr 6= 0.

i=1

Hence S is linearly dependent. Thus, we conclude, if the set S = {1 , 2 , . . . , m }, of vectors


in a vector space V (F ) is linearly independent, then none of the vectors in S can be a zero
vector.
Theorem 4.4.7 Every subset of a linearly independent set is linearly independent.
Proof: Let the set S = {1 , 2 , . . . , m } be a linearly independent set of vectors and let
T = {1 , 2 , . . . , r }; 1 r m,
be its subset. Let for some scalars c1 , c2 , . . . , cr F, we have,
c1 1 + c2 2 + . . . + cr r =
c1 1 + c2 2 + . . . + cr r + 0r+1 + . . . + 0m =
c1 = c2 = . . . = cr = 0; as S is linearly independent.
This shows that T is linearly independent.
Ex 4.4.18 Prove that the four vectors x = (1, 0, 0), y = (0, 1, 0), z = (0, 0, 1), u = (1, 1, 1) in
<3 form linearly dependent subset of <3 , but, any three of them are linearly independent.
Solution: Let us consider the relation ax + by + cz + du =
or, a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1) + d(1, 1, 1) = (0, 0, 0)
or, (a + d, b + d, c + d) = (0, 0, 0)
or, a + d = 0, b + d = 0, c + d = 0 or, a = d = b = c.

Linear Combination of Vectors

261

Let d = 1. Therefore, a = b = c = 1. Hence x, y, z, u are linearly dependent and there


relation is
(1, 0, 0) + (0, 1, 0) + (0, 0, 1) (1, 1, 1) = .
Let us take three vectors x, y, z, then a = b = c = 0. Hence x, y, z are linearly independent.
If we take y, z, u then
d = 0, b + d = 0, c + d = 0 or, b = c = d = 0.
Thus y, z, u are linearly independent. Similarly, {x, y, z}, {z, y, u}, {x, z, u}, etc. all are
linearly independent.
Theorem 4.4.8 Every superset of a linearly dependent set is linearly dependent.
Proof: Let S = {1 , 2 , n
. . . , r } be a linearly dependent
o set of vectors and let
T = 1 , 2 , . . . , r , r+1 , . . . , m
be its superset. Now, S being linearly dependent, so scalars c1 , c2 , . . . , cr , not all zero such
that,
c1 1 + c2 2 + . . . + cr r =
c1 1 + c2 2 + . . . + cr r + cr+1 r+1 + . . . + cm m =
where cr+1 = . . . = cm = 0.
Pm
Thus scalars c1 , c2 , . . . , cr , not all zero such that i=1 ci i = holds. Hence the set T is
linearly dependent.
Theorem 4.4.9 A set {1 , 2 , . . . , n } of nonzero vectors in a vector space V (F ) is linearly
dependent, if and only if, some k (2 k n) in the set is a linear combination of its
preceding vectors 1 , 2 , . . . , k1 .
Proof: Let S = {1 , 2 , . . . , n } be a linearly dependent set of non null vectors. Since
1 6= , the set {1 } is linearly independent.
Let k be the first integer, such that {1 , 2 , . . . , k } is linearly dependent. Clearly, 2 k
n. Thus, scalars c1 , c2 , . . . , ck , not all zero, such that
c1 1 + c2 2 + . . . + ck k = .
Here ck 6= 0, for, otherwise {1 , 2 , . . . , k1 } would be linearly dependent and the same
would contradict our hypothesis that k is the first integer between 2 and n for which
{1 , 2 , . . . , k } is linearly dependent. Hence as ck 6= 0, c1
F exists. Thus, from
k
the above relation, we get,
1
1
k = c1
k c1 1 ck c2 2 . . . ck ck1 k1 ,

i.e., k is a linear combination of the prereading vectors 1 , 2 , . . . , k1 of the set.


Conversely, let some k (2 k n) be a linear combinations of the prereading vectors
1 , 2 , . . . , k1 so that scalars c1 , c2 , . . . ck1 , such that,
k = c1 1 + c2 2 + . . . + ck1 k1
c1 1 + c2 2 + . . . + ck1 k1 + (1)k =

k
X

ci i = ; where ck = 1 6= 0.

i=1

Since the above equality holds for scalars c1 , c2 , . . . , ck1 , 1 in F and one of them is non
zero, the set of vectors {1 , 2 , . . . , k } is linearly dependent. Since, every superset of
a linearly dependent set is linearly dependent, it follows that {1 , 2 , . . . , k , . . . , n } is
linearly dependent. From this theorem, it is observed that if {1 , 2 , . . . , n } are linearly
independent set of vectors in a vector space, then they must be distinct and none can be
the zero vector.

262

Vector Space

Theorem 4.4.10 If a set of vectors {1 , 2 , . . . , n } in a vector space V (F ) be linearly


dependent, if and only if at least one of the vectors of the set can be expressed as a linear
combination of the others.
Proof: Since the set S = {1 , 2 , . . . , n } is linearly dependent, so scalars c1 , c2 , . . . , cn
F , not all zero, such that,
c1 1 + c2 2 + . . . + cn n = .
1
1
Let ck 6= 0, then c1
k exists and F , where ck ck = ck ck = 1, the identity element in F .
Now,
ck k = c1 1 c2 2 . . . ck1 k1 . . . cn n
1
1
1
k = c1
k c1 1 ck c2 2 . . . ck ck1 k1 . . . ck cn n
= d1 1 + d2 2 + . . . dk1 k1 + . . . dn n ,
where dr = c1
k cr ; r = 1, 2, . . . , j 1, j + 1, . . . , n.

This shows that, k is a linear combination of the vectors 1 , 2 , . . . , k1 , k+1 , . . . , n .


Conversely, let one of the vectors j of the set S, is a linear combination of the other vectors
of the set S. Then for some scalars ci F (i = 1, 2, . . . , j 1, j + 1, . . . , n), we have,
j = c1 1 + c2 2 + . . . + cj1 j1 + cj+1 j+1 + . . . + cn n
c1 1 + c2 2 + . . . + cj1 j1 + (1)j + cj+1 j+1 + . . . + cn n = .
Here all the scalars in the equality belong to F and since at least one of them is non zero,
S = {1 , 2 , . . . , n } is linearly dependent.
Theorem 4.4.11 If S is a linearly independent subset of the vector space V (F ) and L(S) =
V , then no proper subset of S can span V .
Proof: Let T be a proper subset of S and L(T ) = V . Since T is a proper subset there is a
vector S T . Now, V, so is a linear combination of the vector of T . Hence T {}
is linearly dependent. But T {} S. Consequently, it must be linearly independent and
thus T can not span V .

4.5

Basis and Dimension

Here we are to discuss the structure of a vector space V (F ) by determining a smallest set
of vectors in V that completely describes V .
Definition 4.5.1 Basis of a vector space : Let V (F ) be a vector space. A nonempty
subset S of vectors in V (F ) is said to be its basis, if
(i) S is linearly independent in V , and
(ii) S generates V ; i.e., L(S) = V .
If 1 , 2 , . . . , n form a basis for a vector space V (F ), then they must be distinct and non
null, so we write them as a set S = {1 , 2 , . . . , n }. Note that, if {1 , 2 , . . . , n } be a
basis for the vector space V (F ), then {c1 , 2 , . . . , n }, is also a basis, when c 6= 0. Thus,
a basis for a non zero vector space is never unique.
Definition 4.5.2 Dimension of a vector space : Let V (F ) be a vector space. The vector
space V (F ) is said to be finite dimensional or finitely generated if there exists a finite subset
S of vectors in V , such that V = L(S).

Basis and Dimension

263

(i) A vector space which is not finitely generated is known as an infinite dimensional
vector space.
(ii) The null space {}, which has no basis and linearly dependent set is finite dimensional,
since it is generated by the null set . So the vector space {} is said to be of dimension
0.
The number of elements in any basis set S of a finite of a finite dimensional vector space
V (F ), is called dimension of the vector space and is denoted by dimV . For example, the set
S = {e1 , e2 , , en }, where e1 = (1, 0, , 0), e2 = (0, 1, 0, , 0), , en = (0, 0, , 1) is a
basis of <n (<). This is known as it standard basis of <n .
Ex 4.5.1 Show that the vectors {1 , 2 , 3 }, where 1 = (1, 0, 1), 2 = (1, 2, 1) and 3 =
(0, 3, 2) forms a basis. Express each of the standard basis vector as a linear combination
of these vectors.
Solution: Let S = {1 , 2 , 3 }. To show S is linearly independent, let scalars c1 , c2 , c3
<, such that,
c1 1 + c2 2 + c3 3 =
c1 (1, 0, 1) + c2 (1, 2, 1) + c3 (0, 3, 2) =
(c1 + c2 , 2c2 3c3 , c1 + c2 + 2c3 ) = (0, 0, 0).
Thus, we obtain a linear system of equations,
c1 + c2 = 2c2 3c3 = c1 + c2 + 2c3 = 0,


1 1 0


where, 0 2 3 6= 0. Thus, the homogeneous system has unique solution c1 = c2 = c3 = 0,
1 1 2
which shows that S is linearly independent. To show that S spans V3 (<), let (a, b, c) be an
arbitrary element in V3 (<). We now seek constants c1 , c2 , c3 <, such that
(a, b, c) = c1 1 + c2 2 + c3 3 = (c1 + c2 , 2c2 3c3 , c1 + c2 + 3c3 )
1
1
1
(7a 2b 3c), c2 =
(3a + 2b + 3c), c3 = (a b + c)
c1 =
10
10
5
Thus (a, b, c) V3 (<) can be written as,
1
1
(a, b, c) = 10
(7a 2b 3c)(1, 0, 1) + 10
(3a + 2b + 3c)(1, 2, 1) + 51 (a b + c)(0, 3, 2),
i.e., every element in V3 (<) can be expressed as a linear combination of elements of S and
so L(S) = V3 (<) and consequently S is a basis of V3 (<). Therefore,
7
3
1
(1, 0, 1) + (1, 2, 1) + (0, 3, 2)
10
10
5
1
1
1
(0, 1, 0) = (1, 0, 1) + (1, 2, 1) (0, 3, 2)
5
5
5
3
3
1
(0, 0, 1) = (1, 0, 1) + (1, 2, 1) + (0, 3, 2).
10
10
5

(1, 0, 0) =

Ex 4.5.2 Let W = {(x, y, z) <3 : x 4y + 3z = 0}. Show that W is a subspace of <3 .


Find the basis and dimension of the subspace W of <3 .
Solution: In the previous section, we are easily verified that W is a subspace of <3 . Let
= (a, b, c) W , then a, b, c < and satisfies a 4b + 3c = 0. Therefore,
= (a, b, c) = (4b 3c, b, c) = b(4, 1, 0) + c(3, 0, 1).

264

Vector Space

Let = (4, 1, 0) and = (3, 0, 1), then,


W and = b + c; b, c <
L{, } W L{, }.
Again, W, W , so L{, } W gives W = L{, }. Now,
c1 + c2 = ; c1 , c2 <
c1 (4, 1, 0) + c2 (3, 0, 1) =
4c1 3c2 = 0 = c1 = c2 c1 = c2 = 0.
Therefore, , are linearly independent in W . Hence {, } is a basis of W and dimW = 2.
Ex 4.5.3 Show that W = {(x, y, z) <3 : 2x y + 3z = 0 and x + y + z = 0} is a subspace
of <3 . Find a basis of W . What its dimension?
Solution: W is non-empty as (0, 0, 0) W . Let = (a1 , a2 , a3 ), = (b1 , b2 , b3 ) and
, W . Then,
2a1 a2 + 3a3 = 0, a1 + a2 + a3 = 0
and 2b1 b2 + 3b3 = 0, b1 + b2 + b3 = 0.
Also, 2(c1 a1 + c2 b1 ) (c1 a2 + c2 b2 ) + 3(c1 a3 + c2 b3 )
= c1 (2a1 a2 + 3a3 ) + c2 (2b1 b2 + 3b3 ) = 0
and (c1 a1 + c2 b1 ) + (c1 a2 + c2 b2 ) + (c1 a3 + c2 b3 )
= c1 (a1 + a2 + a3 ) + c2 (b1 + b2 + b3 ) = 0
Now,
c1 + c2 = (c1 a1 + c2 b1 , c1 a2 + c2 b2 , c1 a3 + c2 b3 ).
Therefore c1 +c2 W . Hence W is a subspace of <3 . Let = (a, b, c) be any vector of W .
Then 2ab+3c = 0 and a+b+c = 0. Solving these two equations we get a = 4c/3, b = c/3.
Therefore = (4c/3, c/3, c) = 3c (4, 1, 3).
Since c is arbitrary and any vector of W can be expressed in terms of , therefore,
W = L{(4, 1, 3)}.
Hence {(4, 1, 3)} is a basis of W . Since the number of vectors in this basis is 1 so the
dimension of W is 1.
Ex 4.5.4 Show that W = {(x1 , x2 , x3 , x4 ) <4 : x1 x2 + x3 x4 = 0} is a subspace of
the four dimensional vector real space R4 . Find the dimension of W .
Solution: Let = (a1 , a2 , a3 , a4 ), = (b1 , b2 , b3 , b4 ) be two vectors of W . Then
a1 a2 + a3 a4 = 0 and b1 b2 + b3 b4 = 0.
or, (ca1 + db1 ) (ca2 + db2 ) + (ca3 + db3 ) (ca4 + db4 )
= c(a1 a2 + a3 a4 ) + d(b1 b2 + b3 b4 ) = 0.
Then,
c + d = c(a1 , a2 , a3 , a4 ) + d(b1 , b2 , b3 , b4 )
= (ca1 + db1 , ca2 + db2 , ca3 + db3 , ca4 + db4 ) W .
Hence W is a subspace of R4 . Now, = (a1 , a2 , a3 , a4 ), where a1 a2 + a3 a4 = 0
= (a1 , a2 , a3 , a1 a2 + a3 ) = a1 (1, 0, 0, 1) + a2 (0, 1, 0, 1) + a3 (0, 0, 1, 1).
Since a1 , a2 , a3 are arbitrary and is linear combination of (1, 0, 0, 1), (0, 1, 0, 1), (0, 0, 1, 1).
Therefore,
W = L{(1, 0, 0, 1), (0, 1, 0, 1), (0, 0, 1, 1)}.
Again, the vectors (1, 0, 0, 1), (0, 1, 0, 1), (0, 0, 1, 1) are linearly independent since
a(1, 0, 0, 1) + b(0, 1, 0, 1) + c(0, 0, 1, 1) =
implies a = b = c = 0. Therefore, {(1, 0, 0, 1), (0, 1, 0, 1), (0, 0, 1, 1)} is a basis of W .
Since there are three vectors in the basis of W , dimension of W is 3.

Basis and Dimension

265

Ex 4.5.5 Show that, S = {t2 + 1, t 1, 2t + 2} is a basis for the vector space P2 .


Solution: To do this, we must show that S spans V and is linearly independent. To show
that, it spans V , we take any vector V , i.e., a polynomial at2 +bt+c and must find constants
k1 , k2 and k3 such that,
at2 + bt + c = k1 (t2 + 1) + k2 (t 1) + k3 (2t + 2)
= k1 t2 + (k2 + 2k3 )t + (k1 k2 + 2k3 ).
Since the polynomials agree for all values of t only, if the coefficients of the respective powers
of t agree, we get the linear system,
a = k1 ; b = k2 + 2k3 ; c = k1 k2 + 2k3
1
1
k1 = a; k2 = (a + b c); k3 = (b + c a).
2
4
Hence S spans V . To show that, S is linearly independent, we form,
k1 (t2 + 1) + k2 (t 1) + k3 (2t + 2) =
k1 t2 + (k2 + 2k3 )t + (k1 k2 + 2k3 ) =
k1 = 0; k2 + 2k3 = 0; k1 k2 + 2k3 = 0.
The only solution to this homogeneous system is k1 = k2 = k3 = 0, which implies that S is
linearly independent. Thus, S is a basis for P2 . The set of vectors {tn , tn1 , , t, 1} forms
a basis for the vector space Pn , called the natural or standard basis for Pn .
Ex 4.5.6 Find a basis for the subspace V of P2 , consisting of all vectors of the form pt2 +
qt + s, where s = p q.
Solution: Every vector in V is of the form pt2 + qt + s can be written as
pt2 + qt + p q = p(t2 + 1) + q(t 1),
so the vectors t2 + 1 and t 1 span V . Moreover, these vectors are linearly independent
because neither one is a multiple of the other. This conclusion could also be reached by
writing the equation,
k1 (t2 + 1) + k2 (t 1) =
k1 t2 + k2 t + (k1 k2 ) = .
Since this equation is to hold for all values of t, we must have, k1 = 0 and k2 = 0.
Ex 4.5.7 Prove that the set S = {(1, 2, 1), (2, 1, 1), (1, 1, 2)} is a basis of <3 .
Solution: Let = (1,
2, 1), = (2, 1, 1), = (1, 1, 2). Now,
1 2 1


= 2 1 1 = 1(2 1) 2(4 1) + 1(2 1) = 4 6= 0.
1 1 2
Hence {, , } is linear independent. Let = (x, y, z) be an arbitrary vector of <3 .
Let us examine if L{, , }. If possible let = c1 + c2 + c3 , where ci s are real.
Therefore,
c1 + 2c2 + c3 = x, 2c1 + c2 + c3 = y, c1 + c2 + 2c3 = z.
This is a homogeneous system of three equations in c1 , c2 , c3 . The coefficient determinant =
4 6= 0. Therefore, there exists a unique solution for c1 , c2 , c3 . This proves that
L{, , }. Thus any vector of <3 can be generated by the vectors , , . Hence {, , }
generates <3 . Therefore, {, , } is a basis of <3 .

266

Vector Space

Ex 4.5.8 Prove that (2, 0, 0), (0, 1, 0) are linearly independent but do not form a basis of
<3 .
Solution: Let c1 (2, 0, 0) + c2 (0, 1, 0) = (0, 0, 0)
or, 2c1 = 0, c2 = 0 or, c1 = c2 = 0.
Therefore, the given vectors are linearly independent. Let (1, 2, 3) be a vector of <3 . Then
d1 (2, 0, 0) + d2 (0, 1, 0) = (1, 2, 3) or, (2d1 , d1 , 0) = (1, 2, 3).
Equating both sides we get 2d1 = 1, d2 = 2, 0 = 3. The last relation is not possible.
Hence the vectors (1, 2, 3) <3 , but, can not be expressed using the given vectors, i.e., the
given vectors do not generate <3 . Hence they do not form a basis of <3 .
Ex 4.5.9 If {, , } be a basis of real vector space V and c 6= 0 be a real number, examine
whether { + c, + c, + c} is a basis of V or not.
[WBUT 2003]
Solution: Let + c = 1 , + c = 2 , + c = 3 . Let us consider the relation
c1 1 + c2 2 + c3 3 = , where c1 , c2 , c3 are real. Therefore,
c1 ( + c) + c2 ( + c) + c3 ( + c) =
or, (c1 + c3 c) + (c1 c + c2 ) + (c2 c + c3 ) =
or, c1 + cc3 = 0, c1 c + c2 = 0, c2 c + c3 = 0,
since , , are linearly independent. The coefficient determinant of the above system of
equations of c1 , c2 , c3 is

1 0 c


= c 1 0 = c3 + 1.
0 c 1
3
If c + 1 = 0 or, c = 1 then = 0 and hence the vectors 1 , 2 , 3 are linear dependent
and therefore, { + c, + c, + c} does not form a basis. But, if c 6= 1, 6= 0 then
1 , 2 , 3 are linearly independent. V is a vector space of dimension 3 and {1 , 2 , 3 } is a
linearly independent set containing 3 vectors of V . Therefore, {1 , 2 , 3 } is a basis of V .
Ex 4.5.10 Let V be the vector space of all polynomials with real coefficients of degree at
most n, where
elements of V as functions from < to <, define W =
Z 1 n 2. Considering
n
o
pV :
p(x)dx = 0 , show that W is a subspace of V and dim(W ) = n. [IIT-JAM11]
0

Solution: The set W of V is given Zby


1
n
o
W = pV :
p(x)dx = 0 .
0
R1
Clearly 0 W as 0 0dx = 0 W , i.e., W is nonempty. Let p1 (x), p2 (x) W , then
Z 1
Z 1
p1 (x)dx = 0 and
p2 (x)dx = 0.
0

Let a, b <, then


Z

Z
i
ap1 (x) + bp2 (x) dx = a

Z
p1 (x)dx + b

p2 (x)dx
0

= a 0 + b 0 = 0.
This implies ap1 (x) + bp2 (x) W, if p1 (x), p2 (x) W hence W is a subspace of V . Now
Z 1
let p W then
p(x)dx = 0, where p(x) is a polynomial of degree n as
0

p(x) = a0 + a1 x + a2 x2 + + an xn
Z 1
Z 1h
i

p(x)dx =
a0 + a1 x + a2 x2 + + an xn dx
0

= a0 +

a1
a2
an
+
+ +
= 0.
2
3
n+1

Basis and Dimension

267

Now above equation will hold true for all n N if and only if a0 = a1 = a2 = = an = 0.
Hence dimW = n = dimV.
Existence theorem
Theorem 4.5.1 Every finite dimensional vector space has a finite basis.
Proof: Let V (F ) be a finite dimensional vector space, then V = L(S), where S is a finite
subset of V . Let S = {1 , 2 , . . . , n } and we can assume S does not contain , as
L(S {}) = L(S).
If S is linearly independent, then it is a finite basis of V and the theorem follows. If S is
linearly dependent, some k (2 k n) in S such that k is a linear combination of the
preceding vectors 1 , 2 , . . . , k1 . If S1 = S {k }, then,
L(S1 ) = L(S) = V.
If S1 is linearly independent, then it is a finite basis of V and so, we are done. If S1 is
linearly dependent, then some l (l > k), which is a linear combination of the preceding
vectors. In the same way, we can say that if, S2 = S1 {l }, then,
L(S2 ) = L(S1 ) = L(S) = V.
Now, if S2 is linearly independent, it becomes a finite basis, otherwise we continue to proceed in the same manner till after a finite number of steps, we obtain a linearly independent
subset of S, which generates V .
It is clear that each step consists in the exclusion of an and the resulting set generates V .
At the most, we may be left with a single element generating V , which is clearly linearly
independent and so, it will become a basis. Thus there must exists a linearly independent
subset of S, generating V .
Result 4.5.1 (Deletion theorem). If a vector space V over a field F be spanned by a
linearly dependent set {1 , 2 , . . . , n }, then V can also be generated by a suitable proper
subset of {1 , 2 , . . . , n }.
Replacement theorem
Theorem 4.5.2 Let {1 , 2 , . . . , n } be a basis of a vector space V (F ) and (6= ) V ,
n
P
where =
ci i ; ci F. Then if ck 6= 0, can replace k to give a new basis of V .
i=1
1
Proof: Since ck 6= 0, so c1
k exists and F and ck ck = 1, where 1 is the identity element
iv F . Now,
= c1 1 + c2 2 + . . . + ck1 k1 + ck k + ck+1 k+1 + . . . + cn n
or, ck k = c1 1 c2 2 . . . ck1 k1 ck+1 k+1 . . . cn n
k = c1
k [ c1 1 c2 2 . . . ck1 k1 ck+1 k+1 . . . cn n ]
= d1 1 + d2 2 + . . . + dk1 k1 + dk + dk+1 k+1 + . . . + dn n

where, the dr are given by,


 1
ck cr ; r = 1, 2, . . . , k 1, k + 1, . . . , . . . , n
dr =
c1
r=k
k ;
Hence k is a linear combinations of vectors 1 , 2 , . . . , k1 , , k+1 , . . . , n . Now, we are
to show that {1 , 2 , . . . , k1 , , k+1 , . . . , n } is linearly independent. Let p1 , p2 , . . . , pn
be n scalars such that,

268

Vector Space
k1
X

pi i + pk +

i=1

k1
X

pi i =

i=k+1
n
n
X
X
pi i + pk (
ci i ) +
pi i =

i=1
k1
X

n
X

i=1

i=k+1

(pi + pk ci )i + pk ck k +

i=1

n
X

(pi + pk ci )i =

i=k+1

Comparing the coefficients, we get

pi + pk ci = 0; i = 1, 2, . . . , k 1,
pk ck = 0
as {1 , 2 , . . . , n } is LI

pi + pk ci = 0;
k + 1, . . . , n
pk = 0 and pi = 0; i = 1, 2, . . . , k 1, k + 1, . . . , n.
This shows that, {1 , 2 , . . . , k1 , , k+1 , . . . , n } is linearly independent. Now, we are to
show that
L{1 , 2 , . . . , k1 , , k+1 , . . . , n } = V.
For this, let,

S = {1 , 2 , . . . , k1 , k , k+1 , . . . , n }
and T = {1 , 2 , . . . , k1 , , k+1 , . . . , n }.

Since is a linear combination of the vectors of S, each element of T is a linear combination


of the vectors of S. Therefore,
L(T ) L(S).
Also, since j is a linear combination of the vectors of T , each element of S is a linear
combination of the vectors of T . Therefore,
L(S) L(T ).
Therefore, L(S) = L(T ) = V . Hence, T = {1 , 2 , . . . , k1 , , k+1 , . . . , n } fulfills both
the conditions for the basis of V . Hence T is a new basis of V .
Corrolary: If {1 , 2 , . . . , n } be a basis of the finite dimensional vector space V (F ), then
any set of linearly independent vectors of V contains at most n vectors.
Ex 4.5.11 Prove that the set of vectors (1, 1, 0, 1), (1, 2, 0, 0), (1, 0, 1, 2) is LI in <4 .
Extend this set to a basis of <4 . Express = {x1 , x2 , x3 , x4 } in terms of the basis so
formed.
Solution: Let 1 = (1, 1, 0, 1), 2 = (1, 2, 0, 0), 3 = (1, 0, 1, 2). Let for some scalars
c1 , c2 , c3 <, such that,
c1 1 + c2 2 + c3 3 =
c1 (1, 1, 0, 1) + c2 (1, 2, 0, 0) + c3 (1, 0, 1, 2) =
(c1 + c2 + c3 , c1 2c2 , c3 , c1 + 2c3 ) =
c1 + c2 + c3 = 0 = c1 2c2 = c3 = c1 + 2c3
c1 = c2 = c3 = 0.
Hence {1 , 2 , 3 } is linearly independent in <4 . Let {e1 , e2 , e3 , e4 } be a standard basis of
<4 . Then,
1 = 1e1 + 1e2 + 0e3 + 1e4 .
Since the coefficients of e1 is non zero, by replacement theorem, {1 , e2 , e3 , e4 } is a new
basis. Now,

Basis and Dimension

269
2 = 1e1 2e2 = 1 3e2 e4 .

Since the coefficients of e2 is non zero, so, by replacement theorem, {1 , 2 , e3 , e4 } is a new


basis of <4 . Also,
3 = 1e1 e3 + 2e4 = 1 e2 e4 + 2e4
1
= 1 + [2 1 + e4 ] e3 + e4
3
2
1
4
= 1 + 2 e3 + e4 .
3
3
3
Since the coefficients of e3 is non zero, by replacement theorem, {1 , 2 , 3 , e4 } is a new
basis of <4 and L({e1 , e2 , e3 , e4 }) = L({1 , 2 , 3 , e4 }). Also,
e3 = 23 1 + 13 2 3 + 43 e4 ; e2 = 13 (1 2 e4 ), e1 = 23 1 + 13 2 23 e4 . Now,
= (x1 , x2 , x3 , x4 ) = x1 e1 + x2 e2 + x3 e3 + x4 e4
x2
2x3
x1
x2
x3
2x1
x2
4x3
2x1
=(
+
+
)1 + (
+ )2 x3 3 (
+

x4 )e4 .
3
3
3
3
3
3
3
3
3
Ex 4.5.12 Obtain a basis of <3 containing the vector (1, 0, 2).
Solution: <3 is a real vector space of dimension 3. The standard basis of <3 is {e1 , e2 , e3 },
where, e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1). Let = (1, 0, 2) be the given vector, then
= 1e1 + 0e2 + 2e3 .
Since the coefficient of e1 is non zero, by replacement theorem, e1 can be replaced by to
give a new basis of <3 . Hence a basis of <3 containing the given vector is
{(1, 0, 2), (0, 1, 0), (0, 0, 1)}.
The replacement can be done in more than one ways and thus different bases for <3 can be
obtained.
Ex 4.5.13 Obtain a basis of <3 containing the vectors (2, 1, 0) and (1, 3, 2).
Solution: We know, {e1 , e2 , e3 }, is the standard basis of <3 , where,
e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1).
Let = (2, 1, 0) and = (1, 3, 2) be the two given vectors, then
= 2e1 e2 + 0e3 .
Since the coefficient of e1 is non zero, by replacement theorem, can replace e1 , to give a
new basis {, e2 , e3 } of <3 . Now,
= 1e1 + 3e2 + 2e3 =

1
7
1
( + e2 ) + 3e2 + 2e3 = + e2 + 2e3 .
2
2
2

Since the coefficient of e2 is non zero, by replacement theorem, can replace e2 , to give a
new basis {, , e3 } of <3 . Hence a basis of <3 containing the given vectors is
{(2, 1, 0), (1, 3, 2), (0, 0, 1)}.

270

Vector Space

Invariance theorem
Theorem 4.5.3 Let V (F ) be a finite dimensional vector space, then any two bases of V
have the same number of vectors.
Proof: Let V (F ) be a finite dimensional vector space and B1 = {1 , 2 , . . . , n } and
B2 = {1 , 2 , . . . , r } be two bases of V (F ). We are to show that, n = r. If possible, let
r > n, then i 6= , i 6= and using replacement theorem, we know 1 can replace some i
to give a new basis of V . Without loss of any generality (by changing the order of s), we
can say, {1 , 2 , . . . , n } is a basis of V (F ). Let,
n
X
ci i ,
2 = c1 1 +
i=2

then we assume that some ci (i > 1) 6= 0 for if ci = 0(i > 1), then 2 = c1 1 , showing
that {1 , 2 } is linearly dependent, but {1 , 2 } being a subset of linearly independent set is
linearly independent. Thus, some ci (i > 1) 6= 0. Hence 2 can be replaced some i ( 2)
to give a new basis of V . Without loss any generality, we assume {1 , 2 , . . . , n } is a basis
of V (F ).
Therefore, n+1 is a linear combination of 1 , 2 , . . . , n giving that, {1 , 2 , . . . , n ,
n+1 } is linearly dependent, which is a contradiction., since it is an subset of linearly independent set {1 , 2 , . . . , n , . . . , r }. This contradiction shows that our assumption is wrong.
Hence r 6> n. Similarly, by changing the roles of the basis, we have r 6< n and consequently,
r = n.
Thus although a vector space has many bases, we have just shown that for a particular
vector space V , all bases have the same number of vectors. Therefore, all finite dimensional
vector space of the same dimension differ only in the nature of the elements, their algebraic
properties are identical.
Extension theorem
Theorem 4.5.4 Every linearly independent subset in a finite dimensional vector space
V (F ) is either a basis of V or it can be extended to form a basis of V .
Proof: Let {1 , 2 , . . . , r } be a linearly independent subset of a finite dimensional vector
space V (F ). Now, L(S) being the smallest subspace containing S, so, L(S) V. If L(S) = V ,
then S is the finite basis of V . If L(S) is a proper subspace of V , then,
V L(S) 6= .
Let 1 V L(S), then S1 = {1 , 2 , . . . , r , 1 } and we are to prove that S1 is linearly
independent. Let scalars c1 , c2 , . . . , cr , cr+1 F , such that,
r
X
ci i + cr+1 1 = .
i=1

We assert that, cr+1 = 0, because, if cr+1 6= 0, then c1


r+1 exists F and then,
1 =

r
X

c1
r+1 ci i ;

c1
r+1 ci F

i=1

shows that 1 L(S), which is a contradiction. So cr+1 = 0. Also, {1 , 2 , . . . , r } is linearly independent, so,c1 = c2 = . . . = cr = 0; cr+1 = 0. Therefore, S1 = {1 , 2 , . . . , r , 1 }
is linearly independent. Now, L(S1 ) V. If L(S1 ) = V , then S1 is a basis of V , where S1 S

Basis and Dimension

271

and as S1 is an extension of S, the theorem is proved. If L(S1 ) is a proper subspace of V ,


then,
V L(S1 ) =
and proceed as before and we get S2 = {1 , 2 , . . . , r , 1 , 2 } S1 . Let L(S2 ) = V , then it
is basis, if not so, we continue to repeat the procedure, till after a finite number of steps, we
get a linearly independent set, which generates V containing S. As V is finite dimensional,
after a finite number of steps we come to a finite set
Sk = {1 , 2 , . . . , r , 1 , 2 , . . . , k }
as an extension of S and also as a basis of V . Hence either S is already a basis or it can be
extended to form a basis of V .
Deduction 4.5.1 Every set of (n + 1) or more vectors in an n dimensional vector space
V (F ) is linearly dependent.
Proof: Since V (F ) is a finite dimensional vector space with dimension n, every basis of V
will contain exactly n vectors. Now, if S is any linearly independent subset of V containing
(n + 1) vectors. Then by Extension theorem, either it is already a basis of V or it can be
extended to form a basis of V .
Conversely, in each case the basis of V contains (n + 1) or more vectors in V , which is
contrary, to the hypothesis that V is ndimensional. Thus S is linearly dependent and so
is every superset of the same.
Deduction 4.5.2 If V (F ) be a finite dimensional space of dimension n, then any linearly
independent set of n vectors in V forms the basis of V .
Proof: Since V (F ) be a finite dimensional vector space with dimV = n, every basis of V
will contain exactly n vectors. Now, if S is a linearly independent set of n vectors in V ,
then by extension theorem either S is already a basis of V or it can be extended to form the
basis of V . But in later case, the basis of V will contain more than nvectors, contradicting
the hypothesis that V is ndimensional. Consequently, the former statement that S forms
the basis of V is true.
Deduction 4.5.3 If V (F ) be a finite dimensional space of dimension n, then any subset
consisting of n vectors in V and which generates V , forms the basis of V .
Proof: Since V (F ) be a finite dimensional vector space with dimV = n, every basis of V
will contain exactly n vectors. Let S be a set of n vectors in V generating V . If S is linearly
independent, then it will form a basis of V ; otherwise there will exist a proper subset of S
which will form the basis of V .
Thus, in this case, the basis of V will contain less than n elements, contradicting the
hypothesis that V is ndimensional. Hence S can not be linearly dependent and so it must
form the basis of V .
Theorem 4.5.5 Let V (F ) be a vector space. A subset B = {1 , 2 , . . . , n } of V is a basis
of V if and only if every element of V has a unique representation as a linear combination
of the vectors of B.
Proof: Let B = {1 , 2 , . . . , n } be a basis of V (F ) and V . Then, every vector V
can be written as a linear combination of the vectors in B, as B spans V . Now, let,

272

Vector Space

=
and =

n
X
i=1
n
X

ci i ; for some scalars ci F.


di i ; for some other scalars di F.

i=1

Subtracting, second from the first, we obtain,


n
X

(ci di )i = =

i=1

ci di = 0; i, as {1 , 2 , . . . , n } is linearly independent
ci = di ; 1 i n
and so ci s are unique. Hence there is only one way to express as a linear combination
of the vectors in B. Conversely, let B = {1 , 2 , . . . , n } be a subset of V such that every
vector of V has a unique representation as linear combination of the vectors of B. Clearly
V = L(B) = L({1 , 2 , . . . , n }).
Now, V , and by the condition has an unique representation as a linear combination of
n
P
the vectors of B. Let, =
ci i , which is satisfied by c1 = c2 = . . . = cn = 0 and because
i=1

of uniqueness in the condition, it follows that,


n
X

ci i = ci = 0; i.

i=1

B is a LI set B is a basis of V (F ).
Result 4.5.2 If U be a subspace of a finite dimensional vector space V and dim V = n
then U is finite dimensional and dim U n.
Ex 4.5.14 Find a basis of <3 containing the vectors (1,2,0) and (1,3,1).
Solution: Since dim <3 = 3, so three vectors are needed to generate <3 . Let =
(1, 2, 0), = (1, 3, 1) and the third vector be e1 = (1, 0, 0).
Now, the determinant

formed by the vectors e1 , , ,
1 0 0


1 2 0 = 2 6= 0.


1 3 1
So, the vectors e1 , , are linearly independent. Also, the number of vectors is three and
they belong to <3 . Hence {(1, 2, 0), (1, 3, 1), (1, 0, 0)} is a basis of <3 containing and .
Ex 4.5.15 W1 and W2 are two subspaces of <4 defined by
W1 = {(x, y, z, w) : x, y, z, w <, 3x + y + z + 2w = 0}, W2 = {(x, y, z, w) : x, y, z, w
<, x + y z + 2w = 0}. Find dim (W1 W2 ).
Solution: The subspace W1 W2 = {(x, y, z, w) : x, y, z, w <, 3x + y + z + 2w =
0, x + y z + 2w = 0}. Now solving the equations 3x + y + z + 2w = 0, x + y z + 2w = 0
for y, z, we get y = 2x 2w, z = x. Therefore,
(x, y, z, w) = (x, 2x 2w, x, w) = x(1, 2, 1, 0) + w(0, 2, 0, 1).
Thus the set {(1, 2, 1, 0), (0, 2, 0, 1)} generates the subspace W1 W2 . The vectors
(1, 2, 1, 0) and (0, 2, 0, 1) are linearly independent as
c1 (1, 2, 1, 0) + c2 (0, 2, 0, 1) = (0, 0, 0, 0) implies c1 = 0, c2 = 0.
Hence {(1, 2, 1, 0), (0, 2, 0, 1)} is a basis of W1 W2 and hence the dimension of W1 W2
is 2.

Basis and Dimension

273

Dimension of a subspace
Theorem 4.5.6 Every non null subspace W of a finite dimensional vector space V (F ) is
finite dimensional and dimW dimV.
Proof: Since V is finite dimensional, every basis of V will contain a finite number of
elements, say n and so every set of (n + 1) or more vectors in V is linearly dependent.
Consequently, a linearly independent set of vectors in W contains at most n elements. Let,
S = {1 , 2 , . . . , m },
where m n be a maximal linearly independent set in W . Now, if is an arbitrary element
of of W , then S being a maximal linearly independent set,
S1 = {1 , 2 , . . . , m , }
is therefore linearly dependent and hence the vector is a linear combination of 1 , 2 , . . . , m ,
showing that S generates V . Accordingly, dimW = m n = dimV. Moreover, when W is
a proper subspace of V , a vector V but not contained in W and as such can not be
expressed as a linear combination of elements of S, the basis of W . Consequently, the set
obtained by adjoining to S forms a linearly independent subset of V and so the basis of
V will contain more than m vectors. Hence, in this case,
dimW < dimV.
Again, if V = W , then every basis of V is also a basis of W and therefore,
V = W dimW = dimV.
On the other hand, let W be a subspace of V such that
dimW = dimV = n(say).
Now, if S is a basis of W then it being a linearly independent subset of V containing n
vectors, it will also generate V . Thus, each one of V and W is generated by S. So in this
case, V = W. Hence,
V = W dimW = dimV.
Dimension of a linear sum
Theorem 4.5.7 Let W1 and W2 are the subspaces of a finite dimensional vector space
V (F ). Then W1 + W2 is finite dimensional and
dim(W1 + W2 ) = dimW1 + dimW2 dim(W1 W2 ).
Proof: Every subspace of a finite dimensional vector space being finite dimensional and
V (F ) being finite dimensional; so are therefore its subspaces W1 , W2 , W1 W2 and W1 + W2
and
dim(W1 W2 ) dimW1 ; dimW1 dim(W1 W2 ) dimV
dim(W1 W2 ) dimW2 ; dimW2 dim(W1 W2 ) dimV.
Let B = {1 , 2 , . . . , r } be a basis of W1 W2 . Since W1 W2 being a subspace of W1 as
well as of W2 , B can be extended to form the basis of W1 and W2 . Let the extended sets
B1 and B2 which forms the bases of W1 and W2 respectively be
B1 = {1 , 2 , . . . , r ; 1 , 2 , . . . , s }
B2 = {1 , 2 , . . . , r ; 1 , 2 , . . . , t }.

274

Vector Space

Obviously, dim(W1 W2 ) = r, dimW1 = r + s and dimW2 = r + t. Consider the set,


B0 = {1 , 2 , . . . , r ; 1 , 2 , . . . , s ; 1 , 2 , . . . , t }. We shall show that, the set B0 is basis
of W1 + W2 . First, we shall show that L(B0 ) = W1 + W2 . Now,
L(B0 ) =
=

r
X

ci i +

i=1
" r
X

s
X

bi i +

i=1
s
X

ci i +

i=1

t
X

ki i

i=1

"

bi i +

i=1

r
X

0i +

i=1

t
X

#
ki i

i=1

= 1 + 2 ; where 1 W1 and 2 W2
W1 + W2 L(B0 ) W1 + W2 .

(4.9)

Again, W1 + W2 = 1 + 2 , where 1 W1 , 2 W2 , so,


# " r
#
" r
s
t
X
X
X
X
ki i +
li i
=
ci i +
bi i +
i=1

r
X

i=1

c0i i +

i=1

s
X

i=1

bi i +

i=1

t
X

li i ;

i=1

c0i = ci + ki , di , li F

i=1

L(B0 ) W1 + W2 L(B0 ).

(4.10)

Hence from (4.9) and (4.10), it follows that, L(B0 ) = W1 +W2 . Next, we are to show that B0
is LI. For this, let scalars xi (i = 1, 2, . . . , r), yi (i = 1, 2, . . . , s), and zi (i = 1, 2, . . . , t) F ,
such that
r
X

xi i +

i=1

t
X

(zi )i =

i=1

s
X
i=1
r
X

yi i +
xi i +

i=1

t
X
i=1
s
X

zi i =
yi i = (say).

i=1

Since W1 as well as W2 , so, W1 W2 . Hence,


=

r
X
i=1
r
X
i=1

xi i +

s
X

yi i =

i=1

(xi ui )i +

r
X

ui i

i=1
s
X

yi i =

i=1

xi ui = 0; i = 1, 2, . . . , r and yi = 0; i = 1, 2, . . . , s; as B1 is LI
xi = 0, yi = 0 and consequently zi = 0.
Therefore, B0 is linearly independent and so B0 is a basis of finite dimensional subspace
W1 + W2 and
dim(W1 + W2 ) = r + s + t = (r + s) + (r + t) r
= dimW1 + dimW2 dim(W1 W2 ).
Ex 4.5.16 Suppose W1 and W2 be the distinct four-dimensional subspaces of a vector space
V , where dimV = 6. Find the possible dimension of W1 W2 .
Solution: Since the subspaces W1 and W2 be the distinct four-dimensional subspaces of a
vector space V , W1 + W2 properly contains W1 and W2 . Consequently, dim(W1 + W2 ) > 4.

Basis and Dimension

275

But dim(W1 + W2 ) can not be greater than 6, as dimV = 6. Therefore we have the following
two possibilities: (i) dim(W1 + W2 ) = 5, or, (ii) dim(W1 + W2 ) = 6. Using the theorem of
dimension of a linear sum, we have
dim(W1 W2 ) = dimW1 + dimW2 dim(W1 + W2 )
= 8 dim(W1 + W2 ).
Therefore, (i) dim(W1 W2 ) = 3, or, (ii) dim(W1 W2 ) = 2.
Ex 4.5.17 If U = L{(1, 2, 1), (2, 1, 3)}, W = {(1, 0, 0), (0, 1, 0)}, show that U, W are subspaces of <3 . Determine dim U, dim V, dim (U W ), dim (U + V ).
Solution: Let = (1, 2, 1), = (2, 1, 3), = (1, 0, 0), = (0, 1, 0). Then {, } is linearly
independent as c1 (1, 2, 1) + c2 (2, 1, 3) = (0, 0, 0) implies c1 = 0, c2 = 0. Also, it is given that
U = L{, }. Hence {, } is a subspace of <3 of dimension 2.
Again {, } is linearly independent as d1 (1, 0, 0) + d2 (0, 1, 0) = (0, 0, 0) implies d1 =
0, d2 = 0. Also, {, } generates W . Therefore, W is a subspace of <3 and {, } is a basis
of W of dimension 2.
Let be a vector in U W . Then = a+b for some real numbers a, b. Also, = c +d
for some real number c, d. Therefore,
a(1, 2, 1) + b(2, 1, 3) = c(1, 0, 0) + d(0, 1, 0)
or, (a + 2b, 2a + b, a + 3b) = (c, d, 0)
or, a + 2b = c, 2a + b = d, a + 3b = 0.
Solving we get, a = 3b, c = b, d = 5b. Hence = (b, 5b, 0) = b(1, 5, 0), b is
arbitrary.
Therefore U W is a subspace of dimension 1. Now,
dim (U + V ) = dim U + dim W dim (U W ) = 2 + 2 1 = 3.
Thus, dim U = 2, dim W = 2, dim (U W ) = 1 and dim (U + W ) = 3.
Dimension of a direct sum
Theorem 4.5.8 If a finite dimensional vector space V (F ) is the direct sum of its subspaces
W1 and W2 , then,
dim(V ) = dim W1 + dim W2 .
Proof: Since V is finite dimensional, so are therefore its subspaces W1 and W2 . Let
S1 = {1 , 2 , . . . , k } and {1 , 2 , . . . , l } be the bases of W1 and W2 respectively, so that
dimW1 = k and dimW2 = l. We are to show that, S = {1 , 2 , . . . , k , 1 , 2 , . . . , l } is a
basis of V . Now, since V = W1 W2 , so every V can be expressed as,
= + ; W1 and W2
k
l
X
X
=
ci i +
dj j ; for some ci , dj F.
i=1

j=1

This shows that, can be expressed as a linear combination of elements of S. Thus, S


generates V . Now, we are to show that, S is linearly independent. For this,
k
X

ci i +

i=1

k
X
i=1

l
X

dj j =

j=1

ci i =

l
X
j=1

(dj )j W1 W2

276

Vector Space

as

k
X

ci i W1 and

i=1

k
X

l
X

(dj )j W2

j=1

ci i = and

i=1

l
X

(dj )j = ; as W1 W2 = {}

j=1

ci = 0, i and dj = 0, j as W1 , W2 are LI
S is linearly independent .
Thus, S is a basis of V , and consequently,dim(V ) = k + l = dimW1 + dimW2 .
Theorem 4.5.9 Existence of complementary subspace: Every subspace of a finite
dimensional vector space has a complement.
Proof: Let W1 be a subspace of a finite dimensional vector space V (F ). Then, we are
to find a subspace W2 of V such that V = W1 W2 . Since V i finite dimensional, so it
therefore its subspace W1 . Let S1 = {1 , 2 , . . . , n } be a basis of W1 . Then, S1 is a
linearly independent subset of V and therefore, it can be extended to form a basis of V . Let
the extended set,
S2 = {1 , 2 , . . . , n , 1 , 2 , . . . , n }
be a basis of V . Let us denote by W2 , the subspace generated by {1 , 2T
, . . . , n }. We shall
show that, V = W1 W2 , which is equivalent to V = W1 + W2 and W1 W2 = {}. Let
be an arbitrary element of V . As S2 is a basis of V , we have,
=

m
X

ai i +

i=1

n
X

bj j ; for some scalars ai , bi

j=1

= + , where =

m
X

ai i W1 and =

i=1

n
X

bj j W2 .

j=1

Thus, each element of V is expressible as sum of an element of W1 and an element of W2 ,


m
T
P
so V = W1 + W2 . Now, in order to show that, W1 W2 = {}, let =
ai i W1 and
=

n
P
j=1

i=1

bj j W2 be equal. Then,
m
n
m
n
X
X
X
X
=
ai i =
bj j
ai i +
(bj )j =
i=1

j=1

i=1

j=1

ai = 0, i and bj = 0, j; as S2 is linearly independent


m
n
X
X

ai i = and
bj j = = = .
i=1

j=1

Thus, no non-zero vector is common to both W1 and W2 , i,e, W1


V = W1 W2 .

W2 = {}. Therefore,

Theorem 4.5.10 Dimension of a quotient space: Let V (F ) be a finite dimensional


vector space and W be a subspace of V . Then
dim(V /W ) = dimV dimW.
Proof: Let dimV = n and dimW = m. Let S1 = {1 , 2 , . . . , m } be a basis of W . By
extension theorem, S1 can be extended to S2 = {1 , 2 , . . . , m , 1 , 2 , . . . , nm } to form
a basis of V . We claim that the set

Co-ordinatisation of Vectors

277

S3 = {W + 1 , W + 2 , . . . , W + nm }
of (n m) cosets, is a basis of (V /W ). First, we are to show that, S3 is linearly independent.
Now, for some scalars b1 , b2 , . . . , bnm F , we have,

b1 (W + 1 ) + b2 (W + 2 ) + . . . + bnm (W + nm ) = W +
(W + b1 1 ) + (W + b2 2 ) + . . . + (W + bnm nm ) = W +
W + (b1 1 + b2 2 + . . . + bnm nm ) = W +
b1 1 + b2 2 + . . . + bnm nm W
b1 1 + b2 2 + . . . + bnm nm = a1 1 + a2 2 + . . . + am m , for some ai F
a1 1 + a2 2 + . . . + am m + (b1 )1 + (b2 )2 + . . . + (bnm )nm =
a1 = a2 = . . . = am = 0; b1 = b2 = . . . = bnm = 0; as S2 is LI
b1 = b2 = . . . = bnm = 0; in particular .

Therefore, S3 is linearly independent. Moreover, if W + be an arbitrary element of V /W ,


then V and S2 being a basis of V , we have for some scalars ai , bj F,
=

m
X

ai i +

nm
X

i=1

bj j

j=1

W +=W +

m
hX

ai i +

i=1

=W +

nm
X

bj j ; as

j=1

nm
X

nm
X

m
nm
i h
i h
i
X
X
bj j = W +
ai i + W +
bj j

j=1
m
X

ai i W W +

i=1

i=1
m
X

j=1

ai i = W

i=1

bj (W + j ).

j=1

This shows that W + L({W + 1 , . . . , W + nm }). Thus, each element of V /W is


expressible as a linear combination of elements of S3 , i.e., S3 generates V /W . So, S3 is a
basis of V /W . Therefore,
dim(V /W ) = n m = dimV dimW.
Ex 4.5.18 Let V = <4 and W be a subspace of V generated by the vectors (1, 0, 0, 0), (1, 1, 0, 0).
Find a basis of the quotient space V /W .
Solution: Let = (1, 0, 0, 0) and = (1, 1, 0, 0). Since , are linearly independent, so
{, } is a basis of W . The linearly independent set S in V can be extended to a basis of
V . Let = (0, 0, 1, 0) and = (0, 0, 0, 1), then S1 = {, , , } is linearly independent in V
and so a basis of V . Elements of the bases of the quotient space V /W are W + and W +
and so dimV /W = 2. Here dimV = 4, dimW = 2 and dimV /W = 2 = 4 2, so that
dim(V /W ) = n m = dimV dimW.

4.6

Co-ordinatisation of Vectors

Let V be an ndimensional vector space, then V has a basis S with n vectors in it. Here
we shall discuss of an order basis S = {1 , 2 , . . . , n } for V .

278

4.6.1

Vector Space

Ordered Basis

If the vectors of the basis set S of a finite dimensional vector space V (F ) be enumerated in
some fixed ordering way, then it is called ordered basis.

4.6.2

Co-ordinates

Let S = {1 , 2 , . . . , n } be a ordered basis of a finite dimensional vector space V (F ). Then,


for scalars c1 , c2 , . . . , cn , each V can be uniquely expressed in the form
n
X
= c1 1 + c2 2 + + cn n =
ci i .
(4.11)
i=1

For each V , the unique ordered ntuple (c1 , c2 , . . . , cn ) is called co-ordinate vector of
relative to the ordered basis S and is denoted by ()S . The entries of ()S are called
co-ordinates of V with respect to S.
(i) We assert that, the set of vectors in S should be ordered because a change in ()S
occurs if the relative order of vectors in S be changed.
(ii) For a non zero vector space V3 (F ), the co-ordinates of all vectors in Vn are unique,
relative to the ordered basis S.
(iii) The co-ordinate vectors of the vectors in an abstract space V (F ) of dimension n relative
to an ordered basis are the elements of F n .
Ex 4.6.1 Find the co-ordinate vector of = (1, 3, 1) relative to the ordered basis B =
{1 , 2 , 3 } of <3 , where 1 = (1, 1, 1), 2 = (1, 1, 0), 3 = (1, 0, 0).
Solution: It is easy to verify that B is a basis of <3 (<). Let scalars c1 , c2 , c3 < such
that c1 1 + c2 2 + c3 3 = holds, so,
c1 (1, 1, 1) + c2 (1, 1, 0) + c3 (1, 0, 0) = (1, 3, 1)
(c1 + c2 + c3 , c1 + c2 , c1 ) = (1, 3, 1).
Set corresponding components equal to each other to obtain the system
c1 + c2 + c3 = 1; c1 + c2 = 3; c1 = 1
c1 = 1, c2 = 2, c3 = 2.
This is the unique solution to the system and hence the ordered basis of with respect to
the basis B is ()B = (1, 2, 2). If the coordinate vector relative to the ordered basis B is
(a, b, c), then the vector is given by
= a1 + b2 + c3
= a(1, 1, 1) + b(1, 1, 0) + c(1, 0, 0)
= (a + b + c, a + b, a).
Ex 4.6.2 In the vector space V of polynomials in t of maximum degree 3, consider the
following basis B = {1, 1t, (1t)2 , (1t)3 }. Find the coordinate vector of = 32tt2 V
relative to the basis B.

Rank of a Matrix

279

Solution: It is easy to verify that B is a basis of V . Set as a linear combination of the


polynomials in the basis B, using the unknown scalars c1 , c2 , c3 , c4 < such that
c1 .1 + c2 (1 t) + c3 (1 t)2 + c4 (1 t)3 = = 3 2t t2
(c1 + c2 + c3 + c4 , (c2 2c3 3c4 )t, (c3 + 3c4 )t2 , c4 t3 ) = 3 2t t2 .
Set corresponding coefficients of the same powers of t equal to each other to obtain the
system of linear equations
c1 + c2 + c3 + c4 = 3; c2 + 2c3 + 3c4 = 2; c3 + 3c4 = 1; c4 = 0,
from which, we have the unique solution c1 = 0, c2 = 4, c3 = 1 and c4 = 0. This is the
unique solution to the system and hence the ordered basis of with respect to the basis B
is ()B = (0, 4, 1, 0).
Ex 4.6.3 
In the vector
symmetric matrices over < consider the following
 space

W of 2 2 
1 1
41
3 2
basis B =
,
,
. Find the coordinate vector of the matrix =
1 2
10
2 1


12
W relative to the basis B.
24
Solution: It is easy to verify that B is a basis of W . Set W as a linear combination of
the matrices in the basis
B, using
the
scalars c1, c2 , c3 
< as 


 unknown


1 1
41
3 2
12
c1
+ c2
+ c3
==
1 2
10
2 1
24

 

c1 + 4c2 + 3c3 c1 + c2 2c3
12

=
.
c1 + c2 2c3
2c1 + c3
24
Set corresponding entries equal to each other to obtain the system of linear equations
c1 + 4c2 + 3c3 = 1; c1 + c2 2c3 = 2; 2c1 + 3c3 = 4,
from which, we have the unique solution c1 = 3, c2 = 1, c3 = 2. This is the unique
solution to the system and hence the ordered basis of with respect to the basis B is
()B = (3, 1, 2). Since dimW = 3, so ()B must be a vector in <3 .

4.7

Rank of a Matrix

Here we obtain an effective method for finding a basis for a vector space V spanned by a
given set of vectors. We attach a unique number to a matrix A that we later show gives
us information about the dimension of the solution space of a homogeneous systems with
coefficient matrix A.

4.7.1

Row Space of a Matrix

Let A = [aij ]mn be an arbitrary m n matrix over the field F , i.e., aij F . Let
R1 , R2 , . . . , Rm be the m row vectors of A, where Ri Vn . Then L({R1 , R2 , . . . , Rm })
is a subspace of the linear space F n , called the row space of A and is denoted by R(A).
The dimension of the row space R(A) is called the row rank of A.
(i) R(AT ) = C(A).
(ii) The matrices A and B are row equivalent, written A B, if B can be obtained from
A by a sequence of elementary row operations.

280

Vector Space

(iii) Row equivalent matrices have the row space.


(iv) Every matrix A is row equivalent to a unique matrix in row canonical form.

6721
Ex 4.7.1 Find the row space and row rank of the matrix A = 1 2 1 4 .
2428
Solution: Here the row vectors are R1 = (6, 7, 2, 1), R2 = (1, 2, 1, 4) and R3 = (2, 4, 2, 8).
Now row space of A is the linear span of the row vectors {R1 , R2 , R3 }. Hence the row space
is,
R(A) = {a1 R1 + a2 R2 + a3 R3 ; a1 , a2 , a3 <}
where, a1 R1 + a2 R2 + a3 R3
= a1 (6, 7, 2, 1) + a2 (1, 2, 1, 4) + a3 (2, 4, 2, 8)
= (6a1 + a2 + 2a3 , 7a1 + 2a2 + 4a3 , 2a1 + a2 + 2a3 , a1 + 4a2 + 8a3 ).
Now, {R1 , R2 , R3 } is linearly dependent as R3 = 2R2 . But {R1 , R2 } is linearly independent.
Hence, {R1 , R2 } is a basis of the row space R(A) and so dimR(A) = 2. Consequently, the
row rank of A = 2.
Ex 4.7.2 Determine, which of the following matrices have the same row space





1 1 3
1 2 1
1 1 2
A=
,B =
, C = 2 1 10 .
3 4 5
2 3 1
3 5 1
Solution: The row reduce each matrix to row canonical form are given by,





 

1 2 1 1 2 1 1 0 7
107
R2 3R1
R 1 + R2

.
3 4 5
0 2 8
028
014



 
 

1 1 2 1 1 2
1 1 2
10 1
R2 2R1

.
2 3 1
0 5 5
0 1 1
0 1 1

1 1 3
1 1 3
1 1 3
107
2 1 10 0 1 4 0 1 4 0 1 4 .
3 5 1
0 2 8
0 1 4
000
Since the non zero rows of the reduced form of A and of the reduced form of C are same, A
and C have the same row space. On the other hand, the non zero rows of the reduced form
of B are not the same as the others, and so B has a different row space.
Ex 4.7.3 Let 1 = (1, 1, 1), 2 = (2, 3, 1), 3 = (3, 1, 5) and 1 = (1, 1, 3), 2 =
(3, 2, 8), 3 = (2, 1, 3). Show that the subspace of <3 generated by i is the same as the
subspace generated by the i .
Solution: Let us consider two matrices A and B, where the rows of A are i and the rows
of B are the i . The row reduce each matrix to row canonical form are given by,

1 1 1
1 1 1
1 0 2
A = 2 3 1 0 1 1 0 1 1 .
3 1 5
0 2 2
00 0

1 1 3
1 1 3
1 0 2
B = 3 2 8 0 1 1 0 1 1 .
2 1 3
0 3 3
00 0
Since the non zero rows of the reduced form of A and of the reduced form of B are same, A
and B have the same row space. Hence the subspace of <3 generated by the i is the same
as the subspace generated by the i .

Rank of a Matrix

4.7.2

281

Column Space of a Matrix

Let A = [aij ]mn be an arbitrary m n matrix over the field F , i.e., aij F . Let
C1 , C2 , . . . , Cn be the n column vectors of A, where Ci Vm . Then L({C1 , C2 , . . . , Cm }) is
a subspace of the linear space F m , called the column space of A and is denoted by C(A).
The dimension of the column space C(A) is called the column rank of A.

6721
Ex 4.7.4 Find the column space and column rank of the matrix A = 1 2 1 4 .
2428
Solution: Here the column vectors are C1 = (6, 1, 2), C2 = (7, 2, 4), C3 = (2, 1, 2) and C4 =
(1, 4, 8). Now the column space of A is the linear span of the column vectors {C1 , C2 , C3 , C4 }.
Hence the column space of A is,
C(A) = {b1 C1 + b2 C2 + b3 C3 + b4 C4 ; b1 , b2 , b3 , b4 <}
where, b1 C1 + b2 C2 + b3 C3 + b4 C4
= a1 (1, 4, 8) + b2 (7, 2, 4) + b3 (2, 1, 2) + b4 (1, 4, 8)
= (6b1 + 7b2 + 2b3 + b4 , b1 + 2b2 + b3 + 4b4 , 2b1 + 4b2 + 2b3 + 8b4 ).
Now, {C1 , C2 , C3 , C4 } is linearly dependent but {C1 , C2 } is linearly independent. Hence,
{C1 , C2 } is a basis of the column space C(A) and so dimC(A) = 2. Consequently, the
column rank of A = 2.
Theorem 4.7.1 Let A = [aij ]mn and P = [pij ]mn be two matrices of same order over
the same field F . Then,
(i) the row space of P A is the subspace of the row space of A.
(ii) the row space of P A is the same as the row space of A if P is non singular.
Proof: (i) Here A = [aij ]mn and P = [pij ]mn are two given matrices. Let R1 , R2 , . . . , Rm
be the row vectors of A and 1 , 2 , . . . , m be the row vectors of P A. Then,
i = pi1 R1 + pi2 R2 + . . . + pim Rm ; i = 1, 2, . . . , m.
Therefore, each i is a linear combination of the vectors R1 , R2 , . . . , Rm . Hence,
L({1 , 2 , . . . , m }) L({R1 , R2 , . . . , Rm }),
i.e., the row space of P A is the subspace of the row space of A.
(ii) Let P A = B. Since P is non singular, P 1 exists and A = P 1 B. Hence by using
(i), row space of P 1 B is the subspace of the row space of B, i.e., R(A) R(P A). Again
R(P A) R(A) as in (i), and so, R(A) = R(P A).
Corrolary: If B is row equivalent to A, then a non singular square matrix P of order m
such that B = P A, where P is also the product of elementary matrices. Hence,
R(A) = R(P A) = R(B).
Therefore, row equivalent matrices have the same row space.
Corrolary: It is shown that,
R(A) = R(P A) = R(B) dimR(A) = dimR(B)
row rank of A = row rank ofB.
Hence, the pre multiplication with a non singular matrix does not alter the row rank.

282

Vector Space

Result 4.7.1 The elementary row transformations on A, (i)Ri Rj or (ii)Ri kRi ; k 6=


0 or (iii)Ri kRj + Ri does not alter the row space or row rank of the matrix.
Theorem 4.7.2 The non zero row vectors of a row reduced echelon form R of the matrix
A = [aij ]mn form a basis of the row space of A.
Proof: Let A = [aij ]mn and R1 , R2 , . . . , Rr be the non zero row vectors of the row reduced
echelon matrix R. Other m r row vectors of R are null vectors . Hence the row space is
generated by
{R1 , R2 , . . . , Rr , }
and the row space R = L({R1 , R2 , . . . , Rr , }). Since the generator {R1 , R2 , . . . , Rr , } contains a null vector , the vectors of the set is linearly dependent. Using deletion theorem,
the null vector can be deleted from the generating set. So the new generating set is
{R1 , R2 , . . . , Rr }, which contains non null vectors. Now we shall show that {R1 , R2 , . . . , Rr }
is linearly independent. Let,
Ri = (ai1 , ai2 , . . . , ain ).
Since R is a row reduced echelon matrix, there are positive integers k1 , k2 , . . . kr satisfying
the following conditions:
(i) the leading 1 of Ri occurs in column ki
(ii) k1 < k2 < . . . < kr
(iii) aikj = ij
(iv) aij = 0; if j < ki .
Let us consider the relation c1 R1 + c2 R2 + . . . + cr Rr = , where ci s are scalars. Then,
c1 R1 + c2 R2 + . . . + cr Rr =
c1 (a11 , a12 , . . . , a1n ) + . . . + cr (ar1 , ar2 , . . . , arn ) = (0, 0, . . . , 0)
(c1 a11 + . . . + cr ar1 , c1 a12 + . . . + cr ar2 , . . . , c1 a1n + . . . + cr arn ) = (0, 0, . . . , 0).
Here,
a1k1 = 1 a2k1 = 0 . . . ark1 = 0
a1k2 = 0 a2k2 = 0 . . . ark2 = 0
..
..
..
..
.
.
.
.
a1kr = 0 a2kr = 0 . . . arkr = 1
Equating k1th , k2th , . . . , krth components only, we have
c1 = c2 = . . . = cr = 0.
Here {R1 , R2 , . . . , Rr } is linearly independent and consequently it is a basis of the row
space of R. Since R is row equivalent to A, the row space of A is same as that of R and
therefore {R1 , R2 , . . . , Rr } is a basis of the row space of A. Hence every matrix is row
equivalent to a unique row reduced echelon matrix called its row canonical form.
Corrolary: The row rank of a row reduced echelon matrix R is the number of nonzero rows
of R. Hence, dimR(T ) = dimR(A) = row rank of A.
Corrolary: The determinant rank of A is the order of the largest sub matrix of A, whose
determinant is not zero. Hence
dimR(A) = row rank of A = determinant rank.

Isomorphic

283

Theorem 4.7.3 For any m n matrix A, the row rank and the column rank are equal.
Proof: Let A = [aij ]mn be a matrix, where aij F . Also, let R1 , R2 , . . . , Rm and
C1 , C2 , . . . , Cn be the row and column vectors of A respectively. Let the row rank of A be
r and a basis of the row space of A is {1 , 2 , . . . , r }, where,
i = (bi1 , bi2 , . . . , bin )
and bij = akj for some k. Since {1 , 2 , . . . , r } is a basis, by the property of a basis, for
some suitable scalars cij F ,

R1 = c11 1 + c12 2 + . . . + c1r r

R2 = c21 1 + c22 2 + . . . + c2r r


(i)
..

Rm = cm1 1 + cm2 2 + . . . + cmr r


The j th component of Ri is aij , so considering j th components of (i), we have,
a1j = c11 b1j + c12 b2j + . . . + c1r brj
a2j = c21 b1j + c22 b2j + . . . + c2r brj
..
.
amj = cm1 b1j + cm2 b2j + . . . + cmr brj
Hence cj = b1j 1 + b2j 2 + . . . + brj r , for j = 1, 2, . . . , n, where

c1i
c2i

i = . ; i = 1, 2, . . . , r.
..
cmi
This shows that, any column vectors of A belongs to the linear span of r vectors 1 , 2 , . . . , r
and so the column space of A has dimension at most r. Hence
dimension of the column space r Column rank of A r
Column rank of A Row rank of A.
Also, we know,

Row rank of A = Column rank of AT


Column rank of AT Row rank of AT
Row rank of A Column rank of A.

(ii)

(iii)

Combining (ii) and (iii), we have, Row rank of A = column rank of A. Also, if A and B be
two matrices of the same type, over the same field F , then
rank of (A + B) rank of A + rank of B.

4.8

Isomorphic

We see that, to each vector V , there corresponds relative to a given basis B =


{1 , 2 , . . . , n }, a ntuple []B in F n . On the other hand, if (c1 , c2 , . . . , cn ) F n , then
a vector in V of the form

284

Vector Space
c1 1 + c2 2 + . . . + cn n .

Thus, the basis B determines a one-to-one correspondence between the vectors in V and
n
n
P
P
the ntuples in F n . Also, if =
ci i corresponds to (c1 , c2 , . . . , cn ) and =
di i
i=1

i=1

corresponds to (d1 , d2 , . . . , dn ), then


n
X
(ci + di )i corresponds to (c1 , c2 , . . . , cn ) + (d1 , d2 , . . . , dn )
+ =
i=1
n
and for any scalar m F , X
(mci )i corresponds to m(c1 , c2 , . . . , cn )
m =
i=1

i.e., [ + ]B = []B + []B and [m]B = m[]B .


Thus the one-to-one correspondence V F n preserves the vector space operations of vector addition and scalar multiplication. This one-to-one correspondence V F n is called
isomorphic, i.e., V
= F n.
Ex 4.8.1 Test whether
the

 following
 matrices
 in V =
 M23 are linearly independent or not.
1 2 3
1 3 4
3 8 11
; C=
A=
; B=
40 1
65 4
16 10 9
Solution: The coordinate vectors of the matrices in the usual basis of M23 are
[A] = [1, 2, 3, 4, 0, 1]; [B] = [1, 3, 4, 6, 5, 4]; [C] = [3, 8, 11, 16, 10, 9].
From the matrix M whose rows are the above coordinate vectors and reduce M to an echelon
form:

12 3 4 0 1
12 3 4 0 1
12 3 401
1 3 4 6 5 4 0 1 1 2 5 3 0 1 1 2 5 3 .
3 8 11 16 10 9
0 2 2 4 10 6
00 0 000
Sice the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B] and
[C] span a subspace of dimension 2 and so are linearly dependent. Accordingly, the original
matrices A, B, C are linearly dependent.

Exercise 4
Section-A
[Multiple Choice Questions]
1. Let V1 and V2 subspaces of a vector space V . Which of the following is necessarily a
subspace of V ?
NET(June)12
(a) V1 V2 (b) V1 V2 (c) V1 + V2 = {x + y : x V1 , y V2 } (d) V1 /V2 = {x
V1 , y 6 V2 }
2. Let n be a positive integer and let Hn be the space of all n n matrices A = (aij )
with entries in < satisfying aij = ars whenever i + j = r + s(i, j, r, s = 1, 2, , n).
Then the dimension of Hn , as a vector space over <, is
[NET(Dec)11]
(a) n2
(b) n2 n + 1
(c) 2n + 1
(d) 2n 1.
3. The value of k for which the vectors (1,5) and (2, k) linearly dependent is
(a) k = 1
(b) k = 5
(c) k = 2
(d) k = 10
4. The value of k for which the vectors (1,0,0), (0,2,0) and (0, 0, k) linearly dependent is
(a) k = 0
(b) k = 1
(c) k = 2
(d) k = 1

Isomorphic

285

5. The value of x for which the vectors (x, 1, 0), (0, x, 1) and (1, 1, 1) linearly dependent
are
(a) 0, 1
(b) 0, 2
(c) 0, 3
(d) 1, 2
6. The value of k such that the vectors (1, k, 2), (0, 1, 2) and (1, 1, 1) linearly independent
is
(a) k = 3/2
(b) k 6= 3/2
(c) k = 3/2
(d) k 6= 3/2
7. If {(1, 2, 1), (2, 0, 1), (1, 1, k)} is a basis of R3 then the value of k is
(a) k = 2
(b) k 6= 2
(c) k = 0
(d) k 6= 1
8. If {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis of vector space V then its dimension is
(a) 0
(b) 1
(c) 2
(d) 3
9. If W = {(x, y, z) R3 : x + y = 2z} is a subspace of R3 . Then one of its basis is
(a) {(2, 0, 1), (0, 2, 1)}
(b) {(1, 1, 0), (0, 1, 1)}
(c) {(2, 1, 0), (1, 0, 1)}
(d) none of these
10. If W = {(x, y, z) R3 : x + 2y 3z = 0} is a subspace of R3 , then one of its basis is
(a) {(1, 1, 1), (1, 0, 1)}
(b) {(2, 1, 0), (3, 0, 1)}
(c) {(1, 0, 0), (0, 1, 0)}
(d) {(2, 1, 0), (1, 0, 0)}
11. If S = {(x, y, z) R3 : x + y + z = 0} is a subspace of R3 then dimension of S is
(a) 0
(b) 1
(c) 2
(d) 3
12. If S = {(x, y, z) R3 : 2x y + 4z = 0, x y + z = 0} is a subspace of R3 then
dimension of S is
(a) 4
(b) 3
(c) 2
(d) 1
13. The set {(1, 0, 0), (0, 1, 0), (0, 0, 1)} is a basis of
(a) R
(b) R2
(c) R3
(d) R4
14. The set {(1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1)} is a basis of
(a) <
(b) <2
(c) <3
(d) <4
15. If = (3, 7) and = (2, 4), = (1, 1) then the linear combination of in terms of
, is
(a) = + 5
(b) = + 5
(c) = 13 + 53
(d) = 13 +
16. If U and W are two subspaces of V and dim U = 2, dim W = 3, dim (U W ) = 1
then dim (U + W ) is
(a) 1
(b) 3
(c) 6
(d) 4
17. Let U = {(x, y, z) <3 : x + y + z = 0}, V = {(x, y, z) <3 : x y + 2z = 0},
W = {(x, y, z) R3 : 2x y + z = 0}. Then S = {(1, 2, 0), (0, 1, 1)} is a basis of
(a) U
(b) W
(c) V
(d) none of these
18. If V is a vector space of all polynomials of degree n, then dimension of V is
(a) 0
(b) 1
(c) n
(d) infinite
19. Let T : <n <n be a linear transformation, where ni2. For k n, let E =
{v1 , v2 , , vk } <n and F = {T v1 , T v2 , , T vk }. Then
[IIT-JAM11]
(a) If E is linearly independent, then F is linearly independent (b) If F is linearly
independent, then E is linearly independent (c)If E is linearly independent, then F
is linearly dependent (d) If F is linearly independent, then E is linearly dependent

286

Vector Space

20. The dimension of the vector space of all symmetric matrices of order n n(n 2)
with real entries and trace equal to zero is
NET(June)11
(a) {(n2 n)/2} 1 (b) {(n2 + n)/2} 1 (c) {(n2 2n)/2} 1 (d){(n2 + 2n)/2} 1
21. The dimension of the vector space of all symmetric matrices A = (aij ) of order n
n(n 2) with real entries, a11 = 0 and trace equal to zero is
NET(June)12
(a) {(n2 + n 4)/2} (b) {(n2 n + 4)/2} (c) {(n2 + n 3)/2} (d){(n2 n + 3)/2}.
22. Let C be an nn real matrix. Let W be the vector space spanned by {I, C, C 2 , , C 2n }.
The dimension of the vector space W is
NET(June)12
(a) 2n
(b) atmost n
(c) n2
(d) atmost 2n.

210
23. Let M be the vector space of all 3 3 real matrices and let A = 0 2 0 . Which of
003
the following
are
subspaces
of
M
?
NET(June)11
n
o
n
o
n
(a) X M : XA = AX
(b) X M : X + A = A + X
(c) X M :
o
n
o
trace(AX) = 0 (d) X M : det(AX) = 0

010
n
o
24. Let W = p(B) : p is a polynomial with real coefficients where B = 0 0 1 . The
100
dimension d of the vector space W satisfies
NET(June)11
(a) 4 d 6 (b) 6 d 9 (c) 3 d 8 (d) 3 d 4
Section-B
[Objective Questions]
1. Show that the straight lines and the planes in <3 through the origin (0,0,0) are proper
subspaces of the linear space <3 .
2. Show that the dimension of the vector space V = {(x1 , x2 , , xn ) <n : x1 + x2 +
+ xn = 0} is n 1.
3. Prove that every finitely generated vector space has a finite basis.
4. Let V1 = {(x, y, z) <3 : x = y = z} and V2 = {(0, y, z) <3 }, find V1 V2 .
5. Let V = {(x, y, z) <3 : 2x + 3y + 5z = 5}. Is V is a vector space over <? Justify
your answer.
6. Show that the following sets of vectors are linearly independent:
(a) {(1, 0, 1), (0, 1, 1), (1, 1, 1)} in R3
(b) {(1, 3, 0), (0, 1, 1)} in R3
(c) {(1, 1, 1, 0), (1, 1, 0, 1), (1, 0, 1, 1), (0, 1, 1, 1)} in R4
(d) {(2, 6, 1, 8), (0, 10, 4, 3), (0, 0, 1, 4), (0, 0, 0, 8)} in R4 ,
(e) {1, 2, 2), (2, 1, 2), (2, 2, 1)} in <3 .

[WBUT 2004]

7. Test whether the following set of vectors


(a) (1, 2, 1), (3, 1, 2) and (5, 3, 0)
(b) (1, 0, 1), (2, 1, 3) and (0, 1, 5)
(c) (1, 2, 3), (2, 1, 3) and (3, 0, 1)
in Euclidean 3 space is linearly dependent or not.

BH98
BH99
BH96

Isomorphic

287

8. Show that the following sets of vectors are linearly independent:


(a) {(1, 2, 0), (3, 1, 1), (4, 1, 1)} in R3
(b) {(1, 0, 1), (2, 1, 3), (1, 0, 0), (1, 0, 1)} in R3
(c) {(1, 2, 3, 4), (3, 1, 2, 1), (1, 5, 8, 7)} in R4 .
9. Determine k so that the set S is linearly dependent:
(a) S = {(1, 3, 1), (2, k, 0), (0, 4, 1)} in R3
(b) S = {(1, 2, 1), (k, 1, 1), (0, 1, 1)} in R3
(c) S = {(1, 2, 1), (k, 3, 1), (2, k, 0)} in R3
(d) {(k, 1, 1), (1, k, 1), (1, 1, k)}
(e) S = (0, 1, k), (1, k, 1), (k, 1, 0) on <3 (<).

[WBUT 2007]
[WBUT 2005]
[ VH 97]
[ SET 10]

10. Find t, for which the following vectors are linearly independent:
(a) (cos t, sin t), ( sin t, cos t)
(b) (cos t, sin t), (sin t, cos t)
(c) (et , et ), (et , et ).






11
11
10
11. Let A =
,B =
,C =
be three matrices in M2 (<). Are they
11
00
01
linearly independent over <. Justify your answer.
[BH06]
12. Examine the set S is a basis
(a) S = {(1, 1, 0), (0, 1, 1), (1, 0, 1)} for R3 ,
(b) S = {(1, 1, 0), (1, 0, 0), (1, 1, 1)} for R3 ,
(c) S = {(1, 2, 1, 2), (2, 3, 0, 1), (1, 2, 1, 4), (1, 3, 1, 0)} for V4 (R),
(d) S = {(2, 1, 0, 1), (1, 1, 2, 0), (3, 0, 2, 1), (0, 1, 2, 3)} for R4 .
(e) S = {(1, 2, 1), (2, 1, 0), (1, 1, 2)} of <3 .
(f) S = {(1, 2, 1, 2), (2, 3, 0, 1), (1, 2, 1, 4), (1, 3, 1, 0)} of V4 (<).

V H99, 01
VH96

13. Examine whether in <3 , the vector (1,0,7) is in the span of S = {(0, 1, 2), (1, 2, 3)}.
Section-C
[Long Answer Questions]
1. (a) Let P [x] be the set of all polynomials in x of degree n, over a real field <, i.e.,
V = {f (x) : f (x) = a0 + a1 x + a2 x2 + + an xn , ai <}.
Show that P [x] is a vector space with ordinary addition of polynomials and the
multiplication of of each coefficient of the polynomial by a member of < as the
scalar multiplication composition.
(b) Let V be the set of all m n matrices whose elements belong to the field F . Show
that V is a vector space over F with respect to the operations of matrix addition
and scalar multiplication.
(c) Show that the set C of complex numbers is a vector space itself as the vector
space compositions.
(d) Show that the set of all odd functions from < to itself is a vector space with
respect to addition and scalar multiplication of functions.
(e) Let V be the set of all ordered pairs (x, y) of real numbers and let < be a field of
real numbers. Define
(x1 , y1 ) + (x2 , y2 ) = (3y1 + 3y2 , x1 x2 ) and c(x, y) = (3cx, cx).
Verify that V with these operations is not a vector space over <.

288

Vector Space
(f) Let V = <2 = {(a1 , a2 ) : a1 , a2 <} and F = <. Define the addition and scalar
multiplication in <2 to <2 as follows:
(a1 , a2 ) + (b1 , b2 ) = (a1 + b1 + 1, a2 + b2 + 1) and
c(a1 , a2 ) = (ca1 + a1 1, ca2 + a2 1).
Show that V is a vector space over <.
(g) (Vector space of n-tuples) Let n be a fixed integer ( 1). Then (<n , +, .) is a
vector space over <, where,
(x1 , x2 , , xn ) + (y1 , y2 , , yn ) = (x1 + y1 , x2 + y2 , , xn + yn )
c(x1 , x2 , , xn ) = (cx1 , cx2 , , cxn ).
x1
x2
x3
(h) The set of all ordered triplets (x1 , x2 , x3 ) of real numbers such that
=
=
3
4
2
forms a real vector space, where the operations addition and multiplication are
defined as above, i.e., the set of all points on any line passing through origin in
<3 forms a vector space.
(i) V = {f (t) : f (t) = c1 cos t + c2 sin t, c1 , c2 F }, that is f is a solution of the
d2 x
differential equation 2 + x = 0 and c1 , c2 are scalars of the field. Then V is a
dt
vector space over F .
(j) The power set V of a fixed non-empty set forms a vector space over F = {0, 1}
withe respect to the operations
A + B = (A B) (B A) = A 4 B.
cA = A if c = 1 and cA = , the null set, if c = 0.
(k) The set of all real-valued random variables on a fixed sample space forms a vector
space over < under the usual operations.

(l) For the vector space F X show that the set {f : 0 range off } is a generating
set provided X has at least two elements and {f : 0 6 range off } is a generating
set provided F has at least 3 elements.
Z 1
n
o
2. (a) Show that S = p(t) :
(2t + 3)p(t)dt = 0 is a subspace of P4 and find a
1

generating set of S with size 3.


(b) If a vector space V is the set of real valued continuous functions over R, then
d2 y
dy
show that the set W of solutions of
+p
+ qy = 0 is a subspace of V .
dx2
dx
(c) Let <3 be the vector space of all 3 tuples of real numbers. Show that W =
{(x1 , x2 , 0) : x1 , x2 <} is a subspace of <3 .
(d) Show that the set of all n n real diagonal matrices is a subspace of Rnn .
(e) Show that the set of all n n real symmetric matrices is a subspace of Rnn .
(f) Show that the set of all nn real skew-symmetric matrices is a subspace of Rnn .
(g) Show that the set of all n n real triangular matrices is a subspace of Rnn .
3. Does the set S = {(x, y) <2 ; xy 0} form a vector space? Give reasons.

Gate04

4. Show that any field can be considered to be a vector space over a subfield of it. BH98
5. Show that the following subsets of <3 are subspaces of <3 .
(i) W = {(x, 2x, 3x) : x R}
(ii) W = {(x, y, z) R3 : x + 2y + 4z = 0}
(iii) W = {(x, y, z) R3 : 2x 2y + 5z = 0, x y + 2z = 0}.

Isomorphic

289

6. Show that the following subsets of <3 are not subspaces of <3 .
(i) U = {(x, y, 3) : x, y R}
(ii) U = {(x, y, z) R3 : x + 2y z = 3}
(iii) U = {(x, y, z) R3 : 2x y + z = 1, x 3y + z = 0}.
7. (a) Show that f1 (t) = 1, f2 (t) = t2, and f3 (t) = (t2)2 form a basis of P4 . Express
3t2 5t + 4 as a linear combination of f1 , f2 , f3 .
(b) Show that = (8, 17, 36) is the linear combination of = (1, 0, 5), = (0, 3, 4), =
(1, 1, 1).
8. (a) Show that = (1, 0, 0), = (0, 1, 0), = (0, 0, 1) and = (1, 1, 1) in V3 form a
linearly dependent set in <3 , but any three of them are linearly independent.
(b) Show that the vectors 2x3 + x2 + x + 1, x3 + 3x2 + x 2 and x3 + 2x2 x + 3 of
P [x], the real vector space of all polynomials are linearly independent.
(c) Show that the vectors (1, 2, 1), (3, 1, 5) and (3, 4, 7) are linearly dependent in
<3 .
V H96
(d) Determine the subspace of <3 spanned by the vectors (1, 2, 3), (2, 3, 4) and examine if (1, 1, 1) is in this subspace.
(e) Show that the vectors 1 + x + 2x2 , 3 x + x2 , 2 + x, 7 + 5x + x2 of P (x), the
vector space of polynomials in x over <, are linearly dependent.
(f) If , and are linearly independent vectors, find the number of linearly independent vectors in the set { , , }.
BH99
9. Let f1 , f2 , , fn be real valued functions defined on [a, b] such that fi has continuous derivatives upto order (n 1). If the Wronskian W (f1 , f2 , , fn )(x) = 0 and
W (f1 , f2 , , fn1 )(x) 6= 0 in [a, b], show that f1 , f2 , , fn are LD on [a, b]. Gate97
10. The vectors (a1 , b1 ) and (a2 , b2 ) are linearly independent in <2 . Find the rank of the
subspace of P generated by {(a1 , b1 ), (a2 , b2 ), (a3 , b3 )}.
BH98
11. Find the dimensions of the vector space of all solutions of the set of equations x1 +
x2 + x3 = 0, x1 x3 = 0 and x2 x3 = 0.
BH98
12. Show that the vectors (1, 2, 0), (3, 1, 1) and (4, 1, 1) are linearly dependent. Find two
of them which are linearly independent.
BH02
13. Show that the vector space of all periodic functions f (t) with period T contains an
infinite set of linearly independent vectors.
14. (a) Examine whether in <3 , (1, 0, 7) is the span of S = {(0, 1, 2), (1, 2, 3)}. BH06
(b) Consider the vectors 1 = (1, 3, 2) and 2 = (2, 4, 3) in <3 . Show that the span
of {1 , 2 } is
{(c1 , c2 , c3 ) : c1 7c2 + 10c3 = 0}
and show that it can also be written as {(, , +7
10 ) : , <}.
(c) Consider the vectors 1 = (1, 0, 1, 1), 2 = (1, 1, 2, 0) and 3 = (2, 1, 1, 3) in
<4 . Show that the span of {1 , 2 , 3 } is
{(c1 , c2 , c3 , c4 ) : c1 + c2 + c4 = 0, 2c1 c3 + c4 = 0}
and show that it can also be written as {(, , , ) : , <}.

290

Vector Space
(d) Consider the vectors 1 = (1, 2, 1, 1), 2 = (2, 4, 1, 1), 3 = (1, 2, 2, 4)
and 4 = (3, 6, 2, 0) in <4 . Show that the span of {1 , 2 , 3 , 4 } is
{(c1 , c2 , c3 , c4 ) : 2c1 c2 = 0, 2c1 3c3 c4 = 0}
and show that it can also be written as {(, 2, , 2 ) : , <}.
(e) In <2 , let = (3, 1), = (2, 1). Show that L({, }) = <2 .
(f) Show that the set S = {(1, 2, 3, 0), (2, 1, 0, 3), (1, 1, 1, 1), (2, 3, 4, 1)} is linearly
dependent in R4 . Find a linearly independent subset S1 of S such that L(S) =
L(S1 ).

15. (a) For what values of a do the vectors (1 + a, 1, 1), (1, 1 + a, 1) and (1, 1, 1 + a) form
a basis of V3 (R).
(b) Show that the set {(1, 0, 0), (1, 1, 1)} form a basis of V3 (<). Find the co-ordinates
of the vector (a, b, c) with respect to the above basis.
VH99
(c) Show that { + , + , + } and {, + , + + } are bases of <3 , if
{, , } is a basis of <3 .
16. (a) Find a basis for <3 containing the set of vectors {(1, 2, 1), (2, 4, 2)}.

BH99

VH04, 00

BH00

(b) Find a basis for < containing the set of vectors {(1, 2, 0), (1, 3, 1)}.
(c) Find a basis for < containing the vector (1, 2, 0).

17. Show that if {1 , 2 , 3 } be a basis of a vector space V of dimension 3, then {1 +


2 + 3 , 2 + 3 , 3 } is also a basis of V .
BH03, 05
18. Show that the sets S = {, } and T = {, , } of real vectors generate the same
vector space.
BH01
19. (a) Suppose that S = {(x, y, z) <3 : z = 2x y}. Show that S is a subspace of the
vector space <3 over the field of reals. Find a basis of S containing the vector
(1, 2, 0).
VH04
(b) Suppose that S = {(x, y, z) <3 : 2x + y z = 0}. Show that S is a subspace
of the vector space <3 over the field of reals. Find a basis and dimension of S.
NBH05
(c) Show that S = {(x, y, z) <3 : x + y z = 0 and2x y + z = 0} is a subspace
of the vector space <3 . Find the dimension of S.
VH02
(d) Suppose that S = {(x, y, z) <3 : x + y + z = 0}. Show that S is a subspace of
the vector space <3 over the field of reals. Find a basis of S.
VH00, 97
(e) Show that S = {a + ib, c + id} is a basis of C(<) if ad bc 6= 0.



ab
(f) Let V2 =
; a, b, c Q , show that V2 is a subspace of the vector space
bc
of 2 2 real matrices with usual matrix addition and scalar multiplication. Write
a basis of V2 and find dimension of V2 .
20. (a) Find basis for the subspace in <3 defined by {(x1 , x2 , x3 ) : x1 + x2 + x3 = 0}.
[JU(M.Sc.)06]
(b) Find the conditions on a, b, c so that (a, b, c) <3 belongs to the space generated
by = (2, 1, 0), = (1, 1, 2) and = (0, 3, 4).
(c) Determine a basis of the subspace spanned by the vectors 1 = (2, 3, 1), 2 =
(3, 0, 1), 3 = (0, 2, 1) and 4 = (1, 1, 1).

Isomorphic

291

(d) Let 1 = (1, 2, 0, 3, 0), 2 = (1, 2, 1, 1, 0), 3 = (0, 0, 1, 4, 0), 4 = (2, 4, 1, 10, 1)
and 5 = (0, 0, 0, 0, 1). Find the dimension of the linear span of {1 , 2 , , 5 }.
Gate04
21. (a) Show that, the yz plane W = {(0, b, c)} in <3 is generated by (0, 1, 1) and
(0, 2, 1).
(b) Show that the complex numbers w = 2 + 3i and z = 1 2i generate the complex
field C as a vector space over the real field <.
22. Find a basis of the span of the vectors (1, 0, 1, 2, 1), (2, 1, 1, 3, 0), (0, 1, 3, 1, 2),
(3, 1, 0, 1, 1), (3, 1, 0, 3, 0) in <5 .
23. Using replacement theorem determine a basis of <4 containing vectors
(1, 2, 1, 3), (2, 1, 1, 0) and (3, 2, 1, 1).

[CH10]

24. Consider the subspaces of <4 as


S = {(x, y, z, w) <4 : 2x + y + z + w = 0}
T = {(x, y, z, w) <4 : x + 2y + z + w = 0}
Determine a basis of the subspace S T and hence determine dim(S T ).

[CH10]

25. Two subspaces of R3 are U = {(x, y, z) : x+y+z = 0} and W = {(x, y, z) : x+2yz =


0}. Find dim U, dim W, dim U W, dim (U + W ).
26. Let W be an m dimensional subspace of an n dimensional vector space V , where
m < n. Find dim(V /W ).
Gate98
27. Extend the linearly independent subset A of V to a basis of V in each of the following
cases:
(a) V = <n and A = {(1, 1, , 1)}.
(b) V = F 4 with F = GF (3) and A = {(1, 2, 1, 2), (1, 1, 2, 1)}.
28. Show that 1 = 2 + 3t, 2 = 3 + 5t, 3 = 5 8t2 + t3 and 4 = 4t t2 form a basis
of P4 . Find the coordinates of the vector a0 + a1 t + a2 t2 + a3 t3 with respect to this
coordinate system.
29. In a real vector space <3 (<), show that the co-ordinate vector of = (3, 1, 4) with
respect to the basis {(1, 1, 1), (0, 1, 1), (0, 0, 1)} is identical to itself.
30. (a) Let S and T be subspaces of a vector space V with dim(S) = 2, dim(T ) = 3 and
dim(V ) = 5. Find the minimum and maximum possible values of dim(S + T ) and
show that every (integer) value between these can be attained.
(b) Let S and T be two subspaces of <24 such that dim(S) = 19 and dim(T ) = 17.
Find the possible value of dim(S T ).
Gate04

101
31. Determine rank of row space of 0 1 0 .
111
32. Show that, if A and B be have the same column space if and only if AT and B T have
the same row space.
33. Find two different subspaces of dimension 1 and two different subspaces of dimension
2 contained in the subspace {(1 , 2 , 3 , 4 ) : 1 + 22 + 33 + 44 = 0} of <4 .

292

Vector Space

34. Let <3 <3 be a linear mapping. Show that an one dimensional subspace V of <3
such that it is invariant under f .
Gate96
35. Let B = {1 , 2 , 3 } be the basis for C 3 , where 1 = (1, 0, 1), 2 = (1, 1, 1) and
3 = (2, 2, 0). Find the dual basis of B.
BU(M.Sc)03

Chapter 5

Linear Transformations
We have studied homomorphisms from one algebraic system to another algebraic system,
namely, group homomorphism, ring homomorphism. On parallel lines we shall study vector
space homomorphism. Since the vector space V (F ) is comprised of two algebraic system,
group (V, +) and a field (F, +, .), there may be some confusion as to what operations are
to be preserved by such function. Generally, vector space homomorphism are called linear
mappings or linear transformation.
In this chapter, the notion of linear transformation (or mapping or function) and its
different properties are studied. Here we shall show that a linear transformation can be
represented by a matrix. The similarity between kernel or null space, rank, nullity and
other features of linear transformations are discussed here.

5.1

Linear Transformations

Let V and W be two vector spaces over the same field F . Then a mapping T : V W
with domain V and codomain W is called a linear transformation or linear mapping or
homomorphism of V into W , if
(i) Additive property: T ( + ) = T () + T ();
(ii) Homogeneous property: T (c) = cT ();
for all and in V and all scalars c in F . Thus T : V W is linear if it preserves the two
basic operations, vector addition, scalar multiplication of a vector space. Note the following
facts:
(i) A linear mapping T : V W is completely characterized by the condition (principle
of superposition)
T (a + b) = aT () + bT (); a, b F and , V.

(5.1)

This single condition, which is the replacement of the above two conditions is sometimes used as its definition. More generally, for any scalars ci F and for any vectors
!
i V , we get,
n
n
X
X
T
ci i =
ci T (i )
i=1

i=1

T (c1 1 + c2 2 + + cn n ) = c1 T (1 ) + c2 T (2 ) + + cn T (n ).
(ii) If c = 0, then T (V ) = f (W ) = W f () = W , i.e. every linear mapping takes the
null vector into the null vector of W .
293

294

Linear Transformations

(iii) The term linear transformation rather than linear mapping is frequently used for linear
mappings of the form T : <n <m .
(iv) The mapping T : V V defined by T () = ; V , is a linear mapping. This is
called the identity mapping on V and is denoted by IV .
(v) Two linear transformations T1 and T2 from V (F ) to W (F ) are said to be equal iff
T1 () = T2 (); V .
(vi) The mapping T : V W defined by T () = W ; V , W being the null vector in
W , is a linear mapping. This is called the zero mapping and is denoted by 0T .
(vii) A one-to-ont linear transformation of V onto W is called an isomorphism. In case,
an isomorphism of V onto W , we say that V is isomorphic to W and is written as
V
= W.
(viii) Let A be any m n real matrix, which is determined by a mapping TA : F n F m
by FA () = A( where the vectors F n and F m are written as columns). This matrix
mapping is linear.
(ix) A transformation T is called nilpotent of index n if T n+1 = but T n1 6= .
The following are some important linear transformations:
(i) Projection: T : <3 <2 , defined by T (x, y, z) = (x, y).
(ii) Dilation: T : <3 <3 , defined by T () = r; r > 1.
(iii) Contraction: T : <3 <3 , defined by T () = r; 0 < r < 1.
(iv) Reflection: T : <3 <2 , defined by T (x, y) = (x, y).
(v) Rotation: T : <2 <2 , defined by T () =


cos sin
.
sin cos

In geometry, rotation, reflections and projections provide us with another class of linear
transformations. These transformations can be used to study rigid motion in <n .
Ex 5.1.1 The mapping T : R2 R defined by T (x, y) = 2x + y for all (x, y) R2 is a
linear transformation.
Solution: Let c1 , c2 R and = (a1 , a2 ), = (b1 , b2 ) be any two elements of <2 . Then
T (c1 + c2 ) = T (c1 a1 + c2 b1 , c1 a2 + c2 b2 )
= 2(c1 a1 + c2 b1 ) + (c1 a2 + c2 b2 ) = c1 (2a1 + a2 ) + c2 (2b1 + b2 )
= c1 T (a1 , a2 ) + c2 T (b1 , b2 ) = c1 T () + c2 T ().
Hence T is a linear transformation.
Ex 5.1.2 Let F [x] be the vector space of all polynomials in the indeterminant x over <.
d
Prove that, the mapping T : F [x] F [x] defined by T [p(x)] = dx
[p(x)]; p(x) F [x] is a
linear mapping.

Linear Transformations

295

Solution: Here the mapping T : F [x] F [x] defined by T [p(x)] =


Now, a, b < and p1 (x), p2 (x) F [x], we have,

d
dx [p(x)];

p(x) F [x].

d
[ap1 (x) + bp2 (x)]; by definition
dx
d
d
= a [p1 (x)] + b [p2 (x)]
dx
dx
= aT [p1 (x)] + bT [p2 (x)].

T [ap1 (x) + bp2 (x)] =

Therefore, the mapping T : F [x] F [x] is linear. This T : F [x] F [x] is called derivative
mapping. If p(x) is a polynomial of degree n, then
T n+1 [p(x)] =

dn+1
p(x) = 0; p(x) F [x].
dxn+1

i.e. T n+1 = . Thus a non-zero transformation T such that a finite power of T is .


Ex 5.1.3 Let V be the vector space of all real valued continuous functions on [a, b]. Prove
Z b
that, the mapping T : V < defined by T [f ] =
f (x)dx; f V is a linear mapping.
a

Solution: Here the mapping T : V < defined by T [f ] =

f (x)dx;

f V . Now,

a1 , a2 < and f, g V , we have,


Z b
[a1 f (x) + a2 g(x)]dx; by definition
T [a1 f + a2 g] =
a

Z
= a1

f (x)dx + a2
a

g(x)dx = a1 T [f ] + a2 T [g].
a

Therefore, the mapping T : V < is linear. This


Z x mapping T : V < is called integral
mapping. The linear transformation T [f ](x) =
f (t)dt is not only continuous but has a
continuous first derivative.

Ex 5.1.4 Let W (F ) be a subspace of a vector space V (F ). Prove that the mapping T : V


V /W defined by T () = + W ; V is a linear transformation.
Solution: Let a, b F , then , V , we have,
T (a + b) = a + b + W = (a + W ) + (b + W )
= a ( + W ) + b ( + W ) = aT () + bT ()
Therefore, T is a linear mapping.
Ex 5.1.5 Let the mapping T : P1 P2 be defined by T [p(x)] = xp(x) + x2 . Is T is linear
transformation?
Solution: Let p1 (x), p2 (x) P1 , then
T [p1 (x) + p2 (x)] = x[p1 (x) + p2 (x)] + x2
= xp1 (x) + x2 + xp2 (x) + x2 x2
= T [p1 (x)] + T [p2 (x)] x2 6= T [p1 (x)] + T [p2 (x)].
Therefore, we conclude that T is not a linear transformation.

296

Linear Transformations

Properties of linear transformation


Let V and W be two vector spaces over the same field F and T : V W be a linear
mapping. Let , V , and V , W be the null elements of V and W respectively. We now
have,
(i) T (V ) = W .
(ii) T () = T (); V.
(iii) T ( ) = T () T (); , V
(iv) T ( + + + ) = T (n) = nT (), n is a positive integer,
|
{z
}
n times
(v) T (m) = mT () = mT (), m is a positive integer,
(vi) T ( m
n ) =

m
n T (),

m and n are integers.

Proof: This properties are derived from the definition of linear transformation. Here
T : V W is a linear mapping. (i) For any V , we have,
T () = T ( + V ) = T () + T (V ); as T is linear
T () + W = T () + T (V ); W is the zero in W
T (V ) = W ; by cancellation law in (W, +).
Thus, if T (V ) 6= W , then T is not linear transformation. For example, let T : <2 <2 ,
defined by T (x, y) = (x + 4, y + 7) be a translation mapping. Note that, T () = T (0, 0) =
(4, 7) 6= (0, 0). Thus, the zero vector is not mapped into the zero vector, hence T is not
linear.
(ii) Using the first result, we have,
W = T (V ) = T [ + ()]
= T () + T (); sinceT is linear
T () = T ().
If T is the linear operator, then T () = .
From this result, it follows that the principle of homogeneity of a linear transformation
follows from the principle of additively when c (T (c) = cT ()) is rational, but, this is not
the case if c is irrational. Again, a transformation may satisfy the property of homogeneity
without satisfying the property of additivity.
Ex 5.1.6 Prove that the following mappings are not linear (i) T : <2 <2 defined by
T (x, y) = (xy, x). (ii) T : <2 <3 defined by T (x, y) = (x + 5, 7y, x + y). (iii) T : <3 <2
defined by T (x, y, z) = (|x|, y + z).
Solution: (i) Let = (1, 2) and = (3, 4), then + = (4, 6). Also by definition,
T () = (2, 1) and T () = (12, 3)
T () + T () = (14, 4)
T ( + ) = (24, 6) 6= T () + T ().
Therefore, the mapping T : <2 <2 defined by T (x, y) = (xy, x) is not linear.
(ii) Since T (0, 0) = (5, 0, 0) 6= (0, 0, 0), so the mapping T : <2 <3 defined by T (x, y) =

Linear Transformations

297

(x + 5, 7y, x + y) is not linear.


(iii) Let = (1, 2, 3) and c = 3, then c = (3, 6, 9) so that, by definition, T (c) =
(3, 15). Also, we have
T () = (1, 5) and cT () = (3, 15) 6= (3, 15) = T (c).
Therefore, the mapping T : <3 <2 defined by T (x, y) = (|x|, y + z) is not linear.

5.1.1

Kernal of Linear Mapping

Let V and W be two vector spaces over the same field F and T : V W be a linear
mapping. The kernal or null space of the linear mapping T , denoted by KerT , is the set of
elements of V such that T () = W ; W being the null vector in W , i.e.,
KerT = { : V and T () = W }.

(5.2)

For example, let T : V V and T0 : V W be identity and zero mappings respectively,


then N (T ) = {}, N (T0 ) = V . Since V kerT, so kerT is never an empty set. kerT is
V

W
q

Ker (T )

range(T )
r

dim= n r

dim= r
*
T

Co-domain

Domain
Figure 5.1: Range and kernel of a linear transformation T : V W .
also called the null space of T and is denoted by N (T ). Also, dimN (T ) dimV.
Ex 5.1.7 A mapping T : <3 <3 , defined by, T (x, y, z) = (x + 2y + 3z, 3x + 2y + z, x +
y + z); (x, y, z) <3 . Show that T is linear. Find kerT and dimension of kerT .
Solution: Let = (x1 , y1 , z1 ) <3 and = (x2 , y2 , z2 ) <3 . Now,
T () = (x1 + 2y1 + 3z1 , 3x1 + 2y1 + z1 , x1 + y1 + z1 )
T () = (x2 + 2y2 + 3z2 , 3x2 + 2y2 + z2 , x2 + y2 + z2 )
T ( + ) = T (x1 + x2 , y1 + y2 , z1 + z2 )
= ((x1 + x2 ) + 2(y1 + y2 ) + 3(z1 + z2 ), 3(x1 + x2 )
+ 2(y1 + y2 ) + (z1 + z2 ), (x1 + x2 ) + (y1 + y2 ) + (z1 + z2 ))
= (x1 + 2y1 + 3z1 , 3x1 + 2y1 + z1 , x1 + y1 + z1 )
+ (x2 + 2y2 + 3z2 , 3x2 + 2y2 + z2 , x2 + y2 + z2 )
= T () + T (); , <3 .

298

Linear Transformations

Let c < be any scalar, then c = (cx1 , cy1 , cz1 ). Therefore, using definition,
T (c) = T (cx1 , cy1 , cz1 )
= (cx1 + 2cy1 + 3cz1 , 3cx1 + 2cy1 + cz1 , cx1 + cy1 + cz1 )
= c(x1 + 2y1 + 3z1 , 3x1 + 2y1 + z1 , x1 + y1 + z1 )
= cT (); c < and <3 .
Hence T is a linear mapping. Let (x1 , y1 , z1 ) KerT , then by using definition of kerT , we
have,
kerT = {(x1 , y1 , z1 ) <3 ; T (x1 , y1 , z1 ) = (0, 0, 0)}.
or, x1 + 2y1 + 3z1 = 0, 3x1 + 2y1 + z1 = 0, x1 + y1 + z1 = 0.
From the first two equations, we have,
y1
z1
x1
=
=
4
8
4
x1
y1
z1
=
=
= k(say)
or,
1
2
1
or, x1 = k, y1 = 2k, z1 = k,
which satisfies the last equation x1 + y1 + z1 = 0. Thus, (x1 , y1 , z1 ) = k(1, 2, 1); k <.
Let = (1, 2, 1), then kerT = L{} and so dimkerT = 1.
Ex 5.1.8 Find a basis and dimension of kerT , where the linear mapping T : <3 <2 is
defined by T (x, y, z) = (x + y, y + z).
Solution: To find the basis and dimension of the kerT , set T () = , where = (x, y, z).
Therefore, we have the homogeneous system as,
(x + y, y + z) = (0, 0) x + y = 0 = y + z.
The solution space is given by x z = 0, the free variable is y in kerT . Hence dim(kerT )
or nullity(T ) = 1. Now, (1, 1, 1) is a solution and so {(1, 1, 1)} form a basis for kerT .
Deduction 5.1.1 Kernal of matrix mapping: The kernal of any m n matrix A over
F, viewed by a linear map A : F n F m , consists of all vectors for which A = . This
means that, the kernal of A is the solution space of the homogeneous system Ax = , called
the null space of A.

12 3 1
Ex 5.1.9 Consider the matrix mapping A : <4 <3 , where A = 1 3 5 2 . Find the
3 8 13 3
basis and dimension of the kernal of A.
Solution: By definition, kerA is the solution space of the homogeneous system Ax = ,
where we take the variables as x = (x1 , x2 , x3 , x4 )T . Therefore, reduce the matrix A of
coefficients to echelon form

12 3 1
123 1
123 1
1 3 5 2 0 1 2 3 0 1 2 3
3 8 13 3
0 2 4 6
000 0
Thus Ax = becomes
x1 + 2x2 + 3x3 + x4 = 0 and x2 + 2x3 3x4 = 0.

Linear Transformations

299

These equations shows that the variables x3 and x4 are free variables, so that dimker(A) = 2.
Also we see that (1, 2, 1, 0) and (7, 3, 0, 1) satisfies the equation, so that the basis of kerA
is {(1, 2, 1, 0), (7, 3, 0, 1)}.
Theorem 5.1.1 Let V and W be two vector spaces over a field F and let T : V W be a
linear mapping. Then kerT is a subspace of V .
Proof: By definition of kerT , we have,
KerT = { : V and T () = W }.
Since T (V ) = W ; so V kerT . Therefore, kerT is non empty. To prove the theorem, we
consider the following two cases:
Case 1: Let kerT = {V }, then obviously kerT is a subspace of V .
Case 2: Let kerT 6= {V }, and let and kerT . Then by definition,
T () = W ;

T () = W .

Since T is a linear transformation, so a, b F , we have,


T (a + b) = T (a) + T (b); as T is linear
= aT () + bT (); as T is linear
= aW + bW = W
a + b kerT ; , kerT and a, b F.
Hence kerT is a subspace of V . This subspace is also known as null space of T . The
dimension of kerT is called the nullity of T , i.e., nullity (T)=dim (ker T).
Theorem 5.1.2 Let V and W be two finite dimensional vector spaces over a field F and
let T : V W be linear. Then T is injective if and only if kerT = {V }.
Proof: First let, T be injective mapping from V into W . Since T (V ) = W in W , is a
preimage of W and since T is injective, V is the only preimage of W . Therefore,
kerT = {V }.
Conversely, let kerT = {V }. We wish to show that T is one-to-one. Let , be two elements
of V such that T () = T () in W . Now,
W = T () T () = T ( ); as T is linear
kerT = V ; as kerT = {V }.
Thus, T () = T () =
and so T is injective. Hence the theorem. Note that, if the linear mapping T : V V be
such that kerT = {V }, then a basis of V is mapped into another basis of V . Thus T is
injective, if and only if dim(kerT ) = 0.
Note : If T () = and T () = , then kerT. In other words, any two solutions
to T () = differ by an element of the kernal of T .
Ex 5.1.10 Prove that there cannot be a one-one linear transformation T : <2 <2 .
Solution: Here dim(<3 ) = 3 and dim(<2 ) = 2. Therefore
dim(<3 ) = dim(<2 ) + dim(kerT )
or, 3 = 2 + dim(kerT ) dim(kerT ) = 1 6= 0.
Thus, there cannot be a one-one linear transformation T : <2 <2 .

300

Linear Transformations

Theorem 5.1.3 Let V and W be vector spaces over a field F and T : V W be a linear
mapping such that kerT = {}. Then the images of a linearly independent set of vectors in
V are linearly independent in W .
Proof: Let S = {1 , 2 , . . . , n } be a linearly independent set in V . We are to show that,
{T (1 ), T (2 ), . . . , T (n )} is a LI set in W . For some scalars c1 , c2 , . . . , cn F , we have,
c1 T (1 ) + c2 T (2 ) + . . . + cn T (n ) = W
T (c1 1 ) + T (c2 2 ) + . . . + T (cn n ) = W ; as T is Linear
T (c1 1 + c2 2 + . . . + cn n ) = W ; as T is Linear
c1 1 + c2 2 + . . . + cn n = V ; as kerT = {V }
c1 = c2 = . . . = cn = 0; as S is LI.
Hence {T (1 ), T (2 ), . . . , T (n )} is a linearly independent set of vectors in W .

5.1.2

Image of Linear Mapping

Let V and W be two vector spaces over the same field F and T : V W be a linear
mapping. The range or image of the linear mapping T , denoted by R(T ) or ImT , is the
set of all images of all elements of V, i.e.,
R(T ) = ImT = {T () W : V }.
If ImT = W, we say that T is onto.
Ex 5.1.11 Show that the following mapping T : <3 <3 , defined by,
T (x, y, z) = (2x + y + 3z, 3x y + z, 4x + 3y + z); (x, y, z) <3
is linear. Find ImT and dimension of ImT .
Solution: Let = (x1 , y1 , z1 ) <3 and = (x2 , y2 , z2 ) <3 . By definition,
T () = (2x1 + y1 + 3z1 , 3x1 y1 + z1 , 4x1 + 3y1 + z1 )
T () = (2x2 + y2 + 3z2 , 3x2 y2 + z2 , 4x2 + 3y2 + z2 )
T ( + ) = T (x1 + x2 , y1 + y2 , z1 + z2 )
= (2(x1 + x2 ) + (y1 + y2 ) + 3(z1 + z2 ), 3(x1 + x2 )
(y1 + y2 ) + (z1 + z2 ), 4(x1 + x2 ) + 3(y1 + y2 ) + (z1 + z2 ))
= (2x1 + y1 + 3z1 , 3x1 y1 + z1 , 4x1 + 3y1 + z1 )
+ (2x2 + y2 + 3z2 , 3x2 y2 + z2 , 4x2 + 3y2 + z2 )
= T () + T (); , <3 .
Let c <. Then c = (cx1 , cy1 , cz1 ). By definition,
T (c) = T (cx1 , cy1 , cz1 )
= (2cx1 + cy1 + 3cz1 , 3cx1 cy1 + cz1 , 4cx1 + 3cy1 + cz1 )
= c(2x1 + y1 + 3z1 , 3x1 y1 + z1 , 4x1 + 3y1 + z1 )
= cT (); c < and <3 .
Hence T is linear. Let be an arbitrary vector in ImT . Then,
= (2x + y + 3z, 3x y + z, 4x + 3y + z)
= x(2, 3, 4) + y(1, 1, 3) + z(3, 1, 1).

(5.3)

Linear Transformations

301

Hence is a linear combination of vectors (2, 3, 4), (1, 1, 3) and (3, 1, 1). Also, for some


scalars, c1 , c2 , c3 <, if
2 1 3


c1 (2, 3, 4) + c2 (1, 1, 3) + c3 (3, 1, 1) = and 3 1 1 = 0,
4 3 1
so that, S = {(2, 3, 4), (1, 1, 3), (3, 1, 1)} is linearly dependent. Hence,
ImT = L({(2, 3, 4), (1, 1, 3), (3, 1, 1)}).
Since S is linearly dependent, so dimension of ImT is 3.
Ex 5.1.12
Show that a linear transformation
T : <3 <3 , such that ImT is a subspace
n
o
3
S = (x, y, z) < , x + y + z = 0 is T (x, y, z) = (x + y, x, y).
n
o
Solution: Let = (x, y, z) = (x, y, x y) = x(1, 0, 1) + y(0, 1, 1). Let e1 , e2 , e3 be
a standard basis of <3 , then there exists a unique linear transformation T such that
T (e1 ) = (1, 0, 1) = , T (e2 ) = (0, 1, 1) = , T (e3 ) = (0, 0, 0) = .
Now = (x, y, z) = xe1 + ye2 + ze3 , so that
T () = T (x, y, z) = xT (e1 ) + yT (e2 ) + zT (e3 )
= x(1, 0, 1) + y(0, 1, 1) + z(0, 0, 0) = (x, y, x y).
Ex 5.1.13 Find a linear mapping T : <3 <3 , whose image space is spanned by (1, 2, 3)
and (4, 5, 6).
Solution: Let {e1 , e2 , e3 } be a standard basis of <3 . Then there exists unique linear
transformation T such that T (e1 ) = (1, 2, 3) = , T (e2 ) = (4, 5, 6) = and T (e3 ) =
(0, 0, 0) = . Let (x, y, z) <3 , then
(x, y, z) = xe1 + ye2 + ze3
T (x, y, z) = xT (e1 ) + yT (e2 ) + zT (e3 )
= x(1, 2, 3) + y(4, 5, 6) + z(0, 0, 0)
= (x + 4y, 2x + 5y, 3x + 6y).
Since {e1 , e2 , e3 } is a basis of <3 , {, , } generates the range space. Thus the range space
is generated by L({, , }) = L({, }).
Deduction 5.1.2 Image of matrix mapping: Let A be any m n matrix over a field F ,
viewed as a linear map A : T n T m . Now the usual basis vectors span F n , so their images
Aei ; i = 1, 2, n, which are precisely the columns of A, span the image of A. Therefore
ImA = columnspace(A).

12 3 1
Ex 5.1.14 Consider the matrix mapping A : <4 <3 , where A = 1 3 5 2 . Find the
3 8 13 3
basis and dimension of the image of A.
Solution: By definition, the column space of A is the ImA. Therefore, reduce the matrix
AT of coefficients to echelon form

1 1 3
1 1 3
113
2 3 8 0 1 2 0 1 2

3 5 13 0 2 4 0 0 0 .
1 2 3
0 3 6
000
Thus the basis of ImA is {(1, 1, 3), (0, 1, 2)} and dim(ImA) = 2.

302

Linear Transformations

Theorem 5.1.4 Let V and W be two vector spaces over a field F and let T : V W be a
linear mapping. Then ImT is a subspace of W .
Proof: Let V and W be the null elements of V and W respectively. Since,
T (V ) = W ;

so W ImT.

Hence ImT is not an empty set. To prove the theorem, consider the following two cases:
Case 1: Let ImT = {W }, then obviously ImT is a subspace of W .
Case 2: Let ImT 6= {W }. Let , ImT , then , ImT such that T () = and
T () = , where , W . Therefore, a, b F , we have,
a + b = aT () + bT ()
= T (a) + T (b); as T is linear
= T (a + b); as T is linear
a + b ImT ; as a + b W.
This proves that, ImT is a subspace of W . ImT is also called the range of T and is denoted
by R(T ). The dimension of R(T ) is called the rank of T .
Ex 5.1.15 Let a linear transformation T : <3 <3 be defined by


a1
101
a1
T () = T a2 = 1 1 2 a2 = A.
a3
213
a3
Is T onto, one-to-one? Find basis for range T and ker T .
Solution: Given any = (a b c)T <3 , where a, b and c are any real numbers, can we
find , so that T () = . We seek a solution to the linear system


101
a1
a
1 1 2 a2 = b
213
a3
c
and we find the reduced row echelon form of the augmented matrix to be

..
a
1 0 1 .

..
1 1 2 . b a .

..
213.cba
Thus a solution exists only for c a b = 0, so T is not onto. To find a basis for range T ,


we note that
0
1
a1 + a3
1
T () = A = a1 + a2 + 2a3 = a1 1 + a2 1 + a3 2 .
3
2a1 + a2 + 3a3
2
1

1
0
1
This means that 1 , 1 , 2 spans range T , i.e., range T is the subspace of <3

2
1
3
spanned by the columns of the matrix defining T . The first two vectors in this set are LI as
the third is the sum of the first two. Therefore, the first two vectors form a basis for range

Linear Transformations

303

T , and dim(rangeT ) = 2. To find kerT , we wish to find all <3 so that T () = <3 .
Solving the resulting homogeneous system,


a1 + a3
0
a1 + a2 + 2a3 = 0
2a1 + a2 + 3a3
0
we find that a1 = a3 and a2 = a3 . Thus kerT consists of all vectors of the form,
k(1, 1, 1)T , where k <. Moreover, dim(kerT ) = 1. As kerT 6= {<3 }, it follows that T
is not one-to-one.
Theorem 5.1.5 Let V and W be finite dimensional vector spaces over a field F and T :
V W be linear. Let = {1 , 2 , . . . , n } be a basis of V , then
R(T ) = Span (T ()) = Span

n
o
T (1 ) , T (2 ) , . . . , T (n ) .

Proof: Let ImT , then an element V such that T () = . Since V , we can


write = c1 1 + c2 2 + . . . + cn n , where c1 , c2 , . . . , cn are uniquely determined. Thus,
T () = T (c1 1 + c2 2 + . . . + cn n )
= T (c1 1 ) + T (c2 2 ) + . . . + T (cn n )
= c1 T (1 ) + c2 T (2 ) + . . . + cn T (n ); asT is linear .
As each T (i ) ImT , it follows that ImT is generated by T (1 ), T (2 ), . . . , T (n ). Thus,
T () has been completely determined by the elements of T (1 ), T (2 ), . . . , T (n ). When,
kerT = {}, then {1 , 2 , . . . , n } is LI and in this case, {T (1 ), T (2 ), . . . , T (n )} is a
basis of ImT . Therefore, we conclude that, if {1 , 2 , . . . , n } span a vector space V , then
T (1 ), T (2 ), . . . , T (n ) span ImT , where T : V W is linear.
Note: This theorem provides a method for finding a spanning set for the range of a linear
transformation. For example, define a linear transformation


f (1) f (2) 0
T : P2 (<) M22 (<) by T (f (x)) =
0
f (0)
Since = {1, x, x2 } is a basis for P2 (<), we have

 
R(T ) = Span (T ()) = Span T (1), T (x), T x2

 
 
 

00
1 0
00
3 0
= Span
,
,
,
01
0 0
01
0 0

 

00
1 0
= Span
,
01
0 0
Thus we have found a basis for R(T ), and so dim (R(T )) = 2.
Deduction 5.1.3 Let T : V W be a linear transformation of an n dimensional vector
space into a vector space W . Also, let S = {1 , 2 , , n } be a basis of V . If is an
arbitrary vector in V , then L() is completely determined by {T (1 ), T (2 ), . . . , T (n )}.
Theorem 5.1.6 Let V and W be finite dimensional vector spaces over a field F and T :
V W be a linear mapping. Then ImT is finite dimensional.

304

Linear Transformations

Proof: Since V is finite dimensional, let dimV = n and S = {1 , 2 , . . . , n } be a basis of


V . If ImT, then a vector V such that, = T (). Since, V ,
=

n
X

ci i ;

for some ci F

i=1
n
n
X
X
= T(
ci i ) =
ci T (i );
i=1

as T is linear.

i=1

Therefore, ImT = L({T (1 ), T (2 ), . . . , T (n )}), i.e., ImT is generated by a finite set, so


it is finite dimensional. Note that,
(i) if kerT = {}, then the images of a LI set of vectors in V are LI in W and, then
{T (1 ), T (2 ), . . . , T (n )} is a basis of ImT .
(ii) If T : V V be a linear mapping on V such that kerT = {}, then a basis of V is
mapped onto another basis of V .
Ex 5.1.16 A mapping T : <3 <3 is defined by
T (x1 , x2 , x3 ) = (x1 + x2 + x3 , 2x1 + x2 + 2x3 , x1 + 2x2 + x3 ),
where (x1 , x2 , x3 ) <3 . Show that T is a linear mapping. Find ImT and dimImT .
Solution: It can be easily verified that T is a linear mapping. If {1 , 2 , 3 } be a basis
of the domain space <3 , ImT is the linear span of the vectors of T (1 ), T (2 ), T (3 ). We
know, {e1 , e2 , e3 }, where e1 = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) is a standard basis of <3 ,
where
T (e1 ) = (1, 2, 1), T (e2 ) = (1, 1, 2), T (e3 ) = (1, 2, 1).
Since T (e1 ) = T (e3 ), ImT is the linear span of the vectors (1, 2, 1) and (1, 1, 2). Hence
ImT = L{(1, 2, 1), (1, 1, 2)}. Since the vectors (1, 2, 1) and (1, 1, 2) are linearly independent,
therefore, dimImT = 2.
Ex 5.1.17 A mapping T : <3 <4 is defined by
T (x, y, z) = (y + z, z + x, x + y, x + y + z),
where (x, y, z) <3 . Show that {T (e1 ), T (e2 ), T (e3 )} is a basis of ImT , where {e1 =
(1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1)} is a standard basis of <3 .
Solution: First, we are to find kerT , for which, let = (x1 , x2 , x3 ) <3 be such that
T () = 1 , then,
y + z = 0, z + x = 0, x + y = 0, x + y + z = 0
x = y = z = 0, i.e., kerT = {}.
Using the definition of T , we have, T (e1 ) = (0, 1, 1, 1), T (e2 ) = (1, 0, 1, 1) and T (e3 ) =
(1, 1, 0, 1). First we are to show that the set {0, 1, 1, 1), (1, 0, 1, 1), T (e3 ) = (1, 1, 0, 1)} is LI.
Let c1 , c2 , c3 < be the scalars such that
c1 T (e1 ) + c2 T (e2 ) + c3 T (e3 ) = 1
(c2 + c3 , c1 + c3 , c1 + c2 , c1 + c2 + c3 ) = (0, 0, 0, 0)
c1 = c2 = c3 = 0.
This shows that {T (e1 ), T (e2 ), T (e3 )} is LI in <4 and also kerT = {}, hence {T (e1 ), T (e2 ), T (e3 )}
is a basis for <3 .

Linear Transformations

305

Ex 5.1.18 Let T : P1 P2 be a linear transformation for which we know that T (x + 1) =


x2 1 and T (x 1) = x2 + x. Find T (7x + 3) and T (ax + b).
Solution: It can be easily verified that {x + 1, x 1} is a basis for P1 . Also, 7x + 3 can be
written as a linear combination as 7x + 3 = 5(x + 1) + 2(t 1), therefore,
T (7x + 3) = T (5(x + 1) + 2(x 1)) = 5T (x + 1) + 2T (x 1)
= 5(x2 1) + 2(x2 + x) = 7x2 + 2x 5.
Writing ax + b as linear combination of the given basis vectors, we see that,
a b
(x + 1) +
(x 1)
2

2


a+b
a b
T (ax + b) = T
(x + 1) +
(x 1)
2
2
a + b
a b
=
T (x + 1) +
T (x 1)
2
2
a + b
a b
=
(x2 1) +
(x2 + x)
2
2
a + b
a b
x
.
= ax2 +
2
2
ax + b =

a + b

Theorem 5.1.7 Let T : V V be a linear transformation on a finite dimensional vector


space V (F ). Then the following statements are equivalent:
(i) ImT kerT = {}.
(ii) T [T ()] = T () = ; V.
Proof: First, we suppose that (i) holds. Then,
T [T ()] = T () kerT and T () ImT ; as V
T () kerT ImT
T () = ; as ImT kerT = {}.
Thus (i) (ii). Again, let us suppose that (ii) is true and if possible, let (i) be not true.
Then (6= ) ImT kerT. Now
ImT kerT ImT and kerT
= T (); for some V and T () =
T () = T [T ()] =
T () = ; by (ii)
= T () = ,
which is a contradiction. Therefore ImT kerT = {}. Hence (ii) (i). Therefore, the
given statements are equivalent.
Theorem 5.1.8 (Sylvesters Law) Let V and W be two vector spaces over a field F . Let V
be a finite dimensional and T : V W be linear, then
rank(T ) + nullity(T ) = dim(ImT ) + dim(kerT ) = dimV.

(5.4)

306

Linear Transformations

Proof: We know, if V is finite dimensional vector space, then both kerT and ImT are
finite dimensional.
Case 1: Let kerT = V, then T () = W for any V . Hence ImT = {W } and so
dim(ImT ) = 0. Thus,
dim(kerT ) + dim(ImT ) = dimV + 0 = dimV.
Hence the theorem holds good in this case.
Case 2: Let kerT = {} and let {1 , 2 , . . . , n } be a basis of V , so that dimV = n. Then
{T (1 ), T (2 ), . . . , T (n )} is a basis of ImT , so dimImT = n. Thus, dimkerT = 0 and
therefore,
dim(kerT ) + dim(ImT ) = 0 + n = dimV.
Case 3: Let kerT be a non-trivial proper subspace of V . Let S1 = {1 , 2 , . . . , m } be a
basis of kerT , so that dim(kerT ) = m. Then, S1 is a LI subset of V and so it can be extended
to form a basis of V . Let the extended basis of V be S2 = {1 , 2 , . . . , m , m+1 , . . . , n }.
Therefore, S = {T (m+1 ), . . . , T (n )} is a finite basis of ImT . First we are to show that S
spans ImT . Let ImT, so V such that = f (). Since S2 is a basis of V , we can
n
P
find a unique set of real numbers c1 , c2 , . . . , cn such that, =
ci i . Then,
i=1
n
n
X
X
= T () = T (
ci i ) =
ci T (i );
i=1

n
X

ci T (i );

Since T is linear

i=1

as T (i ) = ; i = 1, 2, . . . , m.

i=m+1

Thus, every element of ImT is expressible as a linear combination of elements of S and so,
S generates ImT . Now, we are to show that S is LI. Suppose that,
n
X

ci T (i ) =

i=m+1

n
X
i=m+1
n
X
i=m+1
n
X
i=m+1
m
X

T (ci i ) = ; as T is linear
ci i KerT
ci i =

bj j +

j=1

m
X

j=1
n
X

bj j ; as S1 generates KerT
(ci )i =

i=m+1

b1 = 0 = b2 = . . . = bm ; am+1 = 0 = . . . = an ; as S2 is LI
am+1 = 0 = . . . = an ; in particular.
This shows that S is LI and consequently, S is basis of ImT . Thus, rank(T ) = dim(ImT ) =
n m, nullity(T ) = dim(KerT ) = m. Therefore,
rank(T ) + nullity(T ) = (n m) + m = n = dimV.
This theorem is called the dimension theorem.
Result: Reflecting on the action of a linear transformation, we see intuitively that the larger
the nullity, the smaller the rank. In other words, the more vectors that are carried into 0,
the smaller the range. The same heuristic reasoning tells us that the larger the rank, the
smaller the nullity. This balance between rank and nullity is made precise, appropriately by
the dimension theorem.

Linear Transformations

307

Deduction 5.1.4 If T : V W is a linear transformation and dimV = dimW , then T is


one-to-one if and only if T maps V onto W .
Proof: First of all suppose T be one to one so that KerT = {}, which means that
dim(KerT ) = 0. Using Sylvesters theorem, we have,
dim(kerT ) + dim(ImT ) = dimV = dimW,

by hypothesis.

Here, dim(kerT ) = 0, therefore, dim(ImT ) = dimW, shows that T is onto. Conversely, let
T be onto. Then, ImT = W , which implies that dim(ImT ) = dimW . We have,
dim(kerT ) + dim(ImT ) = dimV = dimW
dimW + dim(kerT ) = dimW
dim(kerT ) = 0 kerT = {}.
Hence T is one to one.
Deduction 5.1.5 Surprisingly, the condition of one-to-one and onto are equivalent in a
important special case.
Let V and W be vector space of equal(finite) dimension, and let T : V W be linear.
Then the following statements are equivalent:
(i) T is one-to-one.
(ii) T is onto.
(iii) rank(T ) = dim(V ).
Ex 5.1.19 Determine the linear mapping T : <3 <3 that maps the basis (0, 1, 1), (1, 0, 1),
(1, 1, 0) of <3 to the (2, 1, 1), (1, 2, 1), (1, 1, 2) respectively. Find KerT and ImT . Verify that
dimKerT + dimImT = 3.
Solution: Let 1 = (0, 1, 1), 2 = (1, 0, 1), 3 = (1, 1, 0) and 1 = (2, 1, 1), 2 = (1, 2, 1), 3 =
(1, 1, 2) R3 . Let = (x, y, z) be an arbitrary vector in R3 , unique scalars c1 , c2 , c3 R
such that
= c1 1 + c2 2 + c3 3
(x, y, z) = c1 (0, 1, 1) + c2 (1, 0, 1) + c3 (1, 1, 0)
(x, y, z) = (c2 + c3 , c1 + c3 , c1 + c2 )
c2 + c3 = x, c1 + c3 = y, c1 + c2 = z.
1
1
1
c1 = (y + z x), c2 = (x y + z), c3 = (x + y z).
2
2
2
Since T is linear, hence
T () = c1 T (0, 1, 1) + c2 T (1, 0, 1) + c3 T (1, 1, 0)
T () = c1 1 + c2 2 + c3 3
1
1
1
T () = (y + z x)(2, 1, 1) + (x y + z)(1, 2, 1) + (x + y z)(1, 1, 2)
2
2
2
T (x, y, z) = (y + z, x + z, x + y); (x, y, z) <3
which is the required linear transformation. Now let (x, y, z) <3 , then
y + z = 0, x + z = 0, x + y = 0 x = y = z = 0.

308

Linear Transformations

Hence KerT = {} and so dimKerT = 0. Also, ImT is the linear span of vectors
T (1 ), T (2 ), T (3 ), i.e.
n
o
ImT = L T (1 ), T (2 ), T (3 ) .
where {1 , 2 , 3 } is any basis of the domain space <3 . Since {(0, 1, 1), (1, 0, 1), (1, 1, 0)} is
a basis of <3 , so
n
o
ImT = L (2, 1, 1), (1, 2, 1), (1, 1, 2) .
Now, as the set of vectors {(2, 1, 1), (1, 2, 1), (1, 1, 2)} is linearly independent, dim ImT = 3.
Hence,
dim KerT + dim ImT = 0 + 3 = 3
Ex 5.1.20 Let T : R2 R2 be defined by T (x, y) = (x + y, x). Prove that T is one-to-one
and onto.
Solution: It is easy to verify that T is a linear transformation. It is easy to see that
N (T ) = {}; so T is one-to-one. Since R2 is finite dimensional. So, T must be onto.
Ex 5.1.21 Determine the linear mapping T : R3 R2 which maps the basis vectors
(1, 0, 0), (0, 1, 0), (0, 0, 1) of R3 to the vectors (1, 1), (2, 3), (3, 2) respectively. Find KerT
and ImT .
Solution: Let = (x, y, z) be an arbitrary vector in <3 , then
= (x, y, z) = x(1, 0, 0) + y(0, 1, 0) + z(0, 0, 1)
T (x, y, z) = xT (1, 0, 0) + yT (0, 1, 0) + zT (0, 0, 1)
= x(1, 1) + y(2, 3) + z(3, 2) = (x + 2y + 3z, x + 3y + 2z),
which is the required linear transformation. To find kerT , let = (x, y, z) <3 be such
that T () = , then
x + 2y + 3z = 0 = x + 3y + 2z y z = 0.
Now, if = (0, 1, 1), then kerT = L{} and so dimkerT = 1. Let be an arbitrary vector
in ImT , then can be expressed in the form
= x(1, 1) + y(2, 3) + z(3, 2).
Now, {(1, 1), (2, 3), (3, 2)} forms a LD set but {(1, 1), (2, 3)} forms a LI set, so that ImT =
{(1, 1), (2, 3)} and dimImT = 2.
0
Ex
Z x 5.1.22 Let T : P2 (<) P3 (<) be a linear transformation defined by T (f (x)) = 2f (x)+
3f (t) dt. Prove that T is injective.
0

Solution: Let = {1, x, x2 } be a basis of P2 (R), then






3
R(T ) = span T (1), T (x), T (x2 ) = span
3x, 2 + x2 , 4x + x3
.
2


Since 3x, 2 + 23 x2 , 4x + x3 is linearly independent, rank(T ) = 3. Since dim (P3 (R)) = 4,
T is not onto. From the dimension theorem
nullity(T ) + 3 = 3 nullity(T ) = 0 N (T ) = {}.
Therefore T is one-to-one.

Linear Transformations

309

Deduction 5.1.6 Let us consider a system of m linear equations in n unknowns as Ax = B,


where the coefficient matrix A may be viewed as a linear mapping A : F n F m . Thus, the
solution of the equation Ax = B may be viewed as the preimage of the vector B F m under
the linear mapping A. Further, the solution of the associated homogeneous system Ax =
may be viewed as the linear mapping A. Applying Sylvesters theorem to this homogeneous
system yields
dim(kerA) = dimF n dim(ImA) = n rankA.
But n is exactly the number of unknowns in the homogeneous system Ax = . Note that, if
r be the rank of the coefficient matrix A, which is also the number of pivot variables in an
echelon form of Ax = , so s = n r is also the number of free variables. Accordingly, as
dimW = s, they form a basis for the solution space W .
Theorem 5.1.9 (Linear mapping with prescribed images ): Let V and W be two
vectors spaces over the same field F . Let {1 , 2 , . . . , n } be an ordered basis of the finite
dimensional vector space V and 1 , 2 , . . . , n be an arbitrary set (not necessarily distinct)
of n vectors in W . Then a unique linear mapping T : V W such that T (i ) = i , for
i = 1, 2, . . . , n.
Proof: To prove this theorem, there are three basic steps. Let be an arbitrary element
of V .
Step 1: Mapping definition : Since {1 , 2 , . . . , n } is a basis of V , unique n tuple
(c1 , c2 , . . . , cn ) such that
= c1 1 + c2 2 + + cn n .
For this vector let us define a mapping T : V W by
T () = c1 1 + c2 2 + + cn n
Then T is a well defined null of associating with each vector in V a vector T () in W .
From the definition it is clear that T (i ) = i ; i = 1, 2, . . . , n. Since the constants ci s are
unique, T is well defined.
Step 2: Linearity of mapping : Here we are to prove that T is linear. For this let , V
n
n
P
P
where =
ci i and =
di i , where the unique scalars ci , di F determined by the
i=1

i=1

basis {1 , 2 , . . . , n }. Then for a, b F , we have


" n
#
n
X
X
T (a + b) = T
(aci + bdi )i =
(aci + bdi )T (i ); as aci + bdi F
i=1

n
X

i=1

(aci + bdi )i = a

i=1

n
X

ci i + b

i=1

n
X

di i = aT () + bT ()

i=1

Therefore T is linear.
Step 3: Uniqueness: To prove the uniqueness of T , let us assume, another linear mapping
U : V W such that U (i ) = i , for i = 1, 2, . . . , n. Thus
!
n
n
X
X
U () = U
ci i =
ci U (i ) ; as f linear
i=1

n
X

i=1

ci i = T () ; V

i=1

U = T, i.e.T is unique.

310

Linear Transformations

Hence the theorem is proved. This shows that LT T with T (i ) = i is unique. This theorem tells us that a linear mapping is completely determined by its values on the elements
of a basis.
Corollary: Let V and W be vector space and suppose that V has a finite basis {1 , 2 , . . . , n }.
If U, T : V W are linear and U (i ) = T (i ) for i = 1, 2, . . . , n, then U = T .
Let T : <2 <2 be a linear transformation defined by T (x, y) = (2y x, 3x) and suppose
that U : <2 <2 is linear. If we know that U (1, 2) = (3, 3) and U (1, 1) = (1, 3) then U = T .
This follows from the corollary and from the fact that {(1, 2), (1, 1)} is a basis for <2 .
Ex 5.1.23 Let T : <2 < where T (1, 1) = 3 and T (0, 1) = 2. Find T (x, y).
Solution: Let = (1, 1), = (0, 1) and {(1, 1), (0, 1)} is a basis of <2 . Hence the linear
map T (x, y) exists and is unique. For a, b <, the linear combination of the vectors and
is given by,
a + b = a(1, 1) + b(0, 1) = (a, a + b)
T (a + b) = aT () + bT ()
T (a, a + b) = 3a 2b
T (x, y) = 3x 2(y x) = 5x 2y,
which is the unique linear transformation.
Ex 5.1.24 Describe the linear operator T on <3 that makes the basis vectors (1, 0, 0),
(0, 1, 0), (0, 0, 1) to (1, 1, 1), (0, 1, 1) and (1, 2, 0) respectively. Find T , T (1, 1, 1) and
T (2, 2, 2).
[WBUT 2003]
Solution: Let 1 = (1, 0, 0), 2 = (0, 1, 0), 3 = (0, 0, 1). Also, let = (b1 , b2 , b3 ) be any
element of <3 . We have to determine the expression for T (b1 , b2 , b3 ).
Now, there exist scalars c1 , c2 and c3 such that = c1 1 + c2 2 + c3 3 . That is,
(b1 , b2 , b3 ) = c1 (1, 0, 0) + c2 (0, 1, 0) + c3 (0, 0, 1) = (c1 , c2 , c3 ).
These equations give, c1 = b1 , c2 = b2 , c3 = b3 . Therefore,
T () = T (c1 1 + c2 2 + c3 3 ) = c1 T (1 ) + c2 T (2 ) + c3 T (3 )
= c1 (1, 1, 1) + c2 (0, 1, 1) + c3 (1, 2, 0)
= (c1 + c3 , c1 + c2 + 2c3 , c1 c2 ).
Thus the required transformation is
T (x, y, z) = (x + z, x + y + 2z, x y).
Therefore,T (1, 1, 1) = (0, 0, 0) and T (2, 2, 2) = (0, 0, 0).
Ex 5.1.25 Find T (x, y) where T : <2 <3 is defined by T (1, 2) = (3, 1, 5), T (0, 1) =
(2, 1, 1). Also, find T (1, 1).
Solution: Let 1 = (1, 2) and 2 = (0, 1). Let = (b1 , b2 ) be any element of <2 . Now,
there exists scalars c1 , c2 , c3 such that = c1 1 + c2 2 . That is,
(b1 , b2 ) = c1 (1, 2) + c2 (0, 1) = (c1 , 2c1 + c2 ).
Therefore, b1 = c1 and b2 = 2c1 + c2 , i.e., c1 = b1 , c2 = b2 2c1 = b2 2b1 . Hence,
T () = c1 T (1 ) + c2 T (2 ) = c1 (3, 1, 5) + c2 (2, 1, 1)
= (3c1 + 2c2 , c1 + c2 , 5c1 c2 ).
That is, T (x, y) = (x + 2y, 3x + y, 7x y) and hence T (1, 1) = (1, 2, 6).

Isomorphism

5.2

311

Isomorphism

Let V and W be vector spaces over a field F . A linear mapping T : V W is said to an


isomorphism if T is both one-to-one and onto. In case, there exists an isomorphism of V
and W , we say that V is isomorphic to W and we write V
= W.
(i) Since T is both one-to-one and onto, T is invertible and T 1 : W V is also a linear
mapping which is both one-to-one and onto.
(ii) The existence of an isomorphism T : V W implies the existence of another isomorphism T 1 : W V . In this case, V and W are said to be isomorphic.
The importance of isomorphism lies in the fact that an isomorphic image of a vector space
may be easier to study or to visualize than the original vector space. For example, <n
is easier to visualize than the general n dimensional real vector space. Note that not all
properties of a vector space are reflected in its isomorphic images, for
(i) to study a vector through its isomorphic image, we need to fix a basis of the vector
space.
(ii) there may be operations and properties which are specific to a given vector space but
are not relevant in an isomorphic image.
Foe example, the product of two elements of Pn is defined in a natural way, but has no
counterpart in <n . Similarly angles and distances by an isomorphism from <n to itself.
Ex 5.2.1 If C be the vector space of all complex numbers over the field <, of all real numbers,
then the mapping T : C <2 defined by T (a + ib) = (a, b); (a + ib) C is an isomorphism.
Solution: First we are to show that T is a linear transformation. For that, let 1 , 2 C
where 1 = a1 + ib1 and 2 = a2 + ib2 ; ai , bi <. Now, x, y <, we have,
T (x1 + y2 ) = T [x(a1 + ib1 ) + y(a2 + ib2 )]
= T [(xa1 + ya2 ) + i(xb1 + yb2 )]
= (xa1 + ya2 , xb1 + yb2 ); bydef inition
= x(a1 , b1 ) + y(a2 , b2 ) = xT (1 ) + yT (2 ); 1 , 2 C.
Hence T is linear mapping. Now,
T (1 ) = T (2 ) T (a1 + ib1 ) = T (a2 + ib2 )
(a1 , b1 ) = (a2 , b2 )
a1 = a2 ; b1 = b2
a1 + ib1 = a2 + ib2 1 = 2 ; 1 , 2 C
Therefore T is one-to-one. Also for each (a, b) <2 , a complex number (a + ib) such that
T (a + ib) = (a, b). So T is onto. Hence T is an isomorphism and therefore C(<)
= <2 (<).
Ex 5.2.2 The linear mapping T : R3 R3 maps the vectors (1, 2, 3), (3, 0, 1) and (0, 3, 1)
to (3, 0, 2), (5, 2, 2) and (4, 1, 1) respectively. Show that T is an isomorphism.
Solution: = (x, y, z) be an arbitrary vector in <3 , then
= (x, y, z) = c1 (1, 2, 3) + c2 (3, 0, 1) + c3 (0, 3, 1)
= (c1 + 3c2 , 2c1 + 3c3 , 3c1 + c2 + c3 )
c1 + 3c2 = x, 2c1 + 3c3 = y, 3c1 + c2 + c3 = z
1
1
1
c1 = (x y + 3z), c2 =
(7x + y 3z), c3 = (x + 4y 3z).
6
18
9

312

Linear Transformations

Thus the linear transformation becomes,


T (x, y, z) = c1 (1, 2, 3) + c2 (3, 0, 1) + c3 (0, 3, 1)


3x + y 6z 2x y x + 2y 3z
,
,
.
=
3
3
3
Now, we are to show that T is one-to-one, for this we are to show that kerT = {}. Now,
T () = gives,
2x y
x + 2y 3z
3x + y 6z
= 0,
= 0,
=0
3
3
3
x = y = z = 0,
so that kerT = {} and so T is one-to-one. Also, for each (x, y, z) <3 , real
 numbers
3x+y6z 2xy x+2y3z
3x+y6z 2xy x+2y3z
,
,
such
that
T
(x,
y,
z)
=
,
,
holds. So
3
3
3
3
3
3
ImT = <3 and therefore T is onto and consequently T is an isomorphism.
Theorem 5.2.1 Let V be the finite dimensional vector space of dimension n over a field
F . Then every n-dimensional vector space V (F ) is isomorphic to the n-tuple space F n (F ).
Proof: Since dimV = n, let {1 , 2 , . . . , n } be a basis of the finite dimensional vector
space V (F ). Then, for each V , a unique ordered set {c1 , c2 , . . . , cn } of scalars such
n
P
that =
ci i . Let us consider the mapping T : V F n defined by,
i=1

T () = (c1 , c2 , . . . , cn );

n
X

ci i .

i=1

If =

n
P

ci i and =

i=1

n
P

di i be any two elements of V , then x, y F , we have

i=1

"
T (x + y) = T x

n
X

!
ci i

+y

i=1

n
X

!#
di i

"
=T

i=1

n
X

#
(xci i + ydi i )

i=1

= (xc1 1 + yd1 1 , xc2 2 + yd2 2 , . . . , xcn n + ydn n )


= (xc1 1 , xc2 2 , . . . , xcn n ) + (yd1 1 , yd2 2 , . . . , ydn n )
= x(c1 1 , c2 2 , . . . , cn n ) + y(d1 1 , d2 2 , . . . , dn n )
!
!
n
n
X
X
= xT
ci i + yT
di i = xT () + yT (); , V
i=1

i=1

Here T is homomorphism. Also,


T () = T () T

n
X

!
ci i

i=1

=T

n
X

!
di i

i=1

(c1 , c2 , . . . , cn ) = (d1 , d2 , . . . , dn )
ci = di ; i = 1, 2, . . . , n
n
n
X
X
di i =

ci i =
i=1

i=1

This shows that T is one-one. Again,


to each vector (c1 , c2 , . . . , cn ) F n ,
 ncorresponding

n
P
P
a vector
ci i in V such that, T
ci i = (c1 , c2 , . . . , cn ). Therefore, T is onto. Thus
i=1

i=1

Isomorphism

313

T is an isomorphism and hence V


= F n (F ). Thus the mapping []S , S be any basis
of a vector space V (of dim n), which maps each V into the co-ordinate vector []S is
an isomorphism between V and F m . Note the following facts:
(i) We are to establish the isomorphicity of T on the choice of the ordered basis (c1 , c2 , . . . , cn ).
Different isomorphisms can be established on the choice of different ordered basis.
(ii) Since T : V F n is an isomorphism, both V and F n have the same structure as
a vector space except for the names of their elements. Therefore, F n serves as a
prototype of a vector space V over F of dimension n.
(iii) A real vector space V of dimension n and the vector space <n of ntuple are isomorphic
and therefore they have the same structure as vector spaces.
(iv) Let (i1 , i2 , , in ) be a fixed permutation of (1, 2, , n). Then the mapping T defined
by T (x1 , x2 , , xn ) = (xi1 , xi2 , , xin ) is an isomorphism from F n to itself. This
transformation is called permutation transformations.
All the concepts and properties defined only through vector addition and scalar multiplication are valid in any isomorphic image of a vector space.
Theorem 5.2.2 If W1 and W2 be the complementary subspace of vector space V (F ), then
the correspondence that assigns to each vector W2 , the coset (W1 +) is an isomorphism
between W2 and (V /W1 ).
Proof: Since W1 , W2 are the complementary subspace of the vector space V (F ), we have
V = W1 W2 ; where V = W1 + W2 and W1 W2 = {}.
Let us consider the mapping T : W2 (V /W1 ), defined by T () = W1 + ; W2 .
Now, , W2 and x, y F , we have
T (a + b) = W1 + (a + b) = (W1 + a) + (W1 + b)
= a(W1 + ) + b(W1 + ) = aT () + bT ()
This shows that T is a linear transformation. Also, for , W2 we have,
T () = T () W1 + = W1 +
W1
W1 W2 ; as , W2 W2
= ; as W1 W2 = {}
= .
Now, let (W1 + ) be an arbitrary element of V /W1 so that V. Thus = + , for some
W1 . Hence
W1 + = W1 + ( + ) = (W1 + ) + (W1 + )
= W1 + = T (); as W1 W1 + = W1
Thus, corresponding to each member W1 + of V /W1 , W2 such that T () = W1 + .
So T is onto. Hence T is an isomorphism and hence W2
= (V /W1 ).
Theorem 5.2.3 Two finite dimensional vector spaces over the same field are isomorphic if
and only if they are of the same dimension.

314

Linear Transformations

Proof: First let V and W are two finite dimensional vector spaces over the same field F
such that dimV = dimW = n. Let S1 = {1 , 2 , . . . , n } and S2 = {1 , 2 , . . . , n } be the
basis of V and W respectively. Then V, an unique ordered set {c1 , c2 , . . . , cn } of
n
P
ci i . Let us consider the mapping T : V W , defined by
scalars such that =
i=1

T () =

n
X

ci i ; =

i=1

n
X

ci i V.

i=1

The map is well defined. Now, if =

n
P

ci i and =

i=1

then a, b F , we have,

n
X

T (a + b) = T

n
P

!
(aci i + bdi i )

n
X

=T

i=1

n
X

di i for any two vectors in V ,

i=1

(aci + bdi )i = a

i=1

n
X

i=1

(aci + bdi )i

i=1
n
X

ci i + b

i=1
n
X

n
X
= aT (
ci i ) + bT (

di i

i=1

di i ) = aT () + bT ().

i=1

This shows that T is a linear transformation or homomorphism. Also, if =


=

n
P
i=1

n
P

ci i and

i=1

di i be such that
n
n
X
X
T () = T () T (
ci i ) = T (
di i )
i=1

n
X

ci i =

i=1

i=1
n
X

di i , i.e.,

i=1

n
X

(ci di )i =

i=1

ci di = 0; i, as S2 is LI
n
n
X
X

ci i =
di i = .
i=1

i=1

Therefore T is one-one. Again, if W , then, for some scalars ci s we have,


=

n
X

ci i ; as S2 generates W

i=1
n
n
X
X
ci i ), where,
ci i V.
= T(
i=1

Thus, for each V

n
P
i=1

i=1

ci i V such that T (

n
P

ci i ) = . This shows that T is onto

i=1

W and let T be corresponding isomorphism. Let


and hence V
= W. Conversely, let V =
S1 = {1 , 2 , . . . , n } be a basis of V . Then, we claim that S2 = {T (1 ), T (2 ), . . . , T (n )}
is a basis of W . S2 is linearly independent, since
c1 T (1 ) + c2 T (2 ) + + cn T (n ) =
T (c1 1 + c2 2 + + cn n ) = T (); since T is LT
c1 1 + c2 2 + + cn n = ; since T is one-one
c1 = c2 = = cn = 0; as S1 is LI .

Isomorphism

315

Now, in order to show that V = L({T (1 ), T (2 ), . . . , T (n )}), let be an arbitrary element


of V , then T being onto, V such that T ()
! = . Thus, for some scalars ci , we have,
n
X
= T () = T
ci i ; as S1 generates V
i=1

n
X

ci T (i ); as T is linear.

i=1

Thus vector in V is a linear combination of elements of S2 and so, S2 generates V . Hence


S2 is a basis of V and so, dimV = dimW .
Theorem 5.2.4 Let V and W be finite dimensional vector spaces over a field F and :
V W be an isomorphism. Then for a set of vectors S in V , S is linearly independent in
V if and only if (S) is linearly independent in W.
Proof: Let S = {1 , 2 , . . . , n } be a linearly independent set in V . For some scalars
c1 , c2 , , cn F , let us consider the relation, c1 (1 ) + c2 (2 ) + . . . + cn (n ) = W ,
which implies,
(c1 1 + c2 2 + . . . + cn n ) = W
c1 1 + c2 2 + . . . + cn n = V , as is isomorphism
c1 = c2 = = cn = 0,
as {1 , 2 , . . . , n } is linearly independent set in V . Therefore, {(1 ), (2 ), , (n )} is
linearly independent in W . Conversely, let {1 , 2 , . . . , r } be a linearly independent set in
W . Since is isomorphism, unique elements 1 , 2 , . . . , r in V such that
(i ) = i , i = 1, 2, , r
and therefore, S = {1 , 2 , . . . , r }. For some scalars c1 , c2 , , cr F , let us consider the
relation, c1 1 + c2 2 + . . . + cr r = V , which implies,
(c1 1 + c2 2 + . . . + cr r ) = W
c1 (1 ) + c2 (2 ) + . . . + cr (r ) = W , as is isomorphism
c1 1 + c2 2 + . . . + cr r = W
c1 = c2 = = cn = 0,
as {1 , 2 , . . . , r } be a linearly independent set in W and therefore, {1 , 2 , . . . , n } is
linearly independent set in V .
Theorem 5.2.5 Any homomorphism of a finite dimensional vector space onto itself is an
isomorphism.
Proof: Let T be a homomorphism of a finite dimensional vector space V (F ) onto itself.
Let dimV = n and S = {1 , 2 , . . . , n } be a basis of V . We are to show that S1 =
{T (1 ), T (2 ), . . . , T (n )} is also a basis of V . Let V . Since T is onto, V , such
that f () = . Let,
= c1 1 + c2 2 + + cn n ; ci F
= T () = T (c1 1 + c2 2 + + cn n )
= c1 T (1 ) + c2 T (2 ) + + cn T (n ) ; T is linear
Thus, each V is expressible as a linear combination elements of S1 generates V . Now,
dimV = n and S1 is a set of n vectors of V , generating V , so S1 is a linearly independent

316

Linear Transformations

set. Now we shall show that T is one-one. For this, let =

n
P

ci i and =

i=1

n
P

di i be

i=1

#
two elements of V such that T () "= nT (). #Then " n
X
X
T () = T () T
ci i = T
di i
i=1

n
X
i=1
n
X

i=1

ci T (i ) =

n
X

di T (i ) ; T is LI

i=1

(ci di )T (i ) =

i=1

ci di = 0 ; i as S1 is LI
n
n
X
X

ci i =
di i = ; , V
i=1

i=1

Hence T is also one-one and consequently it is an isomorphism.

5.3

Vector Space of Linear Transformation

In this section, we are to give algebraic operations on the set of all linear mappings.
Theorem 5.3.1 (Algebraic operations on the set of all linear mappings) : Let V
and W be two vector spaces over the same field F and let T : V W, S : V W be two
linear mappings. Then the set L(V, W ) of all collection of linear transformations from V
into W , is a vector space with respect to the sum T +S : V W and the scalar multiplication
cT : V W , defined by
(i) (T + S)() = T () + S(); V.
(ii) (cT )() = cT (); V and c F.
Proof: First we are to show that if T and S are linear then T + S and cT are also linear
transformations. For this, let , V , then
(T + S)() = T () + S(); (T + S)() = T () + S()
(T + S)( + ) = T ( + ) + S( + ); by definition
= T () + T () + S() + S(); T, S linear
= [T () + S()] + [T () + S()]
= (T + S)() + (T + S)().
For any scalar k F, we have,
(T + S)(k) = T (k) + S(k); by definition
= kT () + kT (); T, S linear
= k[T () + S()] = k(T + S)().
Hence T + S is linear mapping. Again,
(cT )( + ) = cT ( + ); by definition
= c[T () + T ()]; T is linear
= cT () + cT () = (cT )() + (cT )().

Vector Space of Linear Transformation

317

and (cT )(k) = c[T (k)]; by definition, k F


= c[kT ()]; T is linear
= ckT () = k(cT )().
Therefore, cT is also a linear transformation. Thus the scalar multiplication composition is
well defined. It is easy to prove that hL(V, W ), +i is an abelian group. However, it may be
noted that the mapping

: V W : ()
= ; V
is a linear transformation and is the zero of L(V, W ). Also, for each T L(V, W ), the
mapping (T ), defined by
(T )() = T (); V
is a linear transformation, which is the additive inverse of T . Also, the two compositions
satisfy the following properties
(i) [k(T + S)]() = k[(T + S)()] = k[T () + S()]; k F
= kT () + kS()
= (kT )() + (kS)() = (kT + kS)(); V
k(T + S) = kT + kS.
(ii) [(m + n)T ]() = (m + n)T (); m, n F
= mT () + nT () = (mT )() + (nT )()
= (mT + nT )(); V
(m + n)T = mT + nT.
(iii) [(mn)T ]() = (mn)T (); m, n F
= m[nT ()] = m[(nT )()]
= [m(nT )](); V
(mn)T = m(nT ).
(iv) (1T )() = 1T () = T () 1T = T,
1 is the identity element in F . Therefore, L(V, W ) is a vector space. This linear space
L(V, W ) of all linear mappings has domain V and co-domain W , as the linear mapping
T : V W is also a homomorphism of V into W , the linear space L(V, W ) is also denoted
by Hom(V, W ). Two particular cases of the linear space L(V, W ) are of profound interest.
The first one is L(V, V ) and the second one is L(V, F ), where F is a vector space over F
itself. The linear space L(V, F ) is said to be the dual space of V .
Theorem 5.3.2 Let V and W be two finite dimensional vector spaces over the same field F .
Let dimV = m and dimW = n, then, the vector space L(V, W ) of all linear transformations
from V into W is of dimension mn.
Proof: Let B1 = {1 , 2 , , m } and B2 = {1 , 2 , , n } be the ordered bases of V
and W respectively. Then, for each pair of positive integers (x, y), where 1 x n and
1 y m, there exists a linear transformation fxy : V W such that,

318

Linear Transformations
fxy = ix x ; ix = Kroneckers delta .

We shall show that the set S = {fxy : 1 x n; 1 y m} is a basis of L(V, W ). It is


clear that S contains mn elements. Now, for some axy s, we have
m
n X
X

is the zero of L(V, W )


axy fxy = ;

x=1 y=1

"

n X
m
X

#
i) =
axy fxy (i ) = (

x=1 y=1

m
n X
X

axy fxy (i ) =

x=1 y=1
n
X

m
n X
X

axy ix x =

x=1 y=1

axi x (i ) = ; i, (1 i m)

x=1

a1i = 0, a2i = 0, , ani = 0; i, as B2 is LI


axy = 0, 1 x n; 1 y m.
This shows that S is LI. Now, let f be an arbitrary element of L(V, W ). Then f (i ) W
for each i = 1, 2, , m. Let f (i ) = a1i 1 + a2i 2 + + ani n . Now,
" n m
#
n X
m
m
n X
X
X
XX
axy ix x
axy fxy (i ) =
axy fxy (i ) =
x=1 y=1

=
f =

x=1 y=1

x=1 y=1
n
X

axi x
x=1
n X
m
X

= f (i )

axy fxy .

x=1 y=1

Thus, every element of L(V, W ) is a linear combination of elements of S, i.e., S generates


L(V, W ). So, S is the basis of L(V, W ). Hence, dimL(V, W ) = mn.

5.3.1

Product of Linear Mappings

Let U, V and W be three vector spaces over the same field F (not necessarily of same
dimension). Let T : U V and S : V W be two linear mappings. Then, their product,
denoted by S0 T : U W is the composite mapping, ( composite of S with T ) defined by
(S T )() = S (T ()) ; U
Generally S T is denoted by ST and is also said to be the product mapping ST . In general
ST 6= T S.
Ex 5.3.1 Let T : P2 P2 and S : P2 P2 be defined by T (ax2 + bx + c) = 2ax + b and
S(ax2 + bx + c) = 2ax2 + bx. Compute T S and ST .
Solution: To compute T S and ST we have,
(T S)(ax2 + bx + c) = T (S(ax2 + bx + c)) = T (7ax2 + bx) = 14ax + b.
(ST )(ax2 + bx + c) = S(T (ax2 + bx + c)) = S(2ax + b) = 7ax.
Theorem 5.3.3 The product of two linear transformation is linear.

Vector Space of Linear Transformation

319

Proof: Let U, V and W be three vector space over the same field F . Let T : U V and
S : V W be two linear mappings. We are to show that ST : U W is linear. Let,
, U and a, b F . Then we have,
(ST )(a + b) = S [T (a + b)] ; by definition
= S [aT () + bT ()] ; T is linear
= aS [T ()] + bS [T ()] ; S is linear
= a(ST )() + b(ST )().
This proves that the composite mapping ST : U W is linear.
Theorem 5.3.4 The product of linear mappings is associative.
Proof: Let T1 : V1 V2 , T2 : V2 V3 , T3 : V3 V4 be three linear transformation such
that T3 T2 T1 is well defined. Let V1 . Then
[T3 (T2 T1 )]() = T3 (T2 T1 )() = T3 [T2 T1 ()]
= (T3 T2 )[T1 ()] = [(T3 T2 )T1 ]()
T3 (T2 T1 ) = (T3 T2 )T1
Hence the product of mappings is associative.
Theorem 5.3.5 Let U, V and W be finite dimensional vector space over the same field
F . If T and S are linear transformations from U into V and V into W respectively, then,
rank(T S) rank(T ) and rank(T S) rank(S).
Proof: Here, T : U V and S : V W be two linear mappings. Since, S(U ) V , we
have,
S(U ) V T [S(U )] T (V )
(T S)(U ) T (V )
range(T S) range(T )
dim[range(T S)] dim [range(T )] .
Hence, rank(T S) rank(T ). Again,
dim [T {S(U )}] dimS(U )
rank(T S) = dim[(T S)(U )]
rank(T S) = dim [T {S(U )}] dimS(U ) = rank(S).
Ex 5.3.2 Let S and T be linear mappings of <2 to <2 defined by S(x, y) = (x+y, y), (x, y)
<2 and T (x, y) = (x, x + y), (x, y) <2 . Determine T S, ST , (T S ST )2 , (ST T S)2 .
Solution: Both linear mappings S and T are defined in <2 . The composite mapping
ST : <2 <2 is given by
ST (x, y) = S(T (x, y)) = S(x, x + y)
= (2x + y, x + y); (x, y) <2 .
T S(x, y) = T (S(x, y)) = T (x + y, y)
= (x + y, x + 2y); (x, y) <2 .

320

Linear Transformations

To evaluate (T S ST )2 and (ST T S)2 , we first calculate T S ST and ST T S, which


are given by
(T S ST )(x, y) = (x, y); (x, y) <2
(ST T S)(x, y) = (x, y); (x, y) <2
(T S ST )2 (x, y) = (T S ST )(T S ST )(x, y)
= (T S ST )(x, y) = (x, y) = <2 .
and (ST T S)2 (x, y) = (ST T S)(ST T S)(x, y)
= (ST T S)(x, y) = (x, y) = <2 .
Now, ker(T S ST )2 = ker(ST T S)2 = {}. Thus, we see that, (T S ST )2 = I<2 =
(ST T S)2 .
Ex 5.3.3 Let V be the linear space of all real polynomials p(x). Let D and T be linear
d
mappings on V defined by D(p(x)) = dx
p(x), p(x) V and T (p(x)) = xp(x), p(x) V.
2
2
Show that, DT T D = Iv , DT T D = 2T .
Solution: The linear mappings D and T on V defined by D(p(x)) =
and T (p(x)) = xp(x), p(x) V.

d
dx p(x),

p(x) V

(DT T D)(p(x)) = D(T (p(x))) T (D(p(x)))


d
d
d
= D(xp(x)) T ( p(x)) =
(xp(x)) x p(x)
dx
dx
dx
d
d
= x p(x) + p(x) x p(x) = p(x).
dx
dx
Therefore, DT T D Iv . Now,
DT 2 p(x) = DT (T (p(x))) = DT (xp(x))
= D(T (xp(x))) = D(x2 p(x)) = 2xp(x) + x2

d
p(x)
dx

d
p(x))
dx
d
d
d
= T {T ( p(x))} = T {x p(x)} = x2 p(x)
dx
dx
dx
(DT 2 T 2 D)p(x) = 2xp(x) = 2T p(x)
DT 2 T 2 D = 2T.

T 2 Dp(x) = T 2 (Dp(x)) = T 2 (

Ex 5.3.4 Let T1 : <3 <3 and T2 : <3 <3 be defined by T1 (x, y, z) = (2x, y z, 0) and
T2 (x, y, z) = (x + y, 2x, 2z). Find the formulae to define the mappings
(a) T1 + T2 , (b) 3T1 2T2 , (c) T1 T2 , (d) T2 T12 .
Solution: Using the given transformations,
(a) (T1 + T2 )(x, y, z) = T1 (x, y, z) + T2 (x, y, z) = (2x, y z, 0) + (x + y, 2x, 2z)
= (3x + y, 2x + y z, 2z).
(b) (3T1 2T2 )(x, y, z) = 3T1 (x, y, z) 2T2 (x, y, z)
= 3(2x, y z, 0) 2(x + y, 2x, 2z)
= (4x 2y, 4x + 3y 3z, 4z).
(c) T1 T2 (x, y, z) = T1 (x + y, 2x, 2z) = (2x + 2y, 2x 2z, 0).
(d) T2 T12 (x, y, z) = T2 T1 T1 (x, y, z) = T2 T1 (2x, y z, 0) = T2 (4x, y z, 0)
= (4x + y z, 8x, 0).

Vector Space of Linear Transformation

321

Ex 5.3.5 Give an example of a linear transformation T : <2 <2 such that T 2 () =


for all <2 .
[IIT-JAM10]
Solution: The linear transformation T : <2 <2 is such that T 2 () = for all <2 ,
i.e.,
  



x
x
1 0
T2
=
T2 =
T 2 () = .
y
y
0 1

Let T =


ab
, then
cd
T2 =

a2 + bc ab + bd
ac + cd bc + d2


=

1 0
0 1

a2 + bc = 1 ab + bd = 0; ac + cd = 0 bc + d2 = 1.
Let b = c = 0, then a2 = 1, i.e., a = i, d2 = 1, i.e., d = i. Therefore




i 0
i 0
T =
and
.
0 i
0 i
Also, when a + d = 0 then a = d. Then such matrix is written as

T =




1 0
1 2

T2 =
.
0 1
2 1


Therefore, T over real field is given as T =

5.3.2


1 2

.
2 1

Invertible Mapping

Let V and W be vector space over a field F . A linear mapping T : V W is said to be


invertible if a mapping S : W V such that ST = Iv and T S = Iw , where Iv and Iw are
the identity operators in V and W respectively. In this case, S is said to be an inverse of T
and is denoted by T 1 .
Theorem 5.3.6 Let V and W be vector space over the same field F . If T : V W be
invertible then T has an unique inverse.
Proof: To prove the uniqueness, let T1 : W V and T2 : W V be two inverses of T .
Then by definition,
T1 T = T2 T = Iv ; T T1 = T T2 = Iw .
Now by using associative law, we get,
T1 (T T2 ) = (T1 T )T2 T1 Iw = Iv T2 T1 = T2 .
This proves that the inverse of T is unique. And the inverse of the invertible linear mapping
T is denoted by T 1 . Also, T 1 : W V is linear transformation and (T 1 )1 = T.
Theorem 5.3.7 Let V and W be vector space over the same field F . A linear mapping
T : V W is invertible if and only if T is one-to-one and onto.

322

Linear Transformations

Proof: Let the linear mapping T : V W be invertible, then a mapping S : W V


such that ST = Iv and T S = Iw . First, we are to show that T is one-to-one and onto, for
this let , V such that T () = T (), then,
T () = T () ST () = ST ()
Iv () = Iv (); since ST = Iv
=
Therefore, T is one-to-one. To prove that T is onto, let W . As, T S = Iw so, T S() = ,
i.e.,T {S()} = , which shows that S() is a pre-image of under T . Therefore T is onto.
Thus, when T is invertible, then it is both one-to-one and onto.
Conversely, let T : V W be both one-to-one and onto. Let V be such that T () =
W . As T is one-to-one is the unique image of under T . Since T is onto, each W
has a pre-image in V . Let us define a mapping S : W V by S() = . Then
ST () = S() = ; V
and
T S() = T () = ; W
ST = Iv and T S = Iw .
Hence T is invertible and this completes the proof.
Theorem 5.3.8 Let V and W be vector space over the same field F . If A linear mapping
T : V W is invertible, then the inverse mapping T 1 : W V is linear.
Proof: Here T 1 : W V is the inverse mapping of the linear mapping T : V W so
T T 1 = Iw and T 1 T = Iv . Let , W be such that T 1 () = V and T 1 () =
V .Thus, T () = and T () = . Since T is linear, we have
T (a + b) = aT () + bT () ; a, b F
T (a + b) = a + b
T 1 [T (a + b)] = T 1 (a + b)
Iv (a + b) = T 1 (a + b)
(a + b) = T 1 (a + b) ; as a + b V
aT 1 () + bT 1 () = T 1 (a + b)
This proves that T 1 is linear. Hence, if T : V V be an invertible linear mapping on V
then the linear mapping T 1 : V V has the property that T 1 T = T T 1 = Iv . Both of T
and T 1 are automorphism. However, as, T T 1 = Iw , T 1 T = Iv and inverses are unique,
we conclude that (T 1 )1 = T.
Ex 5.3.6 Let S and T be linear mappings of <3 to <3 defined by S(x, y, z) = (z, y, x), (x, y, z)
<3 and
T (x, y, z) = (x + y + z, y + z, z), (x, y, z) <3 .
(i) Determine T S and ST . (ii) Prove that both S and T are invertible. Verify that
(ST )1 = T 1 S 1 .
Solution: Using the definition, we get,
T S(x, y, z) = T (S(x, y, z)) = T (z, y, x)
= (x + y + z, x + y, x); (x, y, z) <3
ST (x, y, z) = S(T (x, y, z)) = S(x + y + z, y + z, z)
= (z, y + z, x + y + z); (x, y, z) <3 .

Vector Space of Linear Transformation

323

Now, kerT S = kerST = {} so that ST = T S = IR3 . Hence both S and T are invertible.
Now let,
(ST )(x, y, z) = (z, y + z, x + y + z) = (a, b, c)
z = a, y + z = b, x + y + z = c, i.e. x = c b, y = b a, z = a
(ST )1 (a, b, c) = (c b, b a, a)
(ST )1 (x, y, z) = (z y, y x, x).
Now, we are to evaluate T 1 and S 1 .
T (x, y, z) = (x + y + z, y + z, z) = (a1 , b1 , c1 )
x + y + z = a1 , y + z = b1 , z = c1
x = a1 b1 , y = b1 c1 , z = c1
T 1 (a1 , b1 , c1 ) = (a1 b1 , b1 c1 , c1 ), i.e. T 1 = (x y, y z, z).
Again
S(x, y, z) = (z, y, x) = (a2 , b2 , c2 )
x = c2 , y = b2 , z = a2
S 1 (a2 , b2 , c2 ) = (c2 , b2 , a2 ), i.e. S 1 (x, y, z) = (z, y, x).
Hence, T 1 S 1 is given by,
T 1 S 1 (x, y, z) = T 1 (S 1 (x, y, z))
= T 1 (z, y, x) = (z y, y x, x)
Hence (ST )1 = T 1 S 1 is verified.
Ex 5.3.7 Let S : <3 <3 and T : <3 <3 be two linear mappings defined by S(x, y, z) =
(z, y, x) and T (x, y, z) = (x + y + z, y + z, z), (x, y, z) <3 . Prove that both S and T are
invertible. Verify that (ST )1 = T 1 S 1 .
Solution: Let S(x, y, z) = (0, 0, 0). Then (z, y, x) = (0, 0, 0). Implies x = y = z = 0, so
ker(S) = {}, so S is one-to-one.
Also, domain and co-domain of S are of equal dimension 3, therefore, S is onto. Hence
S is invertible. Now,
S(x, y, z) = (z, y, x), i.e., (x, y, z) = S 1 (z, y, x) or, S 1 (x, y, z) = (z, y, x).
For the mapping T , let T (x, y, z) = (0, 0, 0). Then (x + y + z, y + z, z) = (0, 0, 0). This
gives,
x + y + z = 0, y + z = 0, z = 0, i.e., x = y = z = 0.
Therefore ker(T ) = {}. Thus T is one-to-one. Also, it is onto. Hence T is bijective and
invertible. Now,
T (x, y, z) = (x + y + z, y + z, z) = (u, v, w), (say)
where u = x + y + z, v = y + z, w = z, i.e., z = w, y = v w and x = u v. Therefore,
T 1 (u, v, w) = (x, y, z) = (u v, v w, w)
or, T 1 (x, y, z) = (x y, y z, z).
Last part: Now,
ST (x, y, z) = S(x + y + z, y + z, z) = (z, y + z, x + y + z).
Therefore (ST )1 (z, y + z, x + y + z) = (x, y, z)
or, (ST )1 (x, y, z) = (z y, y x, x).
Again, T 1 S 1 (x, y, z) = T 1 (z, y, x) = (z y, y x, x).
Thus (ST )1 = T 1 S 1 , verified.
Ex 5.3.8 Let T be a linear operator on <3 defined by T (x, y, z) = (2x, 4x y, 2x + 3y z).
Show that T is invariable and find a formula for T 1 .

324

Linear Transformations

Solution: We are to find kerT by setting T () = , where, = (x, y, z). Therefore


T (x, y, z) = gives
(2x, 4x y, 2x + 3y z) = (0, 0, 0)
2x = 0, 4x y = 0, 2x + 3y z = 0
x=y=z=0
This system has only trivial solution (0, 0, 0) and so nullity T = {}. Therefore, T is one-one.
Also, V = <3 , W = <3 , so that dimV = dimW, so that T is onto. Hence by theorem, T is
non-singular and so it is invertible. Let = (a, b, c) be the image of (x, y, z) under T . Then
(x, y, z) is the image of (a, b, c) under T 1 . Thus,
T (x, y, z) = (2x, 4x y, 2x + 3y z) = (a, b, c)
2x = a, 4x y = b, 2x + 3y z = c
a
x = , y = 2a b, z = 7a 3b c.
2
a
1
T (a, b, c) = ( , 2a b, 7a 3b c).
2
Thus the formula for T 1 : <3 <3 is defined by T (x, y, z) = ( x2 , 2x y, 7x 3y z).
Ex 5.3.9 If T 1 : V V be a linear operator on V such that T 2 T + I = , then show
that T is invertible.
Solution: Here given that T 2 T + I = . Thus,
T2 T + I = T2 = T I
T 2 () = T () I() = ; where T () =
T () = = ; where T () =
If T (1 ) = 1 and T (2 ) = 2 then T (1 ) = T (2 ) = 1 = 2 . Thus,
T (1 ) = T (2 ) = 1 = 2 where T (1 ) = 1 , T (2 ) = 2
This shows that 1 1 = 2 2 1 = 2 , so T is one-to-one. Any V , there is
V such that T () = . Since = , so V . Now, for each V , there is
an V such that T () = . Hence T is onto. Thus T is invertible.

5.4

Singular and Non-singular Transformation

Let V, W be the two vector spaces over the same field F . A linear transformation T : V W
is said to be singular, if a (6= ) V such that T () = w and non-singular, if
T () = w = V , i.e., kerT = {}.
(5.5)
Nonsingular linear mappings may also be characterized as those mappings that carry independent sets into independent sets. Thus the linear transformation T : V W is nonsingular if it is invertible, i.e. T 1 exists.
Theorem 5.4.1 Let V and W be two vector spaces over the same field F . Then a linear
transformation T : V W is non-singular if and only if T maps evenly LI subset of V onto
a LI subset of W .

Singular and Non-singular Transformation

325

Proof: Let the linear transformation T : V W be non-singular and let S = {1 , 2 , . . . , n }


is LI in W . Let for any scalars, c1 , c2 , . . . , cn F ;
c1 T (1 ) + c2 T (2 ) + + cn T (n ) = w
T (c1 1 + c2 2 + + cn n ) = w ; T is linear
c1 1 + c2 2 + + cn n = v ; T is non-singular
c1 = c2 = = cn = 0
Thus S1 = {T (1 ), T (2 ), . . . , T (n )} is LI. Conversely, let the image under T of every LI
subset of V be LI subset of W . Now, if be a non-zero element of V , then {} is LI and
so by given hypothesis {T ()} is LI Consequently, T () = w . Then
T () = w = v .
Hence T is non-singular. From this theorem, we have the following consequences:
(i) A linear transformation T : V V is non-singular iff the image of a basis S of V is
again a basis of V under the mapping T .
(ii) A linear transformation T : V V on a finite dimensional vector space V is invertible
iff T is non-singular.
(iii) Let V and W be vector space over the same field F and T : V V be a linear
transformation. Then dimV = dimW T is non-singular. Take a example. Since
dim<3 is less than dim<4 , we have that dim(ImT ) is less than the dimension of the
domain of T , accordingly, no linear mapping T : <4 <3 can be non-singular.
(iv) A base S of Vn (F ) can be changed to another base by a non-singular transformation.
Theorem 5.4.2 A linear transformation T : V W is an isomorphism if and only if T is
non-singular.
Proof: Let the linear transformation T : V W be an isomorphism. Then it is clearly
one-to-one. So, v is the only vector such that T (v ) = w . Consequently,
T () = w = v .
This shows that T is non-singular and T () = T (). Conversely, let T be non-singular.
Then,
T () = T () T () T () = w
T ( ) = w ; T is linear
= v ; T is nonsingular
=
Hence T is one-to-one. Further let S1 = {T (1 ), T (2 ), . . . , T (n )} is L.I. subset of W . But
W being finite-dimensional, it follows that {T (1 ), T (2 ), . . . , T (n )} is a basis of W . Now
an arbitrary element W can be expressed as
= c1 T (1 ) + c2 T (2 ) + + cn T (n ) ; ci F
= T (c1 1 + c2 2 + + cn n ) ; T is linear
RangeT
Thus, every element of W is in the range of T , i.e. W R(T ). So W = R(T ) and so T is
onto as R(T ) W . Hence T is an isomorphism.

326

5.5

Linear Transformations

Linear Operator

So far we have discussed some properties of linear mappings of V to W , where V and W are
vector spaces over a field F . Let T : V W and S : V W be two linear mappings over
a same field F . The sum T + S and the scalar product cT , c F as the mappings from V
to W is defined as
(i) (T + S)() = T () + S(); V.
(ii) (cT )() = cT (); V and c F.
We denote the vector space of all LT s from V into W by L(V, W ). Now we shall consider
the special case when W = V . A linear mapping T : V V is called a linear mapping on
V, T is also called a linear operator on V . The set of all linear operators on a vector space
over a field F form, in its own right, a linear space over F , denoted by L(V, V ).
Deduction 5.5.1 The important feature is that can define another binary composition,
called multiplication, on this set. Let T and S be two linear operators on V , then the
composite mappings T S and S T are both linear operators on V . If we define ST by
S T , then ST : V V is defined by ST () = S[T ()] for all V. Since the composition
of linear mappings is associative, multiplication is associative, i.e., (ST )U = S(T U ) for all
S, T, U L(V, V ). The mapping IV : V V defined by IV () = for all V is the
identity operator. Also, multiplication is distributive related to addition by T (S + U ) =
T S + T U and (S + U )T = ST + U T , for,
[T (S + U )] = T [(S + U )] = T [S + U ]
= T (S) + T (U ) [T is linear ]
= (T S + T U ), for all V.
Therefore T (S + U ) = T S + T U and similarly, (S + U )T = ST + U T . Thus the linear space
L(V, V ) is a ring under addition and multiplication. It is a non-commutative ring with unity,
IV being the unity in the ring.
Theorem 5.5.1 Let T : V V be a linear operator on a finite dimensional vector space
over a field F . Then the following five statements are equivalent.
(i) T is non-singular; kerT = {}.
(ii) T is one-to-one;
(iii) T is an onto mapping;
(iv) T maps a linearly independent set of V to another linearly independent set;
(v) T maps a bases of V to another bases.
Proof: Let (i) holds, i.e., the linear operator T is non-singular, then T is invertible. Therefore T is bijective, i.e., T is one-to-one and onto. Hence (ii) holds. Moreover T is linear, so
the mapping T : V V is an isomorphism.
Let (ii) holds. Then for one-one mapping dim KerT = 0. The Sylvesters law dim KerT +
dim ImT = dimV gives dim ImT = n. But ImT V and dimV = n. Hence ImT = V
and this proves that T is an onto mapping. Hence (iii) holds.
Let (iii) holds and let {1 , 2 , . . . , n } be a linearly independent set of V . Since T is
onto, dim ImT = dimV. Therefore dim KerT = 0 and consequently, KerT = {}. Since
KerT = {}, the images of linearly independent set in V are linearly independent. Therefore

Matrix Representation of Linear Transformation

327

the set {T (1 ), T (2 ), . . . , T (n )} being a linearly independent set of n elements, is a bases


of V . Hence (v) holds.
Let (v) holds. Let {1 , 2 , . . . , n } be a bases of V . Then {T (1 ), T (2 ), . . . , T (n )}
is another bases of V . Let KerT and let = c1 1 + c2 2 + + cn n . Since T is
linear, c1 T (1 ) + c2 T (2 ) + + cn T (n ) = . This implies c1 = c2 = = cn = 0, since
{T (1 ), T (2 ), . . . , T (n )} is a linearly independent set. Therefore KerT = and
consequently, KerT = {}. Therefore T is one-to-one. dim KerT + dim ImT = dimV
gives dim ImT = dimV . But ImT V. so ImT = V and this implies that T is onto. T
being both one-to-one and onto, T is invertible and therefore non-singular. Hence (i) holds.
Thus all five conditions are equivalent.

5.6

Matrix Representation of Linear Transformation

In the previous section we have studied linear transformations by examining their ranges
and all null spaces. In this section, we develop a one-to-one corresponds between matrices
and linear transformations that allows us to utilize properties of one to study properties of
the other.
Let V and W be two finite dimensional vector spaces over a field F with dimV = n(6= 0),
dimW = m(6= 0) and T : V W be a linear mapping. Let P = (1 , 2 , . . . , n ) and
Q = (1 , 2 , . . . , m ) be ordered bases of V and W respectively. T is completely determined
by the images T (1 ), T (2 ), . . . , T (n ). Since (1 , 2 , . . . , m ) is ordered, each T (i ) W
is a linear combination of the vectors {1 , 2 , . . . , m }, in a unique manner as
T (1 ) = c11 1 + c21 2 + + cm1 m
T (2 ) = c12 1 + c22 2 + + cm2 m
..
.
T (n ) = c1n 1 + c2n 2 + + cmn m
where a unique co-ordinate set ci1 , ci2 , , cim F determined by the ordered basis (1 , 2 , . . . , m ).
n
m
P
P
Let =
xi i be an arbitrary vector of V and let T () =
yj j ; xi , yj F . Now,
i=1

j=1

T () = T (x1 1 + x2 2 + + xn n )
= x1 T (1 ) + x2 T (2 ) + + xn T (n )
= x1 (c11 1 + c21 2 + + cm1 m ) + x2 (c12 1 + c22 2 + + cm2 m )
+ + xn (c1n 1 + c2n 2 + + cmn m ).
As {1 , 2 , . . . , m } is linearly independent, we have,
y1 = c11 x1 + c12 x2 + + c1n xn
y2 = c21 x1 + c22 x2 + + c2n xn
..
.
ym = cm1 x1 + cm2 x2 + + cmn xn


x1
c11 c12 c1n
y1
y2 c21 c22 c2n x2


or, . = .
..
..
.
.. ..
.
ym

cm1 cm2 cmn

xn

or, Y = AX; i.e., [T ()]Q = A[]P

(5.6)

where Y , i.e., [T ()]Q is the co-ordinate column vector of T () with respect to the base Q
in W ; X, i.e. []P is the co-ordinate column vector of in V and A = [cij ]mn is called the

328

Linear Transformations

matrix associated with T or the matrix representation of T or matrix of T relative to the


ordered bases P = (1 , 2 , . . . , n ) and Q = (1 , 2 , . . . , m ). Graphically,
T T ()
[]s A [T ()]Q = A[]P
The top horizontal arrow represents the linear transformation L from the n dimensional
vector space V into m dimensional vector space W and take the vector in V to the
vector T () in W . The bottom horizontal table represents the matrix A. Then [T ()]Q , a
coordinate vector in m dimensional vector space, is obtained simply by multiplying []P , a
co-ordinate vector in n dimensional vector space, by the matrix A. The following are some
facts :
(i) If V = W and P = Q, then we write T () = A.
(ii) A can be written as [T ] or m(T ). Also, the co-ordinate vector of T (j ) relative to the
ordered basis (1 , 2 , . . . , m ) is given by the j th column of A.
(iii) For given {1 , 2 , . . . , n } and {1 , 2 , . . . , m }, the matrix A is unique.
(iv) IF the matrix A is given, then the linear mapping T is unique.
(v) Let = {1 , 2 , . . . , n } V (F ), then can be written as =

n
P

i ei where

i=1

{e1 , e2 , . . . , en } is the standard basis.


For given A, we have,
T () =

X

ci1 i ,

ci2 i , ,

cim i

relative to the standard basis. Physicists and others who deal at great length with linear
transformations perform most of their computations with the matrices of the linear transformations.
Ex 5.6.1 A linear mapping T : <3 <3 is defined by
T (x1 , x2 , x3 ) = (2x1 + x2 x3 , x2 + 4x3 , x1 x2 + 3x3 ); (x1 , x2 , x3 ) <3 .
Find the matrix T relative to the ordered basis ((0, 1, 1), (1, 0, 1), (1, 1, 0)) of <3 .
Solution: Using the definition of T , we have,
T (0, 1, 1) = (0, 5, 2) = 0(1, 0, 0) + 5(1, 0, 0) + 2(0, 0, 1)
T (1, 0, 1) = (1, 4, 4) = 1(1, 0, 0) + 4(0, 1, 0) + 4(0, 0, 1)
T (1, 1, 0) = (3, 1, 0) = 3(1, 0, 0) + 1(0, 1, 0) + 0(0, 0, 1).
Since S = {(0, 1, 1), (1, 0, 1), (1, 1, 0)} is a basis for <3 , the co-ordinate vectors of {T (0, 1, 1), T (1, 0, 1), T (1, 1, 0)}
with respect to the given ordered basis are the same as T (0, 1, 1), T (1, 0, 1), T (1, 1, 0) respectively. Thus, the matrix representation T , relative to the ordered basis ((0, 1, 1), (1, 0, 1), (1, 1, 0))

of <3 is
013
[T ] = m(T ) = 5 4 1
240
Ex 5.6.2 Let T : R2 R3 be a linear transformation defined by
T (x, y) = (x + 3y, 0, 2x 4y).
Find the matrix representation relative to the bases = {e1 , e2 } and = {e3 , e2 , e1 }.

Matrix Representation of Linear Transformation

329

Solution: Let = {e1 , e2 } and = {e1 , e2 , e3 } be the standard basis for R2 and R3 ,
respectively. Now
T (1, 0) = (1, 0, 2) = 1e1 + 0e2 + 2e3
and T (0, 1) = (3, 0, 4) = 3e1 + 0e2 4e3

1 3
0 0
Hence [T ]
=
2 4
If = {e3 , e2 , e1 } =
6 , with respect to the ordered bases, we get,

2 4
0
[T ] = 0 0
1 3
Ex 5.6.3 In T : <2 <2 , T maps (1, 1) to (3, 3) and (1, 1) to (5, 7). Determine the
matrix of T relative to the ordered bases ((1, 0), (0, 1)).
Solution: We see that, {(1, 1), (3, 3)} and {(1, 1), (5, 7)} are the bases for <2 . Let
scalars c1 , c2 and d1 , d2 < such that,
(1, 0) = c1 (1, 1) + c2 (1, 1) and (0, 1) = d1 (1, 1) + d2 (1, 1).
Solutions of the linear equations in c1 , c2 , d1 , d2 gives c1 = c2 = 12 , d1 = 12 , d2 = 12 and so,
1
1
1
1
(1, 0) = (1, 1) + (1, 1) and (0, 1) = (1, 1) (1, 1).
2
2
2
2
Since T is a linear transformation, we have,
1
1
T (1, 1) + T (1, 1)
2
2
1
1
= (3, 3) + (5, 7) = (4, 2) = 4(1, 0) + 2(0, 1)
2
2
1
1
T (0, 1) = T (1, 1) T (1, 1)
2
2
1
1
= (3, 3) (5, 7) = (1, 5) = 1(1, 0) 5(0, 1).
2
2


4 1
Hence, the matrix of T is
. Let (x, y) <2 , then (x, y) = x(1, 0) + y(0, 1), so that
2 5
T (x, y) = xT (1, 0) + yT (0, 1); as T is linear
= x(4, 2) + y(1, 5) = (4x y, 2x 5y),
T (1, 0) =

which the linear transformation.



1 2 3
Ex 5.6.4 The matrix of a linear transformation T : < < is A =
. Determine
4 2 1
the transformation relative to ordered bases ((1, 2, 1), (2, 0, 1)(0, 3, 4)) of <3 and ((2, 1), (0, 5))
of <2 . Find the matrix of T relative to the ordered bases ((1, 1, 0), (1, 0, 1)(0, 1, 1)) of <3 and
((1, 0), (0, 1)) of <2 .
3

Solution: Here, by definition,


T (1, 2, 1) = 1(2, 1) + 4(0, 5) = (2, 21)
T (2, 0, 1) = 2(2, 1) + 2(0, 5) = (4, 12)
T (0, 3, 4) = 3(2, 1) 1(0, 5) = (6, 8).

330

Linear Transformations

Let (a, b, c) <3 and let for some scalars ci <,


(a, b, c) = c1 (1, 2, 1) + c2 (2, 0, 1) + c3 (0, 3, 4)
= (c1 + 2c2 , 2c1 + 3c3 , c1 + c2 + 4c3 )
c1 + 2c2 = a, 2c1 + 3c3 = b, c1 + c2 + 4c3 = c
1
1
1
c1 =
(3a + 8b 6c), c2 =
(5a 4b + 3c), c3 =
(2a b + 4c).
13
13
13
Thus, the linear transformation T : <3 <2 is given by,
T (a, b, c) = c1 T (1, 2, 1) + c2 T (2, 0, 1) + c3 T (0, 3, 4); T linear
= c1 (2, 21) + c2 (4, 12) + c3 (6, 8)
= (2c1 + 4c2 6c3 , 21c1 + 12c2 8c3 )


38a + 6b 24c 139a + 128b 122c
=
,
; (a, b, c) <3 .
13
13
Now, using the above linear transformation T : <3 <2 , we get,
44 267
44
267
,
)=
(1, 0) +
(0, 1)
13 13
13
13
14 17
14
17
T (1, 0, 1) = ( , ) =
(1, 0) + (0, 1)
13 13
13
13
18 6
18
6
T (0, 1, 1) = ( , ) = (1, 0) + (0, 1).
13 13
13
13
 44 14 18 
13 13 13
Therefore, the matrix T is given by [T ] = 267
.
17
6
T (1, 1, 0) = (

13

13

13

Ex 5.6.5 Let {1 , 2 , 3 } and {1 , 2 } be ordered bases of the real vector spaces V and W
and a linear mapping T : V W maps the basis vectors as
T (1 ) = 1 + 2 , T (2 ) = 21 2 , T (3 ) = 1 + 32 .
Find the matrix T relative to the ordered bases {1 , 2 , 3 } of V and {1 , 2 } of W .


1 1 3
Solution: For the given linear mapping, the matrix of T is [T ] =
. Now we
1 2 1
calculate the matrix of T relative to the ordered bases (1 +2 , 2 , 3 ) of V and (1 , 1 +2 )
of W . For this,
T (1 + 2 ) = T (1 ) + T (2 ) = 31 = 31 + 0(1 + 2 )
T (2 ) = = 21 2 = 31 1.(1 + 2 )
T (3 ) = = 1 + 32 = 21 + 1.(1 + 2 ).


3 3 2
Therefore the matrix of T is [T ] =
.
0 1 3
Ex 5.6.6 Let N be the vector space of all real polynomials of degree atmost 3. Define
S : N N by (Sp )(x) = p(x+1), p N . Then find the matrix of S in the basis {1, x, x2 , x3 }
considered as column vectors.
[NET(June)12]
Solution: We have S : N N defined by (Sp )(x) = p(x + 1), p N . Then
(S1 )(x) = 1.(x + 1) = 1 = 1.1 + 0.x + 0.x2 + 0.x3
(Sx )(x) = x(x + 1) = x + 1 = 1.1 + 1.x + 0.x2 + 0.x3
(Sx2 )(x) = x2 (x + 1) = (x + 1)2 = 1 + 2x + x2 = 1.1 + 2.x + 1.x2 + 0.x3
(Sx3 )(x) = x3 (x + 1) = (x + 1)3 = 1 + 3x + 3x2 + x3 = 1.1 + 3.x + 3.x2 + 1.x3

Matrix Representation of Linear Transformation

331

1111
1 1 2 3

Therefore, the matrix of S with respect to basis {1, x, x2 , x3 } is


0 0 1 3.
0001
Theorem 5.6.1 Let V, W be two finite dimensional vector space over a field F and T :
V W be a linear transformation. Then rank of T = rank of the matrix T .
Proof: Let {1 , 2 , , n } and {1 , 2 , m } be the ordered bases of the vector spaces
V and W respectively so that, dimV = n and dimW = m. Let the matrix of T relative to
the chosen bases be m(T ), given by,

c11 c12 c1n


c21 c22 c2n

m(T ) = .

..
..

.
cm1 cm2 cmn
where the scalars cij F are uniquely determined by the basis {1 , 2 , m }. Therefore,
T (j ) = c1j 1 + c2j 2 + + cmj m ; j = 1, 2, , n.
Since (1 , 2 , , n ) is a ordered basis of V , T (1 ), T (2 ), , T (n ) generates ImT . Let
rank of T = r, then dimImT = r. Without loss of any generality, let {T (1 ), T (2 ), , T (r )}
be a basis of ImT . Then each T (r+1 ), T (r+2 ), , T (n ) belongs to L{T (1 ), T (2 ), , T (n )}.
Let us consider the isomorphism : W F m , defined by,

c1
c2

(c1 1 + c2 2 + + cm m ) = . ,
..

c11
c21
then, we have,

(T (1 )) = .
..
cm1

c12
c22

; (T (2 )) = ..

.
cm2

cm

c1n
c2n

; ; (T (n )) = ..

cmn

Since, {T (1 ), T (2 ), , T (r )} is a LI set and is an isomorphism {(T (1 )), (T (2 )), ,


(T (n ))} is a LI set of m tuples of F m . Since each of T (r+1 ), T (r+2 ), , T (n ) belong
to L{T (1 ), T (2 ), , T (r )} and is isomorphism, each of (T (r+1 )), (T (r+2 )), ,
(T (n )) belong to L{T (1 ), T (2 ), , T (r )}. Therefore the first r column vectors of
m(T ) are LI and each of the remaining n r column vectors belongs to the linear span of
the first column vectors. Consequently, the column rank of m(T ) = r and therefore the rank
of m(T ) = r. Hence the theorem.
Theorem 5.6.2 (Matrix representation of composite mapping) Let T : V U and
S : U W be linear mappings where V, U, W are finite dimensional vector spaces over a
field F . Then relative to a choice of order bases m(ST ) = m(S).m(T ), where m(T ) is the
matrix of T relative to the chosen bases.
Proof: Let (1 , 2 , . . . , n ) be an ordered basis of V, (1 , 2 , . . . , p ) be an ordered basis of
U and (1 , 2 , . . . , m ) be an ordered basis of W respectively, so that dimV = n, dimU = p,
and dimW = m.. Relative to the ordered bases, let the matrices of transformations T, S
and ST relative to the bases be

a11 a12 . . . a1n


b11 b12 . . . b1p
c11 c12 . . . c1n
a21 a22 . . . a2n
b21 b22 . . . b2p
c21 c22 . . . c2n

... ... ... ... , ... ... ... ... , ... ... ... ...
ap1 ap2 . . . apn
bm1 bm2 . . . bmp
cm1 cm2 . . . cmn

332

Linear Transformations

respectively. Then, by definition,


T (j ) = a1j 1 + a2j 2 + + apj p
S(j ) = b1j 1 + b2j 2 + + bmj m
ST (j ) = c1j 1 + c2j 2 + + cmj m ,
where, j = 1, 2, . . . , n. As S and T are linear, so ST (j ) = S[t(j )] and therefore,
S[a1j 1 + a2j 2 + + apj p ] = a1j S(1 ) + a2j S(2 ) + + apj S(p )
= a1j [b11 1 + + bm1 m ] + a2j [b12 1 + + bm2 m ] + apj [b1p 1 + + bmp m ]
p
p
p
X
X
X
=
b1k akj 1 +
b2k akj 2 + +
bmk akj m
k=1

k=1

Therefore, c1j =

p
P

b1k akj , c2j =

k=1

c11
c21

...
cm1

c12
c22
...
cm2

k=1
p
P

b2k akj , . . ., cmj =

k=1


. . . c1n
b11
b21
. . . c2n
=
... ... ...
. . . cmn
bm1

p
P

bmk akj and consequently,

k=1

b12
b22
...
bm2

. . . b1p
a11
a21
. . . b2p

... ... ...


. . . bmp
ap1

a12
a22
...
ap2

. . . a1n
. . . a2n

... ...
. . . apn

That is, m(ST ) = m(S).m(T ) and hence the theorem.


Ex 5.6.7 Let U, V, W be the vector spaces of dimensions 3, 2, 4 respectively and their bases
be respectively {(1, 0, 0), (1, 1, 0), (1, 1, 1)}, {(1, 1), (0, 1)} and {(0, 0, 1, 0),
(1, 0, 0, 1), (1, 1, 1, 1), (0, 1, 2, 1)}. The linear mappings T : U V and S : V W be
defined by T (x, y, z) = (2x 4y + 9z, 5x + 3y 2z), S(x, y) = (3x + 4y, 5x 2y, x + 7y, 4x),
show that [ST ] = [S][T ].
Solution: First the mapping T is defined as T (x, y, z) = (2x 4y + 9z, 5x + 3y 2z).
Therefore,
T (1, 0, 0) = (2, 5) = 2(1, 1) + 7(0, 1)
T (1, 1, 0) = (2, 8) = 2(1, 1) + 6(0, 1)
T (1, 1, 1) = (7, 6) = 7(1, 1) + 13(0, 1).


2 2 7
Therefore, the matrix representation of T is [T ] =
. The mapping S is defined
7 6 13
as S(x, y) = (3x + 4y, 5x 2y, x + 7y, 4x). Therefore,
S(1, 1) = (1, 7, 6, 4) = 16(0, 0, 1, 0) + 5(1, 0, 0, 1) + 4(1, 1, 1, 1) + 3(0, 1, 2, 1)
S(0, 1) = (4, 2, 7, 0) = 5(0, 0, 1, 0) 10(1, 0, 0, 1) 6(1, 1, 1, 1) + 4(0, 1, 2, 1).

T
16 5 4 3
Therefore, the matrix representation of S is [S] =
. Using the definition
5 10 6 4
of composite mapping,
ST (1, 0, 0) = 2S(1, 1) + 7S(0, 1) =
= 3(0, 0, 1, 0) 60(1, 0, 0, 1) 34(1, 1, 1, 1) + 34(0, 1, 2, 1),
ST (1, 1, 0) = 2S(1, 1) + 6S(0, 1) =
= 62(0, 0, 1, 0) 70(1, 0, 0, 1) 44(1, 1, 1, 1) + 18(0, 1, 2, 1),
ST (1, 1, 1) = 7S(1, 1) + 13S(0, 1) =
= 47(0, 0, 1, 0) 95(1, 0, 0, 1) 50(1, 1, 1, 1) + 73(0, 1, 2, 1).

Matrix Representation of Linear Transformation

333

Therefore the matrix representation of the composite mapping ST is given by,


16 5
3 62 47


60 70 95 5 10 2 2 7
=

[ST ] =
34 44 50 4 6 7 6 13 = [S][T ].
34 18 73
3
4
Theorem 5.6.3 Let V and W be finite dimensional vector spaces over a field F and let
T : V W be a linear mapping. Then T is invertible (non-singular) if and only if the
matrix of T relative to some chosen bases is non-singular.
Proof: First, let T : V W be invertible, then by definition, T is one-to-one and onto.
Since T is one-to-one, dimKerT = 0 and as T is onto, Im T = W .
dimKerT + dimImT = dimV dimV = dimW = n(say).
Then the matrix of T , i.e., m(T ), is an n n matrix. Also rank of T = rank of m(T ).
Therefore m(T ) being an n n matrix of rank n, is non-singular.
Conversely, let the matrix m(T ) be non-singular. Then the matrix m(T ) is a square
matrix of order n, say. Therefore, the rank of m(T ) is n. Since the order of m(T ) is n,
dimV = dimW = n. Since rank of T = rank of m(T ), the rank of T = n. Therefore
Im T = W and this implies T is onto.
dimKerT + dimImT = dimV dimKerT = 0.
Hence T is one-to-one. T being both one-to-one and onto, T is invertible. This completes
the proof.
Theorem 5.6.4 (Matrix of the inverse mapping:) Let V and W be two finite dimensional vector spaces of the same dimension over a field F and let T : V W be an
invertible mapping. Then relative to chosen ordered bases, the matrix of the inverse mapping
T 0 : W V is given by m(T 0 ) = [m(T )]1 .
Proof: Let dimV = dimW = n and let (1 , 2 , . . . , n ), (1 , 2 , . . . , n ) be ordered bases
of
Relative to
the chosen bases, let the matrices of T and T 0 be
V and W respectively.

a11 a12 . . . a1n
b11 b12 . . . b1n
a21 a22 . . . a2n b21 b22 . . . b2n

. . . . . . . . . . . . , . . . . . . . . . . . . respectively. Then,
an1 an2 . . . ann
bn1 bn2 . . . bnp
T (j ) = a1j 1 + a2j 2 + + anj n
T 0 (j ) = b1j 1 + b2j 2 + + bnj n ,
for j = 1, 2, . . . , n. Since the mapping T 0 : W V is the inverse of T , so, T 0 T = Iv and
T T 0 = Iw . Therefore,
T 0 T (j ) = T 0 [a1j 1 + a2j 2 + + anj n ]
= a1j T 0 (1 ) + a2j T 0 (2 ) + + anj T 0 (n ) as T 0 is linear
= a1j [b11 1 + b21 2 + + bn1 n ] + a2j [b12 1 + b22 2 + + bn2 n ]
+ + anj [b1n 1 + b2n 2 + + bnn n ]
= (b11 a1j + b12 a2j + + b1n anj )1 + (b21 a1j + b22 a2j + + b2n anj )2
+ + (bn1 a1j + bn2 a2j + + bnn anj )n .

334

Linear Transformations

But T 0 T = Iv T 0 T (j ) = j . Therefore,
bi1 a1j + bi2 a2j + + bin anj = 1 if i = j
= 0 if i 6= j
It follows that m(T 0 ).m(T ) = In . Similarly, using T T 0 = Iw T T 0 (j ) = j . Therefore,
ai1 b1j + ai2 b2j + + ain bnj = 1 if i = j.
= 0 if i 6= j.
It follows that m(T ).m(T 0 ) = In = m(T 0 )m(T ). Hence by definition of inverse of matrix, we
get, m(T 0 ) = [m(T )]1 .
Ex 5.6.8 Let (1 , 2 , 3 ), (1 , 2 , 3 ) be ordered bases of the real vector spaces V and W
respectively. A linear mapping T : V W maps the basis vectors as T (1 ) = 1 , T (2 ) =
1 + 2 , T (3 ) = 1 + 2 + 3 . Find the matrix of T relative to the ordered bases (1 , 2 , 3 )
of V and (1 , 2 , 3 ) of W . Find the matrix of T 1 relative to the same chosen ordered
bases.
Solution: The linear mapping T : V W which maps the basis vectors as T (1 ) =
1 , T (2 ) = 1 + 2 , T (3 ) = 1 + 2 + 3 can be written as
T (1 ) = 11 + 02 + 03 ;
T (2 ) = 11 + 12 + 03 ;
T (3 ) = 11 + 12 + 13 .

111
Therefore, the matrix representation is m(T ) = 0 1 1 . We see that the matrix represen001
tation m(T ) is non-singular and therefore T is non-singular, so T 1 exists and T 1 is linear.
Thus the inverses are given by, T 1 (1 ) = 1 ; T 1 (1 + 2 ) = 2 ; T 1 (1 + 2 + 3 ) = 3 ,
i.e., T 1 (1 ) = 1 ; T 1 (1 ) + T 1 (2 ) = 2 ; T 1 (1 ) + T 1 (2 ) + T 1 (3 ) = 3 , as T 1 is
linear. They can be written as,
T 1 (1 ) = 1 = 11 + 02 + 03 ;
T 1 (2 ) = 2 1 = 11 + 12 + 03 ;
T 1 (3 ) = 3 2 = 01 12 + 13 .

1 1 0
Therefore, the matrix representation is m(T 1 ) = 0 1 1 . We see that
0 0 1

111
1 1 0
100
m(T )m(T 1 ) = 0 1 1 0 1 1 = 0 1 0 = m(T 1 )m(T ).
001
0 0 1
001
Therefore, [m(T )]1 exists and [m(T )]1 = m(T 1 ).
Theorem 5.6.5 (Isomorphism between linear mappings and matrices:) Let V and
W be finite dimensional vector spaces over a field F with dimV = n and dimW = m. Then,
the linear space over F of all linear mappings of V to W , i.e., L(V, W ) and the vector space
of all m n matrices over F , i.e., Mm,n are isomorphic.

Matrix Representation of Linear Transformation

335

Proof: Let (1 , 2 , . . . , n ) and (1 , 2 , . . . , m ) be the ordered bases of V and W respectively . Let us define a mapping m : L(V, W ) Mm,n by
m(T ) = (aij )mn for T L(V, W ), (aij )mn ,
being the matrix of T relative to the ordered bases (1 , 2 , . . . , n ) of V and (1 , 2 , . . . , m )
of W . Let T L(V, W ), S L(V, W ), then T +S L(V, W ). Let m(T ) = (aij )mn , m(S) =
(bij )mn , m(T + S) = (cij )mn . Then
T (j ) = a1j 1 + a2j 2 + + amj m
S(j ) = b1j 1 + b2j 2 + + bmj m
(T + S)(j ) = c1j 1 + c2j 2 + + cmj m ,
for j = 1, 2, . . . , n. Since T and S are linear, (T + S)(j ) = T (j ) + S(j ), so,
c1j 1 + c2j 2 + + cmj m = (a1j + b1j )1 + (a21j + b2j )2 + + (amj + bmj )m .
As {1 , 2 , . . . , m } is linearly independent, we have cij = aij + bij for i = 1, 2, . . . , m; j =
1, 2, . . . , n. Hence, it follows that m(T + S) = m(T ) + m(S). Let k F , then kT L(V, W ).
Let m(kT ) = (dij )m,n , then,
(kT )(j ) = d1j 1 + d2j 2 + + dmj m for j = 1, 2, . . . , n.
Again (kT )(j ) = kT (j ), by the definition of kT , therefore,
k[a1j 1 + a2j 2 + + amj m ]
= (ka1j )1 + (ka2j )2 + + (kamj )m .
Therefore dij = kaij for i = 1, 2, . . . , m, since {1 , 2 , . . . , m } is linearly independent.
Consequently, dij = kaij for i = 1, 2, . . . , m; j = 1, 2, . . . , n. It follows that m(kT ) =
km(T ), and so, m is a homomorphism. To prove, m is an isomorphism, let m(T ) = m(S)
for some T, S L(V, W ). Let m(T ) = (aij )mn , m(S) = (bij )mn , then,
T(j ) = a1j 1 + a2j 2 + + amj m for j=1,2, . . . ,n;
S(j ) = b1j 1 + b2j 2 + + bmj m for j=1,2, . . . ,n.
m(T ) = m(S) aij = bij for all i, j. Hence, T (j ) = S(j ) for j = 1, 2, . . . , n Let be an
arbitrary vector in V . Since T (j ) = S(j ) for all basis vectors j , T () = S() for all
V and this implies T = S. Therefore m(T ) = m(S) T = S, proving that m is one-toone. To prove that m is onto, let (aij )mn Mm,n . Then there exist a unique linear mapping
T : V W whose matrix is (aij ), because if we prescribe the j th column of (aij ) as the coordinates of T (j ) relative to (1 , 2 , . . . , m ), i.e., T (j ) = a1j 1 +a2j 2 + +amj m then
T is determined uniquely with (ai,j ) as the associated matrix. Thus m is an isomorphism
and therefore the linear space L(V, W ) and Mm,n are isomorphic.
Theorem 5.6.6 Let V and W be two finite dimensional vector spaces over a field F and
let T : V W be a linear mapping. Let A be the matrix of T relative to a pair of ordered
bases of V and W and C be the matrix of T relative to a different pair of ordered bases of V
and W . Then the matrix C is equivalent to A, i.e., non-singular matrices P and Q such
that C = P 1 AQ.
Proof: Let V and W be two finite dimensional vector spaces over a field F with dimV = n
and dimW = m, say. Let A be the matrix of T relative to the ordered bases (1 , 2 , . . . , n )

336

Linear Transformations

of V and (1 , 2 , . . . , n ) of W . Let C be the matrix of T relative to a the ordered bases


(1 , 2 , . . . , n ) of V and (1 , 2 , . . . , n ) of W . Let A = (aij )mn , C = (cij )mn . Then,
T (j ) = a1j 1 + a2j 2 + + amj m ;
T (j ) = c1j 1 + c2j 2 + + cmj m ,
for j = 1, 2, . . . , n. Let T1 : V V be such that T1 (i ) = i f ori = 1, 2, . . . , n and Q be the
matrix of T1 relative to the ordered bases (1 , 2 , . . . , n ) of V . Since T1 maps a bases of
V onto other bases, T1 is non-singular and therefore Q is non-singular. Let Q = (qij )nn ,
then
j = T1 (j ) = q1j 1 + q2j 2 + + qnj n ;

for j = 1, 2, . . . , n.

Let T2 : W W be such that T2 (i ) = i for i=1,2, . . . ,n and P be the matrix of


T2 relative to the ordered bases (1 , 2 , . . . , n ) of W . Since T2 maps a bases of W onto
other bases, T2 is non-singular and therefore P is non-singular. Let P = (pij )mm , then
j = T2 (j ) = p1j 1 + p2j 2 + + pmj m for j = 1, 2, . . . , m.
T (j ) = c1j 1 + c2j 2 + + cmj m
= c1j [p11 1 + p21 2 + + pm1 m ] + c2j [p12 1 + p22 2 + + pm2 m ]
+ + cmj [p1m 1 + p2m 2 + + pmm m ]
!
!
m
m
X
X
p1k ckj 1 + +
pmk ckj m ,
=
k=1

k=1

and T (ij ) = T [q1j 1 + q2j 2 + + qnj n ]


= q1j T (1 ) + q2j T (2 ) + + qnj T (n )
= q1j [a11 1 + a21 2 + + am1 m ] + q2j [a12 1 + a22 2 + + am2 m ]
+ + qnj [a1n 1 + a2n 2 + + amn m ]
!
!
m
m
X
X
=
a1k qkj 1 + +
amk qkj m ,
k=1

k=1

Since (1 , 2 , . . . , n ) is a basis, we have


m
X
k=1

p1k ckj =

m
X

a1k qkj for i=1, 2, . . . , m; j=1, 2, . . . , n.

k=1

This gives P C = AQ. Since P is non-singular, C = P 1 AQ. Thus a linear mapping


T L(V, W ) has different matrices relative to different pairs of ordered bases of V and W but
all such matrices are equivalent matrices. Here P T is known as the transition matrix or the
matrix oft the change of basis from the ordered bases (1 , 2 , . . . , n ) of V to (1 , 2 , . . . , n )
of W .
Theorem 5.6.7 Let V and W be two finite dimensional vector spaces over a field F with
dimV = n and dimW = m and A, C are m n matrices over F such that C = P 1 AQ for
some non-singular matrices P, Q. Then a linear mapping T L(V, W ) such that A and
C are matrices of T relative to different pairs of ordered bases of V and W .
Proof: Let us consider A = (aij )mn ; C = (cij )mn ; P = (pij )mn ; Q = (qij )nn . Let
(1 , 2 , . . . , n ), (1 , 2 , . . . , n ) be a pair of ordered bases of V and W respectively and let
the mapping T : V W be defined by
T (j ) = a1j 1 + a2j 2 + + anj n , j = 1, 2, . . . , n.

Matrix Representation of Linear Transformation

337

Then T is a uniquely determined linear mapping and the matrix of T relative to the ordered
bases (1 , 2 , . . . , n ) and (1 , 2 , . . . , n ), is A. Let a mapping T1 : V V be defined by
T1 (j ) = q1j 1 + q2j 2 + + qnj n . Then T1 is a uniquely determined mapping on V .
Since Q is a non-singular matrix, {T1 (1 ), T1 (2 ), . . . , T1 (n )} is a basis of V . Let T1 (i ) =
i ; i = 1, 2, . . . , n. Let the mapping T2 : W W be defined by T2 (j ) = p1j 1 + p2j 2 +
+ pmj m . Then T2 is a uniquely determined mapping on W . Since P is a non-singular
matrix, {T2 (1 ), T2 (2 ), . . . , T2 (n )} is a basis of W . Let T2 (i ) = i , i = 1, 2, . . . , n. Let
T 0 be the linear mapping belongs to L(V, W ) whose matrix relative to the ordered bases
(1 , 2 , . . . , n ) of V and (1 , 2 , . . . , m ) of W be C. Then,
T 0 (j ) = c1j 1 + c2j 2 + + cmj m
= c1j [p11 1 + p21 2 + + pm1 m ] + c2j [p12 1 + + pm2 m ] +
+cmj [p1m 1 + + pmm m ]
!
!
m
m
X
X
=
p1k ckj 1 + +
pmk ckj m ,
k=1

k=1

and T (j ) = T [q1j 1 + q2j 2 + + qnj n ]


= q1j T (1 ) + q2j T (2 ) + + qnj T (n )
= q1j [a11 1 + + am1 m ] + q2j [a12 1 + + am2 m ] +
+ qnj [a1n 1 + + amn m ]
!
!
n
n
X
X
=
a1k qkj 1 + +
amk qkj m .
k=1

k=1

Now, C = P 1 AQ, i.e., P C = AQ which gives


m
X
k=1

pik ckj =

n
X

aik qkj for j = 1, 2, , n.

k=1

So T 0 (j ) = T (j ) for j = 1, 2, , n and this proves T = T 0 . Thus A is the matrix of


T relative to the ordered bases (1 , 2 , . . . , n ) of V and (1 , 2 , . . . , m ) of W ; C is the
matrix of T relative to the ordered bases (1 , 2 , . . . , n ) of V and (1 , 2 , . . . , m ) of W . P
is the matrix of the non-singular mapping T2 L(W, W ) that maps i to i ; Q is the matrix
of the non-singular mapping T1 L(V, V ) that maps i to i . Thus, a linear mapping
T L(V, W ) such that A and C are matrices of T relative to different pairs of ordered bases
of V and W . It is the converse of the previous theorem.
Theorem 5.6.8 Let V and W be two finite dimensional vector spaces over a field F with
dimV = n and dimW = m and A be an mn matrices over F .Relative to two different pairs
of ordered bases of V and W , let A determines two linear maps T1 and T2 such that m(T1 ) =
A, m(T2 ) = A. Then non-singular linear mappings S L(V, V ) and R L(W, W ), such
that T1 = R1 T2 S.
Proof: Let (1 , 2 , . . . , n ) and (1 , 2 , . . . , n ) be a pair of ordered bases of V and let the
mapping S : V V be defined by S(i ) = i , i = 1, 2, . . . , n. Let (1 , 2 , . . . , m ), (1 , 2 , . . . , m )
be a pair of ordered bases of W and let R : W W be defined by R(i ) = i , i =
1, 2, . . . , m. Then S and R are non-singular. Let A = (aij )mn , consider,
T1 (j ) = a1j 1 + a2j 2 + + amj m ;
T2 (j ) = a1j 1 + a2j 2 + + amj m , j = 1, 2, . . . , n.

338

Linear Transformations
Then, T2 S(j ) = T2 (j ) = a1j 1 + a2j 2 + + amj m ,
RT1 (j ) = R(a1j 1 + a2j 2 + + amj m )
= a1j R(1 ) + a2j R(2 ) + + amj R(m )
= a1j 1 + a2j 2 + + amj m .

Since T2 S(j ) = RT1 (j ) for j = 1, 2, . . . , n, T2 S = RT1 and therefore T1 = R1 T2 S.


Theorem 5.6.9 Let V and W be finite dimensional vector spaces over a field F with
dimV = n and dimW = m and let T1 , T2 be two linear maps in L(V, W ) such that
T1 = R1 T2 S, where R is a non-singular mapping in L(W, W ), S is a non-singular mapping in L(V, V ). Then m(T1 ) = m(T2 ) relative to different pairs of ordered bases of V and
W.
Proof: Let (1 , 2 , . . . , n ) (1 , 2 , . . . , m ) be a pair of ordered bases of V and W respectively. Let S(i ) = i and R(i ) = i . Since R and S are non-singular, (S(1 ), S(2 ), . . . , S(n ))
is a basis of V and (R(1 ), R(2 ), . . . , R(n )) is a basis of W . Let A = (aij )mn be the
matrix of T1 relative to the ordered bases (1 , 2 , . . . , n ) of V and (1 , 2 , . . . , m ) of W .
Then
T1 (j ) = a1j 1 + a2j 2 + + amj m ,
RT1 (j ) = R(a1j 1 + a2j 2 + + amj m )
= a1j 1 + a2j 2 + + amj m .
Since RT1 = T2 S, we have
T2 S(j ) = a1j 1 + a2j 2 + + amj m
or, T2 (j ) = a1j 1 + a2j 2 + + amj m .
This shows that A is the matrix of T2 relative to the ordered bases (1 , 2 , . . . , n ) of V
and (1 , 2 , . . . , m ) of W . This is the converse of the previous theorem.
Deduction 5.6.1 Matrix representation of a linear operator: Let V be a vector
space of dimension n over a field F and T : V V is a linear operator on V . Let
(1 , 2 , . . . , n ) be an ordered bases of V , then by theorem, T is completely determined by
the images T (1 ), T (2 ), . . . , T (n ) and so, each T (i ) is a linear combination of the vectors
1 , 2 , . . . , n in the basis, say,
T (1 ) = a11 1 + a21 2 + + an1 n
T (2 ) = a12 1 + a22 2 + + an2 n
.. ..
. .
T (n ) = a1n 1 + a2n 2 + + ann n ,
where aij are unique scalars in F determined by the ordered basis (1 , 2 , . . . , n ). Proceeding in the similar arguments as in the case of linear mappings, the matrix representation of
the linear operator T is Y = AX, where X is the co-ordinate vector of any arbitrary element
in V relative to the ordered basis (1 , 2 , . . . , n ) and Y is the co-ordinate vector of T ()
relative to the same basis. Relative to
the ordered
2 , . . .
, n ), The co-ordinate
bases
(1 ,
a11
a12
a1n
a21 a22
a2n

vectors of T (1 ), T (2 ), . . . , T (n ) are . , . , , . respectively and


.. ..
..
an1
an2
ann
m(T ) = [T ] = [[T (1 )], [T (2 )], , [T (n )]] .

Matrix Representation of Linear Transformation

a11
a21

The matrix A = .
..

a12
a22
..
.

..
.

339

a1n
a2n

.. is said to be the matrix of T relative to the basis


.

an1 an2 ann


(1 , 2 , . . . , n ) and is denoted by m(T ).
Ex 5.6.9 A linear mapping T : <3 <3 is defined by T (x1 , x2 , x3 ) = (2x1 + x2 x3 , x2 +
4x3 , x1 x2 + 3x3 ), (x1 , x2 , x3 ) <3 . Find the matrix of T relative to the ordered bases
(i) ((1, 0, 0), (0, 1, 0), (0, 0, 1)) of <3 ;
(ii) ((0, 1, 1), (1, 0, 1), (1, 1, 0)) of <3 .
Solution: (i) First, we write T (1, 0, 0), T (0, 1, 0) and T (0, 0, 1) as the linear combination of
the basis vectors (1, 0, 0), (0, 1, 0), (0, 0, 1) of <3 as
T (1, 0, 0) = (2, 0, 1) = 2(1, 0, 0) + 0(0, 1, 0) + 1(0, 0, 1)
T (0, 1, 0) = (1, 1, 1) = 1(1, 0, 0) + 1(0, 1, 0) 1(0, 0, 1)
T (0, 0, 1) = (1, 4, 3) = 1(1, 0, 0) + 4(0, 1, 0) + 3(0, 0, 1).

2 1 1
Therefore the matrix of T is [T ] = 0 1 4 .
1 1 3
(ii) First, we write T (0, 1, 1), T (1, 0, 1) and T (1, 1, 0) as the linear combination of the basis
vectors (0, 1, 1), (1, 0, 1), (1, 1, 0) of <3 as
3
3
7
(0, 1, 1) (1, 0, 1) + (1, 1, 0)
2
2
2
7
1
1
T (1, 0, 1) = (1, 4, 4) = (0, 1, 1) + (1, 0, 1) + (1, 1, 0)
2
2
2
T (1, 1, 0) = (3, 1, 0) = 1(0, 1, 1) + 1(1, 0, 1) + 2(1, 1, 0).

7 7
2 2 1
Therefore the matrix of T = 32 12 1 .
3 1
2 2 2
T (0, 1, 1) = (0, 5, 2) =

Theorem 5.6.10 Let V be a finite dimensional vector space of dimension n over a field F .
Let T L(V, V ) and m(T ) be the matrix of T with respect to an ordered basis {1 , 2 , . . . , n }
of V , then m(T1 T2 ) = m(T1 ).m(T2 ); T1 , T2 L(V, V ).
Proof: Let m(T1 ) = (aij )nn , m(T2 ) = (bij )nn ; aij , bij F. Then
T1 (j ) = a1j 1 + a2j 2 + + anj n f or j = 1, 2, . . . , n;
T2 (j ) = b1j 1 + b2j 2 + + bnj n f or j = 1, 2, . . . , n.
T1 T2 (j ) = T1 (T2 (j )) = T1 (b1j 1 + b2j 2 + + bnj n )
= b1j T1 (1 ) + b2j T1 (2 ) + + bnj T1 (n )
= b1j (a11 1 + + an1 n ) + + bnj (a1n 1 + + ann n )
!
!
n
n
X
X
=
aik bkj 1 + +
ank bkj n .
k=1

This shows that,

k=1

340

Linear Transformations
P
P
P a1k bk1 P a1k bk2
a2k bk1
a2k bk2

m(T1 T2 ) =
..
..

P .
P .
ank bk1
ank bk2

P
P a1k bkn

a2k bkn

..
..

. P .

ank bkn

= (aij )nn .(bij )nn = m(T1 ).m(T2 ).


Ex 5.6.10 The matrix of a linear mapping
: <3 <3 relative to the ordered basis
T
122
((1, 1, 1), (1, 1, 1), (1, 1, 1)) of <3 is 2 1 3 . Find the matrix of T relative to the
331
ordered basis ((0, 1, 1), (1, 0, 1), (1, 1, 0)).
Solution: Since the matrix m(T ) of the linear mapping T is given, so
T (1, 1, 1) = 1.(1, 1, 1) + 2(1, 1, 1) + 3.(1, 1, 1) = (4, 2, 0)
T (1, 1, 1) = 2.(1, 1, 1) + 1(1, 1, 1) + 3.(1, 1, 1) = (2, 4, 0)
T (1, 1, 1) = 2.(1, 1, 1) + 3(1, 1, 1) + 1.(1, 1, 1) = (2, 0, 4).
Let (x, y, z) <3 and ci <, so,
(x, y, z) = c1 (1, 1, 1) + c2 (1, 1, 1) + c3 (1, 1, 1)
c1 + c2 + c3 = x, c1 c2 + c3 = y, c1 + c2 c3 = z
1
1
1
c1 = (y + z), c2 = (z + x), c3 = (x + y),
2
2
2
T (x, y, z) = c1 T (1, 1, 1) + c2 T (1, 1, 1) + c3 T (1, 1, 1)
1
1
1
= (y + z)(4, 2, 0) + (z + x)(2, 4, 0) + (x + y)(2, 0, 4)
2
2
2
= (2x + 3y + 3z, 2x + y + 3z, 2x + 2y).
Therefore, T (0, 1, 1) = (6, 4, 2), T (1, 0, 1) = (5, 5, 2) and T (1, 1, 0) = (5, 3, 4). Let,
(6, 4, 2) = c1 (0, 1, 1) + c2 (1, 0, 1) + c3 (1, 1, 0)
c2 + c3 = 6, c1 + c3 = 4, c1 + c2 = 2 c1 = 0, c2 = 2, c3 = 4.
Let (5, 5, 2) = c1 (0, 1, 1) + c2 (1, 0, 1) + c3 (1, 1, 0), then, c1 = 1, c2 = 1 and c3 = 4. Lastly, let
(5, 3, 4) = c1 (0, 1, 1) + c2 (1, 0, 1)
+ c3 (1,
1, 0), then, c1 = 1, c2 = 3 and c3 = 2. Therefore, the
011
matrix representation of T is 2 1 3 . Also, m(T ) is non-singular, so, T is also so and so
442
T 1 exists.
Theorem 5.6.11 Let V be a finite vector space of dimension n over a field F . Then a
linear mapping T L(V, V ) is invertible iff the matrix of T relative to a chosen ordered
basis of V, i.e., m(T ) is non-singular.
Proof: Let T : V V be an invertible mapping. Then there exist a linear mapping
T 0 : V V such that T.T 0 = T 0 .T = I, I being the identity mapping on V . So m(T T 0 ) =
m(T 0 T ) = m(I). That is, m(T ).m(T 0 ) = m(T 0 ).m(T ) = In . This shows that m(T ) is nonsingular.
Conversely, let m(T ) be a non-singular. Then there exist a matrix P such that m(T ).P =
P.m(T ) = In . If T 0 be the linear mapping such that m(T 0 ) = P with respect to the same
chosen basis of V , then m(T )m(T 0 ) = m(T 0 )m(T ) = In . That is, m(T T 0 ) = m(T 0 T ) =
m(IV ). Since T m(T ) is an isomorphism, we have T T 0 = T 0 T = IV . This shows that T
is invertible and T 0 is inverse of T . This completes the proof.

Orthogonal Linear Transformation

341

Ex 5.6.11 Let (1 , 2 , 3 ) be an ordered basis of a real vector space V and a linear mapping
T : V V is defined by T (1 ) = 1 + 2 + 3 , T (2 ) = 1 + 2 , T (3 ) = 1 . Show that T
is non-singular. Find the matrix of T 1 relative to the ordered basis (1 , 2 , 3 ).
Solution: Let m(T ) be the matrix of T relative to the ordered basis (1 , 2 , 3 ). The
linear mapping T : V W which maps the basis vectors as T (1 ) = 1 + 2 + 3 , T (2 ) =
1 + 2 , T (3 ) = 1 can be written as
T (1 ) = 11 + 12 + 13 ; T (2 ) = 11 + 12 + 03 ; T (3 ) = 11 + 02 + 03 .

111
Therefore, the matrix representation is m(T ) = 1 1 0 . Since m(T ) is non-singular and
100
therefore T is non-singular. Hence,
T 1 (1 + 2 + 3 ) = 1 ; T 1 (1 + 2 ) = 2 ; T 1 (1 ) = 3 .
Since the mapping T is linear, we have,
T 1 (1 ) + T 1 (2 ) + T 1 (3 ) = 3 ;
T 1 (1 ) + T 1 (2 ) = 1 ; T 1 (1 ) = 3 .
T 1 (1 ) = 3 ; T 1 (2 ) = 2 3 ; T 1 (3 ) = 1 2 .

0 0 1
Therefore m(T 1 ) = 0 1 1 . We have seen that the matrix m(T ) associated with a
1 1 0
linear mapping T L(V, V ) depends on the choice of an ordered basis.

5.7

Orthogonal Linear Transformation

Let V be a finite dimensional Euclidean space. A linear mapping T : V V is said to be


orthogonal transformation on V if,
D
E
(5.7)
T (), T () = h, i, , V.
An orthogonal matrix preserves inner product. Let V be an Euclidean space. If a linear
mapping T : V V is orthogonal on V then for all , V,
(i) h, i = 0 hT (), T ()i = 0;
(ii) ||T ()|| = ||||;
(iii) ||T () T ()|| = || ||;
(iv) T is one-one.
Result 5.7.1 If h, i = 0 and T be an orthogonal transformation, then hT (), T ()i =
h, i = 0. Thus an orthogonal transformation preserves orthogonality. But, a linear
transformation preserving orthogonality is not necessarily an orthogonal transformation. On
the finite dimensional Euclidean space <, the mapping T : < < defined by T () = 2 is a
linear transformation and hT (), T ()i = h2, 2i = 4h, i. Hence T is not an orthogonal
transformation though h, i = 0 hT (), T ()i = 0, i.e., though under T , orthogonality
of vectors is preserved.

342

Linear Transformations

Theorem 5.7.1 A linear mapping T : V V is orthogonal if and only if T preserves


length of vectors, i.e., ||T ()|| = ||||.
Proof: First let T be orthogonal, then V, we have T ()T () = ., i.e.,
2

[||T ()||] = (||||)2 ||T ()|| = ||||.


Conversely, let ||T ()|| = ||||, x V. Now, if , V, then ||T () T ()|| = || ||,
i.e.,
hT ( ), T ( )i = h , i
or, hT () T (), T () T ()i = h , i; as T is linear
or, ||T ()||2 2hT ()T ()i + ||T ()||2 = ||||2 2h, i + ||||2
or, hT (), T ()i = h, i; ||T ()|| = ||||, ||T ()|| = .
Hence the theorem.
Theorem 5.7.2 A linear mapping T : V V is orthogonal if and only if for every unit
vector V , the mapping T () is also an unit vector.
Proof: First let, the linear mapping T : V V is orthogonal, then
p
p
|||| = 1 ||T ()|| = hT (), T ()i = h, i = 1.
Conversely, let T be linear and |||| = 1 ||T ()|| = 1. We shall first show ||T ()|| =
||||; V. If = , then T () = and hence ||T ()|| = ||||. Let 6= . If |||| = 1,
1
is an unit vector. Hence,
then the result holds. If |||| =
6 1, then unit vector
= ||||
= ||||
gives
||T ()||



1
1
1
1
|| = ||
|| ||
T ()|| = ||
||
||||
||||
||||
||||
1
1

||T ()|| =
|||| ||T ()|| = ||||.
||||
||||
||T

Hence by the previous theorem, T is orthogonal.


Theorem 5.7.3 Let {1 , 2 , , n } be a basis of Euclidean space V of dimension n. Then
a linear mapping T : V V is orthogonal on V if and only if hT (i ), T (j )i = hi , j i, i, j.
Proof: First, let the linear mapping T : V V is orthogonal on V , then, by definition,
hT (), T ()i = h, i, , V. Therefore,
hT (i ), T (j )i = hi , j i, i, j.
Conversely, let, hT (i ), T (j )i = hi , j i, i, j. Let , be any two elements of V
=

n
X
k=1

where, ai , bi <.
n
P
k=1

ak k ; =

n
X

bk k ,

k=1

Since T is a linear transformation, T () =

n
P
k=1

bk T (k ), and so,

ak T (k ); T () =

Orthogonal Linear Transformation


*
hT (), T ()i =

and h, i =

n
X

k=1
* n
X

343

ak T (k ),

n
X

+
bk T (k )

ak k ,

+
bk k

k=1

k=1

ak bk hT (k ), T (l )i

k=1 l=1

k=1
n
X

n X
n
X

n
n X
X

ak bk hk , l i .

k=1 l=1

Since, hT (), T ()i = h, i, , V , therefore, T is an orthogonal on V .


Theorem 5.7.4 Let V be a finite dimensional Euclidean space. Then the linear mapping
T : V V is orthogonal on V if and only if T maps an orthonormal basis to another
orthonormal basis.
Proof: Let {1 , 2 , , n } be an orthonormal basis of V , then by definition,
hi , j i = 1;
= 0;

if i = j
if i =
6 j.

First let T : V V is orthogonal mapping on V , then by definition, hT (), T ()i =


h, i, , V and so, hT (i ), T (j )i = hi , j i, i, j. Therefore,
hT (i ), T (j )i = 1;
= 0;

if i = j
if i =
6 j.

This proves that, {T (1 ), T (2 ), , T (n )} is an orthonormal set and as this contains n


vectors, it is an orthonormal basis of V .
Conversely, let, {1 , 2 , , n } be an orthonormal basis of V and {T (1 ), T (2 ), , T (n )}
is also an orthonormal set. Let, , V can be written as
=

n
X

ai i , ai <; =

i=1

n
X

bi i , bi <.

i=1

Since, {1 , 2 , , n } be an orthonormal basis of V , we have h, i =

n
P

ai bi . Since,

i=1

T : V V is linear, so
T () =

n
X

ai T (i ), ai <; T () =

i=1

n
X

bi T (i ), ai , bi <

i=1

hT (), T ()i =

n
X

ai bi ,

i=1

as {1 , 2 , , n } be an orthonormal basis of V . Therefore, hT (), T ()i = h, i , for all


, V and this shows that T is an orthogonal mapping.
Theorem 5.7.5 Let V be an Euclidean space of dimension n. Let A be the matrix of the
linear mapping T : V V relative to an orthonormal basis. Then T is orthogonal on V if
and only if A is real orthogonal matrix.
Proof: Let A = [aij ]mn be a matrix of T relative to the ordered orthonormal basis
(1 , 2 , , n ) of V , then
T (j ) =
hT (i ), T (j )i =

n
X
i=1
n
X
i=1

aij i ;

forj = 1, 2, , n

aij aij ; as (1 , 2 , , n ) is orthonormal.

344

Linear Transformations

Let the mapping T : V V be orthogonal on V , then hT (i ), T (j )i = hi , j i . Since


(1 , 2 , , n ) is an orthogonal set, so,
n
X

aij aij = 1;

if i = j

= 0;

if i 6= j.

i=1

Therefore, AT A = In and this shows that A is an orthogonal matrix. Conversely, let, A be


an orthogonal matrix, then AT A = In . Therefore,
n
X

aij aij = 1;

if i = j

= 0;

if i 6= j.

i=1

Hence, hT (i ), T (j )i =

n
P

aij aij , as {1 , 2 , , n } is an orthonormal set. Consequently,

i=1

hT (i ), T (j )i = 1;
= 0;

if i = j
if i =
6 j.

This proves that, {T (1 ), T (2 ), , T (n )} is an orthonormal set and as this contains n


vectors, it is an orthonormal basis of V . As T maps an orthonormal basis {1 , 2 , , n }
to another orthonormal basis of V , so T is orthogonal.

5.8

Linear Functional

In this section, we are concerned exclusively with linear transformations from a vector space
V into its field of scalars F , which is itself a vector space of dimension 1 over F . Let V (F )
be a vector space. A linear functional on V is defined by a linear mapping : V F such
that for every , V and for every a, b F ,
(a + b) = a() + b().

(5.8)

A linear functional on V is a linear mapping from V into F . For example,


(i) Let V be a vector space of continuous real valued functions over [0, 2]. The function
h : V R defined by
h(x) =

1
2

x(t) g(t) dt; g V


0

is a linear functional on V . In particular, if g(t) = sin nt or cos nt, h(x) is often called
the nth Fourier coefficient of x.
(ii) Let V = Mnn (F ) and define : V F by (A) = tr(A), the trace of A. Then, is
a linear functional.
(iii) Let V be a finite dimensional vector space, and let = 1 , 2 , . . . , n be an ordered
T
basis of V . For each i = 1, 2, . . . , n, define i () = ai where [] = (a1 , a2 , . . . , an )
is the co-ordinate vector of relative to . Then i is a linear functional on V called
the ith co-ordinate function with respect to the basis . Note that i (j ) = ij .

Linear Functional

5.8.1

345

Dual Space

Let V (F ) be a vector space. The set of all linear functionals on a vector space V (F ) is also
a vector space over F with addition and scalar multiplication, defined by,
( + )() = () + () and (a)() = a(),

(5.9)

where and are linear functionals on V and a F . This vector space L(V, F ) is called
the dual space of V and is denoted by V .
Ex 5.8.1 Let : <3 < and : <3 < be the linear functionals defined by (x, y, z) =
2x 3y + z and (x, y, z) = 4x 2y + 3z. Find (i) + , (ii) 3 and (iii) 2 5.
Solution: Here we use the property of linearity of the functionals and . Now
(i)( + )(x, y, z) = (x, y, z) + (x, y, z)
= 2x 3y + z + 4x 2y + 3z = 6x 5y + 4z.
(ii) By the same property, we have
(3)(x, y, z) = 3(x, y, z) = 3(2x + 3y + z)
= 6x 9y + 3z.
(iii)(2 5)(x, y, z) = 2(x, y, z) 5(x, y, z)
= 2(2x 3y + z) 5(4x 2y + 3z) = 16x + 4y 13z.
Ex 5.8.2 Let be the linear functional on <2 defined by (2, 1) = 15 and (1, 2) = 10.
Find (x, y) and (2, 7).
Solution: Let (x, y) = ax + by. Using the conditions (2, 1) = 15 and (1, 2) = 10, we
have,
2a + b = 15 and a 2b = 10 a = 4, b = 7.
Thus (x, y) = 4x + 7y and so (2, 7) = 41.
Theorem 5.8.1 Let {1 , 2 , , n } be a basis of a finite dimensional vector space V (F ).
Let 1 , 2 , , n V be the linear functionals, defined by, i (j ) = ij ; ij = Kornecker delta,
then {i ; i = 1, 2, , n} is a basis of V .
Proof: Let be an arbitrary element in V and let us suppose that
(1 ) = a1 , (2 ) = a2 , , (n ) = an .
We are first show that {i : i = 1, 2, , n} spans V . Let =

n
P

ai i , then,

i=1

(1 ) = (a1 1 + a2 2 + + an n )(1 )
= a1 1 (1 ) + a2 2 (1 ) + + an n (1 )
= a1 .1 + a2 .0 + + an .0 = a1 = (1 ).
In general, we can write (i ) = ai = (i ); i = 1, 2, , n. Since, and agree on the
n
P
basis vectors, so = =
ai i , accordingly, {i ; i = 1, 2, , n} spans V . Now, we are
i=1

346

Linear Transformations

to show that, {i ; i = 1, 2, , n} is LI. For this, let,


c1 1 + c2 2 + + cn n =
(c1 1 + c2 2 + + cn n )(i ) = (i )
c1 1 (i ) + c2 2 (i ) + + cn n (i ) = 0
n
X
ck ki = 0, for i = 1, 2, , n

k=1

so that c1 = c2 = = cn = 0. Hence, {i ; i = 1, 2, , n} is LI and consequently, it forms


a basis of V . This basis is defined as a dual basis or the basis dual to {i }.
Ex 5.8.3 Find the dual basis of the ordered basis set S = {(2, 1), (3, 1)} for <2 .
Solution: Let the dual basis of S is given by S = {1 , 2 }. To explicitly define a formula
for 1 , we need to consider equations
1 = 1 (2, 1) = 1 (2e1 + e2 ) = 21 (e1 ) + 1 (e2 )
0 = 1 (3, 1) = 1 (3e1 + e2 ) = 31 (e1 ) + 2 (e2 )
Solving these equations, we obtain 1 (e1 ) = 1 and 2 (e2 ) = 3, i.e. 1 (x, y) = x + 3y.
Similarly, 2 (x, y) = x 2y.
Theorem 5.8.2 (Dual basis) : The dual space of an n dimensional vector space is n dimensional.
Proof: Let V (F ) be an n dimensional vector space and let V be the dual space of V . Let
S = {1 , 2 , , n } be an ordered basis of V (F ). Then, for each i = 1, 2, , n there exists
an unique linear functional i on V such that, i (j ) = ij . Let, S 0 = {1 , 2 , , n }, the
clearly, S 0 V . We are to show that S 0 is a basis of V . To show {i ; i = 1, 2, , n} is
linearly independent let,
c1 1 + c2 2 + + cn n =
(c1 1 + c2 2 + + cn n )(i ) = (i )
c1 1 (i ) + c2 2 (i ) + + cn n (i ) = 0
n
X

ck ki = 0, for i = 1, 2, , n
k=1

so that c1 = c2 = = cn = 0. Hence, S 0 = {i ; i = 1, 2, , n} is LI. Further, let be an


arbitrary element in V and let us suppose that
(1 ) = a1 , (2 ) = a2 , , (n ) = an .
Let be any element of V . Since, S is a basis of V , Let =

n
P
i=1

Therefore,
() = (c1 1 + c2 2 + + cn n )
= c1 (1 ) + c2 (2 ) + + cn (n )
= c1 .a1 + c2 .a2 + + cn .an
n
n
n
X
X
X
ai .(
cj ij )
ci ai =
=
i=1

i=1

j=1

ci i , for some ci s F .

Linear Functional

347

n
X
i=1

n
X

n
n
n
X
X
X
ai
cj .i (j ) =
ai .i [
cj j ]
j=1

i=1

j=1

ai .i () = (a1 1 + a2 2 + + an n )().

i=1

Thus, = a1 1 + a2 2 + + an n , i.e., each element of V can be expressed as a linear


combination of elements of S 0 . Thus, S 0 generates V and consequently, S 0 is a basis of V .
Accordingly, dimV = n = dimV.
Ex 5.8.4 Find the dual basis of the bases set S = {(1, 2, 3), (1, 1, 1), (2, 4, 7)} of V3 (<).
Solution: Let 1 = (1, 2, 3), 2 = (1, 1, 1) and 3 = (2, 4, 7), then, the basis set of
V3 (<) is S = {1 , 2 , 3 }. We are to find the dual basis S = {1 , 2 , 3 } of S. We seek
the functionals the functionals
i (x, y, z) = ai x + bi y + ci z; i = 1, 2, 3
where by definition of the dual basis i (j ) = ij . Using these conditions, we have
1 (1 ) = 1 (1, 2, 3) = 1; 2 (1 ) = 2 (1, 2, 3) = 0; 3 (1 ) = 3 (1, 2, 3) = 0
1 (2 ) = 1 (1, 1, 1) = 0; 2 (2 ) = 2 (1, 1, 1) = 1; 3 (2 ) = 3 (1, 1, 1) = 0
1 (3 ) = 1 (2, 4, 7) = 0; 2 (3 ) = 2 (2, 4, 7) = 0; 3 (3 ) = 3 (2, 4, 7) = 1.
Thus we have the following system of equations
a1 2b1 + 3c1 = 1; a1 b1 + c1 = 0; 2a1 4b1 + 7c1 = 0
a2 2b2 + 3c2 = 0; a2 2b2 + c2 = 1; 2a2 4b2 + 7c2 = 0
a3 2b3 + 3c3 = 0; a3 b3 + c3 = 0; 2a3 4b3 + 7c3 = 0.
Solving, the system of equations yields, a1 = 3, b1 = 5, c1 = 2, so 1 (x, y, z) = 3x
5y 2z. Similarly, 2 (x, y, z) = 2x + y and 3 (x, y, z) = x + 2y + z. Therefore, S =
{1 , 2 , 3 } is the dual basis of S, where 1 , 2 , 3 are defined as above.
Ex 5.8.5 Find a basis of the annihilator W 0 of the subspace W of <4 spanned by 1 =
(1, 2, 3, 4) and 2 = (0, 1, 4, 1).
Solution: Here, we seek to find a basis of the set of linear functionals such that (1 ) = 0
and (2 ) = 0, where, (x, z, t) = ax + by + cz + dt. Thus
(1, 2, 3, 4) = a + 2b 3c + 4d = 0
(0, 1, 4, 1) = b + 4c d = 0.
The system of two equations in the unknowns a, b, c and d is in echelon form with free
variables c and d.
(i) Let c = 1, d = 0, then a = 11, b = 4. In this case, the linear function is 1 () =
11x 4y + z.
(ii) Let c = 0, d = 1, then a = 6, b = 1. In this case, the linear function is 2 () = 6xy+t.
The linear functions 1 () and 2 () form a basis of the annihilator W 0 .
Ex 5.8.6 Let V be the vector space of polynomials over < of maximum degree 2. Let
R1
1 , 2 , 3 be the linear functionals on V defined by 1 (f (t)) = f (t)dt, 2 (f (t)) = f 0 (1),
0

3 (f (t)) = f (0). Here f (t) = a + bt + ct2 V and f 0 (t) denotes the derivative of f (t). Find
the basis {f1 (t), f2 (t), f3 (t)} of V that is dual to {1 , 2 , 3 }.

348

Linear Transformations

Solution: Let i = fi (t) = ai + bi t + ci t2 ; i = 1, 2, 3. By definition of dual basis, we have


1 (i ) = i1 . Thus,
Z 1
b1
c1
1 (1 ) =
(a1 + b1 t + c1 t2 )dt = a1 +
+
=1
2
3
0
Z 1
c2
b2
1 (2 ) =
+
=0
(a2 + b2 t + c2 t2 )dt = a2 +
2
3
0
Z 1
c3
b3
1 (3 ) =
+
= 0.
(a3 + b3 t + c3 t2 )dt = a3 +
2
3
0
Using the definition 2 (f (t)) = f 0 (1), we get,
2 (1 ) = f10 (1) = b1 + 2c1 = 0
2 (2 ) = f20 (1) = b2 + 2c2 = 1
2 (3 ) = f30 (1) = b3 + 2c3 = 0.
Using the definition 3 (f (t)) = f (0), we get,
3 (1 ) = f1 (0) = a1 = 0
3 (2 ) = f2 (0) = a2 = 0
3 (3 ) = f3 (0) = a3 = 1.
Solving each system yields, a1 = 0, b1 = 3, c1 = 23 ; a2 = 0, b2 = 21 , c2 = 34 and a3 =
1, b3 = 3, c3 = 32 . Thus f1 (t) = 3t 32 t2 , f2 (t) = 12 t + 34 t2 and f3 (t) = 1 3t + 32 t2 . Thus
{f1 (t), f2 (t), f3 (t)} given above of the basis of V that is dual to {1 , 2 , 3 }.
Theorem 5.8.3 Let {1 , 2 , , n } be a basis of V and let {1 , 2 , , n } be the dual
basis of <n . Then for any vector V, = 1 ()1 + 2 ()2 + + n ()n , and for
any linear functional V , = (1 )1 + (2 )2 + + (n )n .
Proof: Let = a1 1 + a2 2 + + an n , for some ai F . Thus,
1 () = 1 (a1 1 + a2 2 + + an n )
= a1 1 (1 ) + a2 1 (2 ) + + an 1 (n )
= a1 .1 + a2 .0 + + an .0 = a1 .
Thus, in general, i () = ai ; i = 1, 2, , n and therefore,
= 1 ()1 + 2 ()2 + + n ()n .
Applying the functional in both sides, we get,
() = 1 ()(1 ) + 2 ()(2 ) + + n ()(n )
= (1 )1 () + (2 )2 () + + (n )n ()
= ((1 )1 + (2 )2 + + (n )n ) ().
This relation holds for V and so (1 )1 + (2 )2 + + (n )n . This theorem
gives the relationship between bases and their duals.
Theorem 5.8.4 Let = {1 , 2 , , n } and = {1 , 2 , , n } be a bases of V and
let = {1 , 2 , , n } and = {1 , 2 , , n } be the bases of V dual to and
respectively. If T be the transition matrix from to , then (T 1 )t is the transition matrix
from to .

Linear Functional

349

Proof: Let the elements of can be written in the linear combination of the elements of
as
n
X
i =
aij j ; i = 1, 2, , n
j=1

so that by the given condition T = [aij ]nn . Now, the elements of can be written in terms
of as,
i =

n
X

bij j ; i = 1, 2, , n

j=1

where R = [bij ]nn . We shall show that, R = [T 1 ]t . Let Ri = (bi1 , bi2 , , bin ) be the ith
row of R and Cj = (aj1 , aj2 , , ajn )t be the j th column of T t . Then by the definition of
dual space,
i (j ) =

n
X

(bik k )(ajk k ) =

k=1

bik ajk = Ri Cj = ij

k=1

R1 C1 R1 C2
R2 C1 R2 C2

RT t = .
..
..
.
Rn C1 Rn C2
t 1

R = (T )

n
X

= (T


R1 Cn
1
0
R2 Cn

= ..
..
.
.
Rn Cn
0

0
1
..
.

... 0
... 0

.. = In
.

0 1

1 t

).

Therefore, if T be the transition matrix from to , then (T 1 )t is the transition matrix


from to .

5.8.2

Second Dual Space

Let V (F ) be a vector space. Then its dual space V , containing of all the linear functionals
of V , is also a vector space. Hence V itself has a dual space V , consists of all linear
functional on V , called the second dual of V , which consists of all linear functionals of V .
Theorem 5.8.5 Each element of V determines a specific element of
V .
Proof: For every V , we define a mapping
: V F by
() = (). First, we are

to show that
: V F is linear. For this, let a, b F and , V , we have,

(a + b) = (a + b)() = a() + b() = a


() + b
().
Therefore, the mapping
: V F is linear. Next we are to show that, if V is finite
dimensional, then the mapping
: V F is an isomorphism of V onto V . Let (6= )

V , then V such that


() 6= 0
() = () 6= 0

6= .
Since 6=
6= , the mapping
: V F is non singular and so it is an isomorphism.
Since V is finite dimensional, we have,
dimV = dimV = dimV .
Hence the mapping
: V F is an isomorphism and so that each element V
determines an unique element
V . Thus the isomorphism between V and V does
not depend on any choice of bases for the two vector spaces.

350

Linear Transformations

(i) Let V be a finite dimensional vector space and let V . If


() = 0 for all V ,
then = .
(ii) Let {1 , 2 , . . . , n } be a ordered basis for V . Then by this theorem, we conclude
that for this ordered basis, there exists a dual basis {1 , 2 , . . . , n } in V , i.e.
ij = i (j ) = j (i ) ; i, j.
Thus, {1 , 2 , . . . , n } is a dual basis of {1 , 2 , . . . , n }. Therefore, if V be a finite
dimensional vector space with dual space with dual space V , then every ordered basis
for V is the dual basis for some basis for V .

5.8.3

Annihilators

Let V (F ) be a vector space and let S be a subset (not necessarily subspace) of V . An


annihilator of S, denoted by S 0 , is defined by,
S 0 = {f V ; f () = 0; S}

where f is a linear functional in V . Clearly, {} = V

(5.10)
0

and V = {}.

Theorem 5.8.6 If S is any subset of a vector space V (F ), then S 0 is a subspace of V .


Proof: By definition of annihilators, 0 S 0 as () = 0, S, thus S 6= . Let us
suppose , S 0 , then S, we have () = 0 and () = 0. Thus for any scalars
a, b F , we have

(a + b)() = a() + b() = a0 + b0 = 0


a + b S 0 .

Thus, S 0 , S 0 implies a + b S 0 , a, b F . Hence S 0 is a subspace of V .


Theorem 5.8.7 Let V (F ) be a finite dimensional vector space and W is a subspace of V .
Then dimW + dimW 0 = dimV.
Proof: Case 1: If W = {}, then W 0 = V . Thus dimW = 0 and
dimW 0 = dimW = dimV.
In this case the result follows.
Case 2: Again if W = V , then W 0 = {}. Therefore,
dimW = dimV and dimW 0 = 0.
So, in this case also, the result follows.
Case 3: Now let W is a proper subspace of V . Let dimV = n and dimW = r, 0 < r < n.
We are to show that dimW 0 = nr. Let S = {1 , 2 , , r } be a basis of W . By extension
theorem S can be extended to the basis of V as S1 {1 , 2 , , r , 1 , 2 , , nr }. Let the
dual space of V be {1 , 2 , , nr } such that
i (j ) = ij ,

i (j ) = ij .

By definition, of the dual basis, each of i s annihilates each i and so i W 0 ; i =


1, 2, , n r. We assert that {i } is a basis of W 0 and it is LI, as it is a subset of a LI set.
We are to show that {j } spans W 0 . Let W 0 , then
= (1 )1 + (2 )2 + + (r )r + (1 )1 + + (nr )nr
= 01 + 02 + + 0r + (1 )1 + + (nr )nr
= (1 )1 + + (nr )nr .

Linear Functional

351

Thus each element of W 0 is a linear combination of elements of S. Hence {1 , 2 , , nr }


spans W 0 and so it is the basis of W 0 . Hence dimW 0 = n r = dimV dimW.
Corrolary: Here W is exactly the set of vectors such that i () = 0; i = r+1, r+2, , n.
In case r = n 1, W is the null space of r . Thus if W is a rdimensional subspace of a
ndimensional vector space V , then W is the intersection of (n r) hyperspaces in V .
Corrolary: Let W1 and W2 be two finite dimensional vector space such that W1 = W2 . If
W1 = W2 then obviously, W10 = W20 . If W1 6= W2 , then one of the two subspaces contains
a vector which is not in the other. Let a vector W2 but 6 W1 . By the previous
corollary, there is a linear functional such that () = 0 for all W , but () 6= 0.
Then is in W10 but not in W20 and W10 6= W20 . Thus if W1 and W2 are subspaces of a finite
dimensional vector space, then W1 6= W2 if and only if W10 6= W20 .
Result 5.8.1 The first corollary says that, if we select some ordered basis for the space,
each rdimensional subspace can be described by specifying (n r) homogeneous linear
condition on the coordinates relative to that basis. Now, we are to look briefly at systems
of homogeneous linear equations from the point of view of linear functionals. Consider the
system of linear equations

A11 x1 + A12 x2 + . . . + A1n xn = 0

A21 x1 + A22 x2 + . . . + A2n xn = 0


(5.11)
..
..
.
.

Am1 x1 + Am2 x2 + . . . + Amn xn = 0


for which we wish to find the solutions. If we let i ; i = 1, 2, , m be the linear functional
on F m defined by
i (x1 , x2 , , xn ) = Ai1 x1 + Ai2 x2 + . . . + Ain xn
then we are seeking the subspace annihilated by 1 , 2 , , m . Row reduction of the coefficient matrix provides us with a systematic method of finding this subspace. The n tuple
(Ai1 , Ai2 , , Ain ) gives the coordinates of the linear functional i relative to the basis which
is dual to the standard basis for F n . The row space of the coefficient matrix may thus be
regarded as the space of linear functionals spanned by 1 , 2 , , m . The solution space is
the subspace annihilated by the space of functionals.
Now, we are to describe at the system of equations from the dual point of view, i.e., suppose
that we are given m vectors in F n as
i = (Ai1 , Ai2 , , Ain )
and we wish to find the annihilator of the subspace spanned by these vectors. As a typical
linear functional on F n has the form
(x1 , x2 , , xn ) = c1 x1 + c2 x2 + + cn xn
the condition that be in the annihilator is that
n
X

Aij cj = 0; i = 1, 2, , m,

j=1

which shows that (c1 , c2 , , cn ) is a solution of the homogeneous system Ax = 0. Therefore


from the dual point of view, row reduction gives us a systematic method of finding the
annihilator of the subspace spanned by a given finite set of vectors in F n .

352

Linear Transformations

Ex 5.8.7 Let W be the subspace of <5 which is spanned by the vectors 1 = (2, 2, 3, 4, 1),
2 = (1, 1, 2, 5, 2), 3 = (0, 0, 1, 2, 3) and 4 = (1, 1, 2, 3, 0). How does one describe
W 0 , the annihilator of W ?
Solution: Let us form a 4 5 matrix A with row vectors 1 , 2 , 3 , 4 and find the row
reduced echelon matrix which is row equivalent to A as

2 2 3 4 1
1 1 0 1 0
1 1 2 5 2 0 0 1 2 0

A=
0 0 1 2 3 0 0 0 0 1 = R(say).
1 1 2 3 0
0 0 0 0 0
Let be a linear functional on <5 as (x1 , x2 , x3 , x4 , x5 ) =

5
P

cj xj , then is in W 0 if and

j=1

only if (i ) = 0 for i = 1, 2, 3, 4, i.e., if and only if


5
X

Aij cj = 0; i = 1, 2, 3, 4

j=1

5
X

Rij cj = 0; i = 1, 2, 3

j=1

c1 c2 c4 = 0, c3 + 2c4 = 0, c5 = 0.
We obtain all such linear functionals by assigning arbitrary values of c2 and c4 , say c2 = a
and c4 = b, so that c1 = a + b, c3 = 2b, c5 = 0. Therefore, W 0 consists of all linear
functionals of the form
(x1 , x2 , x3 , x4 , x5 ) = (a + b)x1 + ax2 2bx3 + bx4 .
The dimension of W 0 is 2 and a basis {1 , 2 } for W 0 can be found by taking a = 1, b = 0
and a = 1, b = 0 as
1 (x1 , x2 , x3 , x4 , x5 ) = x1 + x2 ; 2 (x1 , x2 , x3 , x4 , x5 ) = x1 2x3 + x4 .
The above general in W 0 is = a1 + b2 .
Ex 5.8.8 Find the subspace which {1 , 2 , 3 } annihilate, where the three functionals on <4
are 1 (x1 , x2 , x3 , x4 ) = x1 + 2x2 + 2x3 + x4 , 2 (x1 , x2 , x3 , x4 ) = 2x2 + x4 , 3 (x1 , x2 , x3 , x4 ) =
2x1 4x3 + 3x4 .
Solution: The subspace which {1 , 2 , 3 } annihilate may be found explicitly, by forming
a 3 4 matrix A with coefficients as row vectors and by finding the row reduced echelon
matrix which is row equivalent to A as

1 2 2 1
1020
A = 0 2 0 1 0 1 0 0.
2 0 4 3
0003
Therefore, the linear functionals {1 , 2 , 3 } given by
1 (x1 , , x4 ) = x1 + 2x3 , 2 (x1 , , x4 ) = x2 , 3 (x1 , , x4 ) = x4
span the same subspace of <4 and annihilate the same subspace of <4 as do 1 , 2 , 3 . The
subspace annihilated consists of the vectors with x1 = 2x3 , x2 = x4 = 0.
Theorem 5.8.8 If V annihilates a subset W of V , then annihilates the L(W ) of
W.

Linear Functional

353

Proof: Let L(W ), then 1 , 2 , , n W for which


n
X
=
ai i , for some scalars ai F
i=1

() =

n
X
i=1

ai (i ) =

n
X

ai 0 = 0.

i=1
0

Since is an arbitrary element of L(W ), annihilates L(W ), i.e., W 0 = (L(W )) .


Theorem 5.8.9 If W is a subspace of a finite dimensional vector space V (F ), then W
=
(V /W 0 ).
Proof: Here,
dim(V /W 0 ) = dimV dimW 0
= dimV dimW 0 ; as dimV = dimV
= dimV (dimV dimW ) = dimW = dimW .
Hence by theorem of isomorphism, we have, W
= (V /W 0 ).
Theorem 5.8.10 For any subset W of V , (i) W W 00 and (ii) W1 W2 W20 W10 .
Proof: (i) Let W , then for every linear functional W 0 , we have

() = () = 0
(W 0 )0 .
Under the identification of V and V , we have W 00 . Hence W W 00 .
(ii) Let W20 , then () = 0, W2 . But W1 W2 hence annihilates every element
of W1 , i.e., W1 . Hence W20 W10 .
Theorem 5.8.11 Let S and T be subspaces of a finite dimensional vector space V (F ), then
(S + T )0 = S 0 T 0 .
Proof: Let (S + T )0 , i.e., () = 0; S + T, then annihilates S + T and so,
in particular annihilates S and T . Let S + T, then = s + t, where s S and
t T . Clearly, s S s S + T, and t T t S + T, so, (s) = 0, s S and
(t) = 0, t T. Therefore,
(S + T )0 S 0 and T 0
S 0 T 0 (S + T )0 S 0 T 0 .
Again let S 0 T 0 , i.e., S 0 and T 0 , then annihilates S and T . Therefore,
() = (s) + (t) = 0 + 0 = 0.
Thus annihilates S + T , i.e., (S + T )0 , i.e., S 0 + T 0 (S + T )0 . Similarly, let S and
T be subspaces of a finite dimensional vector space V (F ), then (S T )0 = S 0 + T 0 .
Theorem 5.8.12 For any subset S of a vector space V (F ), L(S) = S 00 .
Proof: We have S 0 = [L(S)]0 . Now, S 0 being a subspace of V , it follows that [L(S)]0 is
also a subspace and therefore [L(S)]00 = L(S). Thus,
S 0 = [L(S)]0 S 00 = [L(S)]00 S 00 = [L(S)].
Therefore, for any subset S of a vector space V (F ), L(S) = S 00 .

354

Linear Transformations

Ex 5.8.9 Let W be a subspace of <4 spanned by (1, 2, 3, 4), (1, 3, 2, 6) and (1, 4, 1, 8).
Find a basis of the annihilator of W .
Solution: Let 1 = (1, 2, 3, 4), 2 = (1, 3, 2, 6) and 3 = (1, 4, 1, 8) and S =
{1 , 2 , 3 }. Let L(S) then, it suffices to find a basis of the set of linear functionals
() = (x, y, z, w) = ax + by + cz + dw
for which (1 ) = 0 = (2 ) = (3 ). Thus,
(1, 2, 3, 4) = a + 2b 3c + 4d = 0
(1, 3, 2, 6) = a + 3b 2c + 6d = 0
(1, 4, 1, 8) = a + 4b c + 8d = 0.
The system of three equations in unknowns a, b, c, d is in echelon form b + c + 2d = 0
with free variable a. Let c = 0, d = 1, then b = 2 and hence the linear functional
1 (x, y, z.w) = 2y + w. Let c = 1, d = 0 then b = 1, a = 5 and 2 = 5x y + z. The set
of linear functionals {1 , 2 } is LI and so is basis of W 0 , the annihilator of W .

5.9

Transpose of a Linear Mapping

Let U and V be two vector spaces over the same field F . Let T : U V be an arbitrary
linear mapping from a vector space U into a vector space V . Now, for any linear functional
U , the composite mapping 0 T is linear from U to F , so that 0 T . Then the
mapping T : U V , defined by
[T ()]() = [T ()]; U and U

(5.12)

is called the adjoint or transpose of the linear transformation T .


Ex 5.9.1 Let be a linear functional on <2 , defined by (x, y) = 3x 2y. For the linear
mapping T : <3 <2 , defined by T (x, y, z) = (x + y + z, 2x y), find [T ()](x, y, z).
Solution: Using the definition of the transpose mapping, we have T () = 0 T, i.e., we
have,
[T ()]() = [T ()]
[T ()](x, y, z) = [T (x, y, z)] = (x + y + z, 2x y)
= 3(x + y + z) 2(2x y) = x + 5y + 3z.

Theorem 5.9.1 The adjoint T of a linear transformation T is also linear.


Proof: Let U and V be two vector spaces over the same field F and T : U V be a linear
transformation. Now, , U , a, b F and U , we have,
[T (a + b)]() = (a + b)T ()
= a[T ()] + b[T ()]; as , are linear
= a[T ()]() + b[T ()]()
= [aT () + bT ()]()
T (a + b) = aT () + bT ().
Thus, the adjoint T of a linear transformation T is also linear.

Transpose of a Linear Mapping

355

Theorem 5.9.2 Let U and V be two vector spaces over the same field F and T : U V
be a linear transformation, then ker(T ) = [Im(T )] .
Proof: T be the adjoint linear transformation, for the linear transformation T : U V .
Let ker(T ), then,
ker(T ) T () = 0 [T ()] = 0; U
[T ()] = 0; T () Im(T )
[Im(T )] ker(T ) = [Im(T )] .
Hence the theorem. Similarly, if T : U V is linear and U is a finite dimension, then
(kerT ) = Im(T ). Thus the null space of T is the annihilator of the range of T .
Theorem 5.9.3 Let U and V be two vector spaces over the same field F and T : U V
be a linear transformation, then rank(T ) = rank(T ).
Proof: Since U and V has finite dimension, so
dimU = dim[Im(T )] + dim[Im(T )]
= dim[Im(T )] + dim[ker(T )] = (T ) + (T ).
But, T being the linear transformation from U into V , so, we have,
dimU = (T ) + (T )
dimU = (T ) + (T ); as dimU = dimU
(T ) = (T ); i.e., rank(T ) = rank(T ).
Let N be the null space of T . Every functional in the range of T is in the annihilator of
N , for suppose = T for some W , then for N , we have,
() = (T )() = (T ()) = (0) = 0.
Now, the range of T is a subspace of the space N 0 , and
dimN 0 = n dimN = rank(T ) = rank(T )
so that the range T must be exactly N 0 . Therefore, the range of T is the annihilator of
the null space of T .

Exercise 5
Section-A
[Multiple Choice Questions]
1. Let T : <2 <3 be a linear transformation given by T (x1 , x2 ) = (x1 + x2 , x1 x2 , x2 ),
then rank T is
(a) 0
(b) 1
(c) 2
(d) 3.
2. The rank and nullity of T , where T is a linear transformation from <2 <2 defined
by T (a, b) = (a b, b a, a), are respectively
(a) (1,1)
(b) (2,0)
(c) (0,2)
(d) (2,1)
3. Which of the following is not a linear transformation?
(a) T : <2 <2 : T (x, y) = (2x y, x)
(b) T : <2 <3 : T (x, y) = (x + y, y, x)
3
3
(c) T : < < : T (x, y, z) = (x + y + z, 1, 1) (d) T : < <2 : T (x) = (2x, x)

356

Linear Transformations

4. Let T : <2 <2 be the linear transformation such that T ((1, 2)) = (2, 3) and
T ((0, 1)) = (1, 4). Then T ((5, 6)) is
[IIT-JAM10]
(a) (6,-1)
(b) (-6,1)
(c) (-1,6)
(d) (1,-6)
5. Let T1 and T2 be linear operators on <2 defined as follows: T1 (a, b) = (b, a), T2 (a, b) =
(0, b). Then T1 T2 defined by T1 T2 (a, b) = T1 (T2 (a, b)) maps (1,2) into
(a) (2,1)
(b) (1,0)
(c) (0,2)
(d) (2,0)
6. Let T : <3 <3 be the linear transformation
whose matrix with respect to the

001
standard basis {e1 , e2 , e3 } of <3 is 0 1 0 . Then T
[IIT-JAM10]
100
(a) maps the subspace spanned by e1 and e2 onto itself (b)Has distinct eigen values (c)
has eigen vectors that span <3 (d) has a non-zero null space.
7. Let T : <3 <3 be
the linear
transformation whose matrix with respect to the
0 a b
standard basis of <3 is a 0 c , where a, b, c are real numbers not all zero. Then
b c 0
T
[IIT-JAM10]
(a) is one-to-one (b)is onto (c) does not map any line through the origin onto itself
(d) has rank 1.
8. For m 6= n, let T1 : <n <m and T2 : <m <n be linear transformation such that
T1 T2 is bijective. If R(T ) is the rank of T , then
[IIT-JAM11]
(a) R(T1 ) = n and R(T2 ) = m
(b) R(T1 ) = m and R(T2 ) = n
(c)R(T1 ) = n
and R(T2 ) = n
(d) R(T1 ) = m and R(T2 ) = m
9. Let W be the vector space of all real polynomials of degree atmost 3. Define T : W
W by (T p)(x) = p0 (x), where p0 is the derivative of p. The matrix of T in the basis
2
3
{1, x,
considered
is given
[NET(June)11]
x , x }
as column
vectors,

by

0000
0000
0100
0123
0 1 0 0
1 0 0 0
0 0 2 0
0 0 0 0

(a)
0 0 2 0 (b) 0 2 0 0 (c) 0 0 0 3 (d) 0 0 0 0
0003
0030
0000
0000
10. Let N be the vector space of all real polynomials of degree atmost 3. Define S : N N
by (Sp )(x) = p(x+1), p N . Then the matrix of S in the basis {1, x, x2 , x3 } considered
as column
vectors,
is
given by
[NET(June)12]

1000
1111
1123
0000
0 2 0 0

(b) 1 1 2 3 (c) 1 1 2 3 (d) 1 0 0 0


(a)
0 0 3 0
0 0 1 3
2 2 2 3
0 1 0 0
0004
0001
3333
0010
11. Let T be a linear transformation on the real vector space <n over < such that T 2 = T
for some <. Then
[NET(June)11]
(a) if ||T x|| = ||||x|| for all x <n
(b) if ||T x|| = ||x|| for some nonzero vector x <n the = 1.
(c) T = I, where I is the identity transformation on <n .
(d) if ||T x|| > ||x|| for a nonzero vector x <n then T is necessarily singular.

Transpose of a Linear Mapping

357

12. For a positive integer n, let Pn denote the space of all polynomials p(x) with coefficients
in < such that degp(x) n, and let Bn2 denote the standard basis of Pn given by
Bn = {1, x, x2 , , xn }. If T : P3 P4 is the linear transformation defined by
Z x
T (P (x)) = x2 p0 (x) +
p(t) dt
0

and A = (aij ) is the 5 4 matrix of T with respect to standard basis B3 and B4 , then
[NET(Dec)11]
(a) a32 = 32 and a33 = 73 (b) a32 = 32 and a33 = 0 (c) a32 = 0 and a33 = 73 (d)
a32 = 0 and a33 = 0.
13. Consider the linear transformation T : <7 <7 defined by T (x1 , x2 , , x6 , x7 ) =
(x7 , x6 , , x2 , x1 ). Which of the following statements are true?
[NET(Dec)11]
(a) The determinant of T is 1 (b) There is a basis of <7 with respect to which T is a
diagonal matrix (c) T 7 = I (d) The smallest n such that T n = I is even.
14. Let M2 (<) denote the set of 2 2 real matrices. Let A M2 (<) be of trace 2
and determinant -3. Identifying M2 (<) with <4 , consider the linear transformation
T : M2 (<) M2 (<) defined by T (B) = AB. Then which of the following statements
are true?
[NET(Dec)11]
(a) T is diagonizable (b) 2 is eigen value of T (c) T is invertible (d) T (B) = B
for some 0 6= B in M2 (<).
15. If U and V be vector spaces of dimension 4 and 6 respectively. Then dimhom(V, U ) is
(a) 4
(b) 6
(c) 10
(d) 24
16. Let V = {f (x) R[x] : degf (x) 1} where R is the field of real numbers. Define
e1 , e2 : V R by
e1 [f (x)] =

f (x)dx and e2 [f (x)] =

f (x)dx.
0

Then, the basis of V whose dual basis {e1 , e2 } is


1+x
(a) {1 + x, 1 x} (b) {2 2x, 12 + x} (c) {2 + 2x, 12 x} (d) { 1x
2 , 2 }.
17. Let T be a linear transformation on a vector space V such that T 2 T + I = 0, then
(a) T is singular (b) T is non-singular (c) T is invertible (d) T is not invertible.
18. Let T be a linear transformation on <3 given by T (x, y, z) = (2x, 4x y, 2x + 3y z),
then
1
(a)
 T is singular (b)T is non-singular (c) T is invertible (d) T (x, y, z) =
x
2 , 2x y, 7x 3y z .
Section-B
[Objective Questions]
1. Test whether the following mappings are linear or not.
(a) T : R2 R2 defined by T (x, y) = (x + y, x).
(b) T : R3 R defined by T (x, y, z) = x + y + z.
(c) T : R2 R2 defined by T (x, y) = (2x + 1, 2y 1).
(d) T : R3 R3 defined by T (x, y, z) = (3x 2y + 3z, y, 2x + y z).

358

Linear Transformations
(e) T : R3 R2 defined by T (x, y, z) = (xy, x + y).
(f) T : R3 R3 defined by T (x, y, z) = (3x, 2y, z).
(g) T : R4 R2 defined by T (x, y, z, t) = (x + y, z + t).
(h) T : R3 R2 defined by T (x, y, z) = (|x y|, |y|).
(i) T : R3 R3 defined by T (x, y, z) = (xy, yz, zx).
(j) T : R2 R3 defined by T (x, y) = (x, x y, x + y).
(k) T : V V defined by T (u(t)) = a
u(t) + bu(t)

+ cu(t), where V is the vector


space of functions having derivatives of all order and a, b, c are arbitrary scalars.
(l) T : P4 (x) P4 (x) defined by T (p(x)) = xp0 (x) + p(x), where P4 (x) is the set of
all polynomials of degree 4.


ab
(m) T : V22 P3 (x) defined by T
= (a b) + (b c)x + (c d)x2 + (d a)x3 .
cd
(n) T : P2 (x) P3 (x) defined by T (p(x)) = p(x) + 5

R1

p(t) dt.

2. Let V be the vector space of all n n matrices over the field F , and let B be a fixed
n n matrix. If T (A) = AB BA, verify that T is a linear transformation from V
into V .
3. Show that the transformation map T : <2 <2 , defined by T (x, y) = (x + 4, y 3) is
not a linear transformation.
4. Show that a linear transformation T : U V is injective if and only if Kernal of T is
{}.
5. Let T : <2 <2 be the linear transformation which rotates each vector <2 by an
angle 4 . Show that T has no eigen vectors.
Section-C
[Long Answer Questions]
1. (a) Show that T : <2 <2 defined by T (x, y) = (2x + y, x) is a linear mapping.
(b) Let Mmn (F ) be a vector space over the field F and let [aij ] be a m n fixed
matrix over T : Mmn Mmn by
T [bij ] = [aij ].[bij ]; [bij ] Mmn .
Prove that T is a linear transformation.
(c) Let F 3 and F 2 be two vector space over the same field F . T : F 3 F 2 define
by
T (a, b, c) = (a, b); (a, b, c) F 3
Prove that T is linear.
(d) Let P be a fixed m n matrix with entries in the field F and let Q be a fixed
n n matrix over F . Prove that T : F mn F mn defined by T (A) = P AQ is
a linear transformation.
(e) Show that the translation map T : <2 <2 defined by T (x, y) = (x + 4, y 3)
is not a linear transformation.
[ BH07]
2. Let T : U V be a linear transformation. Show that the general solution of T (x) = y
is the sum of the general solution of T (x) = 0 and a particular solution of T (x) = y.

Transpose of a Linear Mapping

359

3. (a) A mapping T : <3 <3 , defined by, T (x, y, z) = (x + 2y + 3z, 3x + 2y + z, x +


y + z); (x, y, z) <3 . Show that T is linear. Find kerT and dimension of kerT .
(b) Show that T : <3 <3 , defined by T (x, y, z) = (x + y, y + z, z + x) is a linear
transformation. Determine dimension of kerT and ImT .
(c) Find a basis and dimension of kerT , where the linear mapping T : <3 <2 is
defined by T (x, y, z) = (x + y, y + z).
(d) Let V be the vector space over < of all polynomials of degree at most 6. Let
T : V V be given by T (p(x)) = p0 (x). Determine the rank and nullity of the
mapping T .
[JU(M.Sc.)06]
4. (a) Let T : R2 R where T (1, 1) = 3 and T (0, 1) = 2. Find T (x, y).
(b) Let S : R3 R where T (0, 1, 2) = 1, T (0, 0, 1) = 2 and T (1, 1, 1) = 3. Find
T (x, y, z).
(c) If T : R2 R3 is defined by T (1, 2) = (3, 1, 5), T (0, 1) = (2, 1, 1). Show that
T (x, y) = (x + 2y, 3x + y, 7x y).
5. (a) Let {(1, 1, 1), (4, 1, 1), (1, 1, 2)} be a basis of R3 and let T : R3 R2 be the linear transformation such that T (1, 1, 1) = (1, 0), T (4, 1, 1) = (0, 1), T (1, 1, 2) =
(1, 1). Find T (x, y, z).
(b) Determine the linear mapping T : R2 R2 which maps the basis vectors
(1, 1), (0, 1) of R2 to the vectors (2, 0), (1, 0) respectively.
(c) Find a linear transformation S : R3 R2 which maps the vectors (1, 1, 1),
(1, 1, 0), (1, 0, 0) to (2, 1), (2, 1), (2, 1) respectively.
(d) Find a linear transformation T : R4 R3 which transforms the elementary
vectors (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), (0, 0, 0, 1) to (1, 2, 3), (1, 1, 2), (1, 2, 2) and
(2, 1, 3) respectively.
(e) Show that the linear transformation T : R3 R3 which transforms the vectors
(3, 1, 2), (1, 1, 0), (2, 0, 2) to twice the elementary vectors 2(1, 0, 0), 2(0, 1, 0),
2(0, 0, 1) is (x y + z, x + y + z, x y + 2z).
6. Determine ker(T ) and nullity T when T is given by
(a) T (x, y, z) = (x y, y z, z x)
(b) T (x, y, z) = x + y z
(c) T (x, y) = (cos x, sin y).
7. Let F be a subfield of the complex numbers and let T be the function from F 3 into
F 3 defined by
T (x, y, z) = (x y + 2z, 2x + y, x 2y + 2z).
(a) Verify that T is a linear transformation.
(b) If (a, b, c) is a vector in F 3 , what are the condition on a, b and c that the vector
be in the range of T ? What is the rank of T ?
(c) What are the conditions on a, b and c that (a, b, c) be in null space of T ? What
is the nullity of T ?
(d) Let T be a linear operator defined on a finite dimensional vector space V . If
rank(T 2 ) = rank(T ), find R(T ) N (T ), where R(T ), N (T ) denote respectively
the range and the null space of T .
[Gate97]

360

Linear Transformations
(e) Let V be the vector space of square matrices of order n over F . Let T : V F
be a trace mapping T (A) = a11 + a22 + + ann , where A = [aij ]. Prove that T
is linear on V . Find nullity and Im(T ). Also verify disession theorem.

8. Prove that T defined below is a linear transformation


(a) T : <3 <2 defined by T (x, y, z) = (x y, 2z)
(b) T : <2 <2 defined by T (x, y) = (x + y, 0, 2x y)
(c) T : M23 (F ) M22 defined by

 

a11 a12 a13
2a11 a12 a13 + 2a12
T
=
a21 a22 a23
0
0
(d) T : P2 (<) P3 (<) defined by T (f (x)) = xf (x) + f 0 (x).
Find both Nullity and dimension of T and verify dimension theorem.
9. (a) If T : <2 <2 is given by T (x, y) = (x, 0) for all (x, y) <2 . Show that T is a
linear transformation and verify that
dim ker(T ) + dim im(T ) = 2.
[WBUT 2004]
(b) If T : <3 <3 is defined as T (x, y, z) = (x y, y z, z x). Show that T is
a linear transformation and verify Sylvesters law, viz., rank of T + nullity of
T = 3.
10. If A <mn and B <nm have a common eigenvalue < show that the linear
operator T : <mn <mn , defined by T (X) = AX XB is singular.
[Gate98]
11. Let V and W be finite dimensional vector spaces and let T : V W be a linear
transformation and {u1 , u2 , , un } be a subset of V such that {T u1 , T u2 , , T un }
is LI in W . Show that {u1 , u2 , , un } is LI in V . Deduce that, if T is onto, then
dimV dimW.
[Gate99]
12. (a) Find a linear transformation T : <4 <4 for which the null space is spanned by
(2, 2, 1, 2), (3, 4, 3, 1) and range space by (3, 2, 1, 0), (0, 1, 2, 3).
(b) Find a linear transformation whose kernal is spanned by (1, 2, 3, 4) and (0, 1, 1, 1).
(c) Find a linear transformation T : <4 <3 for which the null space is spanned
by (2, 1, 1, 2) and image space by (cos , sin , 0), ( sin , cos , 0) and (0, 0, 1),
where is an arbitrary real number.
13. (a) A linear mapping T : <3 <3 maps the vectors (2, 1, 1), (1, 2, 1) and (1,1,2) to
(1, 1, 1), (1, 1, 1) and (1, 0, 0) respectively. Show that T is not an isomorphism.
(b) A linear mapping S : <3 <3 maps the vectors (0,1,1), (1, 0, 1) and (1,1,0) to
(2, 1, 1), (1, 2, 1) and (1, 1, 2) respectively. Show that S is not an isomorphism.
(c) A linear mapping T : <3 <3 maps the basis vectors , , to + , + ,
respectively. Show that T is an isomorphism.
14. Let S and T be linear mappings of <3 to <3 defined by
S(x, y, z) = (z, y, x); (x, y, z) <3 and
T (x, y, z) = (x + y + z, y + z, z); (x, y, z) <3 .
Determine T S and ST . Prove that both S and T are invertible, verify that (ST )1 =
T 1 S 1 .

Transpose of a Linear Mapping

361

15. Consider the basis S = {1 , 2 , 3 } for <3 where 1 = (1, 1, 1), 2 = (1, 1, 0) and 3 =
(1, 0, 0) and T : <3 <2 be a linear transformation, such that T (1 ) = (1, 0), T (2 ) =
(2, 1) and T (3 ) = (4, 3). Find T (2, 3, 5).
[Gate2k]
16. Let T : V V be a linear transformation on a vector space V over the field K
satisfying the property T x = x = . If x1 , x2 , , xn are linearly independent
elements in V , show that T x1 , T x2 , , T xn are also linearly independent. [Gate01]
17. Let T : <3 <3 be a linear transformation defined by T (x, y, z) = (x+y, y z). Then
find the matrix of T with respect to the ordered bases ((1, 1, 1), (1, 1, 0), (0, 1, 0)) and
((1, 1), (1, 0)).
[Gate03]
18. Let V be the vector space of polynomials in t over <. Let I : V < be a mapping
R1
defined by I[p(t)] = p(t)dt. Prove that I is linear functional on V .
0

19. (a) Show that the following mapping T : <3 <3 be defined by T (x1 , x2 , x3 ) =
(2x1 + x2 + 3x3 , 3x1 x2 + x3 , 4x1 + 3x2 + x3 ) where x1 , x2 , x3 < is linear.
Find rank of T .
[CH05, 00]
(b) Let T : <3 <3 be defined by T (x1 , x2 , x3 ) = (x1 + x2 , x2 + x3 , x3 + x1 ) where
x1 , x2 , x3 <. Show that T is a linear map. Determine the dimension of kerT
and ImT .
[CH99]
(c) Let T : <3 <4 be a linear transformation defined by T (x1 , x2 , x3 ) = (x2 +
x3 , x3 + x1 , x1 + x2 , x1 + x2 + x3 ) where x1 , x2 , x3 <. Find kerT . What
conclusion can you draw regarding the linear dependence and independence of
the image set of the set of vectors {(1, 0, 0), (0, 1, 0), (0, 0, 1)}.
CH04
(d) Determine the linear transformation T : <3 <4 that maps the vectors (1, 2, 3), (1, 3, 2), (2, 3, 1)
of <3 to the vector (0, 1, 1, 1), (1, 0, 1, 1) and (1, 1, 0, 1) respectively. Find kerT
and rank of T .
[CH06]
(e) Let a linear transformation T : <4 <2 be defined by T (x1 , x2 , x3 , x4 ) = (3x1
2x2 x3 4x4 , x1 + x2 2x3 3x4 ) where x1 , x2 , x3 , x4 <. Find rank T , nullity
T and the basis of kerT .
[CH02]
(f) Let V be a vector space of 2 2 matrices over <. Let T : V  V be the linear
12
mapping defined by T (A) = AM M A, where M =
. Find a basis of
03
kerT and the dimension of it.
[CH04]
20. (a) Let T : <3 <3 be defined by T (x1 , x2 , x3 ) = (x1 + x2 , x2 + x3 , x3 + x1 ) where
x1 , x2 , x3 <. Show that T is a linear map. Find the matrix associated with it
with respect to the standard ordered basis of <3 .
[CH05, 01]
(b) Let T : <3 <3 be a linear transformation defined by T (x1 , x2 , x3 ) = (x1
x2 , x1 + 2x2 , x2 + 3x3 ) where x1 , x2 , x3 <. Find the matrix representation of T
with respect to the ordered bases {(1, 0, 0), (0, 1, 0), (0, 0, 1)} and {(1, 1, 0), (1, 0, 1), (0, 1, 1)}.
[CH05, 99]
(c) A linear transformation T : <3 <3 transforms the vectors (1, 0, 0), (1, 1, 0),
(1, 1, 1) to the vectors (1, 3, 2), (3, 4, 0) and (2, 1, 3) respectively. Find T and the
matrix representation of T relative to the standard basis of <3 .
[CH07]
21. A linear mapping f : <3 <3 maps the vectors (2, 1, 1), (1, 2, 1), (1, 1, 2) to (1, 1, 1),
(1, 1, 1), (1, 0, 0) respectively. Examine, whether f is an isomorphism.
[CH03]

362

Linear Transformations

22. Find the linear transformation T on <3 which maps the basis (1, 0, 0), (0, 1, 0), (0, 0, 1)
to (1, 1, 1), (0, 1, 1) and (1, 2, 0) respectively. Find the images of (1, 1, 1) and
(2, 2, 2) under T and hence show that T is one-one.
[ BH03]
23. If 1 = (1, 1), 2 = (2, 1), 3 = (3, 2) 1 = (1, 0), 2 = (0, 1), 3 = (1, 1) is there
a linear transformation T from R2 into R2 such that T (i ) = i .
24. (a) Let a linear mapping T : <3 <3 defined by T (a, b, c) = (a + b + c, 2b + 2c, 3c).
Find the matrix T with respect to the standard basis of <3 . Using this matrix
of T with respect to the ordered basis C = {(1, 0, 0), (1, 1, 0), (3, 4, 2)}. Hence
comment on the nature of the elements of C.
[BU(M.Sc.)02]
(b) Prove that the transformation T : <3 <2 be defined by T (x1 , x2 , x3 ) = (3x1
2x2 + x3 , x1 3x2 2x3 ) where x1 , x2 , x3 < is a linear transformation. Find
the matrix of T with respect to the ordered bases {(1, 0, 0), (0, 0, 1), (0, 1, 0)} and
{(0, 1), (1, 0)} of <3 and <2 respectively.
[BH(M.Sc)99, 98]
25. The matrix representation of a linear mapping T : <3 <2 relative
 to the
 ordered
124
3
2
bases {(0, 1, 1), (1, 0, 1), (1, 1, 0)} of < and {(1, 0), (1, 1)} of < is
. Find T .
210
Also, determine the rank of T .
CH02
26. Find the matrix of the linear transformation T in a real vector space of dimension 2
defined by
T (x, y) = (2x 3y, x + y)
with respect to the ordered basis {(1, 0), (0, 1)} and also determine whether the transformation T is non-singular.
[BH04]

1 12
27. The matrix representation of a linear transformation T : <3 <3 is 1 2 1
0 13
relative to the standard basis of <3 . Find the explicit representation of T and the
matrix representation of T relative to the ordered basis
{(1, 1, 1), (0, 1, 1), (0, 0, 1)}.
[CH07, 04, 98]
28. A linear transformation T : <3 <3 is defined by T (x, y, z) = (x + 3y + 3z, 2x + y +
3z, 2x + 2y). Determine the matrix of T , relative to the ordered basis (2, 1, 1), (1, 2, 1),
(1, 1, 2) of <3 . Is T invertible? If so determine the matrix of T 1 relative to the same
basis.
[CH06]
29. T : <4 <3 is linear and is such that
T (x, y, z, w) = (x + y + z + w, 5x + 7y + z + w, 4x + 6y).
Determine that matrix of T relative to the ordered bases {(1, 1, 0, 0), (1, 0, 1, 0), (1, 1, 1, 0),
(1, 1, 1, 1)} of <4 and {(1, 2, 1), (2, 1, 1), (1, 1, 2)} of <3 .
[CH10]
30. (a) The matrix of a linear mapping T : <3 <3 with respect
to the ordered basis
0 3 0
{(0, 1, 1), (1, 0, 1), (1, 1, 0)} of <3 is given by, 2 3 2 . Determine the matrix
2 1 2
of T , relative to the ordered basis (2, 1, 1), (1, 2, 1), (1, 1, 2) of <3 . Is T invertible?
If so determine the matrix of T 1 relative to the same basis.
[CH03]
(b) Prove that no linear transformation from of <3 to <4 is invertible.

[CH: 10]

Transpose of a Linear Mapping

363

(c) Show that the following mapping f : <3 <3 defined by f (x, y, z) = (3x +
3y 2z, 6y 3z, x y + 2z) for all (x, y, z) <3 is linear. Is f non-singular?
Justify your answer. Find the matrix of the above linear mapping f relative to
the ordered basis (1, 0, 0), (0, 1, 0), (0, 0, 1).
CH97
(d) Show that the following transformation T is one-to-one. Find the left inverse of
T , where T (x, y, z) = (x + y + z, x, y z).
(e) Let T : <3 <3 defined by T (x, y, z) = (x y, x + 2y, y + 3z). Show that T is
invertible and determine T 1 .
(f) Let T : <3 <3 defined by T (x, y, z) = (3x + y 2z, x + y, 2x + 2z) is
invertible and determine T 1 .
31. (a) The linear transformation T on <3 maps the basis vector (1, 0, 0), (0, 1, 0), (0, 0, 1)
to (1, 1, 1), (0, 1, 1) and (1, 2, 0) respectively. Find T (1, 1, 1).
BH02
(b) The linear transformation T on <3 maps the basis vector (1, 0, 0), (0, 1, 0), (0, 0, 1)
to (2, 2, 2), (0, 1, 1) and (1, 3, 0) respectively. Find T (2, 1, 1) and T (2, 2, 2). [
BH04]
32. Let V be a vector space and T a linear transformation from V into V . Prove the
following statements are equivalent
(a) The intersection of the range of T and the null space of T is a zero subspace of
V.
(b) If T (T ()) = , then T () = .
33. Let V be the space of n 1 matrices over F and let W be the space of m 1 matrices
over F and let T be the LT from V into W defined by T (X) = AX. Prove that T is
the zero transformation iff A is the zero matrix.
34. Let T : R3 R2 be the LT defined by T (x, y, z) = (x, y, 2z). Then prove that
N (T ) = {(a, a, 0) : a R} and R(T ) = R2 . Also, prove that the mapping is not
injective (i.e. nullity(T ) = 1)
35. Describe explicitly a linear transformation from R3 into R3 which has an its range the
subspace spanned by (1, 0, 1) and (1, 2, 2).
36. Let V be the vector space of all real polynomials p(x). Let D and T be linear mappings
of V of V defined by
Rx
d
D(p(x)) = dx
(p(x)), p(x) V and T (p(x)) = p(t) dt, p(x) V .
0

(a) Show that DT = IV but T D 6= IV .


(b) Find the null space of T D.
37. Let V be the linear space of all real polynomials p(x). Let D and T be linear mappings
d
on V defined by D(p(x)) = dx
(p(x)), p(x) V and T (p(x)) = xp(x), p(x) V .
(a) Show that DT T D = IV
(b) DT 2 T 2 D = 2T .
38. Let S and T be linear mappings of R3 to R3 defined by S(x, y, z) = (z, y, x) and
T (x, y, z) = (x, x + y, x + y + z), (x, y, z) R3 .
(a) Determine ST and T S
(b) Prove that both S and T are invertible. Verify that (ST )1 = T 1 S 1 .

364

Linear Transformations

39. A linear transformation T is given below. Find another linear transformation S such
that ST = T S = IV .
(a) T (x, y) = (y, 3x + 5y)
(b) T (x, y, z) = (x 2y + z, x + y, x).
40. Find the composite transformation T1 T2 T3 when T1 (x, y, z) = (xy, 0, x+y), T2 (x, y, z) =
(x + y, x + y + z), T3 (x, y, z) = (x, y x, x y + z).
41. Let V be a vector space over a field F and S, T be the linear mappings on V . If
ST T S = IV , prove that (a) ST 2 T 2 S = 2T, (b) ST n T n S = nT n1 , for n 2.
42. Let {1 , 2 , 3 } and {1 , 2 , 3 } be ordered bases of the real vector spaces V and W
respectively. A linear mapping T : V W maps the basis vector of V as T (1 ) = 1 ,
T (2 ) = 1 + 2 , T (3 ) = 1 + 2 + 3 . Show that T is non-singular transformation
and find the matrix of T 1 relative to the above ordered bases.
[ CH: 10]

Chapter 6

Inner Product Space


In the earlier chapter we have studied concepts like subspace dimension, linear dependence,
independence, linear transformations and their representations, which are valid over any
finite field F . Here we shall discuss the definition and the properties of the inner product on
an arbitrary vector space V , particularly we confine ourselves to vector spaces over the field
F , where F is either the real field < or the complex field C. Fortunately, there is a single
concept known in Physics as scalar product or dot product of two vectors which covers both
the concepts of length and angle.

6.1

Inner Product Space

In linear algebra, scalar product is usually called it inner product.

6.1.1

Euclidean Spaces

Let V (<) be a real vector space. A real inner product or dot product or scalar product of
vectors of V is a mapping f : V V <, that assigns to each order pair of vectors (, ) of
V a real number f (, ), generally denoted by . or h, i, satisfying the following axioms:
(i) Symmetry: h, i = h, i; , V
(ii) Linearity: h, + i = h, i + h, i; , , V
(iii) Homogeneity: hc, i = ch, i = h, ci; , V and c <
(iv) Positivity: h, i > 0 if (6= ) V, and h, i = 0 iff = .
A real vector space in an inner product define in it, is known as an Euclidean inner product
space. Thus an inner product is Euclidean space if it is a positive definite symmetric bilinear
form. The properties (i) and (iii) simultaneously can be written as
ha + b, i = ah, i + bh, i.
This property states that an inner product function is linear in the first position. Similarly,
the inner product function
h, a + bi = ah, i + bh, i
is also linear in its second position. Thus an inner product of linear combinations of vectors
is equal to a linear combination of the inner products of the vectors. If V is an inner product
space, then by the dimension of V we mean the dimension of V as real vector space, and a
set W is a basis for V if W is a basis for the real vector space V .
365

366

Inner Product Space

Ex 6.1.1 Show that h, i = 2x1 y1 x1 y2 x2 y1 + 3x2 y2 is an inner product in <2 , where


= (x1 , x2 ) and = (y1 , y2 ).
Solution: Here, the given relation of h, i, where = (x1 , x2 ), = (y1 , y2 ) can be written
as


2 1
h, i = T A, where A =
.
1 3
Since A is real symmetric matrix, so h, i = h, i holds. Conditions (ii) and (iii) are
obvious. Now,
h, i = T A = 2x21 2x1 x2 + 3x22
h
x2 i2 5 2
+ x2 > 0.
= 2 x1
2
2
Therefore, all the conditions of the definition of inner product is satisfied, hence, the given
h, i is an inner product in <2 . Alternatively, we see that, the diagonal elements 2,3 of A
are positive and detA = 8 is positive, so that A is positive definite. Thus, h, i is an inner
product.
Ex 6.1.2 Find the values of k so that h, i = x1 y1 3x1 y2 3x2 y1 + kx2 y2 is an inner
product in <2 , where = (x1 , x2 ) and = (y1 , y2 ).
Solution: From the positive definite property of the definition of inner product, we have
h, i > 0 for 6= . Hence,
h, i = x21 6x1 x2 + kx22 > 0.
This relation holds only if (6)2 4.1.k < 0, i.e., k > 9. Alternatively, the inner product
can be written in the form


1 3
T
h, i = A, where A =
.
3 k
Now, h, i is an inner product if A is positive definite, i.e.,
k > 0 and k 9 > 0 k > 9.

6.1.2

Unitary Space

Let V (C) be a vector space over the complex field C. A complex inner product is a mapping
f : V V C, that assigns each ordered pair of vector (, ) of V a complex number
f (, ), generally denoted by h, i, satisfying the following properties:
(i) Conjugate symmetric property:
conjugate of the complex number h, i.

h, i = h, i; where h, i = complex

(ii) Linear property: hc + d, i = ch, i + dh, i; , , V ; c, d C


(iii) Positive definite property: h, i > 0 (6= ) V and h, i = 0.
As h, i = h, i so h, i is a real number. A complex vector space V together with a
complex inner product defined on it is said to be complex inner product space or unitary
space. Thus the inner product in unitary space is a positive definite Hermitian form.
Deduction 6.1.1 Using the property (i) and (ii), we obtain,
h, ci = hc, i = c h, i = c h, i = c h, i.
The inner product is conjugate linear in the second argument, i.e.,
h, c + di = ch, i + dh, i; , , V and c, d C

Inner Product Space

367

Combining linear in the first position and conjugate linear in the second position, we obtain,
by induction,
+
*
X
X
XX
(6.1)
ci i ,
dj j =
ai bj hi , j i.
i

Note: A vector space with an associated inner product is called inner product space or
pre-Hilbert space.
Ex 6.1.3 Show that Vn (<) is a Euclidean vector space.
Solution: Let = (a1 , a2 , . . . , an ) and = (b1 , b2 . . . , bn ) Vn (<) where ai , bi <. We
define dot or standard inner product in Vn (<) by
h, i = a1 b1 + a2 b2 + + an bn .
Now, (i)
h, i = T = a b + a b + + a b
1 1

2 2

n n

= b1 a1 + b2 a2 + + bn an ;
= h, i; , Vn (<)

(since ai , bi <)

(ii) Let = (c1 , c2 , . . . , cn ) Vn (<), then,


h, + i = a1 (b1 + c1 ) + a2 (b2 + c2 ) + + an (bn + cn )
= (a1 b1 + a2 b2 + + an bn ) + (a1 c1 + a2 c2 + + an cn )
= h, i + h, i; , , Vn (<)
(iii) Let k <, then

hk, i = ka1 b1 + ka2 b2 + + kan bn


= k(a1 b1 + a2 b2 + + an bn )
= kh, i; , Vn (<)

Similarly, h, ki = kh, i. Hence, hk, i = kh, i = h, ki; , Vn (<).


(iv) If 6= then ai 6= 0; i and
h, i = a1 2 + a2 2 + + an 2 > 0.
Hence all the four axioms of a Euclidean vector space are satisfied and hence Vn (<) is
a Euclidean vector space. The inner product, defined above is known as standard inner
product, in Vn (<). Similarly, the canonical inner product on C n is h, i = .
Result 6.1.1 An inner product defined by h, i =

ai bi converges absolutely for any

i=1

pair of points in V . This inner product defined as l2 space or Hilbert space.


Ex 6.1.4 Prove that the vector space C[a, b] of real valued continuous functions on [a, b] is
an infinite dimensional Euclidean vector space, if
Z b
hf, gi =
f (t)g(t)dt
for f, g C[a, b].
a

Solution: Here we are to show that properties of the definition of inner product on V are
satisfied. Let f, g, h C[a, b] and k <. Then
Z
(i) hf, gi =

Z
f (t)g(t)dt =

g(t)f (t)dt
a

= hg, f i ; f, g C[a, b]

368

Inner Product Space


Z

(ii) hf, g + hi =

f (t)[g(t) + h(t)]dt
a

Z
f (t)g(t)dt +

f (t)h(t)dt
a

= hf, gi + hf, hi ; f, g, h C[a, b]


Z b
Z b
(iii) hkf, gi =
kf (t)g(t)dt = k
f (t)g(t)dt
a

= khf, gi ; k <
Z b
(iv) hf, f i =
f 2 (t)dt > 0;
if f (t) 6= 0 t [a, b]
a

if f (t) = 0 t [a, b]

= 0;

Thus all the axioms of a Euclidean vector space are satisfied. Hence C[a, b] is an infinite
dimensional Euclidean vector space. The vector space P (t) of all polynomials is a subspace
of C[a, b] for any interval [a, b], and hence the above is also an inner product of P (t).
Similarly if U denotes the vector space of complex continuous functions on the real [a, b],
Z b
then the inner product in U is defined by hf, gi =
f (t)g(t)dt.
a

Ex 6.1.5 If M is the set of all m n matrices over R and


hA, Bi = T r(B T A) for A, B M ,
prove that M is a Euclidean vector spaces.
Solution: Let A = [aij ]mn , B = [bij ]mn , C = [cij ]mn M where aij , bij , cij < and
n X
n
X
<. Then,
T
(i) hA, Bi = T r(B A) =
bki aki
i=1 k=1

n X
n
X

aki bki ; Since, aij , bij <

i=1 k=1
T

= T r(A B) = hB, Ai ; A, B M
n X
n
X
(ii) hA, B + Ci = Tr {(B + C)T A} =
(bki + cki )aki
i=1 k=1

n X
n
X

bki aki +

i=1 k=1
T

n X
n
X

cki aki

i=1 k=1
T

= T r(B A) + T r(C A) = hA, Bi + hA, Ci; A, B, C M


n X
n
X
(iii) hA, Bi = T r(B T A) =
bki (aki )
i=1 k=1

n X
n
X

bki aki ; [since , aij , bij <]

i=1 k=1
T

= T r(B A) = hA, Bi; A, B M and <


n X
n
X
(iv) hA, Ai = T r(AT A) =
aki aki
i=1 k=1

n X
n
X

a2 ki > 0; when A 6= 0

i=1 k=1

= 0; when A = 0

Norm

369

Hence M is a Euclidean space similarly if U denotes the vector space of m n matrices over
C, then the inner product in V is defined by
hA, Bi = T r(B A),
where, B = conjugate transpose of the matrix B.

6.2

Norm

For any V , a inner product space, the norm (or length or magnitude of ), denoted by
the non negative values kk, is defined by
p
(6.2)
kk = h, i.
This definition of length seems reasonable because at least we have |||| > 0, if 6= . This
distance between two vectors and in the inner product space V is
p
d(, ) = || || = h , i.
(6.3)
A vector space together with a norm on it is called a normed vector space or normed linear
space.
Property 6.2.1 When c is a real or complex number,
p
p
kck = hc, ci = |c|2 h, i
p
= |c| h, i = |c|.kk.
Property 6.2.2 If 6= , then h, i > 0 and so kk > 0 and if = , then
h, i = h, i = 0 kk = 0.
Therefore, |||| 0 and |||| = 0 if and only if = .
Property 6.2.3 If , V , then the non negative real number k k is called distance
between and . If h, i = 1 i.e. kk = 1, then is called unit vector or is said to be
1
normalized. Hence non zero vector V can be normalized by us if
b = kk
.
Ex 6.2.1 Find the norm of , where = (1 2i, 3 + i, 2 5i) C 3 .
Solution: (i) Using the usual inner product in C 3 we have,
h, i = = h1 2i, 3 + i, 2 5iih1 2i, 3 + i, 2 5ii
= h1 2i, 3 + i, 2 5iih1 + 2i, 3 i, 2 + 5ii
= 5 + 10 + 29 = 44.
p

Hence kk = h, i = 44 = 2 11.


1 2
Ex 6.2.2 Find the norm of A =
in the space of 2 2 matrices over <.
3 4
Solution: Let V be the vector space of 2 2 matrices over <. Then the inner product in
T
V is defined by hA, Bi = tr(B
A). Now,




1 3
1 2
10 10
hA, Ai = tr
= tr
= 30.
2 4
3 4
10 20
p

Hence, kAk = hA, Ai = 30.

370

Inner Product Space

Ex 6.2.3 Consider f (t) = 3t5 and g(t) = t2 in the polynomial P (t) with the inner product.
Find hf, gi, ||f ||, ||g||.
Solution: Using the definition of inner product,
Z
hf, gi =

(3t 5)t dt =

f (t) g(t) dt =
0

11
.
12

According to the definition of norm,


Z

||f || = hf, f i =
||g||2 = hg, gi =

0
Z 1
0

(3t 5)2 dt = 13 ||f || =


(t2 )2 dt =

13

1
1
||g|| = .
5
5

Cauchy Schwartzs inequality in Euclidean space


Let and be any two vectors in an Euclidean space V (F ), then |h, i| kk.kk.
Proof: Case I: Let one or both of , be , then both sides are zero and the equality sign
holds.
Case II: Let us consider, two non null linearly dependent vectors , . Then c(6= 0) <
such that = c and hence kk = |c|.kk and
h, i = hc, i = ch, i = ckk2
|h, i| = |c|kk2 = kk.kk
In this case, the equality sign holds.
Case III: Let , be not linearly independent. Then c 6= for all real c. Then by
using the properties of norms,
h c, ci > 0
h, i 2ch, i + k 2 h, i > 0.

(6.4)

Since h, i, h, i, h, i are all real and h, i =


6 0, the expression h, i2ch, i+c2 h, i
is a real quadratic polynomial in c. For some c, (6.4) holds if
h, i2 h, ih, i < 0
[|h, i|]2 kk2 .kk2
Cauchy Schwartzs inequality in unitary space
Let and be any two vectors in an unitary space V (C), then |h, i| kk.kk.
Proof: Let = , then |||| = 0. In this case, we have,
h, i = h, i = h0, i = 0h, i = 0.
Therefore, |h, i| = 0. Thus, if = , then, both sides being 0, the equality holds. Now,
1
we consider the case, when 6= . Then ||||
2 is a positive real number, since |||| > 0, for
6= . Let us consider the vector
h, i
=
.
||||2
Therefore, we have,

Norm

371



h, i
h, i
h, i =
,

||||2
||||2




h, i
h, i
h, i
= ,

,
;
||||2
||||2
||||2

linearity

= h, i

h, i
h, i
h, i h, i
h, i
h, i +
h, i
2
2
||||
||||
||||2 ||||2

= ||||2

h, i
h, i
h, i h, i
h, i
h, i +
||||2
||||2
||||2
||||2 ||||2

= ||||2

h, i
|h, i|2
2
h,
i
=
||||

,
||||2
||||2

as h, i and h, i are complex conjugate. Again from the definition, h, i = ||||2 0,


we have,
|h, i|2
||||2
0
||||2
|h, i|2 ||||2 ||||2 |h, i| kk.kk.
This is also known as Cauchy Schwartzs inequality.
Result 6.2.1 Let = (1 , 2 , . . . , n ), = (1 , 2 , . . . , n ) Vn (C), the unitary space,
then
h, i =

n
X

i i ; kk2 =

i=1

n
X

|i |2 ; kk2 =

i=1

n
X

|i |2

i=1

Hence the Cauchy Schwartzs inequality for unitary space becomes


n
2
n
n
X

X
X


2

|
|
|i |2

i i
i


i=1

i=1

i=1

The equality sign holds when either, (i) ai = 0 or bi = 0 or both ai = bi = 0; i = 1, 2, . . . , n


or, (ii) ai = cbi for some nonzero real c ; i = 1, 2, . . . , n.
Result 6.2.2 Let and be two non null vectors in V (<). If be the angle between them,
then
h, i
cos =
kk.kk
For unitary space, when = (1 , 2 , . . . , n ), = (1 , 2 , . . . , n ) we have
P
i i
cos = pP
P
|i |2 .
|j |2
By the Cauchy Schwartzs inequality 1 cos 1, and so the angle exists and is unique.
Result 6.2.3 Let f and g be any real continuous functions defined in [0, 1], then the Cauchy
Schwartzs inequality is
2 Z 1
Z 1
h
i2 Z 1
hf, gi =
f (t)g(t)dt
f 2 (t)dt
g 2 (t)dt.
0

Ex 6.2.4 Find the angle between the vectors = (2, 3, 5) and = (1, 4, 3) in <3 .

372

Inner Product Space

Solution: By definition of inner product, we have,


p

|||| = h, i = 4 + 9 + 25 = 38
p

|||| = h, i = 1 + 16 + 9 = 26
h, i = 2 12 + 15 = 5.
If be the angle between the vectors and , then it is given by
cos =

h, i
5
.
=
|||| ||||
38 26

Since cos is positive, is an acute angle.


Ex 6.2.5 Let V be a vector space of polynomials with inner product given by hf, gi =
Z 1
f (t)g(t)dt. Taking f (x) = x + 2, g(x) = 2x 3, find hf, gi and kf k.
0

Solution: By definition of inner product, we have,


Z 1
Z 1
29
hf, gi =
f (t)g(t)dt =
(t + 2)(2t 3)dt = .
6
0
0
Also, using the definition of norm, we have,
Z 1
Z 1
19
.
hf, f i =
f (t)f (t)dt =
(t + 2)2 dt =
3
0
0
Z 1
Z 1
13
hg, gi =
g(t)g(t)dt =
(2t 3)2 dt =
.
3
0
0
q
q
p
19
and
so
the
length
of
f
(x)
is
Hence, kf k = hf, f i = 19
3 . If be the angle between
3
f (t) and g(t), then
cos =

29/6
hf, gi
29
= q
.
=
kf k.kgk
19
2 19 13
3

Since cos is negative, is an obtuse angle.



Ex 6.2.6 Find the cosine of the between and if =

2 1
3 1


; =

vector space of 2 2 matrices over R.


Solution: By definition of inner product, we have,





2 3
2 1
13 1
h, i = tr
= tr
= 15
1 1
3 1
1 2





0 2
0 1
4 6
h, i =
= tr
= 14
1 3
2 3
6 10



0 2
2 1
h, i = tr( T ) = tr
= 2.
1 3
3 1
Hence, the cosine of the between and is given by,
cos =

h, i
2
2
=p
=
.
kk.kk
210
(15 14)

Since cos is positive, is an acute angle.

0 1
2 3


in the

Norm

373

Ex 6.2.7 If in an inner product space k + k = kk + kk holds, prove that and are


linearly dependent.
Solution: If , are any two vectors, then by Cauchy Schwartzs inequality, we have
|h, i| kkkk.
Using the given condition, we get,
2

k + k = [kk + kk]
2

h + , + i = kk + kk + 2 kkkk
h, i + h, i + h, i + h, i = h, i + h, i + 2 kkkk
2Reh, i = 2 kkkk.
Since Reh, i |h, i| we have kk.kk |h, i|. From this and Cauchy Schwartzs
inequality we have, kk.kk = |h, i|. Thus the equality shows that and are linearly
dependent. The converse of this example is not true. For example, let = (1, 1, 0) and
= (2, 2, 0) in V3 (<). Hence = 2 so that and are linearly dependent. But,
p

kk = 1 + 1 + 0 = 2; kk = (2)2 + (2)2 + 0 = 2 2
p

k + k = 12 + (1)2 + 0 = 2.
Hence, k + k =
6 kk + kk.
Ex 6.2.8 If and are any two vectors in an inner product space V (F ), then,
k + k kk.kk.
Solution: Since , V (F ), the inner product space, by Cauchy Schwartzs inequality,
we have,
|h, i| kk.kk.
Using the definition of inner product we have,
k + k2 = h + , + i = h, i + h, i + h, i + h, i
p
= kk2 + h, i + h, i + kk2
= kk2 + 2Reh, i + kk2
kk2 + 2Re|h, i| + kk2

2
kk2 + 2kk.kk + kk2 = kk + kk .
Hence k + k kk + kk. This is well known triangle inequality. Let and be two
adjacent side of a triangle as indicated then + is the another side of the triangle forward by
and . Geometrically it stats that the length of one side of a triangle is less than or equal
to the sum of the lengths of the other two sides. In the similar manner, if {1 , 2 , , n }
be an orthogonal set of vectors, then
||1 + 2 + + n ||2 = ||1 ||2 + ||2 ||2 + + ||n ||2 ,
which is the well known Pythagoras theorem.
Ex 6.2.9 If , be any two vectors in an inner product space V , then
h
i
k + k2 + k k2 = 2 kk2 + kk2 .

374

Inner Product Space

Solution: Since , are two vectors, by definition,


k + k2 + k k2 = h + , + i + h , i
= h, i + h, i + h, i + h, i + h, i h, i
h, i + h, i = 2kk2 + 2kk2 .
This is the well known parallelogram law. Let if , are two adjacent sides of a parallelogram, then + and represent the diagonally of it. Hence the geometrical significant
of this law indicates that sum of squares of the diagonal of a parallelogram is equal to the
sum of the squares of its sides.
To obtain the real polar form h, i, subtracting, we get,
k + k2 k k2 = 4h, i

1
h, i =
k + k2 k k2 ,
4
which shows the inner product can be obtained from the norm function.
Ex 6.2.10 Prove that for any inner product space V ,
||a + b||2 = |a|2 ||||2 + |b|2 ||||2 + abh, i + abh, i.
Solution: Using the definition,
||a + b||2 = ha + b, a + bi = ah, a + bi + bh, a + bi
= aha + b, i + bha + b, i
= a{ah, i + bh, i} + b{ah, i + bh, i}
= a{ah, i + bh, i} + b{ah, i + bh, i}
= aah, i + abh, i + bah, i + bbh, i
= |a|2 ||||2 + abh, i + bah, i + |b|2 ||||2 .

6.3

Orthogonality

Let V (F ) be an inner product space and , V . Then the vector is said to be orthogonal
to the vector , if,
h, i = 0.
(6.5)
Using the symmetric property h, i = h, i = 0 of the inner product, we say that if is
orthogonal to , then is orthogonal to . Hence if h, i = 0, then and are orthogonal.
If the set of vectors S = {1 , 2 , , n } in an inner product space V (F ) be such that
any two distinct vectors in S are orthogonal, i.e.,
hi , j i = 0 for, i 6= j,

(6.6)

then the set S is called an orthogonal set. This orthogonal set plays a fundamental role in
the theory of Fourier series.
Result 6.3.1 The null vector V is orthogonal to any non null vector V , as
h, i = h, i = h, i = 0.
Also, the null vector is the only vector orthogonal to itself. For it, if (6= ) is orthogonal
to every V , then
h, i = 0 = .
An orthogonal set of vectors may contain the null vector .

Orthogonality

375

Result 6.3.2 If is orthogonal to , then every scalar multiple of is also orthogonal to


. Since,
hk, i = kh, i = k.0 = 0,
so if is perpendicular to , then k is also perpendicular to .
Ex 6.3.1 Show that sin t and cos t are orthogonal functions in the vector space of continuous
functions C[, ].
Solution: According to the definition of inner product, in the vector space, C[, pi] of
continuous functions on [, ], we get,


Z
1
hsin t, cos ti =
sin t cos t dt =
sin2 t
= 0.
2

Thus sin t and cos t are orthogonal functions in the vector space C[, ].
Ex 6.3.2 Find a non zero vector that is perpendicular to = (1, 2, 1) and = (2, 5, 4) in
<3 .
Solution: Let = (x, y, z), then we want h, i = 0 and h, i = 0. This yields a homogeneous system
x + 2y + z = 0; 2x + 5y + 4z = 0
or, x + 2y + z = 0; y + 2z = 0.
Set z = 1 to obtain y = 2 and x = 3. Thus (3, 2, 1) is a desired non zero vector orthogonal
to and . Normalizing , we obtain the unit vector
1
1
=
=
(3, 2, 1)
||||
14
orthogonal to and .

6.3.1

Orthonormal Set

Let V (F ) be an inner product space. Normalizing an orthogonal set S returns to the process
of multiplying each vector in S by the reciprocal of its length in order to transform S into
an orthonormal set of vectors. A set S = {1 , 2 , . . . , n } of vectors in V (F ) is said to be
orthonormal if
hi , j i = 0 ; if i 6= j
= 1 ; if i = j.
A vector V is said to be normalised if |||| = 1. The orthogonal set of vectors does not
contain the null vectors as |||| = 0. Note that, if {1 , 2 , , n } is an orthogonal set of
vectors, then {k1 , k2 , , kn } is an orthogonal, for any scalars k1 , k2 , , kn .
Theorem 6.3.1 Every orthogonal (or orthonormal ) set of non null vectors in an inner
product space is linearly independent.
Proof: Let T = {1 , 2 , . . . , n } be an orthogonal subset of an inner product space V (F ),
where hi , j i = 0 for i 6= j. Let us consider the relation,
c1 1 + c2 2 + + cn n = ,
where ci s are scalars. Now, taking the inner product we get,
hc1 1 + c2 2 + + cn n , i i = h, i i = 0; i = 1, 2, , n
or, c1 h1 , i i + c2 h2 , i i + + cn hn , i i = 0,
or, ci hi , i i = 0; since hi , j i = 0 for i 6= j.
Since i 6= and hence hi , j i 6= 0, therefore ci = 0. Thus c1 = c2 = = cn = 0, shows
that T = {1 , 2 , . . . , n } is linearly independent.

376

6.3.2

Inner Product Space

Orthogonal Complement

Let V (F ) be an inner product space, and W be any subset of V . The orthogonal complement
of W , denoted by W is defined by
W = { V ; h, i = 0, W }
Therefore, W consists of all vectors in V that are orthogonal to every vector W . In
particular, for a given vector V , we have,
= { V ; h, i = 0},
i.e., consists of all vectors in V that are orthogonal to the given vector .
Result 6.3.3 Clearly W . Let us consider two scalars a, b F and two element of
W are 1 , 2 . Thus for any W , we have
ha1 + b2 , i = ah1 , i + bh2 , i
= a0 + b0 = 0
Hence a1 + b2 W and so W is a subspace of V . From this result we conclude, if W
be a subset of a vector space V , the W is a subspace of V .

Result 6.3.4 Let W = {}, then V , we have h, i = 0. Hence W = {} = V.


Result 6.3.5 By definition, V = { : h, i = 0 ; V }. Also, we have,
h, i = 0 = so that V = {}.
Result 6.3.6 Suppose W is a subspace of V , then both W and W are subspaces of V .
Let W W , then,
W and W h, i = 0 =
W W = {}
Ex 6.3.3 Find the basis for the subspace of <3 , where, = (1, 3, 4).
Solution: According to the definition, consists of all vectors = (x, y, z) such that
h, i = 0 x + 3y 4z = 0.
If y = 1, z = 0, then 1 = (3, 1, 0) and if y = 0, z = 1, then 2 = (4, 0, 1), where
{1 , 2 } form a basis for the solution space of the equation, and hence a basis of . The
correspondingnorthonormal basis is
o
1 (3, 1, 0), 1 (4, 0, 1) .
10
17
Ex 6.3.4 Let = (1, 2, 1, 3) be a vector in <4 . Find the orthogonal and orthonormal
basis for .
Solution: Since consists of all vectors = (x, y, z, t) <4 such that h, i = 0.
Therefore, we are to find the solutions of the linear equation x 2y z + 3t = 0. A non null
solution of x 2y z + 3t = 0 is 1 = (0, 1, 1, 1). Now, find a non null solution solution of
the system,
x 2y z + 3t = 0, y + z + t = 0
say, 2 = (5, 1, 0, 1). Lastly, find a non null solution of the linear system,
x 2y z + 3t = 0, y + z + t = 0, 5x + y t = 0,
say, 3 = (1, 7, 9, 2). Thus, {(0, 1, 1, 1), (5, 1, 0, 1), (1, 7, 9, 2)} is orthogonal basis for
. The corresponding
orthonormal basis is
n
o
1
(0, 1, 1, 1), 1 (5, 1, 0, 1), 1 (1, 7, 9, 2) .
3
27
135

Orthogonality

6.3.3

377

Direct Sum

Let W be a subspace of V . Then V is called the direct sum of two subspace W and W ,
denoted by W W is given by
V = W W
Property 6.3.1 Let U and W be subspace of a finite dimensional inner product space V,
then,
(U + W ) = U W and (U + W ) = U + W .
Theorem 6.3.2 Let W be a subspace of a finite dimensional inner product space V . Then
V is the direct sum of W and W ( i.e. V = W W ) and W = W .
Proof: Since W is a subspace of V , so it has an orthogonal basis. Let S = {1 , 2 , . . . , k }
be a basis of W . By extension theorem, S be extended to S1 = {1 , 2 , . . . , k , . . . , n }
to form a basis of V . Using Gram-Schmidt orthonormalization process to S1 we get an
orthonormal basis {1 , 2 , . . . , n } of V where
k
X
i =
aij j ; i = 1, 2, . . . , n.
j=1

Hence {1 , 2 , . . . , n } is an orthonormal basis of W as 1 , 2 , . . . , k W . Thus we conclude


that, there is an orthonormal basis of W which is part of an orthonormal basis of V .
Hence, an orthonormal basis {1 , 2 , . . . , k } of W which is part of an orthonormal
basis {1 , 2 , . . . , n } of V . Also as {1 , 2 , . . . , n } is orthonormal, k+1 , . . . , n W . If
W , then
n
k
n
X
X
X
=
ai i =
ai i +
ai i W + W
i=1

i=1

i=k+1

Hence, V W + W so that V = W + W . Also if W W , then W


and W and
h, i = 0 = W W = {}.
These shows that V = W W as a basic result in linear algebra. We prove this for
finite dimensional vector space V , but it also holds for spaces of arbitrary dimension. By
definition, W = {, h, i = 0; W }. The relation h, i = 0 = h, i suggests that
is orthogonal to . Therefore,
(W ) W W .
W is a subspace of V . Therefore, W is also a subspace of V . Now, W V.
So, can be expressed in the form = + , where W and W. Therefore,
h, i = 0, i.e.,
h, i + h, i = 0 h, i = 0
===W
W W, i.e., W W.
Therefore, W = W and hence the theorem.
Ex 6.3.5 Find the basis of the subspace W of R4 orthogonal to = (1, 2, 3, 4) and =
(3, 5, 7, 8)

378

Inner Product Space

Solution: Here dim <4 = 4, so a basis of R4 contains four linearly independent vectors.
Here we are to find out an orthogonal set of four vectors out of them two are orthogonal to
= (1, 2, 3, 4) and = (3, 5, 7, 8). Since, and are LI and W = {, } so, {, } is a
basis of W . Therefore, dimW = 2. We know,
<4 (<) = W W , dim<4 (<) = dimW + dimW
4 = 2 + dimW dimW = 2.
Thus the basis of W consists of two vectors. Let the other elements be = (x1 , x2 , x3 , x4 )
and = (y1 , y2 , y3 , y4 ), which are the basis of W . For orthogonality, . = 0 and . = 0,
so,
x1 2x2 + 3x3 + 4x4 = 0 and 3y1 5y2 + 7y3 + 8y4 = 0.
The rank of the coefficient matrix of the system of linear equations is 2. Now, = (0, 2, 0, 1)
and = (5, 0, 1, 1) satisfies the relation. Hence the basis of the subspace W of <4 orthogonal
to {, }, i.e., a basis of W is {(0, 2, 0, 1), (5, 0, 1, 1)}.
Ex 6.3.6 Let 
V be the
Show that
 vector
 space
 of2 2 matrices

 over<. (i)

10
01
00
00
=
, =
, =
, =
00
00
10
01
form a orthogonal basis of V . (ii) Find the basis for the orthogonal complement of (a) the
diagonal matrices, (b) the symmetric matrices.
Solution: (i) The relation c1 + c2 + c3 + c4 = holds, if c1 = c2 = c3 = c4 = 0.
Therefore, the given set of vectors {, , , } are linearly independent and hence forms an
orthogonal basis of V .
(ii)(a) Here the diagonal matrices are and .
 Let W1 be the subspace of V spanned by
ac
and . Hence we seek all matrices X =
such that
bd
h, Xi = 0 = h, Xi





ab
10
ab
00
tr
= 0 = tr
cd
00
cd
01




a0
0b
tr
= 0 = tr
a = 0 = d.
c0
0d


00
10

The free variables are b and c. First we choose, b = 1, c = 0 so the solution is X1 =




01
and if we choose, b = 0, c = 1, then the solution is X2 =
. Thus {X1 , X2 } is a basis
00
of W1 .
(b) Here the symmetric matricesare {,
 }. Let W2 be the subspace of V spanned by , .
ac
Hence we seek all matrices X =
such that
bd


tr

h, Xi = 0 = h, Xi




ab
10
ab
00
= 0 = tr
cd
00
cd
01




a0
0b
tr
= 0 = tr
c0
0d
a=0=d


Orthogonality

379



0 1
Taking the free variables as b = 1, c = 1, then the solution is X =
.
1 0
Z
Ex 6.3.7 Let V be the inner product space P3 with the inner product hf, gi =
2

f (t)g(t)dt.

Let W be the subspace of P3 with basis {1, t }. Find a basis for W .


Solution: Let p(t) = at3 + bt2 + ct + d be an element of W . Since p(t) must be orthogonal
to each of the vectors in the given basis for W , we have,
Z 1
c
a b
hp(t), 1i =
(at3 + bt2 + ct + d).1dt = + + + d = 0,
4
3
2
0
Z 1
c d
a b
2
hp(t), t i =
(at3 + bt2 + ct + d).t2 dt = + + + = 0.
6
5
4
3
0
Solving the homogeneous system, we get a = 3l + 16m; b = 15
4 l 15m; c = l; d = m.
Therefore,
p(t) = (3l + 16m)t3 + (
= l(3t3

15
l 15m)t2 + lt + m
4

15 2
t + t) + m(16t3 15t2 + 1).
4

2
3
2
Now, {(3t3 15
4 t + t), (16t 15t + 1)} are LI, as they are not multiples of each other and
15 2

3
3
W = {(3t 4 t + t), (16t 15t2 + 1)}. Hence they form a basis of W .

Ex 6.3.8 Find theorthogonal


complement of the row space of the matrix
1 1 2
A = 2 3 5.
3 4 7
Solution: Consider the system of linear homogeneous equations AX = 0, with the given
matrix A as the coefficient matrix. Therefore,
x1 + x2 + 2x3 = 0, 2x1 + 3x2 + 5x3 = 0, 3x1 + 4x2 + 7x3 = 0
x1 + x2 + 2x3 = 0, 2x1 + 3x2 + 5x3 = 0 x2 + x3 = 0.
Thus the solutions are given by {k(1, 1, 1); k Z.}. Thus the orthogonal complement of
the row space of A is L({(1, 1, 1)}).

1111
Ex 6.3.9 Find the orthogonal basis of the row space of the matrix A = 1 2 1 0 .
2123
Solution: Apply elementary row operations on the given matrix A, we get,

1111
1 1 0 1
0 1 0 1
1 2 1 0 1 2 1 0 1 2 1 0
2123
0 3 0 3
0 1 0 1

0 1 0 1
0 1 0 1
1 2 1 0 1 1 1 1.
0 0 00
0 0 00
Thus, we obtain a row echelon matrix R3 whose row vectors are (0, 1, 0, 1) and (1, 2, 1, 0).
Thus the basis of the row space of A is {(0, 1, 0, 1), (1, 2, 1, 0)} and they are orthogonal.
Hence the orthonormal basis of the row space of the matrix A is { 12 (0, 1, 0, 1), 16 (1, 2, 1, 0)}.

380

Inner Product Space

Ex 6.3.10 Find the orthogonal basis for in C 3 , where = (1, i, 1 + i).


Solution: Here consists of all vectors = (x, y, z) such that,
h, i = . = x iy + (1 i)z = 0.
Let x = 0, then one of the solutions is 1 = (0, 1 i, i). Now, we are to find a solution of
the system
x iy + (1 i)z = 0; (1 + i)y iz = 0.
Let z = 2, then the solution is 2 = (1 + i, 1 + i, 2). Thus we see that {1 , 2 } form an
orthogonal basis for . The corresponding orthonormal basis is
{ 13 (0, 1 i, i), 18 (1 + i, 1 + i, 2)}.

6.4

Projection of a Vector

Let (6= ) be a fixed vector in inner product space V . Then for a vector (6= ) in V ,
c F such that
h, i
h c, i = 0 c =
.
(6.7)
h, i
It is analogous to a coefficient in the Fourier series of a function. The unique scalar c is
defined as the scalar component of along or the Fourier coefficient of with respect to
. c is said to be the projection of along and is given by,
h, i
Proj (, ) = c =
.
(6.8)
h, i
Ex 6.4.1 Find the Fourier coefficient and projection of = (1, 3, 1, 2) along = (1, 2, 7, 4)
in <4 .
Solution: Using the definition of inner product, we have,
h, i = 1 6 + 7 + 8 = 10 and ||||2 = h, i = 1 + 4 + 49 + 16 = 70.
Since 6= , the Fourier coefficient c is given by,
c=

h, i
10
1
=
= .
h, i
70
7

The projection of = (1, 3, 1, 2) along = (1, 2, 7, 4) in <4 is given by,


Proj (, ) = c =

1
(1, 2, 7, 4).
7

Ex 6.4.2 Find the Fourier coefficient and projection of = (1 i, 3i, 1 + i) along =


(1, 2 i, 3 + 2i) in C 3 .
Solution: Using the definition of inner product, we have,
h, i = h(1 i, 3i, 1 + i), (1, 2 i, 3 + 2i)i
= (1 i).1 + 3i(2 + i) + (1 + i)(3 2i) = 3(1 + 2i).
2
|||| = h, i = 12 + (2 i)(2 + i) + (3 + 2i)(3 2i) = 19.
Since 6= , the Fourier coefficient c is given by,
c=

h, i
3
=
(1 + 2i).
h, i
19

Projection of a Vector

381

Thus the projection of along in C 3 is given by,


Proj (, ) = c =

3
(1 + 2i)(1, 2 i, 3 + 2i)
19

3
(1 + 2i, 4 + 3i, 1 + 8i).
19

Ex 6.4.3 Find the Fourier coefficient and projection of = t2 along = t + 3 in the vector
space P (t).
Solution: Using the definition of inner product, we have,
Z 1
Z 1
5
37
2
2
h, i =
t (t + 3)dt = ; |||| =
(t + 3)2 dt =
.
4
3
0
0
Since 6= , the Fourier coefficient c is given by,
c=

h, i
5
3
15
=
=
.
h, i
4 37
148

The projection of = t2 along = t + 3 in P (t) is given by,


Proj (, ) = c =

15
(t + 3).
148


Ex 6.4.4 Find the Fourier coefficient and projection of =

12
34


along =

11
55


in

M22 .
Solution: Using the definition of inner product, we have,





15
12
16 22
h, i = tr
= tr
= 38;
15
34
16 22





15
11
26 26
||||2 = tr
= tr
= 52.
15
55
26 26
Since 6= , the Fourier coefficient c is given by,
c=

h, i
38
19
=
=
.
h, i
52
26

The projection of along is given by,


Proj (, ) = c =

19
26

11
55


.

Theorem 6.4.1 Let {1 , 2 , . . . , r } from an orthogonal set of non null vectors in V , and
r
P
h,k i
, the
be any vector in V L({1 , 2 , . . . , r }). If =
ck k ; where ck = h
k ,k i
k=1

scalar component of along , then is orthogonal to 1 , 2 , . . . , r .


h,i i
= the component (Fourier coefficient ) of along the given
Proof: Given that ci = h
i ,i i
vector i . By definition of inner product, we have for i = 1, 2, . . . , r;

h, i i = h

r
X
k=1

ck k , i i = h, i i

r
X

ck hk , i i

k=1

= h, i i c1 .0 c1 hi , i i cr .0
( since i0 s are orthonormal)
h, i i
= h, i i
hi , i i = 0
hi , i i

382

Inner Product Space

This shows that is orthogonal to each i . Hence the theorem is proved. From this theorem,
we have,
r
P
(i) Let S = {1 , 2 , . . . , r , } and T = {1 , 2 , . . . , r , }. Given =
ck k so
k=1

that each vector in T is a linear combination of the vectors of S. Hence L(T ) L(S).
r
P
By the same argument, as = +
ck k , we have, L(S) L(T ). Hence it follows
k=1

that L(S) = L(T ).


(ii) This theorem is valid when L({1 , 2 , . . . , r }). In this case, =

r
P

ck k , where

k=1

ck is the scalar component of along k . Clearly = and so is orthogonal to each


i and
L(T ) = L({1 , 2 , . . . , r , }) = L({1 , 2 , . . . , r })
= L({1 , 2 , . . . , r , }); as is a LC of {1 , 2 , . . . , r }
= L(S)
Theorem 6.4.2 An orthogonal set of non-null vectors in a finite dimensional inner product
space is either a basis or can be extended to an orthogonal basis.
Proof: Let S = {1 , 2 , . . . , r } be an orthogonal set of non null vectors in V and dimV =
n, where, 1 r n. Therefore S = {1 , 2 , . . . , r } is a linearly independent set.
Case 1: Let r = n, then S = {1 , 2 , . . . , n } is an orthogonal basis of V .
Case 2: Let r < n, then L(S) = L({1 , 2 , . . . , r }) is a proper subspace of V , and so there
exists a non-null vectors r+1 V but r+1
/ L({1 , 2 , . . . , r }). Let
c1 1 + c2 2 + + cr r + cr+1 r+1 = ,
where ci s are scalars. If cr+1 6= 0, then r+1 is a linear combination of 1 , 2 , . . . , r so
that r+1 L({1 , 2 , . . . , r }), which is a contradiction. Therefore, cr+1 must be zero and
so the set {1 , 2 , . . . , r , r+1 } is linearly independent. Let,
r
X
hr+1 , i i
r+1 = r+1
di i ; di =
||i ||2
i=1
where di is the scalar component of r+1 along i , then r+1 is non-null and hr+1 , i i = 0
for i = 1, 2, . . . , r. Therefore, S1 = {1 , 2 , . . . , r+1 } is an orthogonal set and the given set
S has been extended to the orthogonal setS1 .
If r + 1 = n, then S1 is an orthogonal basis of V . If r + 1 < n, then by repeated application, we obtain in a finite number of steps an orthogonal set of n vectors Snr =
{1 , 2 , . . . , r+1 , . . . , n } in V . This set Snr being an orthogonal set of non null vectors,
in linearly independent and so form a basis of V . If this extended set is normalized, then V
has an orthonormal basis.
Result 6.4.1 Let S = {1 , 2 , , n } be an orthogonal basis of V , then they are linearly
independent. Now any V can be expressed as a linear combination of vectors of S as
h, 1 i
h, 2 i
h, n i
=
1 +
2 + +
n .
h1 , 1 i
h2 , 2 i
hn , n i
Gram-Schmidt process of orthogonalization
Theorem 6.4.3 Every non-null subspace W of a finite dimensional inner product space V
possesses an orthogonal basis.

Projection of a Vector

383

Proof: Let V be an inner product space of n vectors and dimW = r where 1 r n.


Let S = {1 , 2 , . . . , r } be a basis of W , then S is linearly independent and none of the
elements of S in . Now we construct an orthogonal basis. Since all the basis vectors are
non-null, set
1 = 1 and 2 = 2 c1 .
If 2 is orthogonal to 1 , then
0 = h2 , 1 i = h2 c1 , 1 i
= h2 , 1 i c1 h1 , 1 i c1 =
2 = 2

h2 , 1 i
h1 , 1 i

h2 , 1 i
1
h1 , 1 i

i.e. c1 is the complement of 2 along 1 and c1 1 is the projection of 2 upon 1 . For the
value of c1 , 2 is orthogonal to 1 and
L({1 , 2 }) = L({1 , 2 }) = L({1 , 2 }).
Let 3
/ L({1 , 2 }) and let 3 = 3 d1 1 d2 2 , where d1 1 , d2 2 are the projection of
3 upon 1 , 2 respectively. If 3 is orthogonal to 1 , 2 then
h3 , 1 i = 0 ; h3 , 2 i = 0
h3 , 1 i
h3 , 2 i
d1 =
; d2 =
h1 , 1 i
h2 , 2 i
h3 , 1 i
h3 , 2 i
3 = 3
1
2
h1 , 1 i
h2 , 2 i
and L({1 , 2 , 3 }) = L({1 , 2 , 3 }) = L({1 , 2 , 3 }).
Proceeding in this way we can construct 1 , 2 , . . . , r where
r = r

hr , 2 i
hr , r1 i
hr , 1 i
1
2
r1
h1 , 1 i
h2 , 2 i
hr1 , r1 i

and r 6= as S is linearly independent. Also


(r , i ) = 0 ; i = 1, 2 . . . , r 1
L({1 , 2 , . . . , r }) = L({1 , 2 , . . . , r })
Hence {1 , 2 , . . . , r } is an orthogonal basis of the subspace W . Let, {1 , 2 , . . . , r } be an
orthogonal basis of the subspace W, then for any V, we have,
= c1 1 + c2 2 + + cr r ,
where ci are the Fourier coefficients of with respect to i , i = 1, 2, , r. Since W =
L({1 , 2 , . . . , r }), where {1 , 2 , . . . , r } form an orthogonal basis of the subspace W,
then,
P roj(, W ) = c1 1 + c2 2 + + cr r .
(6.9)
Here ci is the component of along i .
Ex 6.4.5 Apply Gram Schmidt process to obtain an orthogonal basis of <3 using the standard inner product having given that {(1, 0, 1), (1, 0, 1), (0, 3, 4)} is a basis.

384

Inner Product Space

Solution: Let 1 = (1, 0, 1), 2 = (1, 0, 1), 3 = (0, 3, 4). Since




1 1 0


0 0 3 = 6 6= 0


1 1 4
so S = {1 , 2 , 3 } is linearly independent. Also S contains 3 vectors and dim<3 = 3.
Hence S is a basis of <3 . Let us construct an orthogonal basis {1 , 2 , 3 } by Gram-Schmidt
process. For which, let
1 = 1 = (1, 0, 1)
h2 , 1 i
1
h1 , 1 i
1.1 + 0.0 + 1.(1)
(1, 0, 1) = (1, 0, 1)
= (1, 0, 1)
1.1 + 0.0 + 1.1
h3 , 1 i
h3 , 2 i
and 3 = 3
1
2
h1 , 1 i
h2 , 2 i
0.1 + 3.0 + 4.1
0.1 + 3.0 + 4.(1)
= (0, 3, 4)
(1, 0, 1)
(1, 0, 1)
1.1 + 0.0 + 1.1
1.1 + 0.0 + (1).(1)
= (0, 3, 4) 2(1, 0, 1) + 2(1, 0, 1) = (0, 3, 0).
2 = 2 c1 1 = 2

Thus, 2 is orthogonal to 1 and 3 is orthogonal to 1 , 2 . Also,


L({1 , 2 , 3 }) = L({1 , 2 , 3 }).
Therefore, an orthogonal basis of the subspace is {(1, 0, 1)(1, 0, 1)(0, 3, 0)} and the corresponding orthogonal basis is
{ 12 (1, 0, 1), 12 (1, 0, 1), (0, 1, 0)}.
Ex 6.4.6 Find the orthogonal basis for <4 containing two vectors
(1, 2, 1, 1) and (0, 1, 2, 1).
Solution: Since dim<4 = 4, the basis of <4 contains four linearly independent vectors.
Let {e1 , e2 , e3 , e4 } be a standard basis of R4 , where e1 = (1, 0, 0, 0), e2 = (0, 1, 0, 0), e3 =
(0, 0, 1, 0), and e4 = (0, 0, 0, 1). Let 1 = (1, 2, 1, 1) and 2 = (0, 1, 2, 1). Now,
1 = (1, 2, 1, 1) = 1e1 + 2e2 + 1e3 + (1)e4 .
Since the coefficient of e1 is non zero, using replacement theorem, {1 , e2 , e3 , e4 } is a basis.
Also,
2 = (0, 1, 2, 1) = 0e1 + 1e2 + 2e3 + (1)e4
= 0(1 2e2 r3 + e4 ) + e2 + 2e3 e4
= 01 + e2 + 2e3 e4 .
Since the coefficient of e2 is non zero, so {1 , 2 , e3 , e4 } is a new basis. Now we construct
an orthogonal
, 22,, 1,3 ,1).
4 } by Gram-Schmidt process. For which, let
1 basis
= 1 {
= 1(1,
h2 , 1 i
2 = 2
1
h1 , 1 i
0.1 + 1.2 + 2.1 + (1).(1)
= (0, 1, 2, 1)
(1, 2, 1, 1)
1.1 + 2.2 + 1.1 + (1).(1)
5
1
= (0, 1, 2, 1) (1, 2, 1, 1) = (5, 3, 9, 2)
7
7
he3 , 1 i
he3 , 2 i
3 = e3
1
2
h1 , 1 i
h2 , 2 i

Projection of a Vector

385

9
1
1
7
= (0, 0, 1, 0) (1, 2, 1, 1) 119
( )(5, 3, 9, 2)
7
7
49
1 2 1 1
9
1
(5, 3, 9, 2) =
(4, 1, 3, 5).
= (0, 0, 1, 0) ( , , , ) +
7 7 7 7
119
17
he4 , 1 i
he4 , 2 i
he4 , 3 i
4 = e4
1
2
3
h1 , 1 i
h2 , 2 i
h3 , 3 i
(1)
(2/7) 5 3 9 2
( , , , )
= (0, 0, 0, 1)
(1, 2, 1, 1)
7
119/49 7 7 7 7
(5/17)
4
1 3 5
5 1 1
1

( , , , ) = ( , , 0, ).
3/17
17 17 17 17
3 3 3
3

Here, 1 is orthogonal to 2 , 3 , 4 , 2 is orthogonal to 3 , 4 and 3 is orthogonal to 4 .


Also,
L{1 , 2 , 3 , 4 } = L{1 , 2 , 3 , 4 }.
Therefore, {1 , 2 , 3 , 4 } is the required orthogonal basis of the subspace.
Ex 6.4.7 Given {1 = 1, 2 = 1 + x, 3 = x + x2 } as a basis of P2 (x) over R. Define the
inner product as
Z 1
hf, gi =
f (x)g(x) dx
1

where f (x), g(x) are elements of P2 (x). Construct an orthonormal basis of P2 (x) from the
given set.
2 ,1 i
Solution: Let 1 = 1 = 1. 2 = 2 c21 1 , where c21 = h
h1 ,1 i . Now,
R1
R1
h2 , 1 ) = (1 + x).1 dx = 2 and h1 , 1 i = 1.1 dx = 2.

Therefore 2 = 2 22 1 = 1 + x 1 = x. Similarly,
h3 ,2 i
3 ,1 i
3 = 3 c31 1 c32 2 , where c31 = h
h1 ,1 i , c32 = h2 ,2 i .
R1
R1
Again, h3 , 1 i = (x + x2 ).1 dx = 23 ,
h3 , 2 i = (x + x2 ).x dx =
1

and (2 , 2 ) =

R1
1

2
3

x2 dx = 32 .

2/3
1
1
2
2
Therefore 3 = 3 2/3
2 1 2/3 2 = (x + x ) 3 x = x 3 .
Thus, the set {1, x, x2 1/3} is an orthogonal basis of P2 (x). Again,

k1 k2 = h1 , 1 i =

1 dx = 2, k2 k2 = h2 , 2 i =

1
1

k3 k2 = h3 , 3 i =

Therefore, k1 k =

(x2 1/3)2 dx =

2, k2 k =

6
3 ,

k3 k =

2 10
15 .

x2 dx =

2
,
3

8
.
45

Hence an orthogonal basis of P2 (x) is

o
n 2 6 10
(3x2 1) .
x,
,
4
2
2
Ex 6.4.8 Find the orthogonal basis for <3 containing the vectors ( 12 , 12 , 0) with the
standard inner product.

386

Inner Product Space

Solution: We know, dim<3 = 3, so a basis of <3 contains there linearly independent


vectors. First we construct an orthogonal set of three vectors with 1 = ( 12 , 12 , 0) as an
element. Let the other elements be 2 = (x1 , x2 , x3 ) and 3 = (y1 , y2 , y3 ). Since 1 , 2 , 3
are orthogonal, we have,
1
1
1 .2 = 0 x1 + x2 + 0x3 = 0 x1 = x2
2
2
1
1
1 .3 = 0 y1 + y2 + 0y3 = 0 y1 = y2
2
2
2 .3 = 0 x1 y1 + x2 y2 + x3 y3 = 0
2x1 y1 + x3 y3 = 0
For simplicity, we choose y1 = 0, then x3 = 0, y3 6= 0. Thus the introduced vectors are
2 = (x1 , x2 , 0), 3 = (0, 0, y3 ). Hence,
1
1
1
2 = (1, 1, 0) ;
3 = (0, 0, 1)
||2 ||
||3 ||
2
Thus an orthogonal basis is { 12 (1, 1, 0), 12 (1, 1, 0), (0, 0, 1)}.
Ex 6.4.9 Find an orthogonal basis for the space of solutions of the linear equation 3x 2y +
z = 0.
Solution: First we shall find a basis, not necessarily orthogonal. For instance we give z an
arbitrary value say, z = 1. Thus we have to satisfy 3x 2y = 1. By inspection, we let,
x = 1, y = 2 or x = 3, y = 5, i.e.
1 = (1, 2, 1) and 2 = (3, 5, 1).
Then obviously 1 , 2 are linearly independent. The space of solution has dimension 2, so
1 , 2 form a basis of that space of solution. There are of course many basis for this space.
Let
1 = 1 = (1, 2, 1)
h2 , 1 i
3.1 + 5.2 + 1.1
2 = 2
1 = (3, 5, 1)
(1, 2, 1)
h1 , 1 i
1.1 + 2.2 + 1.1
14
1
= (3, 5, 1) (1, 2, 1) = (2, 1, 4)
6
3
Then {1 , 2 } is an orthogonal basis of the given space of solutions.
Ex 6.4.10 Find an orthogonal basis for the space of solution of the linear equations 3x
2y + z + w = 0 = x + y + 2w
Solution: Let W be the space of solutions in <4 . Then W is the space orthogonal to
the two vectors (3, 2, 1, 1) and (1, 1, 0, 2). These are obviously linearly

independent (by
2 1
any number of arguments, you can prove at once that the matrix
has rank 2, for
1 0
instance). Hence dimW = 4 2 = 2. Next we find a basis for the space of solutions. Let us
put w = 1, and
3x 2y + z = 1; x + y = 2

Projection of a Vector

387

by ordinary elimination. If we put y = 0, then we get a solution with x = 2 and


z = 1 3x + 2y = 5
If we put y = 1, then we get a solution with x = 3 and
z = 1 3x + 2y = 10
Thus we get the two solutions 1 = (2, 0, 5, 1) and 2 = (3,
0). These two solu 1, 10,
1 0
tions are linearly independent, because for instance the matrix
has rank 2. Hence
3 1
{1 , 2 } is a basis for the space of solutions. To find an orthogonal basis, let
1 = 1 = (2, 0, 5, 1)
h2 , 1 i
6 + 50 + 1
2 = 2
1 = (3, 1, 10, 1)
(2, 0, 5, 1)
h1 , 1 i
4 + 25 + 1
1
=
(3, 10, 5, 9)
10
Thus {1 , 2 } is an orthogonal basis for the space of solutions.
Ex 6.4.11 Find an orthogonal and orthonormal basis for the subspace W of C 3 spanned by
1 = (1, i, 1) and 2 = (1 + i, 0, 2).
Solution: To find an orthogonal and orthonormal basis for the subspace W of C 3 , we use,
the Gram-Schmidt process of orthogonalization. Let,
1 = 1 = (1, i, 1)
h2 , 1 i
h(1 + i, 0, 2), (1, i, 1)i
1 = (1 + i, 0, 2)
(1, i, 1)
h1 , 1 i
h(1, i, 1), (1, i, 1)i
3+i
1
= (1 + i, 0, 2)
(1, i, 1) = (2i, 1 3i, 3 i).
3
3

2 = 2

Thus, the orthogonal basis is {(1, i, 1), (2i, 1 3i, 3 i)}. Also,
||1 ||2 = h1 , 1 i = 3, ||2 ||2 = 24.
1
(2i, 1 3i, 3 i)}.
Hence the orthonormal basis is { 13 (1, i, 1), 2
6

d2 y
Ex 6.4.12 Let W be a real valued solution space of
+ 4y = 0. Find the orthogonal and
dx2
orthonormal basis for W .
2

d y
Solution: Since y = (x) = 0 satisfies the differential equation dx
2 + 4y = 0, so (x) W
and consequently, W is non empty and it is easily verified that W is a subspace of V (<).
d2 y
The solution of
+ 4y = 0 is of the type,
dx2

y = c1 cos 2x + c2 sin 2x, for c1 , c2 <.


Thus, W = L({cos 2x, sin 2x}). If c1 cos 2x + c2 sin 2x = 0, for c1 , c2 < is an identity, then
c1 = c2 = 0. Therefore, {cos 2x, sin 2x} is linearly independent and it is a basis of W , so
that dimW = 2. To find the orthogonal basis, we use the inner product definition
Z
hf, gi =
f (t)g(t)dt
0

388

Inner Product Space

and use the Gram-Schmidt process of orthogonalization.


{1 , 2 }, then,

Let the orthogonal basis be

hsin 2x, cos 2xi


= sin 2x.
hcos 2x, cos 2xi
Z
Again, using the inner product definition hf, gi =
f (t)g(t)dt, we have
1 = cos 2x and 2 = sin 2x

; hsin 2x, sin 2xi =


sin2 2xdx = .
2
2
0
0
q
nq
o
2
2
The corresponding orthonormal basis is
cos 2x,
sin 2x .
Z

hcos 2x, cos 2xi =

cos2 2xdx =

Ex 6.4.13 Find the orthonormal

basis of the row space of the matrix


12 0
A = 1 0 1 .
22 1
Solution: Let 1 = (1, 2, 0), 2 = (1, 0,
1), 3 = (2, 2, 1) be three rows of A. The vectors
1 2 0


1 , 2 , 3 are linearly independent as 1 0 1 = 4 6= 0. Thus {1 , 2 , 3 } is a basis of
2 2 1
the row space of the matrix A. To find orthogonal basis, let,
1 = 1 = (1, 2, 0).
2 ,1 i
2 = 2 c21 1 = 2 h
h1 ,1 i 1
= (1, 0, 1) 15 (1, 2, 0) = 15 (4, 2, 5)
h3 , 2 i
h3 , 1 i
and
1
2
3 = 3 c31 1 c32 2 = 3
h1 , 1 i
h2 , 2 i
6
1
1
= (2, 2, 1) (1, 2, 0) + (4, 2, 5) = (8, 4, 8).
5
45
9
Hence the orthogonal basis of row space is {1 , 2 , 3 }, i.e., {(1, 2, 0), 51 (4, 2, 5),
and the corresponding orthonormal basis is {1 /k1 k, 2 /k2 k, 3 /k3 k}, i.e.,

1
9 (8, 4, 8)}

o
n 1
1
1
(1, 2, 0), (4, 2, 5), (2, 1, 2) .
3
5
3 5
Ex 6.4.14 Let V be the subspace of <4 spanned by (1, 1, 1, 1), (1, 1, 2, 2), (1, 2, 3, 4).
Find the orthogonal and orthonormal basis for V . Find the projection of = (1, 2, 3, 4)
onto V .
Solution: Let 1 = (1, 1, 1, 1), 2 = (1, 1, 2, 2), 3 = (1, 2, 3, 4). To find an orthogonal
and orthonormal basis for the subspace V of <4 , we use, the Gram-Schmidt process of
orthogonalization. Let,
1 = 1 = (1, 1, 1, 1)
h2 , 1 i
4
2 = 2
1 = (1, 1, 2, 2) (1, 1, 1, 1) = (0, 2, 1, 1).
h1 , 1 i
4
h3 , 1 i
h3 , 2 i
1
2
3 = 3
h1 , 1 i
h2 , 2 i
4
11
1
= (1, 2, 3, 4)
(1, 1, 1, 1)
(0, 2, 1, 1) = (12, 4, 1, 7).
4
6
6

Projection of a Vector

389

Therefore, an orthogonal basis for the subspace V of <4 , is,


{(1, 1, 1, 1), (0, 2, 1, 1), (12, 4, 1, 7)}
and the corresponding orthonormal basis for the subspace V of <4 , is
1
(12, 4, 1, 7)}.
{ 21 (1, 1, 1, 1), 16 (0, 2, 1, 1), 210
To extend the above orthogonal basis to the orthogonal basis for <4 , we adjoin with the given
basis of <4 any vector of the fundamental system to form a basis of <4 and then proceed to
find the orthogonal basis of <4 . Now, we need only compute the Fourier coefficients.
c1 =

h, 1 i
h, 2 i
1
h, 3 i
1
= 1; c2 =
= ; c3 =
= .
h1 , 1 i
h2 , 2 i
2
h3 , 3 i
10

Since W = L({1 , 2 , 3 }), where {1 , 2 , 3 } form an orthogonal basis of the subspace W,


then,
P roj(, W ) = c1 1 + c2 2 + c3 3
1
1
1
= (1, 1, 1, 1) (0, 2, 1, 1) (12, 4, 1, 7) = (1, 12, 3, 6).
2
10
5
Ex 6.4.15 Let S = {(1, 1, 1, 1), (1, 1, 1, 1), (1, 1, 1, 1), (1, 1, 1, 1)} of <4 . (i) Show
that S is orthogonal and a basis of <4 . (ii) Express = (1, 3, 5, 6) as a linear combination
of the vectors of S. (iii) Find the co-ordinates of an arbitrary vector = (a, b, c, d) <4 ,
relative to the basis S.
Solution: Let 1 = (1, 1, 1, 1), 2 = (1, 1, 1, 1), 3 = (1, 1, 1, 1), 4 = (1, 1, 1, 1).
(i) Using the definition of inner product, we get,
h1 , 2 i = h1 , 3 i = h1 , 4 i = = h3 , 4 i = 0.
Thus, S is orthogonal and hence S is linearly independent. As any four linearly independent
vectors form a basis of <4 , so S is a basis for <4 .
(ii) To express = (1, 3, 5, 6) as a linear combination of the vectors of S, we are to find
the scalars c1 , c2 , c3 , c4 < such that,
= c1 1 + c2 2 + c3 3 + c4 4
or, ((1, 3, 5, 6) = c1 (1, 1, 1, 1) + c2 (1, 1, 1, 1) + c3 (1, 1, 1, 1) + c4 (1, 1, 1, 1)
or, c1 + c2 + c3 + c4 = 1; c1 + c2 c3 c4 = 3;
c1 c2 + c3 c4 = 1; c1 c2 c3 + c4 = 1;
5
3
13
9
c1 = ; c2 = ; c3 = ; c4 = .
4
4
4
4
(iii) Since S is orthogonal, we need only find the fourier coefficients of with respect to the
basis vectors. Therefore,
h, 1 i
1
h, 2 i
1
= (a + b + c + d); c2 =
= (a + b c d);
h1 , 1 i
2
h2 , 2 i
2
h, 3 i
1
h, 4 i
1
c3 =
= (a b + c d); c4 =
= (a b c + d)
h3 , 3 i
2
h4 , 4 i
2

c1 =

are the co-ordinate of with respect to the basis S.

390

Inner Product Space

Exercise 6
Section-A
[Multiple Choice Questions]
1. The transformation T : R2 R defined by T (x, y) = x + y + is linear if equals to
(a) 5
(b) 2
(c) 1
(d) 0
2. The transformation T : R2 R defined by T (x, y) = xk + y is linear if k equals to
(a) 0
(b) 1
(c) 2
(d) 3
3. If a linear transformation T : R2 R2 be defined by T (x1 , x2 ) = (x1 + x2 , 0) then ker
(T ) is
[WBUT 2007]
(a) {(1, 1)}
(b) {(1, 0)}
(c) {(0, 0)}
(d) {(0, 1), (1, 0)}
4. The transformation T : R2 R2 defined by T (x, y) = (x2 , y 2 ) is
(a) linear
(b) non-linear
5. The integral operator I : R R defined by If (x) =

Rb

f (x) dx is

(a) linear

(b) non-linear

6. The integral operator D : R R defined by f (x) =


(a) linear

(b) non-linear

df
f (x) is
dx

7. The transformation T : R2 R is linear then 5T is


(a) linear
(b) non-linear
8. The ker (T ) for the mapping T : R3 R3 defined by T (x, y, z) = (x + y, y + z, z + x)
is
(a) {(0, 0, 1)}
(b) {(0, 0, 0)}
(c) {(1, 1, 1)}
(d) {(1, 1, 1), (1, 1, 1)}
9. The Im (T ) for the mapping T : R3 R2 defined by T (x, y, z) = (x + z, y + z) is
(a) L{(1, 0), (0, 1)}
(b) L{(1, 0), (0, 1), (1, 1)}
(c) L{(1, 0)}
(d) L{(0, 1)}
10. For a bijective mapping T : R3 R3 the rank of T is
(a) 1
(b) 2
(c) 3
(d) 4
11. If T : R3 R3 is bijective then nullity of T is
(a) 0
(b) 1
(c) 2
(d) 3
12. If T (x, y, z) = (x + 2y z, x + 3y + 2z), x, y, z R then (1, 2, 0) is
(a) (1,2)
(b) (5,7)
(c) (5,2)
(d) (2,7)
13. Let T : V V be a linear transformation. If im(T ) = V then ker(T ) is
(a) V
(b) {(1, 0, 0), (0, 1, 0), (0, 0, 1)}
(c) {}
(d) none of these
14. Let T : R3 R2 and S : R3 R2 be defined by T (x, y, z) = (x + y, y + z) and
S(x, y, z) = (x z, y) then (T1 + T2 )(x, y, z) is
(a) (x + y, y + z)
(b) (2x + y z, 2y + z)
(c) (x + y, x + 2y)
(d) (x y, x z)
15. If S and T are linear operators on R2 defined by S(x, y) = (y, x) and T (x, y) = (x, 0)
then ST is equal to
(a) (x, y)
(b) (x, 0)
(c) (0, x)
(d) (y, x)

Projection of a Vector

391

16. If S and T are two linear operators on R2 defined by S(x, y) = (x + y, x y) and


T (x, y) = (y, x) then 2S + 3T is
(a) (2x + 5y, 5x 2y)
(b) (x + 2y, 2x y)
(c) (x + 4y, 4x)
(d) (2x, 3y)
17. If S : R2 R2 defined by S(x, y) = (x, x + y) then S 2 is
(a) (x, 2x + 2y)
(b) (x, 2x + y)
(c) (x + y, x)
(d) (x2 , (x + y)2 )
18. If T1 and T2 be two operators defined by T1 (x, y) = (x, y) and T2 (x, y) = (0, y) then
T2 T12 is
(a) (0, x)
(b) (x, 0)
(c) (0, y)
(d) (x, y)
19. Let T : R2 R2 be defined by T (x, y) = (x + y, x). Then T 1 (x, y) is
(a) (x y, x)
(b) (x, x + y)
(c) (x y, x + y)
(d) (y, x y)
20. Let T : R2 R2 be defined by T (x, y) = (y, x) then T 1 (3, 4) is
(a) (3, 4)
(b) (4, 3)
(c) (3, 4)
(d) (4, 3)
21. Let S : R2 R2 and T : R2 R2 be two mappings defined by S(x, y) = (x + y, x)
and T (x, y) = (y, x) then T 1 S 1 (x, y) is
(a) (x, y)
(b) (y, x)
(c) (y, x y)
(d) (x y, y)
22. If kk = 2 then the norm of the vector 5 is
(a) 10
(b) 10
(c) 2
(d) 2
23. If and are orthogonal vectors then k + k2 is
(a) kk2 + kk2
(b) kk2 kk2
(c) (kk + kk)2

(d) none of these

24. If = (4, 0, 3) is a vector of an inner product space then the normalized is


(a) (4, 0, 3)
(b) 15 (4, 0, 3)
(c) (1, 0, 1)
(d) (1, 0, 0)
25. Let and be two vectors of a Euclidean space V then ( + , ) = 0 iff
(a) k + k = k k
(b) kk = kk
(c) k k = 0
(d) k + k = 0
26. Let V be the vector space of polynomials with inner product given by (f, g) =
R1
f (t)g(t) dt. If f (t) = t2 t then kf (t)k is
0

(a) 1/6

(b) 1/30

(c) 1/30

(d) 1/6

27. If V is a vector space of all polynomials in t with inner product defined by (f, g) =
R1
f (t)g(t) dt. If f (t) = 2t + 1 and g(t) = t2 + 1 then (f, g) is
0

(a) 1/6

(b) 1/10

(c) 17/6

(d) 6/17

28. If , are two vectors in a real inner product space such that kk = kk then ( +
, ) is equal to
(a) 0
(b) 1
(c) 1
(d) none of these
29. Let = (2, 3, 4) and = (1, 1, k) be two vectors in a Euclidean space. If and
are orthogonal then k is equal to
(a) 0
(b) 1
(c) 1/4
(d) 1/4
30. If the vectors = (k, 0, 0) and = (0, k, 0) are orthogonal then k is
(a) 0
(b) 1
(c) 1
(d) for all values of k
Section-B
[Objective Questions]

392

Inner Product Space

1. Find the value(s) of k <, such that h, i = x1 x2 k(x2 y1 + x1 y2 ) + y1 y2 is an inner


product.
[Gate04]
2. Let A be a 2 2 orthogonal matrix of trace and determinant 1. Show that the angle
between Au and u = [1, 0]T is 450 .
[Gate02]
3. Let T : <4 < be a linear functional defined by T (x1 , x2 , x3 , x4 ) = x2 . Find the
unique vector <4 such that f () = h, i for all <4 .
[Gate04]
4. In Euclidean 2 space give an orthonormal basis of which one vector is in the direction
of the vector (1, 2).
[BH98]
Section-C
[Long Answer Questions]
1. Let V be an inner product space over < and T : V V be a LT such that hT u, vi =
hu, T vi and hT u, ui 0, for all u, v V . Prove that
|hT u, vi|2 hT u, uihT v, vi, u, v V.
[Gate99]

2 1 1
2. Consider the inner product h, i = T A on <3 , where A = 1 1 0 . Find an
1 0 3
orthonormal basis B of S = {(x1 , x2 , x3 ) : x1 + x2 + x3 = 0} and then extend it to an
orthonormal basis C of <3 .
3. Prove that the set of vectors {(2, 3, 1), (1, 2, 4), (2, 1, 1)} is an orthogonal basis
of <3 with usual inner product and express the vector (4, 3, 2) at a linear combination
of these basis vectors.
[BH04]
4. If u, v are two vectors of an inner product space V , then show that
||u + v||2 + ||u v||2 = 2||u||2 + 2||v||2 .

[BH05]

5. Find an orthonormal basis of the subspace of <4 spanned by


(2, 1, 0, 1), (6, 1, 4, 5) and (4, 1, 3, 4).
6. Let 1 = (1, 1, 1, 1), 2 = (0, 1, 1, 1), 3 = (0, 0, 1, 1) and 4 = (0, 0, 0, 1) in <4 . Starting from {1 , 2 , 3 , 4 } obtain an orthonormal basis of <4 . If you use {4 , 3 , 2 1 }
what is the orthonormal basis obtained?
7. (a) Use Gram-Schmidt process to obtain an orthogonal basis from the basis set
{(1, 2, 2), (2, 0, 1), (1, 1, 0)} of the Euclidean space <3 with standard inner product.
[VH05,98]
(b) Use Gram-Schmidt process to obtain an orthogonal basis of the subspace of the
Euclidean space <4 generated by the set {(1, 1, 0, 1), (1, 2, 0, 0), (1, 0, 1, 2)}.
[BU(M.Sc.)98]
(c) Use Gram-Schmidt process to obtain an orthogonal basis of <3 containing the
vector ( 12 , 12 , 0) with the standard inner product.
[VH03]
(d) Use Gram-Schmidt process to obtain an orthogonal basis from the basis set
{(1, 1, 1), (0, 1, 2), (2, 1, 1)} of the Euclidean space <3 with standard inner product.
[VH01]
8. Find the orthogonal basis for of the subspace W of C 3 , spanned by
1 = (1, i, 0) and
2 = (1, 2, 1 i).

Projection of a Vector

393

9. Extend (2, 3, 1), (1, 2, 4) to an orthogonal basis of <3 and then find the orthonornal basis.
[VH02]
10. What is the orthogonal complement of the subspace of the even polynomials in Pn (<)
R1
with respect to the inner product hp, qi = p(t)q(t)dt?
1

11. Obtain an inner product on <2 such that ||h2, 3iT || < ||h1, 1iT ||.
12. (a) Show that if T = ( 12 , 12 , 12 , 12 ) and T = ( 12 , 12 , 12 , 12 ) then A = I T T
is an orthogonal projector.
(b) Obtain the orthogonal projector
to the Euclidean inner product)
(with respect

3 2 1
into the column space of A = 1 3 2 .
2 1 3
13. (a) Obtain an orthogonal matrix with ( 12 , 12 , 12 , 12 )T as the first column.
(b) Obtain an orthogonal matrix of order 3 on the integers whose first row is (1, 2, 1).
[VH 01, 05]

Answer

1. d
12. b
23. a

2. b
13. c
24. b

3. a
14. b
25. b

Section-A
[Multiple Choice Questions]
4. b
5. a
6. a
7. a
8. b
15. c 16. a 17. b 18. c 19. d
26. b 27. c 28. a 29. d 30. b

9. a
20. b

10. c
21. d

11. a
22. b

394

Inner Product Space

Chapter 7

Matrix Eigenfunctions
One of the most important topic in linear algebra is determination of eigenvalues and eigenvectors. For a square matrix, eigenfunctions (eigenvalue and eigenvector) plays a significant
role in the field of Applied Mathematics, Applied Physics, Economics, Astronomy, Engineering and Statistics. The analysis of electrical circuit, small oscillations, frequency analysis in
digital system, etc. can be done with the help of eigenvalues and eigenvectors. These are
useful in the study of canonical forms of a matrix under similarity transformations and in
the study of quadratic forms, especially the extrema of quadratic form.

7.1

Matrix Polynomial

If the elements of a matrix A be polynomial in x with degree n atmost, then,


A = xn A0 + xn1 A1 + + xAn1 + An ,

(7.1)

where Ai are the square matrices of the same order as that of A. Such a polynomial (7.1) is
called matrix polynomial of degree n, provided the leading co-efficient A0 6= 0. The symbol
x is called indeterminate.
For example,

1 + x x2 1
1
010
100
1 1 1
2 x2 + x + 2 2 = x2 0 1 0 + x 0 1 0 + 2 2 2
x2 + 3
x
x2 + 5
101
010
3 0 5
= A0 x2 + A1 x + A2 ,
where the coefficients A0 , A1 , A2 are real matrices of order 33 as of A is a matrix polynomial
of degree 2. We say that the metric polynomial is rrowed, if the order of each of the matrix
coefficients Ai ; i = 1, 2, , n be r. Two matrix polynomials are said to be equal, if and only
if the coefficients of the like powers of the indeterminate x are same.

7.1.1

Polynomials of Matrices

Let us consider a polynomial f (x) = c0 xn + c1 xn1 + + cn1 x + cn over a field F . Let


A be a given square matrix, then we define,
f (A) = c0 An + c1 An1 + + cn1 A + cn I,

(7.2)

where I is an unit matrix of the same order as A, is a polynomial of matrix A. A polynomial


f (x) is said to annihilate A if f (A) = 0, the zero matrix. Let f and g be polynomials. For
any square matrix A and scalar k,
(i) (f + g)(A) = f (A) + g(A)
(ii) (f g)(A) = f (A)g(A)
395

396

Matrix Eigenfunctions

(iii) (kf )(A) = kf (A)


(iv) f (A)g(A) = g(A)f (A)
(iv) tells us that any two polynomials in A commute.

7.1.2

Matrices and Linear Operator

Let T : V V be a linear operator on a vector space V . Powers of T are defined by the


composition operation, i.e.
T 2 = T.T, T 3 = T 2 .T, . . .
Also, for a polynomial f (x) = an xn + + a1 x + a0 , we define f (T ) like matrices as
.
f (T ) = an T n + .. + a1 T + a0 I
where I is the identity mapping. We say that T is zero or root of f (x) if f (T ) = 0, the zero
mapping. Suppose A is a matrix representation of a linear operator T . Then f (A) is the
matrix representation of f (T ), and, in particular, f (T ) = 0 if and only if f (A) = 0.


1 2
Ex 7.1.1 Let A =
. Find f (A), where f (x) = x2 3x + 7 and f (x) = x2 6x + 13.
4 5


1 2
Solution: Given that, A =
, therefore,
4 5


 

1 2
1 2
7 12
2
A =
=
.
4 5
4 5
24 17
For f (x) = x2 3x + 7 the value of f (A) = A2 3A + 7I becomes,





 

7 12
1 2
10
3 6
f (A) =
3
+7
=
.
24 17
4 5
01
12 9
For f (x) = x2 6x + 13 the value of f (A) = A2 6A + 13I becomes,





 

7 12
1 2
10
00
f (A) =
6
+ 13
=
.
24 17
4 5
01
00
In the second case, A is the root of f (x).

7.2

Characteristic Polynomial

Characteristic polynomial of a matrix


If A = [aij ]nn be a given square matrix of order n over the field F , then, an ordinary
polynomial in of the nth degree with scalar coefficients as


a11 a12

a1n

a21 a22
a2n

A () = |A I| =
(7.3)
,
..
..
..


.
.
.


an1
an2
ann

Characteristic Polynomial

397

where, I is the unit matrix of order n, is defined as characteristic polynomial or characteristic


matrix of A. The equation
A () = |A I| = 0
(7.4)
is defined as the characteristic equation of the matrix A. The degree of the characteristic
equation is the same as the order
of the square matrix A. Let us write it as
A () = C0 n + C1 n1 + C2 n2 + + Cn1 + Cn = 0
where the coefficients Ci are functions of the elements of A. It can be shown that
C0 = (1)n ; C1 = (1)n1

n
X

aii ,

i=1

C2 = (1)n2 sum of the principle minors of order 2,




a11 a12
and so on. Lastly, Cn = det A. Let A =
, be a matrix of order 2, then the
a21 a22
characteristic polynomial of A is,


a a12
= (a11 )(a22 ) a12 a21
A () = |A I| = 11
a21 a22
= 2 (a11 + a22 ) + (a11 a22 a12 a21 )
= 2 tr(A) + |A|,
where, tr(A) denotes the trace of A, i.e., the sum of the diagonal elements of A. Similarly,
for a matrix of order 3, the characteristic polynomial is


a11 a12
a13

A () = |A I| = a21 a22 a23
a31
a32 a33
= 3 tr(A) 2 + (A11 + A22 + A33 ) |A|,
where, A11 , A22 , A33 are the cofactors of a11 , a22 , a33 respectively. In general, if A be a
square matrix of order n, then the characteristic polynomial is


a11 a12

a1n

a21 a22
a2n

A () =

..
..
..


.
.
.


an1
an2
ann
= n S1 n1 + S2 n2 + (1)n Sn
where Sk is the sum of the principal minors of A of order k.


13
Ex 7.2.1 Find the characteristic polynomial of A =
.
45
Solution: The characteristic polynomial of A is given by,


1 3

A () = |A I| =
4 5
= ( 1)( 5) 12 = 2 6 7.
Also, tr(A) = 1 + 5 = 6 and |A| = 7, so, A () = 2 6 7.

398

7.2.1

Matrix Eigenfunctions

Eigen Value

If the matrix A be of order n, then the characteristic equation of A is an nth degree equation
in . The roots of
A () = |A I| = 0
(7.5)
are defined as characteristic roots or latent roots or eigen values of the square matrix A.
The spectrum of A is the set of distinct characteristic roots of A. Thus, if A = [aij ]nn ,
then the eigen values of the matrix A is obtained from the characteristic equation
|A I| = n + p1 n1 + p2 n2 + + pn = 0,
where p1 , p2 , , pn can be expressed in terms of elements aij s of the matrix A = [aij ]nn
over F . Clearly A () is a monic (i.e., the leading
Qncoefficients is 1) polynomial of degree n,
since the heighest power of occurs in the term i=1 ( aii ) in A (). So by fundamental
theorem of algebra A () has exactly n (not necessarily distinct ) roots. We usually denote
the characteristic roots of A by 1 , 2 , n , so that
A () = |A I| = ( 1 )( 2 ) ( n ).
Ex 7.2.2 The characteristic roots of a 33 matrix are known to be in arithmetic progression.
Determine them, given tr(A) = 15 and |A| = 80.
Solution: Let the characteristic roots of the matrix A be a d, a, a + d. Then,
(a d) + a + (a + d) = 15 a = 5.
Also, (a d)a(a + d) = 80 (a2 d2 )a = 80 d = 3.
Therefore, the characteristic roots of the matrix are 2, 5, 8.

7.2.2

Eigen Vector

Let us consider, the matrix equation


AX = X, i.e.,(A I)X = 0,

(7.6)

where A = [aij ]nn is a given n n matrix and X = [x1 , x2 , , xn ]T is a n 1 column non


null matrix and is scalar as


a11 a12

a1n
x1
0
a21 a22
x2 0
a
2n

.. = .. .
..
..
..

. .
.
.
.
an1
an2
ann
xn
0
Thus corresponding to each eigen value i of the matrix A, there is a non null solution
(A i I)X = 0. If X = Xi be the corresponding non null solution, then the column vector
Xi is defined as eigen or inverient or characteristic or latent vector or pole. Determination
of scalar and the non-null vector X, satisfying AX = X, is known as the eigen value
problem.

221
Ex 7.2.3 Determine the eigen values and eigen vector of the matrix 1 3 1 .
122

Characteristic Polynomial

399

Solution: The characteristic equation of the given matrix A is |A I| = 0, i.e.,




2 2
1

1 3 1 =0


1
2 2
or, (2 )(2 5 + 4) + 3( 1) = 0
or, ( 1)2 ( 5) = 0 = 1, 1, 5.
Thus the eigen values of the given matrix are 1, 1, 5 and 1 is an 2 fold eigen value of the
matrix A. The spectrum of A is 1, 1, 5. Corresponding to = 1, consider the equation
(A I)X = 0, where A is the given matrix and X = [x1 , x2 , x3 ]T . The coefficient matrix is
given by

121
121
1 2 1 0 0 0.
121
000
The system of equation is equivalent to x1 + 2x2 + x3 = 0. We see that [1, 0, 1]T is one
of the non null column solution, which is a eigen vector corresponding to the eigen value
= 1. For = 5, the coefficient matrix is given by

3 2 1
0 8 8
0 0 0
1 2 1 0 4 4 0 1 1 .
1 2 3
1 2 3
1 0 1
The system of equation is equivalent to x2 + x3 = 0, x1 x3 = 0 so that x3 = 1 gives
x1 = x2 = 1. Hence [1, 1, 1]T is a eigen vector corresponding to the eigen value = 5.
Cayley-Hamilton theorem
Theorem 7.2.1 Every square matrix satisfies its characteristic equation.
Proof: Let A be a matrix of order n and I be the unit matrix of the same order. Its
characteristic equation is |A I| = 0, i.e.,
n + a1 n1 + a2 n2 + + an = 0; ai = scalars
|A I| = (1)n {n + a1 n1 + a2 n2 + + an }.
We are to show that An + a1 An1 + a2 An2 + + an1 A + an I = 0. Now, cofactors of the
elements of the matrix A I are polynomial in of degree at most (n 1), so adj(A I)
is a matrix polynomial in of degree (n 1) as,
adj(A I) = n1 A1 + n2 A2 + + An1 + An ,
where Ai are suitable matrices of order n, each of which will contain terms with same powers
of . Now, using the relation, (A I)adj(A I) = |A I|I, we get,
(A I)[n1 A1 + n2 A2 + + An1 + An ]
= (1)n {n + a1 n1 + a2 n2 + + an }I.
This relation is true for all values of , so, equating coefficients of like powers n , n1 ,
of from both sides, we get,

400

Matrix Eigenfunctions
A1 = (1)n I
AA1 IA2 = (1)n a1 I
AA2 A3 = (1)n a2 I
..
.
AAn = (1)n an I.

Multiplying these relations successively by An , An1 , , A, I respectively and adding we


get,
0 = (1)n [An + a1 An1 + a2 An2 + + an1 A + an I]
An + a1 An1 + a2 An2 + + an1 A + an I = 0.
(7.7)
Thus every matrix A is a root of its characteristic polynomial.
Deduction 7.2.1 Now, if |A| =
6 0, then A1 exists. In this case, multiplying (7.7) by A1 ,
we get,
An1 + a1 An2 + a2 An3 + + an1 I + an A1 = 0
1
a1 n2 a2 n3
an1
A1 = An1
A

A

I.
an
an
an
an
Therefore, Cayley-Hamilton theorem can be applied to find the inverse of a matrix.
Result 7.2.1 Suppose A = [aij ] be a triangular matrix. Then A I is a triangular matrix
with diagonal entries aii , and hence, the characteristic polynomial is
|A I| = ( a11 )( a22 ) ( ann ).
Result 7.2.2 Suppose the characteristic polynomial of an n square matrix A is a product
of n distinct factors, then A is similar to the diagonal matrix D = diag(a11 , a22 , , ann ).
Ex 7.2.4 What are the possible eigen values of a square matrix A (over the field <) satisfying A3 = A ?
Solution: According to Cayley-Hamilton theorem, every square matrix satisfies its own
characteristic equation, so A3 = A becomes
3 = (2 1) = 0 = 1, 0, 1.


1 1
Ex 7.2.5 Verify Cayley-Hamilton theorem for A =
and hence find A1 and A6 .
1 3
Solution: The characteristic equation of the given matrix A is


1 1
= 0 2 4 + 4 = 0.
|A I| =
1 3
By Cayley-Hamilton theorem, we have A2 4A + 4I = 0. Now,




 

1 1
1 1
1 1
10
4
+
1 3
1 3
1 3
01

 
 

0 4
4 4
40
=

+
= 0.
4 8
4 12
04

Characteristic Polynomial

401

so the Cayley-Hamilton theorem is verified. Therefore,






1
1
10
1 1
A1 = I A =

01
4
4 1 3

 



1
1 3 1
40
1 1
=

.
=
04
1 3
4
4 1 1
Now divide 6 by 2 4 + 4, we get,
6 = (2 4 + 4)(4 + 43 + 122 + 32 + 80) + 192 320


128 192
6
= 192 320 A = 192A 320I =
.
192 256


31
Ex 7.2.6 If A =
, express 2A5 3A4 + A2 5I as a linear polynomial in A.
12
Solution: The characteristic equation of the given matrix A is |A I| = 0, i.e.,


3 1
2


1 2 = 0 5 + 5 = 0.
By Cayley-Hamilton theorem, we have A2 5A + 5I = 0. Now divide 25 34 + 2 5 by
2 5 + 5, we get,
25 34 + 2 5 = (2 5 + 5)(23 + 72 + 25 + 91) + 330 460
= 330 460
2A5 3A4 + A2 5I = 330A 460I.

0 01
Ex 7.2.7 Verify Cayley-Hamilton theorem for A = 3 1 0 and hence find A1 .
2 1 4
Solution: The characteristic equation of the given matrix A is


0 0
1

|A I| = 0 3 1 0 = 0
2
1 4
or,

3 52 + 6 5 = 0.

By Cayley-Hamilton theorem, we have A3 5A2 + 6A 5I = 0. For verification, we have,

0 01
0 01
2 1 4
A2 = 3 1 0 3 1 0 = 3 1 3
2 1 4
2 1 4
5 5 14

2 1 4
0 01
5 5 14
A3 = 3 1 3 3 1 0 = 3 4 15 .
5 5 14
2 1 4
13 19 51
Therefore, the expression A3 5A2 + 6A 5I becomes,

5 5 14
2 1 4
0 01
100
3 4 15 5 3 1 3 + 6 3 1 0 5 0 1 0 = 0.
13 19 51
5 5 14
2 1 4
001

402

Matrix Eigenfunctions

Thus, Cayley-Hamilton theorem is verified. To find A1 , we get,


A3 5A2 + 6A 5I = 0
A1

4 1 1
1 2
1
12 2 3 .
= (A 5A + 6I) =
5
5
5 0 0

100
Ex 7.2.8 Verify Cayley-Hamilton theorem for A = 1 0 1 and hence find A1 and A50 .
010
Solution: The characteristic equation of the given matrix A is


1 0
0

|A I| = 0 1 0 1 = 0
0
1 0
or,

3 2 + 1 = 0.

By Cayley-Hamilton theorem, we have A3 A2 A + I = 0. For verification, using A, the


expression A3 A2 A + I becomes,




100
100
100
100
2 0 1 1 1 0 1 0 1 + 0 1 0 = 0.
001
110
101
010
Hence the Cayley-Hamilton theorem is verified. Now, using the relation A3 A2 A+I = 0,
we have, A1 = A2 + A + I, i.e.,

100
100
100
1 00
A1 = 1 1 0 + 1 0 1 + 0 1 0 = 0 0 1 .
101
010
001
1 1 0
From the relation A3 A2 A + I = 0, we see that
A3 = A2 + A I
A4 = A3 + A2 A = A2 + A2 I
A5 = A3 + A3 A = A3 + A2 I.
Thus we get, for every integer n 3, we have An = An2 + A2 I. Using this recurrence
relations, we have, A4 = A2 + A2 I, A6 = A4 + A2 I, i.e.,

100
100
A4 = 2 1 0 ; A6 = 3 1 0 .
201
301

From symmetry, we see that A50

1 00
= 25 1 0 .
25 0 1

Ex 7.2.9 A matrix A has eigen values 1 and 4 with corresponding eigenvectors (1, 1)T
and (2, 1)T respectively. Find the matrix A.
[Gate97]

Characteristic Polynomial

403



a11 a12
Solution: Let the required matrix be A =
. Then from the equation AX = X,
a21 a22
we have,





a11 a12
1
1
a a12 = 1
=1
11
.
a21 a22
1
1
a21 a22 = 1





a11 a12
2
2
2a + a12 = 8
=4
11
.
a21 a22
1
1
2a21 + a22 = 4
Solvingtheseequations, we get a11 = 3, a12 = 2, a21 = 1 and a22 = 2. Therefore, the matrix
3 2
is A =
.
1 2
Theorem 7.2.2 If the eigen values of A are distinct, then the eigen vectors are linearly
independent.
Proof: Let xk be the eigen vector of an n n square matrix A corresponding to the
eigen value k , where k ; k = 1, 2, , n are distinct. Let xk = [xk1 , xk2 , , xkn ]T for
k = 1, 2, , n. Thus we have, Axk = k xk ; k = 1, 2, , n. Therefore,
A2 xk = A(Axk ) = k (Axk ) = k (k xk ) = 2 xk .
By the principle of mathematical induction, we conclude Ap xk = pk xk , for any positive
integer p. Let us consider the relation, X = c1 x1 + c2 x2 + + cn xn = ; where ci s are
scalars. Equating ith component to 0, we get,
c1 x1i + c2 x2i + + cn xni = 0,
it is true for i = 1, 2, , n. Since x = , so Ax = . Therefore,
A(c1 x1 + c2 x2 + + cn xn ) =
or, c1 (Ax1 ) + c2 (Ax2 ) + + cn (Axn ) =
or, c1 1 x1 + c2 2 x2 + + cn n xn = .
Equating ith component to 0, we get,
c1 1 x1i + c2 2 x2i + + cn n xni = 0.
Again, equating the ith component of A2 x = to zero, gives
c1 21 x1i + c2 22 x2i + + cn 2n xni = 0.
Continuing this process and lastly we get,
c1 n1
x1i + c2 n1
x2i + + cn n1
xni = 0.
n
1
2
The n equations in n unknowns have a non

1
1

1

2
2
2
1

2

..
.
n1 n1

2
1

null solution, if and only if,



1
n
2n = 0,

..

.

n1
n

which is true if and only if some two s are equal, which is contradictory to the hypothesis.
Thus the system has no non null solution. Therefore, we must have
c1 x1i = c2 x2i = = cn xni = 0; i = 1, 2, , n.
Hence, c1 x1 = c2 x2 = = cn xn = . But x1 , x2 , , xn are non null vectors, since they
are eigen vectors, hence c1 = c2 = = cn = 0. This shows that {x1 , x2 , , xn } is linearly
independent.

404

Matrix Eigenfunctions

Properties of eigen values


Property 7.2.1 Two similar matrices have same eigen values.
Proof: Let A and B be two similar matrices. Hence non singular matrix P such that
B = P 1 AP. The characteristic polynomial of B is
|B I| = |P 1 AP I| = |P 1 AP P 1 IP |
= |P 1 (A I)P | = |P 1 ||A I||P |
= |A I||P P 1 | = |A I|.
Therefore, A and B have the same characteristic polynomial and hence they have the same
eigen values. But the matrices

having thesame
 eigen values may not be similar. For
10
12
example, the matrices A =
and B =
have the same characteristic polynomial
01
01
and hence they have the same eigen values 1, 1. But there is any non singular matrix P ,
such that P 1 AP = B so that B is not similar to A.
Property 7.2.2 If A and B be two square invertible matrices then AB and BA have the
same eigen value.
Proof: Now, AB can be written in the form
AB = B 1 B(AB) = B 1 (BA)B.
1
So, AB and B (AB)B are similar matrices, and therefore, AB and B 1 (AB)B have
the same eigen values. Therefore, AB and BA have the same eigen values. Similarly, A1 B
and B 1 A have the same eigen value.
Property 7.2.3 The eigen values of a matrix and its transpose are same.
Proof: Let A be a square matrix, then the eigen values of A are the roots of the equation



|A I| = 0. Now,
|A I| = (A I)T = AT I T


= AT I ; as I T = I.
Since A and AT have the same characteristic polynomial, A and AT have same eigen values.
Property 7.2.4 If 1 , 2 , , n be the eigen values of A, then k1 , k2 , , kn are the
eigen values of kA, k being a arbitrary scalar.
Proof: Let A be a square matrix of order n. Let Xi be a eigen vector of the matrix A
corresponding to the eigen value i . Then, AXi = i Xi ; i = 1, 2, , n. Therefore,
k(AXi ) = k(i Xi ); i = 1, 2, , n and k 6= 0
(kA)Xi = (ki )Xi ; i = 1, 2, , n
showing that ki is the eigen value of kA for i = 1, 2, , n. Moreover the corresponding
eigen vectors of A and kA are the same. Thus there are more than one eigen vector of A
corresponding to the same eigen value of A. On the other hand, if 1 and 2 be two eigen
vector X of A, then,
AX = 1 X; AX = 2 X (1 2 )X = 0.
Since the eigen vector X 6= , we get 1 = 2 . Thus the eigen vector X of a matrix A can
not correspond to more than one eigen value of A.

Characteristic Polynomial

405

Property 7.2.5 The product of the eigen values of A is |A|.


Proof: Let 1 , 2 , , n be the eigen values of A, then,
|A I| = (1)n ( 1 )( 2 ) ( n ).
|A| = (1)n (1)n 1 .2 . , n ; putting = 0
= 1 .2 . , n .
This shows that, the product of the eigen values of A is |A|. Thus, if A is non singular, then
|A| =
6 0, therefore, none of the eigen value of a non singular matrix is 0.
Property 7.2.6 If A be a square matrix, then the sum of the characteristic roots of A is
equal to the trace of A.
Proof: Let A = [aij ]nn be a square matrix of order n. Then, by definition,
tr(A) = a11 + a22 + + ann =

n
X

aii .

i=1

Let 1 , 2 , , n be the eigen values of A, then,


|A I| = (1)n ( 1 )( 2 ) ( n ).
The coefficient of n1 in the right hand side of the above relation is (1)n+1
that the coefficient in |A I| is (1)n1

n
P

n
P

i , and

i=1

aii . Therefore,

i=1

(1)n+1
tr(A) =

n
X
i=1
n
X

i = (1)n1

n
X

aii

i=1

aii =

n
X

i=1

i .

i=1

Property 7.2.7 If 1 , 2 , , n be the eigen values of a non singular matrix A of order n,


then
(i)

1 1
1
, ,,
are the eigen values of A1 .
1 2
n

m
m
m
(ii) m
1 , 2 , , n are the eigen values of A , m is positive integer.

Proof: Let A be an n n non-singular matrix, so that |A| =


6 0. Let the eigen vectors of
the non-singular matrix A corresponding to the eigen value r be Xr . Then,
AXr = r Xr (AXr ) = A1 (r Xr )
or, A1 Xr = r (A1 Xr ).
This shows that 1r is the eigen value of A1 , r = 1, 2, , n, with the same corresponding
eigen vector. Now,
A2 Xr = A(AXr ) = A(r Xr ) = r (AXr ) = r (r Xr ) = 2r Xr .
Let it is true for m = k, i.e., Ak Xr = kr Xr , then

406

Matrix Eigenfunctions
A(Ak Xr ) = Ak+1 Xr = A(kr Xr )
= kr (AXr ) = kr (r Xr ) = k+1
Xr .
r

The result is true for m = k + 1 if it is true for m = k. By the principle of mathematical


induction we conclude that Am Xr = m
r Xr , where m is a positive value integer. This shows
m
m
m
m
that, m
are
r are the eigen value of A . Combining we conclude that, 1 , 2 , , n
m
the eigen values of A , m is positive integer.
Property 7.2.8 If be an eigen value of a non-singular matrix A, then
value of adjA.

1
|A|

is an eigen

1
Proof: Since A is non-singular, so A1 exists and is given by A1 = |A|
adjA. If I be an
1
1
unit matrix of order n, then AA = A A = I and the characteristic polynomial becomes,

1
|A I| = |A AA1 | = |A A
adjA|
|A|







n |A|

.
adjA =
I

adjA
= |A| I


n1
|A|
|A|

6= 0, therefore, the characteristic equation becomes,









n |A|
, i.e., adjA |A| I = 0.
|A I| = 0
I

adjA



n1
|A|

Since A is non-singular, so

This shows that, if be an eigen value of a non-singular matrix, then


of adjA.

1
|A|

is an eigen value

Property 7.2.9 If is a eigen value of an orthogonal matrix A then so is 1 .


Proof: Let A be an orthogonal matrix so that AAT = I. Now, since is a characteristic
root of A, so |A I| = 0. Therefore,
|A AAT | = 0 |A(I AT )| = 0
1
|A||AT I| = 0.

Now, 6= 0, since A is non singular and also |A| =


6 0. Therefore,


T

A 1 I = 0,


showing that 1 is a characteristic root of AT . But AT and A have same characteristic
root. Hence 1 is also characteristic root of A.
Property 7.2.10 The eigenvalue of the idempotent matrix is either 1 or 0.
Proof: A matrix A is idempotent if A2 = A. Let be an eigenvalue of A and X be its
corresponding eigenvector, so, AX = X. Multiplying this equation by A,
A2 X = AX AX = AX
as A2 = A
or, X = (X)
as AX = X
or, (2 )X = 0 or, 2 = 0
as X 6= 0.
Hence 2 = 0 or, ( 1) = 0, i.e., the eigenvalues of A are either 1 or 0.
Property 7.2.11 The eigen values of a real symmetric matrix are all real.

Characteristic Polynomial

407

Proof: Let A be a real symmetric matrix, so that AT = A. We assume that some roots
of
the characteristic equation in of A belong to the complex field C. Let + i; i = 1
be a complex root of the characteristic equation of A, then, |A ( + i)I| = 0. Let,
B = {A ( + i)I}{A ( + i)I} = (A I)2 + 2 I.
But, |B| = |A ( + i)I||A ( + i)I|
= 0.|A ( + i)I| = 0 as, |A ( + i)I| = 0.
There is a non null matrix X such that BX = 0. Therefore,
X T BX = 0 0 = X T {(A I)2 + 2 I}X
0 = X T (A I)2 X + 2 X T X
0 = X T (A I)T (A I)X + 2 X T X;
as (A I)T = AT I T = A I
0 = {(A I)X}T (A I)X + 2 X T X.

(7.8)

Now, (A I)X is a real column vector and X is a real non zero column vector. Therefore,
{(A I)X}T (A I)X 0 and X T X > 0.
Thus the relation (7.8) is possible only when = 0, which shows that all the roots of the
characteristic equation of A are real.
Property 7.2.12 The eigenvalues of a real skew-symmetric matrix are purely imaginary or
zero.
T

Proof: As in previous case, we can show that (+) X X = 0, as for real skew-symmetric
T
matrix A = AT = A. Since
T
X X 6= 0, + = 0 or, = .
Let = a + ib. Then = a ib. Therefore, from the relation, = , we have
a + ib = a + ib or a = 0. Therefore, = ib, i.e., is purely imaginary or zero.
Property 7.2.13 The eigen vectors corresponding to distinct eigen values of a real symmetric matrix A are mutually orthogonal.
Proof: Let A be a real symmetric matrix, so that AT = A. Therefore, the eigen values of
a real symmetric matrix A are all real. Let X1 , X2 be two eigen vectors corresponding the
eigen values 1 , 2 (1 6= 2 ). Then

AX1
T
X2 AX1

= 1 X1 and AX2 = 2 X2
= 1 (X2T X1 ); X1T AX2 = 2 (X1T X2 ).

Taking transpose and noting that AT = A, we get,


X1T AX2 = 1 (X1T X2 ) = 2 X1T X2
(1 2 )X1T X2 = 0.
But, 1 6= 2 , so X1T X2 = 0, where X1 , X2 are non-null vectors. Hence X1 and X2 are
orthogonal.
Property 7.2.14 The eigen values of an orthogonal matrix are 1.

408

Matrix Eigenfunctions

Proof: Let A be an orthogonal matrix and let i be an eigen value with Xi as the
corresponding eigen vector. Then
AXi = i Xi (AXi )T = (i Xi )T
(AXi )T (AXi ) = 2i XiT Xi
XiT (AT A)Xi = 2i XiT Xi
XiT Xi = 2i XiT Xi ; as AAT = I.
Since the eigen vector Xi is an non null, XiT Xi 6= 0, and so, 2i = 1, i.e., i = 1. Hence,
the eigen values of an orthogonal matrix has unit modulus.
Property 7.2.15 For a square matrix A, the following statements are equivalent:
(i) A scalar is an eigen value of A.
(ii) The matrix A I is singular.
(iii) The scalar is a root of the characteristic polynomial of A.
Property 7.2.16 The eigen values of an unitary matrix are of unit modulus.
Proof: Let A be an unitary matrix, so that A0 A = I, where A0 denotes the transpose
conjugate of A. Let X be the eigen vector of A corresponding to the eigen value , then
AX = X [AX]0 = [X]0 ; taking transpose conjugate
X 0 A0 = X 0 X 0 A0 AX = X 0 X
X 0 (A0 A)X = X 0 X
X 0 IX = X 0 X, i.e., (1 )X 0 X = 0.
Since X 0 X 6= 0, it follows that, = 1, i.e. ||2 = 1, i.e., || = 1.
Property 7.2.17 The eigen values of a Hermitian matrix are all real. The eigen vectors
corresponding to distinct eigen values are orthogonal.
Proof: Let A be a Hermitian matrix. Let X be the eigen vector of A corresponding to the
eigen value , then AX = X. Taking the Hermitian conjugate of this equation and noting
A0 = A, we have, X 0 A = X 0 . Thus,
X 0 (AX) = X 0 (X) X 0 AX = X 0 X
[X 0 AX]0 = [X 0 X]0 ; taking transpose
X 0 A0 [X 0 ]0 = X 0 [X 0 ]0 .
Since A is a Hermitian matrix, so, A0 = A and [X 0 ]0 = X, and so,
X 0 X = X 0 X ( )X 0 X = 0.
Since X 0 X 6= 0, it follows that, = 0, i.e., is real. The eigen values of a skew-Hermitian
matrix are purely imaginary or zero.
Let X1 and X2 be two eigenvectors of A corresponding to the distinct eigen values 1 and
2 respectively, so that AX1 = 1 X1 and AX2 = 2 X2 . Taking the Hermitian conjugate,
we have, X20 A = 2 X20 , where, we use the fact that 2 is real. Therefore,
(1 2 ) X20 X1 = 0.
Since 1 6= 2 , we have, X20 X1 = 0, showing that the vectors X1 and X2 are orthogonal to
each other.

Characteristic Polynomial

409

Leverrier-Fraddeev method to find eigen value


In this method, we are to generate the coefficients of the characteristic polynomial. The
characteristic polynomial is
P () = |A I| = n + p1 n1 + + pn1 + pn ,

(7.9)

where 1 , 2 , , n is a complete set of roots of polynomial. The method is based on


Newtons formula for the sums of the powers of the roots of an algebraic equation. Let
Sk = k1 + + kn ; k = 1, 2, , n so that
S1 =

n
X

i = T r(A); S2 =

n
X

2i = T r(A2 ); ; Sn =

i=1

i=1

n
X

ni = T r(An ).

i=1

For k n, using Newtons formula Sk + p1 Sk1 + + pk1 S1 = kpk we have,


1
1
p1 = S1 ; p2 = [S2 + p1 S1 ]; pn = [Sn + p1 Sn1 + + pn1 S1 ].
2
n
Hence the coefficients of the characteristic polynomial p1 , p2 , , pn can be easily found when
S1 , S2 , , Sn are known. Thus the sequence of matrices {Ai } can be found by using the
following scheme
Tr A1 = a1 ; B1 = A1 a1 I
1
Tr A2 = a2 ; B2 = A2 a2 I
A2 = AB1 ;
2
..
.
1
An = ABn1 ;
Tr An = an ; Bn = An an I
n
A1 = A;

where Bn is a null matrix. Thus the coefficients of the characteristic polynomial are
a1 = p1 ; a2 = p2 ; ; an = pn . The Leverrier-Faddeev method may also be used to
determine all the eigenvectors. Suppose the matrices B1 , B2 , . . . , Bn1 and the eigenvalues
1 , 2 , . . . , n are known. Then the eigenvectors x(i) can be determined using the formula
x(i) = in1 e0 + n2
e1 + n3
e2 + + en1 ,
i
i

(7.10)

where e0 is a unit vector and e1 , e2 , . . . , en1 are column vectors of the matrices B1 , B2 , . . . , Bn1
of the same order as e0 . Using this method one can compute the inverse of the matrix A.
It is mentioned that Bn = 0. That is, An an I = 0 or, ABn1 = an I. From this relation
one can write Bn1 = an A1 . This gives,
A1 =

1
1
Bn1 =
Bn1 .
an
pn

(7.11)

161
Ex 7.2.10 Find the characteristic polynomial of the matrix A = 1 2 0 .
003
Solution: Here (i) A1 = A. Now,
a1 = Tr (A)
= 1 + 2 + 3 = 6. Hence
5 6 1
B1 = A1 a1 I = 1 4 0 .
0 0 3

161
5 6 1
1 18 2
(ii) A2 = AB1 = 1 2 0 1 4 0 = 3 2 1 .
003
0 0 3
0 0 9

410

Matrix Eigenfunctions

6 18 2
Hence a2 = 12 [1 2 9] = 5 and B2 = A2 a2 I = 3 3 1 .
0 0 4

161
6 18 2
12 0
0
(iii)A3 = AB2 = 1 2 0 3 3 1 = 0 12 0 .
003
0 0 4
0
0 12

000
Thus a3 = 31 Tr A3 = 12 and B3 = A3 a3 I = 0 0 0 . Hence the characteristic poly000
3
2
nomial is 6 + 5 + 12 = 0. The eigenvalues of A are
1,
3, 4. To
find
the eigenvector

0
1
2
corresponding to the eigenvalue = 1, let us take e0 = 0 , e1 = 0 , e2 = 1 .
1
3
4
From the formula (7.10) we get the results of calculation in the following table
i
1 = 1

2 = 3

3 = 4

7.2.3

I
0
0
1
0
0
9
0
0
16

II
1
0
3
3
0
9
4
0
12

III
2
1
4
2
1
4
2
1
4

X (i)
3
1
0
1
.
1
4
2
1
0

Eigen Space

If the eigen values of A are real then the eigen vectors X1 , X2 , , Xn <n . The subspace
generated by the non null vectors is known as eigen or characteristic space of the matrix
A and is denoted by E . If is an eigen value of A, then the algebraic multiplicity of is
defined to be the multiplicity of as a root of the characteristic polynomial of A, while the
geometric multiplicity of is defined to be the dimension of its eigen space, i.e., dimE . The
geometric multiplicity of an eigen value the algebraic multiplicity of the eigen value .
If the geometric multiplicity of an eigen value = the algebraic multiplicity of the eigen
value , then is said to be regular.
Theorem 7.2.3 The eigen vector of an n n matrix A over a field F corresponding to an
eigen value of A together with the zero column vector is a subspace of Vn (F ).
Proof: Let E be the set of all eigen vectors of A corresponding to the eigen value .
Obviously, each vector of E is n 1 column vector. Let X1 , X2 E and c1 , c2 F. Then,
AX1 = X1 and AX2 = X2 . Now,
A(c1 X1 + c2 X2 ) = A(c1 X1 ) + A(c2 X2 ) = c1 (AX1 ) + c2 (AX2 )
= c1 (X1 ) + c2 (X2 ) = (c1 X1 + c2 X2 ).
It shows that c1 X1 + c2 X2 E , if X1 , X2 E . Hence E {}, where is the zero column
vector in Vn (F ), is a subspace of Vn (F ). E {} is known as the characteristic subspace
corresponding to the eigen value or eigen space of .

Characteristic Polynomial

411

Characteristic polynomial of block diagonal matrices




A1 B
Let A be a block triangular matrix, say, A =
where A1 and A2 are square
0 A2
matrices. Then A I is also a block triangular matrix, with diagonal blocks A1 I and
A2 I. Thus,


A1 I

B

= |A1 I| |A2 I|.
|A I| =
0
A2 I
Thus the characteristic polynomial of A is the product of the characteristic polynomials of
the diagonal blocks A1 and A2 . In general, let A is a block triangular matrix with diagonal
blocks A1 , A2 , , Ar ; then the characteristic polynomial of A is
|A I| = |A1 I| |A2 I| |Ar I|.
Ex 7.2.11 Find the
of the block triangular matrix
characteristic polynomial

..
9 1 . 5 7

.
8 3 .. 2 4

A=
... ... ... ... ... .

.
0 0 .. 3 6

..
0 0 . 1 8



A1 B
.
Solution: The given block triangular matrix A can be written in the form A =
0 A2
Now, the characteristic polynomial of A1 and A2 are
|A1 I| = 2 12 + 35 = ( 5)( 7),
|A2 I| = 2 11 + 30 = ( 5)( 6).
Accordingly, the characteristic polynomial of A is
|A I| = ( 5)( 7)( 5)( 6) = ( 5)2 ( 6)( 7).
Characteristic polynomial of linear operator
Let T : V V be a linear operator on a vector space V (F ) with finite dimension. For any
polynomial f (t) = c0 + c1 t + + cn tn , let us define
f (T ) = c0 I + c1 T + + cn T n ,
where I is the identity mapping and powers of T are defined by the composition operation.
The characteristic polynomial of the linear operator T is defined to be the characteristic
polynomial of the matrix representation of T . Cayley-Hamilton states that
A linear operator T is a zero of the characteristic polynomial.
Eigen function : Let T : V V be a linear operator on a vector space with finite
dimension. A scalar is called an eigenvalue of T if a non null vector such that,
T () = .
Every vector satisfying this relation is called an eigen vector of T corresponding to the eigen
value . If is an eigen value of T if T I is non singular. The set E , which is the kernal

412

Matrix Eigenfunctions

of T I, of all eigen vectors belonging to an eigen value is a subspace of V , called the


eigen space of .
Note that if A and B are matrix representation of T , then B = P 1 AP , where P is a
change of basis matrix. Thus A and B are similar and they have the same characteristic
polynomial. Accordingly, the characteristic polynomial of T is independent of the particular
basis in which the matrix representation of T is computed.
Ex 7.2.12 For the linear operator T : V V , find all eigen values and a basis for
eigenspace T (x, y) = (3x + 3y, x + 5y).
Solution: The matrix A that represents the linear operator T : V V relative to the
standard basis of <2 as


33
A = [T ] =
.
15
The characteristic polynomial of a linear operator is equal to the characteristic polynomial
of any matrix A that represents the linear operator. Therefore, the characteristic polynomial
for the linear operator T : V V is given by


3 3

= 2 8 + 15.
|A I| =
1 5
Ex 7.2.13 Let T : <3 <3 be defined by
T (x, y, z) = (2x + y 2z, 2x + 3y 4z, x + y z).
Find all eigen values of T and find a basis of each eigen space.
Solution: The matrix A that represents the linear operator T : V V relative to the
standard basis of <2 is

2 1 2
A = [T ] = 2 3 4 .
1 1 1
The characteristic polynomial of a linear operator is equal to the characteristic polynomial
of any matrix A that represents the linear operator. The characteristic polynomial for the
linear operator T : V V is given by


2 1
2

|A I| = 2 3 4
1
1 1
= 3 42 + 5 2 = ( 1)2 (t 2).
Thus the eigen values of A are 1, 2. Now we find the linearly independent eigenvectors for
each eigenvalue of A. Corresponding to = 1, consider the equation (A I)X = 0, where
X = [x1 , x2 , x3 ]T . The coefficient matrix is given by

1 1 2
1 1 2
A I = 2 2 4 0 0 0 .
1 1 2
00 0
x1 + x2 2x3 = 0.
We see that [1, 1, 0]T and [2, 0, 1]T are two linearly independent eigen vector corresponding
to the eigen value = 1. Similarly, for = 2, we obtain,

0 1 2
0 1 2
A 2I = 2 1 4 0 0 0 .
1 1 3
1 1 3
x1 + x2 3x3 = 0 and x2 2x3 = 0.

Diagonalization

413

We see that [1, 2, 1]T is a solution and so it is the eigen vector corresponding to the eigen
value = 2.
Ex 7.2.14 For the linear operator D : V V defined by D(f ) = df
dt , where, V is the space
of functions with basis S = {sin t, cos t}, find the characteristic polynomial.
Solution: First we are to find the matrix A representing the differential operator D relative
to the basis S. Now,
D(sin t) = cos t = 0. sin t + 1. cos t
D(cos t) = sin t = (1). sin t + 0. cos t


0 1
A=
.
1 0
Therefore, the characteristic polynomial
for the linear
operator D : V V is given by
0 1
= 2 + 1.
|A I| =
1 0

7.3

Diagonalization

Diagonalization of a matrix
A given n square matrix A with eigen values 1 , 2 , , n is said to be diagonalisable, if
a non singular matrix P such that
D = P 1 AP = diag(1 , 2 , , n )

(7.12)

is diagonal. Thus the n n matrix A is said to be diagonalisable, if A is similar to an n n


diagonal matrix. Below we are to derive a necessary and sufficient for diagonalizability of a
matrix A.
Theorem 7.3.1 An n n matrix A over the field F is diagonalisable if and only if A has
n linearly independent eigen vectors.
Proof: First let A is diagonalisable. Then by definition A is similar to a diagonal matrix
D = diag(1 , 2 , , n ), where the eigen values of A are 1 , 2 , , n and there exists a
non-singular matrix P of order n such that
A = P DP 1 , i.e., AP = P D.
Let X1 , X2 , , Xn be n column vectors of P , then,
ith column vector of AP = ith column vector of DP.
Therefore, AXi = i Xi . Since i is an eigen value of A, this relation AXi = i Xi shows that
Xi is the eigen vector of A corresponding to the eigen value i . Therefore, eigen vectors of A
are n column vector of P . P is non-singular, so these vectors are LI in Vn (F ). Consequently,
A has n linearly independent eigen vectors.
Conversely, let X1 , X2 , , Xn be n linearly independent eigen vectors of A corresponding
to the eigen values 1 , 2 , , n respectively. Then, AXi = i Xi . Let P be an n n matrix
whose ith column vector is Xi . Since Xi s are linearly independent in Vn (F ), P is nonsingular. If D is a diagonal matrix, than,
D = diag(1 , 2 , , n ), so that AP = P D, i.e., A = P DP 1 .
Consequently, A is similar to D and A is diagonalisable.

1 3 3
Ex 7.3.1 Diagonalise the matrix A = 3 5 3 , if possible.
6 6 4

414

Matrix Eigenfunctions

Solution: The characteristic equation of the given matrix A is




1 3
3

|A I| = 3 5 3 = 0
6
6 4
or, (2 + )2 (4 ) = 0 = 2, 2, 4.
Thus the eigen values of the given matrix are 2, 2, 4 and 2 is an 2-fold eigen value of
the matrix A. Corresponding to = 2, consider the equation (A + 2I)X = 0, where
X = [x1 , x2 , x3 ]T . The coefficient matrix is given by

3 3 3
1 1 1
A + 2I = 3 3 3 0 0 0 .
6 6 6
0 0 0

The system of equation is equivalent to x1 x2 +x3 = 0. We see that [1, 1, 0]T and [1, 0, 1]T
generate the eigen space of the eigen value 2 and they form a basis of the eigen space E2
of 2. For = 4, the coefficient matrix is given by

3 3 3
1 1 1
A + 4I = 3 9 3 0 2 1 .
6 6 0
00 0
x1 + x2 x3 = 0, 2x2 x3 = 0
so that x3 = 2 gives x1 = x2 = 1. Hence [1, 1, 2]T is a eigen vector corresponding to the
eigen value = 5. Thus [1, 1, 2]T generates the eigen space of the eigen value 4 and they
form a basis of the eigen space E4 of 4. These three vectors [1, 1, 0]T , [1, 0, 1]T and [1, 1, 2]T
are LI, so the given matrix A is diagonalisable and the diagonalising matrix is

1 3 1
1 1 1
1
1
P = 1 0 1 so that P = 2 2 1
2
1 1 1
0 1 2

1 3 1
1 3 3
1 1 1
2 0 0
1
P 1 AP = 2 2 1 3 5 3 1 0 1 = 0 2 0 ,
2
1 1 1
6 6 4
0 1 2
0 0 4
where the diagonal elements are eigen values of A.

Ex 7.3.2 Show that the matrix A =

3 5
2 3


is diagonalizable over the complex field C.

Solution: The characteristic polynomial of A is


|A I| = 2 (3 3) + (9 + 10) = 2 + 1.
Now, we consider the following two subcases:
(i) A is a matrix over the real field <, then the characteristic polynomial has no real roots.
Thus A has no eigen values and no eigen vectors, and so A is not diagonalizable.
(ii) A is a matrix over the complex field C. Then it has two distinct eigen values i and i.

Diagonalization

415

Therefore, X1 = (5, 3 i)T and X2 = (5, 3 + i)T are the linearly independent eigen vectors
of A corresponding to the eigen values i and i respectively. Thus,




5
5
i 0
1
P =
and P AP =
.
3i3+i
0 i
As expected, the diagonal entries in D are the eigen values of A. Therefore, the matrix A
is diagonalizable over the complex field C.
Definition 7.3.1 For an r-fold eigenvalue of the matrix A, r is called the algebraic multiplicity of . If k be the number of linearly independent eigenvectors corresponding to an
eigenvalue then k is the geometric multiplicity of . The geometric multiplicity of an
eigenvalue is less than or equal to its algebraic multiplicity. If the geometric multiplicity of
is equal to its algebraic multiplicity, then is said to be regular.


1 1
Ex 7.3.3 Let A =
. Find the algebraic and geometric multiplicities of the eigen1 3
values. Also, diagonalise A, if possible.
Solution: The characteristic equation of A is


1 1

=0
|A I| =
1 3
or, 2 4 + 4 = 0 ( 2)2 = 0.
Therefore, the eigenvalues are = 2, 2. Hence the algebraic multiplicity of 2 is 2. Let
T
[x1 , x2 ] be the
 eigenvectorcorresponding
    to 2. Then
1 2 1
x1
0
=
1
3

2
x
2

    0
1 1
x1
0
,
or,
=
1 1
x2
0
or, x1 + x2 =
 0. Let
 x2= kthen x1 = k.
k
1
Thus the eigenvectors are
=k
, i.e., there is only one independent eigenvector
k
1
corresponding to = 2. So, the geometric multiplicity of the eigenvalue 2 is 1. Since
the number of independent eigenvectors is 1, of the matrix A of order 2 2, so A is not
diagonasable.
Deduction 7.3.1 Suppose a matrix A can be diagonalized as P 1 AP = D, where, D is
diagonal. Then A has the extremely useful diagonal factorization A = P DP 1 . Using the
factorization, the algebra of A reduces reduces to the algebra of the diagonal matrix D,
which can be easily evaluated. Suppose D = diag(1 , 2 , , n ), then,
m
m
m
1
Am = P DP 1
= P Dm P 1 = P diag(m
.
1 , 2 , , n ) P
More generally, for a polynomial f (t),

f (A) = f P DP 1 = P f (D)P 1 = P diag (f (1 ), f (2 ), , f (n )) P 1 .
Furthermore, if the diagonal entries of D are nonnegative, let
p p
p
B = P diag( 1 , 2 , , n ) P 1 .
Then B is nonnegative square root of A, i.e., B 2 = A and the eigen values of A are nonnegative.

416

Matrix Eigenfunctions


Ex 7.3.4 Let A =


31
, find f (A), where f (t) = t3 5t2 + 3t + 6 and A1 .
22

Solution: The characteristic polynomial of A is




3 1

= ( 1)( 4).
|A I| =
2 2
Thus the eigen values of A are 1,4. We see that X1 = (1, 2)T and X2 = (1, 1)T are linearly
independent eigen vectors corresponding to 1 = 1 and 2 = 4 respectively, and hence form
a basis of <2 . Therefore, A is diagonalisable. Let,




1 1 1
1 1
1
P =
so, P =
2 1
3 2 1


10
P 1 AP =
= D.
04
Thus the diagonal elements are eigen values of A. Using the diagonal factorization A =
P DP 1 , and 14 = 1 and 44 = 256, we get,



 1 1  
171 85
3
1 1
1 0
4
4 1
3
=
.
A = PD P =
2 1
170 86
2 1
0 256
3 3
Also, f (1) = 5 and f (4) = 2, hence,
f (A) = P f (D)P
Using


=

1 1
2 1



50
02

 1
3
2
3

13
1
3


=

3 1
2 4



 1 1   5
1 = 1 and 4 = 2, we
obtain,
1 1
10
1/2
1
3 3
= 32
A
= B = P DP =
2 1
2 1
02
3
3 3

1
3
4
3


.

where B 2 = A and where B has positive eigen values 1 and 2.




22
Ex 7.3.5 Let A =
. (a) Find all eigen values and corresponding eigenvectors.
13
(b) Find a nonsingular matrix P such that D = P 1AP is diagonal,
(c) Find A6 and f (A), where f (t) = t4 3t3 6t2 + 7t + 3.
(d) Find a matrix B such that B 3 = A and B has real eigenvectors.
Solution: (a) The characteristic polynomial of A is


2 2

= ( 1)( 4).
|A I| =
1 3
Thus the eigen values of A are 1,4. The eigen vectors corresponding to 1 = 1 and 2 = 4
are X1 = (2, 1)T and X2 = (1, 1)T respectively.
(b) Since the vectors X1 and X2 are linearly independent, so A is diagonalisable. Finally,
let P be the matrix whose columns are the unit vectors X1 and X2 respectively, then,




1 1 1
2 1
P =
so, P 1 =
1 1
3 1 2


10
D = P 1 AP =
,
04

Diagonalization

417

where the diagonal elements are eigen values of A.


(c) Using the diagonal factorization A = P DP 1 , and 16 = 1 and 46 = 4096, we get,


 1 1  

2 1
1 0
1366 2230
3 3
=
.
A6 = P D6 P 1 =
1 2
1 1
0 4096
1365 2731
3 3
Also, f (1) = 2 and f (4) = 1, hence,


 1 1  

2 1
2 0
3
1 2
1
3
f (A) = P f (D)P =
=
.
1 2
1 1
0 1
1 0
3 3

(d) Here

7.3.1

1 0
0 34


is the real cube root of D. Hence the real cube root of A is
 1 1 



1 0
2 1
3
3 3
B = P DP 1 =
1 2
1 1
0 34
3 3



1
2 + 34 2 + 23 4
=
.
3 1 + 3 4 1 + 2 3 4

Orthogonal Diagonalisation

A square matrix A is said to be orthogonally diagonalisable, if there exists an orthogonal


non-singular matrix P such that
P 1 AP = a diagonal matrix.
In this case, P is said to diagonalise A orthogonally.
Theorem 7.3.2 A square matrix is orthogonally diagonalisable, if and only if it is real
symmetric.
Proof: First let A be orthogonally diagonalisable, then there exists an orthogonal matrix
P such that P 1 AP is a diagonal matrix, say P 1 AP = D, where D is an diagonal matrix.
Since P is an orthogonal matrix, we have P T = P 1 and so,
A = P 1 DP = P T DP
AT = [P T DP ]T = P T DT P
= P T DP = A, as D is diagonal so, DT = D,
shows that A is an symmetric matrix. Conversely, let A be a real symmetric matrix of order
n, the A has n linearly independent eigen vectors. Using Gram-Schmidt process, these
n eigen vectors can be converted to a set of linearly indendent orthogonal vectors. This
orthogonal set can be normalized to get a set of n orthogonal eigen vectors. Let P be the
n n matrix whose column vectors are these n orthonormal eigen vectors. Clearly, P is an
orthogonal matrix and P 1 AP is a diagonal matrix. Thus, A is orthogonally diagonalisable,
if it be symmetric.


7 3
Ex 7.3.6 Let A =
, find an orthogonal matrix P such that D = P 1 AP is diago3 1
nal.
Solution: The characteristic polynomial of A is
|A I| = 2 (7 1) + (7 9) = ( 8)( + 2).

418

Matrix Eigenfunctions

Thus the eigenvalues of A are 2, 8. The eigen vectors corresponding to 1 = 2 and 2 = 8


are X1 = (1, 3)T and X2 = (3, 1)T respectively. Since A is symmetric, the eigen vectors
X1 and X2 are orthogonal. Normalize X1 and X2 to obtain, respectively, the unit vectors




1
3
3
1

.
and X2 = ,
X1 = ,
10
10
10
10
1 and X
2 respectively,
Finally, let P be the matrix whose columns are the unit vectors X
then,
!
!
P =
D=P

1
3
10
10
3
1

10
10

AP =

and P 1 =

3
1
10
10
1
3

10
10

!

7 3
3 1

1
3
10
10
3
1

10
10

3
1
10
10
1
3

10
10


=

8 0
0 2


.

As expected, the diagonal entries in D are the eigen values of A.

6 4 2
Ex 7.3.7 Diagonalise the matrix A = 4 12 4 , if possible.
2 4 13
Solution: Here the given matrix A is a real symmetric matrix. The characteristic equation
of the given matrix A is


6 4
2

|A I| = 4 12 4 = 0
2
4 13
or, (4 )(2 27 + 162) = 0
or, ( 9)( 18)(4 ) = 0 = 4, 9, 18.
Thus the eigen values of the given matrix are 4, 9, 18. Corresponding to = 4, consider the
equation (A 4I)X = 0, where and X = [x1 , x2 , x3 ]T . The coefficient matrix is given by

2 4 2
1 2 1
A 4I = 4 8 4 0 0 0 .
2 4 9
00 7
x1 + 2x2 x3 = 0, 7x3 = 0.
We see that, [2, 1, 0]T generates the eigen space of the eigen value 4 and forms a basis of
the eigen space E4 of 4. For = 9, the coefficient matrix is given by

3 4 2
1 7 6
A 9I = 4 3 4 0 5 4 .
2 4 4
00 0
x1 + 7x2 6x3 = 0, 5x2 4x3 = 0
so that x3 = 5 gives x1 = 2, x2 = 4. Hence [2, 4, 5]T is a eigen vector corresponding to the
eigen value = 9. Thus [2, 4, 5]T generates the eigen space of the eigen value 9 and they
form a basis of the eigen space E9 of 9. For = 18, the coefficient matrix is given by

12 4 2
201
A 18I = 4 6 4 0 1 1 .
2 4 5
000
2x1 + x3 = 0, x2 + x3 = 0

Diagonalization

419

so that x3 = 2 gives x1 = 1, x2 = 2. Hence [1, 2, 2]T is a eigen vector corresponding to


the eigen value = 18. Thus [1, 2, 2]T generates the eigen space of the eigen value 18 and
they form a basis of the eigen space E18 of 18. These three vectors [1, 1, 0]T , [1, 0, 1]T and
[1, 1, 2]T are linearly independent and orthogonal, so the given matrix A is diagonalisable
and the diagonalising orthogonal matrix is
2 2 1

5 3
40 0
1 5 3
4
2
P = 5 3 5 3 so that P 1 AP = 0 9 0 ,
5

0 0 18
23
0
3 5

where the diagonal elements are eigen values of A.


Diagonalization of linear operator
A linear operator T : V V is said to be diagonalisable, if it can be represented by
a diagonal matrix D. According to the definition, the linear operator T : V V is
diagonalisable if and only if there exists a basis S = {1 , 2 , , n } of V for which,
T (1 ) = 1 1 , T (2 ) = 2 2 , , T (n ) = n n .

(7.13)

In such a case, T is represented by the diagonal matrix D = diag(1 , 2 , , n ) relative to


the basis S = {1 , 2 , , n }.
2
Ex 7.3.8 Each ofthe following
real matrices

 defines
 a linear transformation

 on < :
5 6
1 1
5 1
(a) A =
;
(b) B =
;
(c) C =
.
3 2
2 1
1 3
Find for each matrix, all eigen values and maximum set S of linearly indeperndent eigen
vectors. Which of these linear operators are diagonalisable?

Solution: (a) The characteristic polynomial of A is




5
6
2

3 2 = 3 28 = ( + 4)( 7).
Therefore, the eigen values of A are 4, 7. For 1 = 4, if X1 = (x1 , x2 )T be the non null
eigen vector, then

9x1 + 6x2 = 0
AX1 = 4X1
3x1 + 2x2 = 0.
3x1 + 2x2 = 0
Thus, X1 = (2, 3)T is a eigen vector corresponding to 1 = 4. Similarly, X2 = (3, 1)T is
a eigen vector corresponding to 2 = 7.
So, S = {(2, 3), (3, 1)} is a maximal set of linearly independent eigen vectors. Since S is
a basis of <2 , A is diagonalisable. Using the basis S, A can be represented by the diagonal
matrix

1 

 

2 3
5 6
2 3
4 0
D=
=
.
3 1
3 2
3 1
0 7
(b) The characteristic polynomial of B is


1 1
2


2 1 = + 1 = ( + i)( i).

420

Matrix Eigenfunctions

There is no real characteristic root of B. Thus, B, a real matrix representing a linear


transformation on <2 , has no eigen values and no eigen vectors. Hence in particular, B is
not diagonalisable in <2 .
As a polynomial over C, the eigen values of B are i, i. Therefore, X1 = (1, 1 + i)T and
X2 = (1, 1 i)T are the linearly independent eigen vectors of A corresponding to the eigen
values i and i respectively. Now S = {(1, 1 + i), (1, 1 i)} is a basis of C 2 consisting of
eigen vectors of B. Using this basis B, B can be represented by the diagonal matrix

T 

 

1
1
1 1
1
1
i 0
D=
=
.
1+i1i
2 1
1+i1i
0 i
As expected, the diagonal entries in D are the eigen values of A. Therefore, the matrix A
is diagonalizable over the complex field C.
(c) The characteristic polynomial of C is


5 1
2
2


1 3 = 8 + 16 = ( 4) .
Therefore, the eigen values of C are 4, 4. For 1 = 4, if X1 = (x1 , x2 )T be the non null eigen
vector, then
CX1 = 4X1 x1 x2 = 0.
The homogeneous system has only one independent solution, say (1, 1)T , so, (1, 1)T is an
eigen vector of C. Furthermore, since there are no other eigen values, the solution set
S = {(1, 1)} is a maximal set of linearly independent eigen vectors of C. Since S is not a
basis of <2 , C is not diagonalisable.

7.4

Minimal Polynomial

It turns out that in the case of some matrices having eigen values with multiplicity greater
than unity, there may exist polynomials of degree less than n, which equal to the zero matrix.
Minimal polynomial of a matrix
Let the characteristic polynomial of a matrix A be
() = (1 )d1 (2 )d2 (l )dl ;

l
X

di = n.

i=1

The Cayley-Hamilton theorem states that


(A) = (1 I A)d1 (2 I A)d2 (l I A)dl = 0.
If r1 , r2 , , rl be the smallest positive integers for which,
J(A) (1 I A)r1 (2 I A)r2 (l I A)rl = 0
where, ri di ; (1 i l) then,
J() = (1 )r1 (2 )r2 (l )rl

(7.14)

is called the minimal polynomial of the matrix A. The degree of the minimal polynomial of
an n n matrix A is atmost n. It follows at once that if all the eigen values of a matrix are
distinct, its minimal polynomial equal to characteristic polynomial.

Minimal Polynomial

421

Theorem 7.4.1 The minimal polynomial m(t) of a matrix A divides every polynomial which
has A as zero. In particular, m(t) divides the characteristic polynomial of A.
Proof: Suppose f (t) is a polynomial for which f (A) = 0. By the division algorithm,
polynomials q(t) and s(t) for which
(7.15)
f (t) = m(t)q(t) + r(t)
where either r(t) = 0 or degr(t) < degm(t). Substituting t = A in (7.15) and using the fact
that f (A) = 0 and m(A) = 0, we get r(A) = 0. If r(t) 6= 0, then by the division algorithm
degr(t) < degm(t). This means that there is a polynomial r(t) of degree less than that of
m(t) such that r(A) = 0, which is a contradiction, as by definition of minimal polynomial,
m(t) is a polynomial of least degree such that m(A) = 0. Hence r(t) = 0 and so
f (t) = m(t)q(t), i.e., m(t) divides f (t).
As a particular case, since A satisfies its own characteristic equation, by Cayley-Hailtons
theorem m(t) divides the characteristic polynomial.
Theorem 7.4.2 Let m(t) be the minimal polynomial of an n square matrix A. Then the
characteristic polynomial of A divides (m(t))n .
Proof: Let the minimal polynomial of an n square matrix A be
m(t) = tr + c1 tr1 + + cr1 t + cr .
Define, the matrices Bj as follows
B0 = I
B1 = A + c1 I
B2 = A2 + c1 A + c2 I
..
.

so I = B0
so c1 I = B1 AB0
so c2 I = B2 AB1
..
.

Br1 = Ar1 + c1 Ar2 + + cr1 I so cr1 I = Br1 ABr2


Also, we have,
ABr1 = cr I (Ar + c1 Ar1 + + cr1 A + cr I)
= cr I m(A) = cr I.
Set, B(t) = tr1 B0 + tr2 B1 + + tBr2 + Br1 , we get,
(tI A)B(t) = (tr B0 + tr1 B1 + + tBr1 )
(tr1 AB0 + tr2 AB1 + + ABr1 )
r
= t B0 + tr1 (B1 AB0 ) + tr2 (B2 AB1 ) + + t(Br1 ABr2 ) ABr1
= tr I + c1 tr1 I + c2 tr2 I + + cr1 tI + cr I = m(t)I
|tI A||B(t)| = |m(t)I| = (m(t))n ; taking determinant.
Since |B(t)| is a polynomial, |tI A| divides (m(t))n . Hence the characteristic polynomial
divides (m(t))n .
Theorem 7.4.3 The characteristic polynomial and the minimal polynomial of a matrix A
have the same irreducible factors.
Proof: Let f (t) is an irreducible polynomial. If f (t) divides the minimal polynomial
m(t), then as m(t) divides the characteristic polynomial, f (t) must divide the characteristic
polynomial. On the other hand, if f (t) divides the characteristic polynomial of A, by the
above theorem, f (t) must divide (m(t))n . But f (t) is irreducible, hence f (t) also divides
m(t). Thus m(t) and the characteristic polynomial have the same irreducible factors.

422

Matrix Eigenfunctions

Result 7.4.1 This theorem does not say that m(t) =characteristic polynomial, only that
any irreducible factor of one must divide the other. In particular, since a linear factor is
irreducible, m(t) and characteristic polynomial have the same linear factors, so that they
have the same roots. Thus we conclude, a scalar is an eigen value of the matrix A if and
only if is a root of the minimal polynomial of A.
Ex 7.4.1 Findthe characteristic
and minimal

3 1 1
3 2
A = 2 4 2 and B = 3 8
1 1 3
3 6

polynomials
of each of the following matrices

1
3 .
1

Solution: The characteristic polynomial A () of A is




3 1
1

|A I| = 2 4 2
1 1 3
= 24 28 + 102 3 = ( 2)2 (6 ).
The characteristic polynomial B () of B is


3 2
1

|B I| = 3 8 3
3
6 1
= 24 28 + 102 3 = ( 2)2 (6 ).
Thus the characteristic polynomial of both matrices is same. Since the characteristic polynomial and the minimal polynomial have the same irreducible factors, it follows that both
(t 2) and (6 t) must be factors of m(t). Also, m(t) must divide the characteristic
polynomial. Hence it follows that m(t) must be one of the following:
f (t) = (t 2)(6 t) or g(t) = (t 2)2 (6 t).
(i) By Cayley-Hailtons theorem, g(A) = 4(A) = 0, so we need only test f (t). Now,

1 1 1
3 1 1
(A 2I)(6I A) = 2 2 2 2 2 2 = 0.
1 1 1
1 1 3
Therefore, m(t) = (t 2)(6 t) is the minimal polynomial of A.
(ii) By Cayley-Hailtons theorem, g(B) = 4(B) = 0, so we need only test f (t). Now,

1 2 1
3 2 1
(B 2I)(6I B) = 3 6 3 3 2 3 = 0.
3 6 3
3 6 7
Therefore, m(t) = (t 2)(6 t) is the minimal polynomial of B.
Deduction 7.4.1 Consider the following two n square matrices

1 0 0 0
a 0
0 1 0 0
0 a

..
..
J(, n) = ...
and
A
=

.
.

0 0 0 1
0 0 0
0 0 0 0
00 0

as

0 0
0 0

..

a
0

Minimal Polynomial

423

where a 6= 0. The matrix J(, n), called Jordan Block has s on the diagonal, 1s on
the superdiagonal and 0s elsewhere. The matrix A, which is the generalization of J(, n),
has s on the diagonal, as on the super diagonal and 0s elsewhere. Now we see that
f (t) = (t )n is the characteristic and minimal polynomial of both J(, n) and A.

2100
0 2 0 0

Ex 7.4.2 Find the minimal polynomial of the matrix A =


0 0 2 0.
0005
Solution: The characteristic polynomial (t) of A is given by,


2 t 1
0
0

0 2t 0
0
3
|A tI| =
= (t 2) (t 5).
0
0
2

t
0


0
0
0 5 t
Since the characteristic polynomial and the minimal polynomial have the same irreducible
factors, it follows that both t 2 and t 5 must have factors of m(t). Also, m(t) must divide
the characteristic polynomial. Hence it follows that m(t) must be one of the following three
polynomials:
(i) m(t) = (t 2)(t 5), (ii) m(t) = (t 2)2 (t 5), (iii) m(t) = (t 2)3 (t 5).

0100
3 1 0 0
For the type (i), we have,
0 0 0 0 0 3 0 0

(A 2I)(A 5I) =
0 0 0 0 0 0 3 0 6= 0.
0003
0 0 0 0

0000
3 1 0 0
For the type (ii), we have,
0 0 0 0 0 3 0 0

(A 2I)(A 2I)(A 5I) =


0 0 0 0 0 0 3 0 = 0.
0009
0 0 0 0
For the type (iii), we have obviously, (A 2I)3 (A 5I) = 0 follows from Cayley-Hailtons
theorem. Since m(t) is minimal polynomial, we have m(t) = (t 2)2 (t 5).
Deduction 7.4.2 Let us consider an arbitrary monic polynomial
f (t) = tn + cn1 tn1 + + c1 t + c0 .
Let us consider an nth order square matrix A with 1s on the subdiagonal, last column
[c0 , c1 , , cn1 ]T and 0s elsewhere as follows

0 0 0 c0
1 0 0 c1

A=.
,
..
..

.
0

1 cn1

then A is called the companion matrix of the polynomial f (t). Moreover, the characteristic and minimal polynomial of the comparison matrix A are both equal to the original
polynomial f (t).
Ex 7.4.3 Find a matrix whose minimal polynomial is t3 5t2 + 6t + 8.

424

Matrix Eigenfunctions

Solution: Here the given monic polynomial is f (t) = t3 5t2 + 6t + 8. Let A be the
comparison matrix of the polynomial f (t), then by definition

0 0 8
A = 1 0 6 .
01 5
Also the characteristic and minimal polynomial of the comparison matrix A are both equal
to the original given polynomial f (t).
Minimal polynomial of linear operator
The minimal polynomial of the operator T is defined independently of the theory of matrices,
as the monic polynomial of the lowest degree with leading coefficient 1, which has T as zero.
However, for any polynomial f (t),
f (T ) = 0 if and only if f (A) = 0,
where A is any matrix representation of T . Accordingly, T and A have the same minimal
polynomials.
(i) The minimal polynomial m(t) of a linear operator T divides every polynomial that
has T as a zero. In particular, the minimal polynomial m(t) divides the characteristic
polynomial of T .
(ii) The characteristic and minimal polynomials of a linear operator T have the same
irreducible factors.
(iii) A scalar is an eigen value of a linear operator if and only if is a root of the minimal
polynomial m(T ) of T .
Minimal polynomial of block diagonal matrices
Let A be a block diagonal matrix with diagonal blocks A1 , A2 , , Ar . Then the minimal
polynomial of A is equal to the least common multiple of the minimal polynomials of the
diagonal blocks Ai .
Ex 7.4.4 Find thecharacteristic and minimal
polynomial of the block diagonal matrix
..
..
2 5 . 0 0 . 0

.
.
0 2 .. 0 0 .. 0

A = 0 0 ... 4 2 ... 0

0 0 ... 3 5 ... 0

.
.
0 0 .. 0 0 .. 7
Solution: The given block diagonal matrix can be written in the form




25
42
A = diag(A1 , A2 , A3 ); where, A1 =
, A2 =
, A3 = [7].
02
35
The characteristic polynomials of A1 , A2 , A3 are
|A1 I| = ( 2)2 ; |A2 I| = ( 2)( 7); |A3 I| = 7.

Bilinear Forms

425

Thus the characteristic polynomial of A is


|A I| = |A1 I| |A2 I| |A3 I|
= ( 2)2 ( 2)( 7)( 7) = ( 2)2 ( 7)2 .
The minimal polynomials m1 (t), m2 (t), m3 (t) of the diagonal blocks A1 , A2 , A3 respectively,
are equal to the characteristic polynomials, i.e.,
m1 (t) = (t 2)2 ; m2 (t) = (t 2)(t 7); m3 (t) = t 7.
But m(t), the minimal polynomial of A is equal to the least common multiple of m1 (t), m2 (t)
and m3 (t), i.e., m(t) = (t 2)2 (t 7).

7.5

Bilinear Forms

Let V be an n dimensional Euclidean space, then B(, ) : V V < is said to be a


bilinear form if it is linear, homogeneous, with respect to both the arguments , . If B is
a symmetric Bilinear form then we can write
B(, ) = hB, i = T B

(7.16)

where B = [bij ] real symmetric n n matrix, known as the matrix of the quadratic form,
B(, ) is known as a quadratic form. For example, the expression
5x1 y1 + 2x1 y2 3x1 y3 + 7x2 y1 5x2 y2 + 3x3 y3
is a bilinear form in the variables x1 , x2 and y1 , y2 , y3 . If we change the base vector such
that = P 0 then,
0T P T BP 0 = B(P 0 , P 0 ).
(7.17)
The matrices of the two quadratic forms (7.16) and (7.17) connected by the transformation,
i.e., if we change co-ordinates in a quadratic form, its matrix is change to a matrix which
is congruent to the matrix of the original quadratic form. In other words if we have two
quadratic form whose matrices are congruent to each other then they represent the same
quadratic form only with respect to two coordinate system connected by a non singular
transformation.

7.5.1

Real Quadratic Forms

A homogeneous expression of the second degree in any number of variables of the form
Q(, ) = T B =

n X
n
X

aij xi xj ; aij = aji

i=1 j=1

a11 a12 a1n


a21 a22 a2n

= (x1 , x2 , , xn ) .
..
..
.
an1 an2 ann

x1
x2

.. ,
.

(7.18)

xn

where aij are constants belonging to a field of numbers and x1 , x2 , , xn are variables,
belonging to a field of numbers (not necessarily same) is defined as a quadratic form. If the
variables assumes real variables only, the form is said to be quadratic form in real variables.
When the constants aij and the variables xi s are all real, then the expression is said to be
a real quadratic form with B as associated matrix. For example,

426

Matrix Eigenfunctions

(i) x21 + 2x1 x2 + 3x22 is real quadratic forms in 2 variables with associated matrix


11
B1 =
.
13
(ii) x21 + 3x22 + 3x23 4x2 x3 + 4x3 x1 2x1 x2 is real quadratic forms in 3 variables with
associated matrix

1 1 2
B2 = 1 3 2 ..
2 2 3
Now, A real quadratic form Q(, ) is said to be
(i) positive definite, if Q > 0, for all 6= .
(ii) positive semi-definite, if Q 0, for all and Q = 0 for some 6= .
(iii) negative definite, if Q < 0, for all 6= .
(iv) negative semi-definite, if Q 0, for all and Q = 0 for some 6= .
(v) indefinite, if Q 0, for some 6= and Q 0 for some 6= .
These five classes of quadratic forms are called value classes.

Ex 7.5.1 Find the quadratic form that corresponds to a symmetric matrix A =

5 3
3 8


.

Solution: The quadratic form Q() that corresponds to a symmetric matrix A is Q() =
T A, where = (x1 , x2 )T is the column vectors of unknowns. Thus,

Q() = (x1 , x2 )

5 3
3 8



x1
x2

= 5x21 6x1 x2 + 8x22 .

Ex 7.5.2 Examine whether the quadratic form 5x2 + y 2 + 5z 2 + 4xy 8xz 4yz is positive
definite or not.
Solution: The given quadratic form can be written as
Q(x, y, z) = 5x2 + y 2 + 5z 2 + 4xy 8xz 4yz
= (2x + y 2z)2 + x2 + z 2 .
Since Q > 0, for all (x, y, z) and Q = 0, only when x = y = z = 0, i.e., = . Hence Q is
positive
definite. Alternatively,
if Q(, ) = T B, then the associated matrix B is given

5 2 4
by, B = 2 1 2 . The principal minors of B are
4 2 5


1 2
= 9, |B| = 17
5,
2 5
are all positive, hence the given quadratic form Q is positive definite.
Ex 7.5.3 Prove that the real quadratic Q = ax2 + bxy + cy 2 is positive definite, if a > 0
and b2 < 4ac(a, b, c 6= 0).

Canonical Form

427

Solution: The given quadratic form can be written as


Q(x, y, z) = ax2 + bxy + cy 2


b
4ca b2 2
= a (x + y)2 +
y
.
2a
4a2
Since the expression Q is positive definite, so, Q 0, i.e., if a > 0 and b2 < 4ac(a,
 bb,c 6= 0).
a
Also, if Q(, ) = T B, then the associated matrix B is given by, B = b 2 . The
2 c
b
a
2
principal minors of B are a, b 2 = a2 b4 . For a, b, c 6= 0, and if a > 0 and b2 < 4ac, then,
2 c
all principal minors are positive, hence the given quadratic form Q is positive definite.
Ex 7.5.4 Examine whether the quadratic form x2 + 2y 2 + 2z 2 2xy + 2xz 4yz is positive
definite or not.
Solution: The given quadratic form can be written as
Q(x, y, z) = x2 + 2y 2 + 2z 2 2xy + 2xz 4yz
= (y z x)2 + (y z)2 .
Since Q 0, for all (x, y, z) and if we take x = 0, y = z = 1, then Q = 0. Hence Q 0, for
all 6= and so Q is positive semi-definite.
Ex 7.5.5 Examine whether the quadratic form x2 + y 2 2z 2 + 2xy 2yz 2xz is positive
definite or not.
Solution: The given quadratic form can be written as
Q(x, y, z) = x2 + y 2 2z 2 + 2xy 2yz 2xz
= (x + y z)2 3z 2 .
We see that, Q 0, for some 6= and Q 0 for some 6= . For example, if (x, y, z) =
(1, 0, 0), then Q > 0, (x, y, z) = (0, 0, 1), then Q < 0, (x, y, z) = (1, 1, 0), then Q = 0, so,
the given expression Q is indefinite.

7.6

Canonical Form

Let us consider the real quadratic form Q(, ) = T A, where A is a real symmetric matrix
of order n. From spectral theorem, we know that eigen vector of A forms an orthonormal
basis of V . Let P be the n square matrix whose columns are orthogonal eigen vector of A
then, |P | 6= 0 and P T = P 1 and hence the non-singular linear transformation 0 = P will
transform T A to
T P T AP = 0T P 1 AP 0 = 0T D0 = Q0 (0 , 0 ),

(7.19)

where D = P 1 AP = P T AP is the symmetric diagonal matrix whose element in the


diagonal are the eigen values of the matrix A, i.e., D = diag(1 , 2 , , n ) = P 1 AP.
Now, Q0 (0 , 0 ) is real quadratic form and it is called a linear transformation of Q. The
matrix D of Q0 is congruent to A and Q0 is said to be congruent to Q. When expressed in
terms of coordinates, the equation (7.19) has the form
02
02
Q0 (0 , 0 ) = 0T D0 = 1 x02
1 + 2 x2 + + n xn .

(7.20)

428

Matrix Eigenfunctions

Let 1 , 2 , , p be the positive eigen values of A, p+1 , p+2 , , r the negative eigen
values of A and r , r+1 , , n be the zero eigen values of A, where r is the rank of A. If
A be an n n real symmetric matrix of rank r( n), then a non singular matrix P such
that P T AP , i.e., D becomes diagonal with the form

Ip

; 0 p r.
Irp
0
Thus, if p, r p, n r are defined to be the positive, the negative and the zero indices of
inertia, and it is expressed by writing
In (A) = (p, r p, n r).

(7.21)

The quantity, p (r p) = s is defined as the signature. We can reduce the equation (7.20)
to the further signature form applying the following transformation
)
x0i = 1 x00i ; i = 1, 2, , r
|i |
(7.22)
x0i = x00i ; i = r + 1, , n
The equation (7.22) transforms to (7.20) into the quadratic form
002
002
002
x002
1 + + xp xp+1 xr .

(7.23)

We have reduce the quadratic form (7.20) to the quadratic form (7.23) which is the sum of
the square terms with coefficients as +1 and 1 respectively. The quadratic form (7.20) is
called the canonical or normal form of Q. The number of positive terms in the normal form
is called index.
Deduction 7.6.1 Sylvesters law of inertia: Sylvestes law of inertia states that when
a quadratic form is reduced to a normal form similar to (7.23), the rank and signature of
the form remains invariant, i.e., In (A) is independent of the method of reducing (7.20) to
the canonical form (7.23).
Deduction 7.6.2 Classification of quadratic forms : A quadratic form Q(, ) =
T A is said to be a positive definite if Q(, ) > 0; 6= , negative definite if Q(, ) <
0; 6= , positive semi definite if Q(, ) 0; , negative semi definite if Q(, ) 0;
and is said to be indefinite if Q(, ) can take positive value for some 6= as well as
negative value for some other 6= . Thus
(i) the quadratic form Q(, ) = T A is positive definite if (r = n), all the eigen
values are positive, i.e., In (A) = (n, 0, 0). In this case, the canonical form becomes
002
002
x002
1 + x2 + + xn .
(ii) the quadratic form Q(, ) = T A is negative definite if all the eigen values are
002
negative, i.e., In (A) = (0, n, 0). In this case, the canonical form becomes x002
1 x2
002
xn .
(iii) the quadratic form Q(, ) = T A is positive semi definite if r < n, r p = 0, i.e.,
In (A) = (p, 0, n p) and positive semi definite if In (A) = (0, r, n r).
(iv) the quadratic form Q(, ) = T A is indefinite if In (A) = (p, r p, n r), where
p > 0 and r p > 0.

Canonical Form

429

Each quadratic form must be of one of these five types. Sylvesters criterion states that
the real symmetric matrix A is positive definite if and only if all its principal minors of A
are positive. This remains valid if we replace the word positive everywhere by the word
non-negative.
Ex 7.6.1 Reduce the quadratic form 5x21 + x22 + 10x23 4x2 x3 10x3 x1 to the normal form.
Solution: The given quadratic form can be written as


5 0 5
x1
Q(, 0 ) = (x1 x2 x3 ) 0 1 2 x2 ,
5 2 10
x3

5 0 5
where, the associated symmetric matrix is given by A = 0 1 2 . Let us apply
5 2 10
congruence operations on A to reduce it to the normal form

5 0 5
5 0 0

A R3 + R2 0 1 2 C3 + C2 0 1 2
0 2 5
0 2 5

50 0
500

R3 + 2R2 0 1 2 C3 + 2C2 0 1 0
00 1
001

1 0 0
5 0 0
1
1
R1 0 1 0 C1 0 1 0 .
5
5
001
0 01
The rank of the quadratic form is r = 3 and the number of positive indices of inertia is
p = 3, which is the index. Therefore, the signature of the quadratic form is 2p r = 3.
Here, n = r = p = 3, so, the quadratic form is positive definite.
Ex 7.6.2 Let Q = x2 + 6xy 7y 2 . Find the orthogonal substitution that diagonalizes Q.


1 3
Solution: The symmetric matrix A that represents Q is A =
. The characteristic
3 7
polynomial of A is
|A I| = 2 (1 7) + (7 6) = ( + 8)( 2).
The eigenvalues of A are 8, 2. Thus using x1 and x2 as new variables, a diagonal form of
Q is
Q(x1 , x2 ) = 2x21 8x22 .
The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigen
vectors of A. The eigen vector corresponding to 1 = 8 and 2 = 2 are X1 = (1, 3)T
and X2 = (3, 1)T respectively. Since A is symmetric, the eigen vectors X1 and X2 are
orthogonal. Now, we normalize X1 and X2 to obtain respectively the unit vectors
T
T


3
1
1
3

.
,
and X2 =
X1 =
,
10
10
10
10
Finally, let P be the matrix whose columns are the unit vectors X1 and X2 respectively,
and then (x, y)T = P (x1 , x2 )T is the required orthogonal change of coordinates, i.e.,
!
3 1
1
1
P = 110 3 10
and x = (3x1 x2 ), y = (x1 + 3x2 ).
10
10
10
10

430

Matrix Eigenfunctions

One can also express x1 and x2 interms of x and y by using P 1 = P T as


1
x1 = (3x + y);
10

1
x2 = (x + 3y).
10

Classification of conics
(i) The general equation of a quadratic conic in two variables x and y can be written in the
form
ax2 + 2hxy + by 2 + gx + f y + c = 0

 
 
ah
x
x
(x y)
+ (g f )
+c=0
hb
y
y
or, T A + K T + c = 0,


(7.24)

ah
is a real symmetric matrix and hence it is orthogonally diagonalizable,
hb
K T = (g f ) and T = (x, y). Let 1 , 2 be the eigen values of the real symmetric matrix A,
the corresponding eigen vectors be 1 , 2 respectively, so that for P = [1 , 2 ], P 1 = P T
i.e., P is a orthogonal matrix and


1 0
AP = P D, where, D = diag(1 , 2 ) =
.
0 2
where A =

If we apply the rotation = P 0 , then equation (7.24) reduces to,


0T P T AP 0 + K T P 0 + c = 0
or, 1 x02 + 2 y 02 + g 0 x0 + f 0 y 0 + c = 0

(7.25)

where the rotation = P 0 transform the principal axes into coordinate axes. Let us now
apply the translation
x0 = x00 + , y 0 = y 00 + .
If 1 6= 0, coefficient of x00 may be made to be zero for a suitable choice of and if 2 6= 0,
coefficient of y 00 may be made to be zero for a suitable choice of . Therefore, the conic
(7.25) can be transformed to one of the three general forms
1. Let In (A) = (2, 0, 0), then the standard form becomes
x2
y2
+
= 1;
a2
b2
= 0;

ellipse
a single point.

2. Let In (A) = (1, 1, 0), i.e., rank of A = 1. In this case, one of 1 and 2 is zero, so, the
standard form becomes
x2
y2
2 = 1;
2
a
b
= 0;

hyperbola
pair of intersecting straight lines

3. Let In (A) = (1, 0, 1), then the standard form becomes


x2 4y = 0;
= 1;
= 0;

parabola
pair of straight lines
a single straight line

Canonical Form

431

(ii) The general equation of a quadratic conic in the variables x, y, z can be written in
the form
ax2 + by 2 + cz 2 + 2hxy + 2gyz + 2f zx + ux + vy + wz + d = 0



ahg
x
x
(x y z) h b f y + (u v w) y + d = 0
gf c
z
z
or, T A + K T + d = 0,
(7.26)

ahg
where A = h b f is a real symmetric matrix and hence it is orthogonally diagonalizable,
gf c
T
K = (u v w) and T = (x, y, z). Rotating the coordinate axes to coincide with orthogonal
eigen axes or principal axes and translating the origin suitably the quadratic form (7.26) can
be reduced to one of the following six general forms, assuming that 1 > 0 and in the final
expression constant on the right hand side, if any, is positive. Therefore, the conic (7.26)
can be transformed to one of the six general forms
1. Let In (A) = (3, 0, 0), i.e., rank of A = 3. In this case, none of 1 , 2 , 3 is zero. Then
the standard form becomes
y2
z2
x2
+ 2 + 2 = 1;
2
a
b
c
= 0;

ellipsoid
a single point.

2. Let In (A) = (2, 1, 0), then the standard form becomes


y2
z2
x2
+

= 1;
a2
b2
c2
= 0;

elliptic hyperboloid of one sheet


elliptic cone

3. Let In (A) = (1, 2, 0), then the standard form becomes


x2
y2
z2
2 2 = 1;
2
a
b
c
= 0;

elliptic hyperboloid of two sheets


elliptic cone

4. Let In (A) = (2, 0, 1), then the standard form becomes


x2
y2
+
= z;
a2
b2
= 1;
= 0;

elliptic paraboloid
elliptic cylinder
a single point

5. Let In (A) = (1, 1, 0), then the standard form becomes


x2
y2

= z;
a2
b2
= 1;
= 0;

hyperbolic paraboloid
hyperbolic cylinder
pair of intersecting planes

432

Matrix Eigenfunctions

6. Let In (A) = (1, 0, 2), then the standard form becomes


x2
= y + z; parabolic cylinder
a2
= 1; a pair of planes
= 0; a simple plane
Ex 7.6.3 Reduce 2y 2 2xy 2yz + 2zx x 2y + 3z 2 = 0 into canonical form.
Solution: The quadratic equation 2y 2 2xy 2yz + 2zx x 2y + 3z 2 = 0 in x, y, z,
can be written as



x
0 1 1
x
(x y z) 1 2 1 y + (1 2 3) y 2 = 0,
1 1 0
z
z
X T AX + BX 2 = 0.
The characteristic equation of A is


1 1


|A I| = 0 1 2 1 = 0
1 1
3 22 3 = 0 = 1, 0, 3.
The eigen vectors corresponding to the eigen values 3, 1, 0 are k1 (1, 2, 1), k2 (1, 0, 1) and
k3 (1, 1, 1). The orthogonal eigen vectors are
( 16 , 26 , 16 ), ( 12 , 0, 12 ) and ( 13 , 13 , 13 )

1
3 2
3 0 0
respectively. Let P = 16 2
0 2 , then, P T AP = 0 1 0 and
0 0 0
1 3 2

BP = ( 2 2 2 0).
By the orthogonal transformation, X = P X 0 , where X 0T = (x0 y 0 z 0 ), the equation reduces
to

3x02 y 02 + 6x0 2 2y 0 2 = 0

1
1
or, 3(x0 + )2 (y 0 + 2)2 = .
2
6
Let us applying the transformation
x00 = x0 + 16 , y 00 = y 0 + 2, z 00 = z 0 ,
the equation finally reduces to 3x002 y 002 = 12 , which is canonical form and it represent a
hyperbolic cylinder.
Ex 7.6.4 Reduce the equation 2x2 + 5y 2 + 10z 2 + 4xy + 6xz + 12yz into canonical form.
Solution: The given quadratic form in x, y, z, can be written in the form


22 3
x
Q(, 0 ) = (x y z) 2 5 6 y = X T AX,
3 6 10
z

Canonical Form

433

where, the associated symmetric matrix is A. Let us apply congruence operations on A to


reduce it to the normal form

2 2 3 2 0 0
3
3
A R2 R1 , R3 R1 0 3 3 C2 C1 , C3 C1 0 3 3
2
2
11
03 2
0 3 11
2

200
200

R3 R2 0 3 3 C3 C2 0 3 0
5
00 2
0 0 52

r
r
1 0 0

2 0 0
2
1
1
2 0
1
1
3 q0
R 1 , R2 ,
C3 0 1 0 .
R3
C1 , C2 ,
5
5
2
2
2
2
5
001
0 0
2
The rank of the quadratic form is r = 3 and the number of positive indices of inertia is
p = 3, which is the index. Therefore, the signature of the quadratic form is 2p r = 3.
Here, n = r = p = 3 so, the quadratic form is positive definite. The corresponding normal
form is x2 + y 2 + z 2 .
Ex 7.6.5 Obtain a non-singular transformation that will reduce the quadratic form x2 +
2y 2 + 3z 2 2xy + 4yz to the normal form.
Solution: The given quadratic form in x, y, z, can be written as


1 1 0
x
Q(, 0 ) = (x y z) 1 2 2 y = X T AX,
0 2 3
z
where, the associated symmetric matrix is A. Let us apply congruence operations on A to
reduce it to the normal form

1 1 0
100

A R2 + R1 0 1 2 C2 + C1 0 1 2
0 2 3
023

10 0
10 0

R3 2R2 0 1 2 C3 2C2 0 1 0 .
0 0 1
0 0 1
The rank of the quadratic form is r = 3 and the number of positive indices of inertia is
p = 2, which is the index. Therefore, the signature of the quadratic form is 2p r = 1.
Here, n = r = 3 and p = 1 < r, so, the quadratic form is indefinite. The corresponding
normal form is x2 + y 2 z 2 . Let X = P X 0 , where, X 0T = (x0 y 0 z 0 ) and P is non-singular,
transforms the form into the normal form X 0T DX 0 , then D(= P T AP ) is a diagonal matrix.
By the property of elementary matrices, we get,
E32 (2)E21 (1)A{E21 }T {E32 (2)}T = D

100
1 0 0
1 0 0
P T = E32 (2)E21 (1) = 0 1 0 1 1 0 = 1 1 0 .
2 2 1
0 2 1
001
Thus the transformation X = P X 0 becomes,
x = x0 + y 0 2z 0 , y = y 0 2z 0 , z = z 0 .

434

Matrix Eigenfunctions

Ex 7.6.6 Show that the quadratic form x1 x2 + x2 x3 + x3 x1 can be reduced to the canonical
form y12 y22 y32 , by means of the transformation
x1 = y1 y2 y3 , x2 = y1 + y2 y3 , x3 = y3 .
Solution: The quadratic form x1 x2 + x2 x3 + x3 x1 in x1 , x2 , x3 , can be written as
1 1
0 2 2
x1
Q(, 0 ) = (x1 x2 x3 ) 12 0 12 x2 = X T AX,
1 1
x3
2 2 0
where, the associated symmetric matrix is A. Let us apply congruence operations on A to
reduce it to the normal form
1
1 1
1 2 1
1
2
2

A R1 + R2 12 0 12 C1 + C2 12 0 12
1 1
1 12 0
2 2 0

1
1 0 0
1 2 1
1
1
R2 R1 , R3 R1 0 14 0 C2 C1 , C3 C1 0 14 0
2
2
0 0 1
0 0 1

1 0 0
1 0 0

2R2 0 12 0 2C2 0 1 0 .
0 0 1
0 0 1
The rank of the quadratic form is r = 3 and the number of positive indices of inertia is
p = 1, which is the index. The corresponding normal form is y12 y22 y32 . By the property
of elementary matrices, we get,
1
1
E2 (2)E31 (1)E12 ( )T E21 (1)A[E21 (1)]T [E12 ( )]T [E31 (1)]T [E2 (2)]T = D
2
2
1 T
T
P = E2 (2)E31 (1)E12 ( ) E21 (1)

1 00
110
100
1 00
1 1 0
= 0 2 0 0 1 0 12 1 0 0 2 0 = 1 1 0 .
001
001
1 0 1
1 1 1
0 01
Let X = P Y 0 , where, Y T = (y1 y2 y3 ) and P is non-singular, transforms the quadratic
form into the normal form = Y T DY, then D(= P T AP ) is a diagonal matrix. Thus the
transformation X = P Y becomes,
x1 = y1 y2 y3 , x2 = y1 + y2 y3 , x3 = y3 .
Ex 7.6.7 Reduce the quadratic form 2x1 x3 + x2 x3 to diagonal form.
Solution: Since the diagonal terms are absent and the coefficient of x1 x3 is non zero we
make the change of variables x1 = y1 , x2 = y2 and x3 = y3 + y1 . The quadratic form is
transform to
2y12 + y1 y2 + 2y1 y3 + y2 y3
1
1
1
1
1
= 2(y1 + y2 + y3 )2 y22 y32 + y2 y3 .
4
2
8
2
2
With z1 = y1 + 14 y2 + 12 y3 , z2 = y2 and z3 = y3 , the quadratic form becomes,
1
1
1
1
2z12 z22 z33 + z2 z3 = 2z12 (z2 2z3 )2 .
8
2
2
8

Canonical Form

435

Finally, making the transformation z1 = u1 , z2 2z3 = u2 and z3 = u3 the quadratic form


becomes 2u21 81 u22 .
It can be checked that the transformation from xs to us is x1 = u1 14 u2 u3 , x2 =
u2 + 2u3 and x3 = u1 41 u2 . The us can be expressed in terms of xs as u1 = 12 x1 + 14 x2 +
1
2 x3 , u2 = 2x1 + x2 2x3 and u3 = x3 x1 . Thus the given quadratic form can be written
as

2
1
1
1
1
2
2
x1 + x2 + x3 (2x1 + x2 2x3 )
2
4
2
8
2
2 

1
1
1
1
1
1

x3 ,
x1 +
x2
x3
or,
x1 +
x2 +
2
2
2 2
2
2
2 2
with coefficients 1, 1 and 0.

7.6.1

Jordan Canonical Form

We have shown that every complex matrix is similar to an upper triangular matrix. Also
it is similar to a diagonal matrix if and only if its minimal polynomial has distinct roots.
When the minimal polynomial has repeated roots then the Jordan canonical form theorem
implies that it is similar to D + N , where D is a diagonal matrix with eigen values as the
diagonal elements of D and N is a nilpotant matrix with a suitable simplified form namely
the first super diagonal elements of N are either 1 or 0, with at least 1. In other words the
matrix is similar to a block diagonal matrix, where each diagonal block is of the form
0

0
0

whose diagonal elements are all equal to an eigen value and first super diagonal element
are all 1. This block diagonal form is known as Jordan canonical form.
Result 7.6.1 Let T : V V be a linear operator, whose characteristic and minimal
polynomial are
P () = ( 1 )n1 ( r )nr ; m() = ( 1 )m1 ( r )mr
where i are the distinct eigen values with algebraic multiplicity ni , mi respectively with
mi ni , then T has a block diagonal matrix representation J whose diagonal entries are of
the form Jij , where

1 0 0 0
0 1 0 0

..
Jij = ...
.
.

0 0 0 1
0 0 0 0
For each i the corresponding blocks Jij has the following properties
(i) There is at least one Jij of order mi and all other Jij with i as diagonal element are
all of order mi .
(ii) The sum of orders of Jij in ni .
(iii) The number of Jij having diagonal element i = the geometric multiplicity of i .

436

Matrix Eigenfunctions

(iv) The Jij of each possible order is uniquely determined.


A k th order Jordan submatrix referring to the number 0 is a matrix of order k, 1 k n,
of the form

0 1 0 0 0
0 0 1 0 0

..

..
.
.
.

0 0 0 0 1
0 0 0 0 0
In other words, one and same number 0 form the field F occupies the principal diagonal,
with unity along the diagonal immediately above and zero elsewhere. Thus


 0 1 0
0 1
[0 ],
, 0 0 1
0 0
0 0 0
are respectively Jordan submatrices of first, second and third order. A Jordan matrix of
order n is a matrix of order n having the form

J1 0 0
0 J2 0

J = . .
.
.. ..

0 0 Jn
The elements along the principal diagonal are Jordan submatrices or Jordan blocks of certain
orders, not necessarily distinct, referring to certain numbers ( not necessarily distinct either)
lying in the field F . Thus, a matrix is a Jordan matrix if and only if it has form

1 1 0 0
0
0 2 2 0
0

..

..
.

0 0 0 n1 n1
0 0 0 0
n
where i ; i = 1, 2, , n are arbitrary numbers in F and every j ; j = 1, 2, , n 1 is equal
to unity or zero. Note that if j = 1, then j = j+1 . Diagonal matrices are a special case
of Jordan matrices. These are Jordan matrices whose submatrices are of order 1.
Theorem 7.6.1 Let J be a Jordan block of order k. Then J has exactly one eigenvalue,
which is equal to the scalar on the main diagonal. The corresponding eigenvectors are the
non zero scalar multiples of the k dimensional unit coordinate vector [1, 0, , 0].
Proof: Suppose that the diagonal entries of J are equal to . A column vector X =
[x1 , x2 , , xk ]T satisfies the equation JX = X if and only if its components satisfy the
following k scalar equations:
x1 + x2 = x1
x2 + x3 = x2
..
.
xk1 + xk = xk1
xk = xk .

Canonical Form

437

From the first (k 1) equations, we obtain x2 = x3 = = xk = 0, so is an eigenvalue for


J and all eigenvectors have the same form x1 [1, 0, , 0] with x1 6= 0. To show that is the
only eigenvalue for J, assume that JX = X for some scalar 6= . Then the components
satisfy the following k scalar equations
x1 + x2 = x1
x2 + x3 = x2
..
.
xk1 + xk = xk1
xk = xk .
Because 6= , the last relation gives xk = 0 and from the other equations we get xk1 =
xk2 = = x2 = x1 = 0. Hence only zero vector satisfies JX = X, so no scalar different
from can be an eigen value for J.
This theorem describes all the eigenvalues and eigenvectors of a Jordan block.
Ex 7.6.8 Find all possible Jordan canonical forms for those matrices whose characteristic
polynomial 4(t) and the minimal polynomial m(t) are as follows
1. 4(t) = (t 2)5 ; m(t) = (t 2)2 .
2. 4(t) = (t 7)5 ; m(t) = (t 7)2 .
3. 4(t) = (t 2)7 ; m(t) = (t 2)3 .
4. 4(t) = (t 2)4 (t 5)3 ; m(t) = (t 2)2 (t 5)3 .
5. 4(t) = (t 2)4 (t 3)2 ; m(t) = (t 2)2 (t 3)2 .
Solution: (1) Since 4(t) has degree 5, J must be a 5 5 matrix, and all diagonal elements
must be 2, since 2 is the only eigenvalue. Moreover, since the exponent of t 2 in m(t) is
2, J must have one Jordan block of order 2, and the other must be of order 2 or 1. Thus
there are only two possibilities

 


21
21
(i) J = diag
,
, [2] .
02
02



21
(ii) J = diag
, [2], [2], [2] .
02
(2) In the similar ways, there are only two possibilities

 


71
71
(i) J = diag
,
, [7] .
07
07



71
(ii) J = diag
, [7], [7], [7] .
07
(3) Let Mk denote a Jordan block with t = 2 of order k. Then, in the similar ways, there
are only four possibilities
(i) diag(M3 , M3 , M1 ).
(ii) diag(M3 , M2 , M2 ).

438

Matrix Eigenfunctions

(iii) diag(M3 , M2 , M1 , M1 ).
(iv) diag(M3 , M1 , M1 , M1 , M1 ).
(4) The Jordan canonical form is one of the following block diagonal matrices:


 
 510
2
1
2
1
(i) J = diag
,
, 0 5 1 .
02
02
005



510
2
1
(ii) J = diag
, [2], [2], 0 5 1 .
02
005
The first matrix occurs if the T has two independent eigenvectors belonging to the eigenvalue
2 and the second matrix occurs if the linear operator T has three independent eigenvectors
belonging to 2.
(5) The Jordan canonical form is one of the following block diagonal matrices:

 
 

21
21
31
(i) J = diag
,
,
.
02
02
03




21
31
(ii) J = diag
, [2], [2],
.
02
03
Result 7.6.2 Let V be an n dimensional linear space with complex scalars and let T : V
V be a linear transformation of V into itself. Then there is a basis for V relative to which
T has a block diagonal matrix representation diag(J1 , J2 , , Jm ), with each Jk being a
Jordan block.
Ex 7.6.9 Find all possible Jordan canonical forms for a linear operator T : V V whose
characteristic polynomial 4(t) = (t 2)3 (t 5)2 . In each case, find the minimal polynomial
m(t).
Solution: Since t 2 has exponent 3 in 4(t), 2 must appear three times on the diagonal.
Similarly, 5 must appear twice. Thus there are six possibilities:


210 
210
51
, (ii)diag 0 2 1 , [5], [5]
(i)diag 0 2 1 ,
05
002
002







21
51
21
(iii)diag
, [2],
, (iv)diag
, [2], [5], [5]
02
05
02



51
(v)diag [2], [2], [2]
, (vi)diag ([2], [2], [2], [5], [5]) .
05
The exponent in the minimal polynomial m(t) is equal to the size of the largest block. Thus
(i)m(t) = (t 2)3 (t 5)2 , (ii)m(t) = (t 2)3 (t 5), (iii)m(t) = (t 2)2 (t 5)2
(iv)m(t) = (t 2)2 (t 5), (v)m(t) = (t 2)(t 5)2 , (vi)m(t) = (t 2)(t 5).

1 3 0
Ex 7.6.10 Verify that the matrix A = 0 2 0 has eigen values 2, 1, 1. Find a non
2 1 1
singular matrix
C
with
initial
entry
C
=
1 that transforms A to the Jordan canonical form
11

2 0 0
C 1 AC = 0 1 1 .
0 0 1

Functions of Matrix

439

Solution: The characteristic equation of the matrix A is |A I| = 0, i.e.,




1 3
0

0
2
0 = 0

2
1 1
(2 )(1 + )2 = 0 = 1, 1, 2.
The corresponding eigenvector corresponding to = 2 is obtained by solving equations
3x + 3y = 0; 2x + y 3z = 0 x = y = z.
The eigenvector corresponding to = 2 is k(1, 1, 1), where k is nonzero constant. The
corresponding eigenvector corresponding to = 1 is obtained by solving equations
3y = 0; 2x + y = 0 x = y = 0.
The eigenvector corresponding to = 1 is (0, 0, a), where a is arbitrary nonzero number.
We construct the matrix C whose first two columns are the eigenvectors corresponding to
= 2 and = 1. Since C11 = 1, we must have k = 1. The third column
in such
is chosen

10b
a way that AC = CB, where B is the Jordan canonical form, say C = 1 0 c . Therefore,
1ad

1 3 0
10b
10b
2 0 0
0 2 0 1 0 c = 1 0 c 0 1 1
2 1 1
1ad
1ad
0 0 1
b + 3c = b, 2c = c, 2b + c d = a d c = 0, a = 2b.

1 0 b
Hence C = 1 0 0 , where b 6= 0 and d is arbitrary.
1 2b d

7.7

Functions of Matrix

As we define and study various functions of a variable in algebra, it is possible to define


and evaluate functions of a matrix. We shall study the following functions of a matrix in
this chapter: integral powers(positive and negative), fractional powers(roots), exponential,
logarithmatic, trigonometric and hyperbolic functions.
There are two methods by which a function of a matrix can be evaluated. The first is
a rather straightforward method based on the diagonalization of a matrix and is therefore
applicable to diagonizable matrices only. The second method is based on the existence of a
minimal polynomial and can be used to evaluate functions of any matrix.
Functions of diagonizable matrix
Let A be a diagonizable matrix and let P be a diagonalizing matrix for A, so that
P 1 AP = , A = P P 1 ,

(7.27)

where is a diagonal matrix containing eigenvalues of A. Now, if f is any function of a


matrix, then we have
f (A) = P f ()P 1 .
(7.28)
Thus, if we can define a function of a diagonal matrix, we can define and evaluate the
function of any diagonalizable matrix. The discussion of this chapter evidently applies to
square matrices only.

440

7.7.1

Matrix Eigenfunctions

Powers of a Matrix

We have, in fact, had many occasions so far in this book of using the powers of a matrix.
Thus, we define the square of a matrix by A2 = AA, The cube by A3 = AAA, etc. In
general, if k is a positive integer, we define the k th power of A as a matrix obtained by
multiplying A with itself k times, that is,
Ak = AAA . . . A (k times).

(7.29)

If A is nonsingular, we have defined its inverse A1 as a matrix whose product with A gives
the unit matrix. The negative powers of A are then similarly defined. If m is a negative
integer, let k = m, so that
k
Am = A1 = A1 A1 A1 . . . A1 (k times).
(7.30)
Finally, in analogy with the functions of a variable, we define
A0 = I

(7.31)

Although all the integral powers of A have thus been defined in a straightforward manner,
the actual evaluation may be tedious for large values of k. The calculation is considerably
simplified by using the diagonalizability of A. For, taking the k th power of A and using the
second of equations (7.27), we have



Ak = P P 1 P P 1 . . . P P 1 (k times)
= P k P 1 .

(7.32)

Similarly, if m = k is a negative integer and A is nonsingular, then


k
Am = P m P 1 = P 1 P 1 .

(7.33)

Ex 7.7.1 Find Ak , where k is any integer, positive or negative, and


"
#
A=

4
3
2
3

2
3
5
3

Solution: The eigenvalues and the eigenvectors of A are found to be

(i)1, { 2 1}; (ii)2, {1 2}.


We therefore have,




2 1
10
1
P =
, P AP =
.
02
1 2

The matrix A is seen to be nonsingular. For any integral k, therefore, we have




 " 2 1 #
1 0
2 1
k
k 1
3 3
A = P P =
2
1
0 2k
1 2
3
3




1
2k +2 2k 1 2
.
=
3 2k 1 2 2k+1 + 1
Note that, in particular, A0 = I. Also,
 

1
250 +2 250 1 2
50
A =
3 250 1 2 251 + 1
 

1
210 +2 210 1 2
10
.
A
=
29 + 1
3 210 1 2

(7.34)

Functions of Matrix

7.7.2

441

Roots of a Matrix

In elementary algebra, we say that y is a k-th root of x if y k = x. Similarly, we shall say


that a matrix B is a k-th root of a matrix A if B k = A. The object is to find all the matrices
B which satisfy this relation for a given matrix A.
To begin with, consider a diagonal matrix whose elements are given by ()ij = di ij . It
is evident that k is again a diagonal matrix whose diagonal elements are dki , i.e., k ij =
dki ij . Now let p = k1 and consider a diagonal matrix D whose elements are given by
(D)ij = dpi ij . Clearly, the k-th power od D will equal , that is Dk = . Then consider
the matrix B = P DP 1 = P p P 1 . Taking the k-th power of B, we find



B k = P p P 1 P p P 1 P p P 1 (k times) = P P 1 = A.
(7.35)
Thus, B = P p P 1 is a k-th root of A. The same result holds good for any fractional
power. Thus, if q is any fraction, we have
Aq = P q P 1

(7.36)

Ex 7.7.2 Find A 7 where A is the matrix of equation (7.34).


Solution: In this case, the diagonal matrix A is given by




2 1
10
1
P =
, P AP =
.
02
1 2


1 0
3
. This gives
Hence we have, 7 =
3
0 27

 3

3
7 + 2
7 1
2
2
2
3
3
1


A 7 = P 7 P 1 =  3
10
3
27 1
2 2 7 +1

(7.37)

It should be realized that the k th root of a number is not unique. In fact, there are exactly
k th roots of any number except zero. Similarly, the k th root of a matrix will not be unique.
If a matrix A has m nonzero eigenvalues, there will be k m matrices whose k th power equals
A.
Ex 7.7.3 Find all the square roots of the matrix
3 1
A = 21 23

(7.38)

2 2

Solution: The eigenvalues and the eigenvectors of the given matrix are found to be
(i) 2, {1 1};
(ii) 1, {1 1}.






we therefore have,
1 1 1
1 1
20
P =
, P 1 =
, =
.
1 1
01
2 1 1


1
1
2 0
2
The square roots of are =
. We can choose any of the four matrices 2 to
0 1
1
1
1
obtain A 2 . Using the relationA 2 = P 2 P 1 , we find that A has four square roots given by
B and C, where

( 21) ( 2+1)
( 2+1) ( 21)
2
.

2
2
2
B = (21
) ( 2+1) ; C = ( 2+1) ( 21)
2

442

Matrix Eigenfunctions

Evaluation of functions using Cayley-Hamilton theorem


The method discussed so far for evaluating the various functions of a matrix is based on
the principle that if A = P P 1 , where is diagonal, then f (A) = P f ()P 1 . It has,
however, the major drawback that it is applicable to diagonalizable matrices only. There is
an alternative method for evaluating the functions of a matrix which is based on the use of
the Cayley-Hamilton theorem and which is applicable to any matrix.
We know that any polynomial of whatever degree in a matrix is equal to a polynomial
of degree m 1, where m is the degree of the minimal polynomial. The result, in fact,
holds good not only for polynomials but also for any arbitrary function of a matrix provided
the function is sufficiently differentiable. Thus if the degree of the minimal polynomial of a
matrix A is m, any function f (A) can be expressed as a linear combination of the m linearly
independent matrices I, A, A2 , . . . , Am1 , i.e.,
f (A) = r(A),
m1

where

r(A) = m1 A

m2

+ m2 A

(7.39)
+ + 1 A + 0 I

(7.40)

The scalars i are determined as follows. If i is a k-fold degenerate eigenvalue of A, the


algebraic functions f () and r() satisfy the k equations
f (i ) = r(i ),
df (i )
dr(i )
=
,
d
d
d2 f (i )
d2 r(i )
,
=
2
d
d2
..
..
..
.
.
.
dk1 r(i )
dk1 f (i )
=
.
dk1
dk1

(7.41)

f (i )
Here the notation d d
denotes the l-th derivative of f () evaluated at = i . We shall
1
now use this method to solve some of the problems discussed earlier in this chapter and also
apply it to nondiagonalizable matrices.

Ex 7.7.4 Find Ap , where p is any number and A is the matrix of equation (7.34).
Solution: The matrix A is of order 2 and has distinct eigenvalues 1 = 1, 2 = 2.
The degree of the minimal polynomial is therefore m = 2. We have f (A) = Ap , so that
f () = p . Let r(A) = 1 A + 0 I, so that r() = 1 + 0 . Since both the eigenvalues are
non degenerate, we have the two conditions
f (1 ) = r (1 ) , f (2 ) = r (2 ) ,

(7.42)

which give 1 = 1 +0 , 2p = 21 +0 . The solution is found to be 1 = 2p 1, 0 = 22p .


Putting these back in r(A), we have,
f (A) = r(A) = 1 A + 0 I.
4 2


10
or, Ap = (2p 1) 32 35 + (2 2p )
01
3 3


p
p
1
2 + 2 (2 1) 2
.
=
3 (2p 1) 2 2p+1 + 1
If p is non integral, this will not be the only matrix equal to Ap .

Series

443

Ex 7.7.5 Find all the square roots of the matrix of equation(7.38).


Solution: The matrix A is of order 2 with distinct 1 = 2, 2 = 1; hence m = 2. We have
1
1
f (A) = A 2 , f () = 2 ; let r(A) = 1 A + 0 I, so that r() = 1 + 0 . The coefficients
1 and 0 satisfy the conditions
f (1 ) = r(1 ), f (2 ) = r(2 ),

or, 2 + 21 + 0 , 1 = 1 + 0 ,
where all the four sign combinations are valid. These give the four sets of solutions




1 =
2 1 , 0 = 2 2 ;




1 =
2 + 1 , 0 = 2 + 2 ,
where we take either the upper signs or the lower signs in each pair. Using these in the
1
equation A 2 = 1 A + 0 I, we get the four square roots B and C, where B and C are
given by

( 2+1) ( 21)
( 21) ( 2+1)
2
; C = 2
.
2
2
B = 21
(
( 2+1) ( 21)
) ( 2+1)
2

7.8

Series

A series such as

S=

ak Ak

(7.43)

k=0

in a matrix A, where ak are scalar coefficients, is said to converge if and only if every element
of the right hand side converges. In that case, the series of equation (7.43) is equal to the
matrix f (A), of the same order as A, whose elements are given by
[f (A)]ij =

ak Ak


ij

(7.44)

k=0

We shall state without proof the result that a series f (A) in a matrix A is convergent if and
only if the corresponding algebraic series f () is convergent for every eigenvalue i of A.
Thus, if

X
f () =
ak k
(7.45)
k=0

exists for || < R, then


f (A) =

ak Ak

(7.46)

k=0

exists if and only if every eigenvalue i of A satisfies |i | < R. R is called the radius of
convergence of the series.

7.8.1

Exponential of a Matrix

The exponential series is defined by


e =

X
k
k=0

k!

(7.47)

444

Matrix Eigenfunctions

and is convergent for every finite value of . Similarly, we shall define the exponential of a
matrix A by

X
Ak

eA =

k!

k=0

(7.48)

which will be a matrix of the same order as A and will exist for every finite square matrix
A because all of its elements (and hence eigenvalues) are finite.
To begin with, let us obtain the exponential of a diagonal matrix. Let be a diagonal
matrix with elements ()ij = i ij . Then
e =

X
k

k!

k=0

(7.49)

The ij-element of exp() will be given by


[exp()]ij =

X
k


ij

k!

k=0

X
k ij
i

k!

k=0

= e(i ) ij .

(7.50)

It is therefore evident that if

1
2

.
n

(7.51)

then

exp (1 )
exp (2 )

exp() =

0
.
.
.

(7.52)

exp (n )

Now consider the series


exp(A) = I + A +

A2
A3
Ak
+
+ + +
+
2!
3!
k!

(7.53)

Let P be a matrix which brings A to the diagonal form . Multiplying equation (7.53) from
the left by P 1 and from the right by P and remembering that P 1 Ak P = k , we have
P 1 (exp(A)) P = I + +

2
3
k
+
+ + +
+ = exp().
2!
3!
k!

(7.54)

It follows immediately that


exp(A) = P (exp()) P 1 .

(7.55)

Having defined that exponential function, it is possible to define that matrix exponent of
any number. In elementary algebra, we have ax = exp(x ln a), where ln denotes the
natural logarithm. Similarly, we define
aA = exp(A ln a) = P exp( ln a)P 1 ,

(7.56)

Series

445

where

exp(A ln a) a =

a1

a2

.
n
0
a

(7.57)

Ex 7.8.1 Find eA and 4A if A is the matrix of equation (7.38).


Solution: The matrices P and are given by






1 1 1
1 1
20
P =
, P 1 =
, =
.
1 1
01
2 1 1
 2 
 2  

Thus, we have,
e 0
4 0
16 0

e =
, 4 =
=
.
0 e
0 4
0 4


Therefore,
1 e2 + e e2 e
eA = P e P 1 =
,
2 e2 e e2 + e


10 6
and 4 = P 4 P 1 =
.
6 10

7.8.2

(7.58)

Logarithm of a Matrix

We say that x is the natural logarithm of y if ex = y, and write it as x = ln y. Similarly,


given a matrix A, we shall say that a matrix B is the natural logarithm of A if eB = A.
Therefore, by definition,
B = ln A exp(B) = A.

(7.59)

We shall first find the logarithm of a diagonal matrix. If [i ij ] is a diagonal matrix of


the same order given by

ln 1

ln 2
0

D ln =
(7.60)

.
0
ln n
To prove this, consider exp(D). Remembering that exp(ln x) = x and using Eq. (7.52), it
is easy to see that
exp(D) = exp(ln ) =

(7.61)

Therefore, by definition, it follows that D is the natural logarithm of .


Let A be a diagonalizable matrix of Eqs. (7.27). We have
B2
Bk
+ +
+
2!
k!
2
k
P DP 1
P DP 1
1
= I + P DP +
+ +
+
2!
k!


D2
Dk
=P I +D+
+ +
+ P 1
2!
k!

eB = I + B +

= P eD P 1 = P P 1 = A.

(7.62)

446

Matrix Eigenfunctions

Ex 7.8.2 Find the logarithm of the matrix

39 50 20
A = 15 16 10 .
30 50 11

(7.63)

Solution: For the given matrix A, the matrices P, and P 1 are found to be

3 1 2
90 0
3 4 1
1
P = 1 1 1 , = 0 9 0 , P 1 = 0 2 1 .
3
0 0 6
2 1 2
3 5 2
we have,

ln 9 0
0
.
0
ln A = 0 ln 9
0
0 ln 6 + i

Therefore, ln A is given by
ln A = P (ln )P 1

9a 6b 10(b a) 4(b a)
1
3(a b) 5b 2a 2(b a) .
=
3
6(a b) 10(b a) 4b a

where, a = ln 9, b = ln(6) = ln 6 + i.

7.9

Hyperbolic and Trigonometric Functions

The exponential function also leads to the hyperbolic and the trigonometric functions of a
matrix. Thus, for any square matrix A, we define the hyperbolic functions by

 X A2k+1
1 A
e eA =
,
sinh A =
2
(2k + 1)!
cosh A =


1 A
e + eA =
2

k=0

k=0

A2k
,
(2k)!

(7.64)
(7.65)

which exist for any matrix A with finite elements. Similarly, the trigonometric functions are
defined by

sin A =
cos A =

 X (1)k A2k+1
1 iA
e eiA =
,
2i
(2k + 1)!

1 iA
e + eiA =
2

k=0

k=0

(1)k A2k
,
(2k)!

(7.66)
(7.67)

which also exists for any matrix A with finite elements.


Ex 7.9.1 Find sin A and cos A, where
47

2 53 30
.
A = 12 53
2 15
2 72 2

(7.68)

Hyperbolic and Trigonometric Functions

447

Solution: The matrices P, and P 1 associated with the given matrix A are found to be

5 11 6
58 1
2 0 0
5 .
P = 0 2 1 , = 0 1 0 , P 1 = 4 9
1
4 3 1
8
17
10
0 0 2
We therefore have,
sin =

X
(1)k ()2k+1
k=0

(2k + 1)!

sin 2 0
0
10 0
0 = 0 0 0 .
= 0 sin
0
0 sin 2
0 0 1

Therefore,

sin A = P sin()P 1
Similarly,

17 38 20
= 8 17 10 .
28 61 34

0 0 0
cos = 0 1 0 .
0 0 0

so that

cos A = P cos()P 1

32 72 40
= 8 18 10 .
12 27 15

Exercise 7
Section-A
[Multiple ChoiceQuestions]

12
1. The characteristic equation of the matrix
is
1 3
(a) 2 2 + 2 = 0
(b) 3 4 + 6 = 0
2
(c) 4 + 4 = 0
(d) 2 4 + 6 = 0


1 1
then the value of the matrix expression A2 3A + 3I is
2. If A =
2 2
(d) A

1242
0 3 1 5

3. The eigenvalues of the matrix


0 0 0 1 are
0002
(a) 0

(b) 2I

(a) {1,3,1,2 }

(c) I

(c) {1,2,4,2}

123
4. The sum of the eigenvalues of A = 0 2 3 is
002
(a) 5

(b) 2

(b) {1,3,0,2 }

(c) 1

(d) {1,0,0,0}

(d) 6

123
5. The sum of the eigenvalues of the matrix 4 5 6 is
211
(a) 7

(b) 6

(c) 4

(d) 5

[WBUT 2007]

448

Matrix Eigenfunctions

12 5
6. The product of the eigenvalues of the matrix 0 3 0 is
0 0 4
(b) 12

(a) 0

(c) 12

(d) 10

7. If is an eigenvalue of the matrix A then one eigenvalue of the matrix 5A is


(a) 2

(c) 1

(b)

(d) 5

8. If is an eigenvalue of A then one eigenvalue of A3 is


(b) 2
(c) 3
(d) k


1 1
9. Let A =
. The eigenvalues of A5 are
0 2
(a)

(a) {1, 2}

(b) {1, 1}

(c) {1, 32}

(d) {1, 10}

10. If 2 is an eigenvalue of a non-singular matrix A then


2

(a) A

(b) A

(c) 2A

1
2

is an eigenvalue of

(d) A

11. If a matrix A satisfy the relation A2 = A then the eigenvalues of A are


(a) {0, 1}

(c) {1, 1}
(d) {1, 2}

123
12. The eigenvalues of the matrix 2 4 5 are all
356
(a) zero

(b) {0, 1}

(b) real

(c) imaginary
(d) real/ imaginary.


11
13. The eigenvectors of the matrix
are
13
(a) {(1, 1)T }
(b) {(1, 3)T }
(c) {(2, 2)T , (1, 1)T }
(d) {(1, 1)T , (1, 3)T }
14. If 1, 5 are the eigenvalues of the matrix A and if P diagonalising it then P 1 AP is








15
10
10
2 0
(a)
(b)
(c)
(d)
00
05
50
0 10
15. If 2 is an eigenvalue of the matrix A of order 2 2 then the rank of the matrix A 2I
is
(a) 0

(b) 2

(c) 1

(d) none of these

120
16. The algebraic multiplicity of the eigenvalue 1 of the matrix A = 0 2 0 is
001
(a) 0

(b) 1

(c) 2

(d) 3

17. If A is real orthogonal matrix then the magnitude of the eigenvalues is


(a) 1

(b) 2

(c) 3

(d) 1/2

18. The characteristic roots of a real skew symmetric matrix are


(a) all reals
(b) all zeros
(c) all imaginary
(d) either all zeros or purely
imaginary.

Hyperbolic and Trigonometric Functions

449

8 6 2
19. The characteristic roots of the matrix A = 6 7 4 are
2 4 3
(a) 0,3,7
(b) 0,5,15
(c) 0,3,15
(d) 1,3,7
20. If the characteristic values of a square matrix of third order are 4,2,3, then the value
of its determinant is
(a) 6
(b) 9
(c) 24
(d) 54
21. If is a non-zero characteristic root of a non-singular matrix A, then a characteristic
root of A1 is
1
(a) |A|
(b) |A|
(c) 1
(d) |A|

102
22. The characteristic equation of the matrix A = 0 2 1 are
203
3
2
3
2
(a) 6 + 5 3 = 0
(b) + 6 7 2 = 0
(c) 3 + 62 + 7 + 3 =
3
2
0
(d) + 6 7 2 = 0


23. For the matrix A =


(a) I

(b) A


01
, A1 is equal to
10
(c) 2A
(d) 21 A

40 29 11
24. Suppose the matrix A = 18 30 12 has a certain complex number 6= 0 as
26 24 50
an eigen value. Which of the following numbers must also be an eigen value of A?
NET(Dec)11
(a) + 20
(b) 20
(c) 20
(d) 20 .
25. Let A be a 3 3 matrix with trace(A) = 3 and det(A) = 2. If 1 is an eigen value of A,
then the eigen values of A2 2I are
(a) 1, 2(i 1), 2(i + 1),
(b) -1, 2(i 1), 2(i + 1),
(c) 1, 2(i + 1), 2(i + 1),
(d) -1, 2(i 1), 2(i + 1).
26. Let A, B are n n positive definite matrices and I be the n nidentity matrix, then
which of the following are positive definite?
[NET(June)11]
(a) A + B (b) ABA (c) A2 + I (d) AB
27. Let N be a 3 3 non zero matrix with the property N 3 = 0. Which of the following
is/are true?
NET(June)11
(a) N is not similar to a diagonal matrix (b) N is similar to a diagonal matrix. (c)
N has one non zero eigen vector (d) N has three linearly independent eigen vector.

1 2
28. Let be a complex number such that 3 = 1, but 6= 1. If A = 2 1 then
2 1
which of the following statements are true?
[NET(Dec)11]
(a) A is invertible (b) rank(A) = 2 (c) 0 is an eigen value of A (d) there exist
linearly independent vectors v, w C 3 such that Av = Aw = 0.

450

Matrix Eigenfunctions

0 0 0 4
1 0 0 0

29. Let A =
0 1 0 5 , then a Jordan canonical form of A is given by [NET(Dec)11]
001 0

1 0 0 0
1 1 0 0
110 0
1 1 0 0
0 10 0
0 10 0
0 1 0 0
0 1 0 0

(a)
0 0 2 0 (b) 0 0 2 0 (c) 0 0 2 0 (d) 0 0 2 0
0 0 0 2
0 0 0 2
0 0 0 2
0 0 0 2
30. Which ofthe following
definite?
 matrices

 are positive



21
12
4 1
04
(a)
(b)
(c)
(d)
.
12
21
1 4
40

NET(June)12

31. Let A be a non-zero linear transformation on a real vector space V of dimension n.


Let the subspace V0 V be the image of V under A. Let k = dimV0 < n and suppose
that for some <, A2 = A. Then
NET(June)12
(a) = 1 (b) det = ||n (c) is the only eigenvalue of A (d) There is a non-trivial
subspace V1 V such that Ax = 0 for all x V1 .
32. Let N be the non-zero 3 3 matrix with the property N 2 = 0. Which of the following
is/are true?
NET(June)12
(a) N is not similar to a diagonal matrix (b) N is similar to a diagonal matrix (c)
N has one non-zero eigenvector (d) N has three linearly independent eigenvectors.
33. Let A be a 2 2 non-zero matrix with entries in C such that A2 = 0. Which of the
following statements must be true?
[NET(Dec)11]
(a) P AP 1 is diagonal for some invertible 2 2 matrix P with the entries in < (b) A
has two distinct eigen values in C (c) A has only one eigen value in C with multiplicity
2 (d) Av = v for some v C 2 , v 6= 0.
34. Let , be distinct eigen values of a 2 2 matrix A. Then which of the following
statements must be true?
[NET(Dec)11]
3
3
A ( + ) (c) Trace of An is
(a) A2 has distinct eigen values. (b) A3 =
n + n for every positive integer n. (d) An is not a scalar multiple of identity for
any positive integer n.

0 0 0 4
1 0 0 5

35. Let A =
NET(Dec)11
0 1 0 5 , then the Jordan canonical form of A is
0 0 1 0

1 0 0 0
1 1 0 0
110 0
1 1 0 0
0 10 0
0 10 0
0 1 0 0
0 1 0 0

(a)
0 0 2 0 (b) 0 0 2 0 (c) 0 0 2 0 (d) 0 0 2 0
0 0 0 2
0 0 0 2
0 0 0 2
0 0 0 2
36. Let A be a 3 3 matrix with real entries such that det(A) = 6 and the trace of A is
0. If det(A + I) = 0, then the eigen values of A are
[NET(Dec)11]
(a) -1,2,3
(b) -1,2,-3
(c) 1,2,-3
(d) -1,-2,3.
37. Let A be a 4 4 matrix with real entries such that -1,1,2,-2 are the eigen values. If
B = A4 5A2 + 5I, then which of the following statements are true? [NET(Dec)11]
(a) det(A + B) = 0 (b) det(B) = 1 (c) trace of A B is 0 (d) trace of A + B is 4.

Hyperbolic and Trigonometric Functions

451

38. Let J be the 3 3 matrix all of whose entries are 1. Then,


[NET(Dec)11]
(a) 0 and 3 are the only eigen values of A (b) J is positive semidefinite, i.e., {Jx, x}
0 for all x <3 (c) J is diagonalizable (d) J is positive definite, i.e., {Jx, x} > 0 for
all x <3 with x 6= 0.
Section-B
[Objective Questions]
1. Show that the eigen values of a Hermitian matrix are all real.


cos sin
2. Let A = R() =
. Show that A does not have any eigen value unless
sin cos
A = 1.


21
3. Verify Cayley-Hamilton Theorem for the square matrix A =
.
05


3 5
4. Using Cayley-Hamilton Theorem, find the inverse of the matrix A =
.
1 2
5. What are the possible eigen values of a square matrix A (over the field <) satisfying
A3 = A ?
6. Let V be the vector space of all real differentiable functions over the field of reals and
D : V V be the differential operator. Is every non-zero real an eigen value of D?
Support your answer.
Section-C
[Long Answer Questions]

1 0 2
1. Prove that the spectrum of 0 1 1 is {1, , 2 }.
0 1 0
2. Let A be a n n normal matrix. Show that Ax = x if and only if A x = x. If A
has in addition, n distinct real eigenvalues, show that A is Hermitian.
[ Gate96]
3. Let A be a 6 6 diagonal matrix with characteristic polynomial x(x + 1)2 (x 1)3 .
Find the dimension of , where = {B M6 (<) : AB = BA}. [ Gate97].
4. The eigen values of a 3 3 real matrix P is 1, 2, 3, show that
P 1 = 16 (5I + 2P P 2 ).

[ Gate96].

n
5. Show that
 the characteristic polynomial of I + A in terms of that of A is ( )

n
or A
according as = 0 or not.

6. Let 1 and 2 be the two distinct eigen values of a real square matrix A. If u and v are
eigen vectors of A corresponding to the eigen values 1 and 2 respectively, examine
whether u + v an eigen vector of A.
7. Find the distinct eigen values of U , where U be a 3 3 complex Hermitian and unitary
matrix.
[Gate96]
8. Let V be the vector space of all real differentiable functions over the field of reals and
D : V V be the differential operator. Is every non zero real an eigen value of D?
Support your answer.
[ BH06]

452

Matrix Eigenfunctions

9. Let A be a 3 3 matrix with eigen values 1, 1, 0.

1 2 1
10. Find the cubic equation which is satisfied by the matrix A = 0 1 1 .
3 1 1


21
11. Use Cayley H theorem to find A1 , where,A =
[SET 10, BH 00]
35
12. State Cayley-Hamilton theorem and verify the same for the matrix







2 1
21
3 5
(i)
BH98, (iii)
BH00 (iv)
BH06.
1 1
03
1 2

1 3 6
1 3 3
0 1 1
(v) 3 5 6 V H02 (vi) 3 5 3 CH98 (vii) 1 0 2 CH03
3 3 4
6 6 4
1 1 0
Also express A1 as a polynomial in the matrix A, and find A1 using Caley-Hamilton
theorem.

100
13. Using Cayley-Hamilton theorem find A50 , for A = 1 0 1 .
BH04
010
14. Find
ofthe matrix
the eigenvalues and eigenvectors

3 2 2
1 1 1
8 6 2
(i) 1 4 1 BH98 (ii) 1 1 1 .BH99 (iii) 6 7 4 BH00
2 4 1
0 0 1
2 4 3

1 3 3
200
1 3 0
(iv) 3 5 3 BH03 (v) 0 3 0 V H05 (vi) 3 2 0 BH05
0 0 5
0
0 1
6 6 4

6 2 2
303
001
(vii) 2 3 1 CH05 (viii) 0 3 0 CH95 (ix) 0 1 0 CH97
2 1 3
303
100

1 2 2
2 2 3
(x) 2 2 4 CH00 (xi) 1 1 1 CH02
3 3 6
1 3 1
Also find the eigen values of A1 .

1 1 1
15. Find matrices P and Q such that 1 1 1 is in normal form.
CH04
3 1 1

024
1 2 1
16. Let (i)A = 2 0 6 , BH98 (ii)A = 2 5 2 CH97, V H02
460
1 2 17

211
321
(iii)A = 1 3 1 CH99 (iv)A = 2 3 1 CH01
122
001
find a non-singular matrix P such that P AP T is a diagonal matrix.
1
17. Find a orthogonal matrix
matrix,
P such that
P AP is a diagonal

where,


2 2 0
5 6 6
2 2
A = (i)
. (ii) 2 1 2 ,[BH 02] (iii) 1 4 2 . [JU(M.Sc.)06]
2 5
0 2 0
3 6 4

Hyperbolic and Trigonometric Functions

453

18. Obtain a non-singular transformation that will reduce the quadratic form x21 + 2x22 +
x23 2x1 x2 2x2 x3 to the normal form.
BU (M.Sc)99
19. Verify that the quadratic form 5x21 + 26x22 + 10x23 + 4x2 x3 + 14x3 x1 + 6x1 x2 is positive
semi-definite.
BU (M.Sc)03




20
3 1
20. Diagonalise the matrix (i)
BH99 (ii)
BH00
1
3
1
3



2 2 0
4 2
(iii) 2 1 2 BH04, 05 (iv)
.
3 1
0 2 0


21
21. Show that the matrix
is not Diagonalisable.
CH98
02

11 7 5
22. Verify that the matrix A = 16 11 6 has eigen values 1, 3, 3. Find a non singular
12 6 7
matrix C with
initial
entry
C
= 1 that transforms A to the Jordan canonical form
11

100
C 1 AC = 0 3 1 .
003
23. Find an orthogonal transformation x = P y to reduce the quadratic form q = xT Ax =
x21 + 4x1 x2 + x22 x23 on <3 to a diagonal form y T Dy, where the diagonal elements of
D are the eigenvalues of A. Hence find the signature and determine the nature of the
definiteness of q.
BU (M.Sc.02
24. Let T : V V be a linear operator on a finite dimensional vector space V over the
field K and let p(t) be the minimal polynomial of T . If T is diagonizable, show that
p(t) = (t 1 )(t 2 ) (t r ) for some distinct scalars 1 , 2 , , r .
Gate0 02.
25. Let Jn be the n n matrix each of whose entries equals to 1. Find the nullity and the
characteristic polynomial of Jn .
Gate0 03.
26. Determine all possible canonical forms J for the matrix of order 5 whose minimal
polynomial is m(t) = (t 2)2 .
27. If A is a complex 5 5 matrix with characteristic polynomial f (t) = (t 2)2 (t + 7)2
and minimal polynomial m(t) = (t 2)2 (t + 7). Write the Jordan Canonical form of
A.
BU (M.Sc)03
28. Find the matrix whose the minimal polynomial is (x 1)(x 2).
29. Find the quadratic form to which x21 +2x22 x23 +2x1 x2 +x2 x3 transforms by the change
of variables y1 = x1 x3 , y2 = x2 x3 , y3 = x3 by actual substitution. Verify that
the matrix of the resulting quadratic form is congruent to the matrix of the original
quadratic form.
30. Show that the quadratic form x1 x2 + x1 x3 + x3 x1 can be reduced to the canonical form
y12 y22 y32 by means of the transformation x1 = y1 y2 y3 , x2 = y1 +y2 y3 , x3 = y3 .
BU (M.Sc.98
31. Reduce the quadratic form 2x1 x3 + x2 x3 to diagonal form.

454

Matrix Eigenfunctions

32. Find
the Jordan
normal form of the matrix A over the field of real numbers, where
4 1 1
A = 4 0 2.
BU (M.Sc)99
2 1 3

Answer

1. d
12. b

2. c
13. a

3. b
14. b

Section-A
[Multiple Choice Questions]
4. a
5. a
6. b 7. d 8. c
15. c 16. c 17. a

9. c

10. d

11. a

Chapter 8

Boolean Algebra
In this chapter we shall adopt the definition of an modern abstract mathematical structure
known as Boolean algebra introduced by famous mathematician George Boole. Hence we
give the definition of Boolean algebra given by Huntington[1904]. This algebra became an
essential tool for the analysis and design of electronic computers, dial telephones, switching
systems and many kinds of electronic control devices.

8.1

Operation

To describe the modern concept like set algebra, we shall first we shall define what is called
the operation.

8.1.1

Unary Operation

A unary operation on a set of elements is a rule which assigns to every element in the set,
another element from the set. For example, if S is a set of positive real numbers,
the function
square root is a unary operation which assigns to each a S, an element a forms.

8.1.2

Binary Operation

A binary operation, on a set of elements, is a rule which assigns to a unique element from
the set to each pair of elements from the set. For example, ordinary addition, subtraction,
multiplication etc over the set of real numbers are the examples of binary operation.

8.2

Boolean Algebra

Boolean algebra can be defined either as an algebraic system or a special lattice.

8.2.1

Boolean Algebra as a Lattice

A Boolean algebra B is a complemented distributive lattice. Equivalently a lattice B is a


Boolean algebra if
(i) B is bounded with bounds 0 and 1.
(ii) every element a B has a complement a0 satisfying a a0 = 1 and a a0 = 0.
(iii) a (b c) = (a b) (a c) and a (b c) = (a b) (a c) for all a, b, c B.
455

456

8.2.2

Boolean Algebra

Boolean Algebra as an Algebraic System

A non-empty set B of elements a, b, c, . . . on which two binary operators + (called addition) and (called multiplication) and one unary operator 0 (called complementation) are
defined, is said to be a boolean algebra {B, +, ,0 }, if the following postulates are satisfied :
P1 : Closure Property:
(i) Closure with respect to + i.e., for all a, b B we have a + b B.
(ii) Closure with respect to i.e., for all a, b B we have a b B.
P2 : Commutative law:
(i) The operator + is commutative i.e., a + b = b + a; a, b B.
(ii) The operator is commutative i.e., a b = b a; a, b B.
P3 Existence of identity:
(i) a B, an identity element 0 B such that, a + 0 = 0 + a = a.
(ii) a B, an identity element 1 B such that, a 1 = 1 a = a.
P4 Distributive law:
Each of operations + and is distributive over the other i.e.
a (b + c) = a b + c c; () over (+), a, b, c B.
a + (b c) = (a + b) (a + c); (+) over(), a, b, c B.
P5 Existence of complement:
For every element a B, an element a0 B(called complement of a) such that
a + a0 = a0 + a = 1; the identity element for
a a0 = a0 a = 0; the identity element for + .
These axioms are given by Huntingtons[1904]. A Boolean Algebra is generally denoted by
a 6-tuple {B, +, ,0 , 0, 1} or simply by {B, +, ,0 }.
(i) Notice that, 00 = 1 and 10 = 0, for, by P3 and P4 , we have 1 + 0 = 1 and 1.0 = 0. Thus
the identity elements are complementary to each other.
(ii) A trivial boolean algebra is given by {{0}, +, .,0 }. Here 0 + 0 = 0, 0.0 = 0, 0 = 0 and
1 = 0. All the axioms P1 , P2 , P3 , P4 hold trivially with both sides 0.
Ex 8.2.1 Let B = {1, 2, 3, 5, 6, 10, 15, 30} the set of all positive divisors of 30. For a, b B,
let the binary and unitary operations on B be defined as
(i) a + b = the LCM of a and b.
(ii) a.b =, the GCD of a and b.
(iii) a0 =

30
a .

Prove that {B, +, ,0 } is a boolean algebra.

Boolean Algebra

457

Solution: We have the following composition tables


+
1
2
3
5
6
10
15
30

1 2 3 5 6 10 15
1 2 3 5 6 10 15
2 2 6 10 6 10 30
3 6 3 15 6 30 15
5 10 15 5 30 10 15
6 6 6 30 6 30 30
10 10 30 10 30 10 30
15 30 15 15 30 30 15
30 30 30 30 30 30 30

30
30
30
30
30
30
30
30
30

. 1 2 3 5 6 10 15 30
30 1 1 1 1 1 1 1 1 1
15 2 1 2 1 1 2 2 1 2
10 3 1 1 3 1 3 1 3 3
6 5 11151 5 5 5
5 6 12316 2 3 6
3 10 1 2 1 5 2 10 5 10
2 15 1 1 3 5 3 5 15 15
1 30 1 2 3 5 6 10 15 30

P1 : From the composition tables, we see that, S is closed under +, .,0 .


P2 : Both the operations are commutative, since from the tables, we have,
a + b = b + a = b and a.b = b.a = a.
as LCM of a and a and b is the LCM of b and a; also, GCD of a and a and b is the GCD of
b and a.
P3 : Each operation is distributive over the other, since, with the help of the properties of
LCM and HCF, we get,
a.(b + c) = a.b + a.c and a + b.c = (a + b).(a + c); a, b, c B
P4 : 1 is the identity element for (+) since,
a + 1 = 1 + a = a;

a B.

Similarly, 30 is the identity element for (.) since,


a.30 = 30.a = a;

a B.

P5 : The complement of an element 3 is


3 + 30 = 3 +
3.30 = 3.

30
= 3 + 10 = 30, the identity element for .
3

30
= 3.10 = 1, the identity element for + .
3

Thus for any element a B, a0 B such that a + a0 = 30, identity element for (.) and
a.a0 = 1, identity element for (+).
Hence {B, +, ,0 } is a boolean algebra. in which 1 is the zero element and 30 is the unit
element.
Ex 8.2.2 Let B be the set of all positive divisors of 48. For a, b B, let the binary and
unitary operations on B be defined as
(i) a + b = the LCM of a and b.
(ii) a.b =, the GCD of a and b.
(iii) a0 =

48
a .

Prove that {B, +, ,0 } is not a boolean algebra.

458

Boolean Algebra

Solution: Here B = {1, 2, 3, 4, 6, 8, 12, 16, 24, 48}, the composition table is given by
+
1
2
3
4
6
8
12
16
24
48

1 2 3 4 6 8 12 16 24 48
1 2 3 4 6 8 12 16 24 48
2 2 6 4 6 8 12 16 24 48
3 6 3 12 6 24 12 48 24 48
4 4 12 4 12 8 12 16 24 48
6 6 6 12 6 24 12 48 24 48
8 8 24 8 24 24 16 16 24 48
12 12 12 12 12 24 12 48 24 48
16 16 48 16 48 16 48 16 48 48
24 24 24 24 24 24 24 48 24 48
48 48 48 48 48 48 48 48 48 48

48
24
16
12
8
6
4
3
2
1

.
1
2
3
4
6
8
12
16
24
48

1 2 3 4 6 8 12 16 24 48
111111 1 1 1 1
121222 2 2 2 2
113131 3 1 3 3
121424 4 4 4 4
122262 6 2 6 6
121428 4 8 8 8
1 2 3 4 6 4 12 4 12 12
1 2 1 4 2 8 4 16 8 16
1 2 3 4 6 8 12 8 24 24
1 2 3 4 6 8 12 16 24 48

We know, 1 is the zero element and 48 is the unit element as by the previous example. Also,
80 = 6 and
8 + 80 = the LCM of 8 and 6
= 24 6= the unit element .
0
8.8 = the GCD of 8 and 6
= 2 6= the zero element .
Hence {B, +, ,0 } is not a Boolean Algebra. We see that B contains the elements like 16,
which is divisible by an square integer greater than 1.
Ex 8.2.3 Let S be a given non-empty set, then P (S) be the power set of S. The binary and
unitary operations on S be defined as
(i) A + B = A B, the union of subset A, B P (S).
(ii) A B = A B, the intersection of subsets A, B P (S).
(iii) A0 = the complement of the subset A in S.
Prove that {P (S), +, ,0 } is a boolean algebra
Solution: Let A, B and C be any three subsets of S, then the Huntingtons postulates for
boolean algebra are satisfied from the following properties of sets:
(i) A B = B A, A B = B A.
(ii) A (B C) = (A B) (A C), A (B C) = (A B) (A C).
(iii) A = A, A S = A.
(iv) A A0 = S, A A0 = .
Thus {P (S), +, ,0 } is a Boolean Algebra i.e. P (S) is a boolean algebra under set theoretical
operations of union, intersection and complementation. The null set P (S) is the zero
element and S is the unit element in this boolean algebra {P (S), +, ,0 }.
Ex 8.2.4 Consider, the set B = {a, b} and the binary operations (+) and (.) defined on the
elements of B as
+ab
. ab
a ab
a a a.
b bb
bab
Prove that {B, +, ,0 } is a boolean algebra.

Boolean Algebra

459

Solution: We are to show that the postulates for boolean algebra are satisfied .
P1 : From the composition tables, we see that, both the operations obey the closure
axioms.
P2 : Both the operations are commutative, since from the tables, we have,
a + b = b + a = b and a.b = b.a = a.
P3 : Each operation is distributive over the other, since
a.(a + b) = a.b = a and a.a + a.b = a + a = a.
Again, b.(a + b) = b.b = b and b.a + b.b = a + b = b.
Similarly, a + (a.b) = a + a = a and (a + a).(a + b) = a.b = a.
P4 : a is the identity element for (+) since,
a + a = a and b + a = a + b = b.
Similarly, b is the identity element for (.) since,
a.b = a and b.b = b.
P5 : The complement of a is b, as a + b = b, the identity element for (.) and the complement
of b is a, since b.a = a, the identity element for (+).
Ex 8.2.5 Prove that there does not exist a Boolean algebra containing only three elements.
Solution: Every Boolean algebra {B, +, ,0 } contains two distinct elements, i.e., B = {0, 1},
zero element 0 and unit element 1, satisfying
a + 0 = a and a 1 = a; a B
which is a two point algebra. Let a boolean algebra B contain an element a other than 0
and 1, i.e., B = {0, 1, a}, where a 6= 0 and a 6= 1. Then the complement of a i.e. a0 S,
satisfying
a + a0 = 1 and a a0 = 0.
We are to show that a0 6= a, a0 6= 0, a0 6= 1. First let a0 = a then,
a a0 = a a 0 = a
as a a0 = 0 and a a = a. We arrive at a contraction, so a0 6= a. Let a0 = 0, then
a + a0 = a + 0 1 = a
as a + a0 = 1 and a + 0 = a. We arrive at a contradiction and consequently a0 6= 0. Lastly,
let a0 = 1, then
a a0 = a 1 0 = a
as a a0 = 0 and a 1 = a. In this case also we arrive at a contradiction and therefore a0 = 1.
Therefore, a0 is distinct from a, 0 and 1. This shows that a Boolean algebra B can not
contain only three elements a, 0 and 1.
Deduction 8.2.1 Difference between Boolean algebra and algebra of real numbers: Comparing Boolean algebra with arithmetic and ordinary algebra (the field of real
numbers) we note the differences:

460

Boolean Algebra

(i) Commutative and associative laws are true in both the algebras. But Huntingtons
postulates do not include the associative law.
(ii) The distributive law of + over that a + b c = (a + b) (a + c) is not valid for
ordinary algebra.
(iii) Boolean Algebra does not have additive and multiplicative inverses. Therefore no
cancellations are allowed (i.e. there are no substraction and division operations).
(iv) The operation complementation (0 ) is not available in ordinary algebra.
(v) The idempotent law a + a = a and a a = a holds in Boolean Algebra but does not
hold in algebra of real numbers.
(vi) Boolean Algebra is linear in character but the algebra of real numbers is not. Thus in
the former, a + a = a and a a = a, but in the latter a + a = 2a and a a = a2 .
(vii) Boolean Algebra is more symmetric in its properties and hence the principle of duality
holds in it. But no such symmetry is true in algebra of real nos.

8.2.3

Boolean Algebra Rules

Below are some important rules of boolean algebra which are used to simplify the boolean
expression.
(i) 0 + x = x

(ix) x + y = y + x

(ii) 1 + x = 1

(x) x y = y x

(iii) x + x = x

(xi) x + (y + z) = (x + y) + z

(iv) 0 x = 0

(xii) x (y z) = (x y) z

(v) 1 x = x

(xiii) x (y + z) = x y + x z

(vi) x x = x

(xiv) x + xz = x

(vii) x x = 0

(xv) x(x + y) = x

(viii) x + x = 1
(ix) x = x

(xvi) (x + y)(x + z) = x + yz
(xvii) x + xy = x + y (xviii) xy + yz + yz = xy + z

Ex 8.2.6 In a Boolean algebra B, prove the following:


(i) x + x0 y = x + y and x (x0 + y) = x y
(ii) xy + x0 z + yz = xy + x0 z and (x + y) (x0 + z) (y + z) = (x + y) (x0 + z)
(iii) (x+y)(y+z)(z+x)=xy+yz+zx.
Solution: (i) Using the Boolean algebra rules, we get,
x + x0 y = (x + x0 ) (x + y) = 1 (x + y) = x + y.
Therefore, x + x0 y = x + y. The dual of this x (x0 + y) = x y.
(ii) Using the Boolean algebra rules, we get,

Boolean Algebra

461

xy + x0 z + yz = xy + x0 z + yz 1 = xy + x0 z + yz(x + x0 )
= xy + x0 z + yzx + yzx0 = xy + xyz + x0 z + x0 yz
= xy(1 + z) + x0 z(1 + y) = xy 1 + x0 z 1 = xy + x0 z.
The dual of the above is
(x + y) (x0 + z) (y + z) = (x + y) (x0 + z).
(iii) Using the Boolean algebra rules, we get,
LHS = (x + y)(y + z)(z + x) = (y + x)(y + z)(z + x)
= (y + xz)(z + x) = (y + xz)z + (y + xz)x
= (yz + xzz) + (yx + xzx) = yz + xzz + yx + xz
= yz + xz + yx + xz = xy + yz + zx + zx = xy + yz + zx.
Ex 8.2.7 Show that, in a Boolean algebra, ab0 = 0 a + b = b and ab = a, where a, b B.
[BH:87]
Solution: Using the Boolean algebra rules, we get,
LHS = a + b = (a + b).1 = (a + b).(b + b0 )
= b + a.b0 = b + 0 = b.
and a.b = a.b + 0 = a.b + a.b0
= a.(b + b0 ) = a.1 = a.
Ex 8.2.8 Show that, in a Boolean algebra,
(x + y).(x0 + z).(y + z) = xz + x0 .y + yz,
where x, y, z B.

BH : 86, 94

Solution: Using the Boolean algebra rules, we get,


LHS = (x + y).(x0 + z).(y + z) = (x.x0 + x.z. + y.x0 + y.z).(y + z)
= (x.z + x0 .y + y.z).(y + z); as x.x0 = 0
= x.z.y + x0 .y + x.z + x0 .y.z + y.z; as x.x = x and x + x = x
= (x + x0 )y.z + x0 .y + x.z + y.z
= 1.y.z + x0 .y + x.z + y.z = y.z + x0 .y + x.z, as x + x = x.
Definition 8.2.1 (i) By a proposition in a boolean algebra we mean either a statement or
an algebraic identity in the boolean algebra. For example, the statement In a boolean
algebra 0 is unique is a proposition.
(ii) A Boolean Algebra is said to be degenerate if it contains only one element and in this
case 0 = 1.

8.2.4

Duality

By the dual of a proposition A in a Boolean algebra we mean the proposition obtained from
A by replacing + with . and . with +, 1 with 0 and 0 with 1. For example, the dual of the
proposition x + y = y + x is the proposition x.y = y.x and vice versa. We are to find the
following two properties of Boolean Algebra {B, +, ,0 } directly from Huntington postulates.
For each a B,
(i) a + 1 = 1

(ii) a 0 = 0,

462

Boolean Algebra

where 1 and 0 represents the identity elements with respect to and +. The second relation
can be obtained from the first by changing + to and 1 to 0 is called dual of the first property.
The same is observed in each pair of Huntingtons postulates, where each postulate of a pair
can be obtained from the other by interchanging + and and consequently 0 and 1.
Duality theorem : Starting with a Boolean relation, we can derive another Boolean
relation is
(i) Changing each + to an sign,
(ii) Changing each to an + sign,
(iii) and consequently 0 and 1.
If a proposition A is derivable from the axioms of a Boolean algebra, then the dual of A is
also derivable from those axioms.
Duality property or dual expression: An algebraic expression or property P 0 , called
the counterpart of an algebric expression or property P is called dual of P .
Boolean relations
Duals
a+b=b+a
a.b = b.a
(a + b) + c = a + (b + c)
(a.b).c = a.(b.c)
a.(b + c) = a.b + a.c a + b.c = (a + b).(a + c)
a+0=a
a.1 = a
a+1=1
a.0 = 0
a+a=a
a.a = a
a+a=1
a.a = 0
a=a
a=a
a + b = a.b
a.b = a + b
a + a.b = a
a.(a + b) = a
a + a.b = a + b
a.(a + b) = a.b
This dual works for every statement and every theorem in a Boolean Algebra. The principle
of duality theorem states that, every true theorem about Boolean Algebra whose statement
involves only three operations +, ,0 remains true if + and and the identity element 0 and
1 are interchanged throughout.
Properties of boolean algebra : We derive the following properties of the boolean algebra
{B, +, ,0 } directly from Huntingtons postulates:
Property 8.2.1 In a Boolean algebra B the two identity elements 0 for + and 1 for are
separately unique.
Proof: If possible, let there be two identity elements 0 and 01 for the binary operation +.
Hence a + 0 = a and a + 01 = a; a B. Again
0 + 01 = 0 Since 01 is the identity element,
and 0 = 0 + 01 = 01 + 0 by the commutative property
and 0 = 0 + 01 = 01 Since 0 is the identity element.
Hence the identity element 0 for + is unique. Again if possible, let there be two identities 1
and 11 for the operation . Hence a 1 = a and a 11 = a ; a B. Again
1 11 = 1 Since 11 is the identity element.
and 1 = 1 11 = 11 1 by the commutative property
and 1 = 1 11 = 11 Since 1is the identity.
Hence the identity element 1 for is unique.

Boolean Algebra

463

Property 8.2.2 In a boolean algebra B the complement of each element is unique.


Proof: Let a be an arbitrary element in B. Then a0 B such that
a + a0 = 1 identity for and a a0 = 0 identity for + .
Let us suppose that a00 in B is also a complement of a. Then a + a00 = 1 and a a00 = 0.
Now,
a0 = a0 1 = a0 (a + a00 ) = (a0 a) + (a0 a00 )
= 0 + (a0 a00 ) = (a a00 ) + (a0 a00 )
= (a00 a) + (a00 a0 );
by commutative law
00
0
00
= a (a + a ) = a 1 = a00 .
Hence the complement of a is unique. Therefore, in a boolean algebra for each a B, an
unique complement in B.
Property 8.2.3 For every a, b B; (a + b)0 = a0 b0 and (a b)0 = a0 + b0 .
Proof: Using the definition, we have,
(a + b) + a0 .b0 = {(a + b) + a0 }.{(a + b) + b0 }, distributive law
= {a0 + (a + b)}.{(a + b) + b0 }, commutative law
= {(a0 + a) + b}.{a + (b + b0 )}, associative law
= (1 + b).(a + 1), as a + a0 = 1
= 1.1 = 1, as a + 1 = 1 and a.a = 1.
Therefore, (a + b) + a0 .b0 = 1. Again,
(a + b).(a0 .b0 ) = a.(a0 .b0 ) + b(a0 .b0 ), distributive law
= a(a0 .b0 ) + (a0 .b0 ).b, commutative law
= (a.a0 ).b0 + a0 .(b0 .b) associative law
= 0.b0 + a0 .0 as a.a0 = 0
= 0 + 0 = 0, as a.0 = 0 and a + a = a.
Therefore, a0 .b0 satisfies all the necessary properties for becoming the complement of (a + b).
Since complement is unique, we have (a + b)0 = a0 b0 ; a, b B. Similarly,
(a.b) + (a0 + b0 ) = 1 and (a.b).(a0 .b0 ) = 0,
from which, we have, (a.b)0 = a0 + b0 . These are well known De-morgans laws.
Property 8.2.4 For any a B; a + a = a and a a = a.
Proof: Using the definition, we have,
LHS = a + a = (a + a).1; existence of identity
= (a + a).(a + a0 ); as a + a0 = 1
= a + a.a0 ; distributive law
= a, as a.a0 = 0.

464

Boolean Algebra

Therefore, for any boolean algebra B, we have a + a = a. Now,


LHS = a.a = a.a + 0; existence of identity
= a.a + a.a0 ; as a.a0 = 0
= a.(a + a0 ); distributive law
= a.1 = a, as a + a0 = 1, identity element for.
Therefore, for any boolean algebra B, we have a.a = a, these laws are known as Idempotent
laws.
Property 8.2.5 For all a, b B; a + a b = a, a (a + b) = a.
Proof: Using the definition of Boolean algebra, we have,
LHS = a + a.b = a.1 + a.b, 1 is the identity element for.
= a.(1 + b), distributive law
= a.(b + 1), commutative law
= a.1 = a, as a + 1 = 1.
Therefore, for any Boolean algebra B, we have a + a.b = a. Also,
LHS = a.(a + b) = (a + 0).(a + b), 0 is the identity element for +
= a + 0.b, distributive law
= a + b.0, commutative law
= a + 0 = a, as a.0 = 0.
Therefore, for any Boolean algebra B, we have a.(a + b) = a. These laws are known as laws
of absorption.
Property 8.2.6 For each a, b, c B;
(a + b) + c = a + (b + c), (a b) c = a (b c)
Proof: Let x = a + (b + c) and y = (a + b) + c. Then,
a.x = a.[a + (b + c)] = a.a + a.(b + c), distributive law
= a + a.(b + c) = a., idempotent and absorption laws
and, a.y = a.[(a + b) + c] = a.(a + b) + a.c, distributive law
= a + a.c = a, idempotent and absorption laws.
Therefore, a.x = a.y. Also,
a0 .x = a0 .[a + (b + c)] = a0 .a + a0 .(b + c), distributive law
= 0 + a0 .(b + c) = a0 .(b + c), since, a.a0 = 0, a + 0 = 0
and, a0 .y = a0 .[(a + b) + c] = a0 .(a + b) + a0 .c, distributive law
= a0 .a + a0 .b + a0 .c, distributive law
= 0 + a0 .b + a0 .c = since a.a0 = 0
= a0 .g + a0 .c = a0 .(b + c), as a + 0 = a, distributive law.
Therefore, a0 .x = a0 .y. From these two results we get, x = y, i.e., (a + b) + c = a + (b +
c)a, b, c B. Similarly, taking z = a.(b.c) and t = (a.b).c, we can prove that (a.b).c =
a.(b.c)a, b, c B. These laws are known as associative law.

Boolean Algebra

465

Property 8.2.7 For every a B, (a0 )0 = a (involution).


Proof: For each a B, there exists a unique element a0 B such that a.a0 = 0 and
a + a0 = 1. Hence,
a0 + a = 1 and a0 .a = 0; by commutative law.
These imply that a is the complement of a0 , i.e., (a0 )0 = a.
Property 8.2.8 For every 00 = 1 and 10 = 0.
Proof: For every a B, we have a.1 = a and 0 + a = a. Replacing a by 0 and 1,
respectively, we get,
0.1 = 0 and 0 + 1 = 1.
This shows that 1 is the complement of 0 in B. Hence, 00 = 1. By dual, 10 = 0.
Property 8.2.9 For all a, x, y B; a + x = a + y and a0 + x = a0 + y x = y; a x = a y
and a0 x = a0 y x = y.
Proof: Using the definition, we have,

(a + x).(a0 + x) = (a + y).(a0 + y)
(x + a).(x + a0 ) = (y + a).(y + a0 ), commutative law
x + a.a0 = y + a.a0 , distributive law
x + 0 = y + 0,
as a.a0 = 0
x = y,
since a + 0 = 0.

Thus, in a boolean algebra, for all a, x, y B; a + x = a + y and a0 + x = a0 + y x = y;


a x = a y. Similarly,

a.x + a0 .x = a.y + a0 .y
(a + a0 ).x = (a + a0 ).y, distributive law
1.x = 1.y, as a + a0 = 1
x.1 = y.1, commutative law
x = y,
since a.1 = 1.

Thus, in a boolean algebra, for all a, x, y B; a0 x = a0 y x = y.


Property 8.2.10 For each a B; a + 1 = 1, a 0 = 0 (Universal bounds).
Proof: Using the axioms and certain properties, we have,
a + 1 = (a + 1).1 = 1.(a + 1)
= (a + a0 ).(a + 1)
= a + (a0 .1) = a + a0 = 1.
Using dual, we have a.0 = 0.
Ex 8.2.9 Show that, in a boolean algebra, for x, y B,
(i)(x0 + xy 0 )0 = xy 0 and (ii)[(x0 + y)0 .(x + y 0 )0 ]0 = x0 + y.

[BH:90]

466

Boolean Algebra

Solution: (i) Using the boolean algebra rules, we get,


LHS = (x0 + xy 0 )0 = (x0 )0 .(x.y)0 ; by Demorgans law
= (x.(x.y)0 = x.(x0 + y 0 )
= x.x0 + x.y 0 = 0 + x.y 0 = x.y 0 .
(ii) Using the boolean algebra rules, we get,
LHS = [(x0 + y)0 .(x + y 0 )0 ]0 = [(x0 + y)0 ]0 + (x + y 0 )0
= x0 + y + x0 .(y 0 )0 = x0 + y + x0 .y
= x0 .(1 + y) + y = x0 + y.
Definition 8.2.2 A non-empty subset S of a boolean algebra B is said to be a subalgebra
of B if S is also a boolean algebra under the same binary operations of B. Consider the non
empty subsets S1 = {1, 30}, S2 = {1, 2, 15, 30}, S3 = {1, 5, 6, 30} and S4 = {1, 3, 10, 30}. We
see that each of these sets, which are subsets of S = {1, 2, 3, 5, 6, 10, 15, 30}, is closed under
+, . and 0 and hence is a subalgebra of {S, +, .,0 }.

8.2.5

Partial Order Relation

Let B be a Boolean algebra and x, y B. It is defined that x is related with y if and only
if x.y = x and this relation is denoted by , i.e., x y. According top this definition and
using the properties of Boolean algebra, we have
(i) x y, x z x y.z.
(ii) x y x y + z.
(iii) x y y 0 x0 and y 0 x0 x y.
Theorem 8.2.1 In a Boolean algebra, x, y B, x y if and only if x + y = y.
Proof: Let x y then,

x + y = x.y = y.

Conversely, let x + y = y, then,

x.y = x.(x + y) = x.

Theorem 8.2.2 The relation in a Boolean algebra is a partial order relation.


Proof: The relation in a Boolean algebra is defined by x is related with y if and only
if x.y = x Here we are to show that the relation is reflexive, transitive and antisymmetric.
(i) We know, x.x = x, x B, so x x and is reflexive.
(ii) Let x.y = x and y.z = y for x, y, z B. Then,
x.z = (x.y).z = x.(y.z) = x.y = x
x y and y z x z.
Therefore, is transitive.
(iii) Let x.y = x and y.x = y for x, y B, then
x = x.y = y.x = y
x y and y x x = y.
Therefore, is antisymmetric. Consequently, is a partial order relation.

Boolean Function

8.3

467

Boolean Function

Consider, the set B = {1, 0} and the binary operations (+) and (.) defined on the elements
of B as
+10
. 10
x x0
1 11
110
1 0
0 10
000
0 1
P1 : the closure axioms is obvious from the tables, since the result of each operation is either
1 or 0, 1, 0 B.
P2 : Both the operations are commutative, follows from the symmetry of the binary operator
tables.
P3 : From the tables we see that
0 + 0 = 0 and 0 + 1 = 1 + 0 = 1
1.1 = 1 and 1.0 = 0.1 = 0,
which establishes the two identity elements 0 for + and 1 for . as defined in postulate P3 .
P4 : The distributive law a.(b + c) = a.b + a.c can be shown to hold true from the operator
tables by forming a truth table of all possible values of a, b and c.
a
1
1
1
1
0
0
0
0

b
1
1
0
0
1
1
0
0

c
1
0
1
0
1
0
1
0

a.(b + c)
1
1
1
0
0
0
0
0

a.b + a.c
1
1
1
0
0
0
0
0

The distributive law of + over . can be shown to hold true by means of truth table similar
to the one above.
P4 : From the complement table, it is easily shown that,
0 + 00 = 0 + 1 = 1 and 1 + 10 = 1 + 0 = 1.
0.00 = 0.1 = 0 and 1.10 = 1.0 = 0
i.e., a + a0 = 1 and a.a0 = 0.
Therefore, the set B = {0, 1} together with boolean sum +, boolean product . and boolean
complement 0 , is called a two element Boolean algebra.

8.3.1

Constant

A symbol representing a specified element of a boolean algebra will be called a constant. 0


and 1 are example of constants.

8.3.2

Literal

A literal is a primed or unprimed (complement) variable. Thus two literals x and x0 corresponding to the variable x. The expression x + x0 y has three literals x, x0 and y.
A single literal or a product of two or more literals is known as product term. The
expression x + x0 y has two product terms. A single literal or of sum of two or more literals
is known as sum term. For example, f = y 0 .(x + z).(y 0 + z 0 ) contains three sum terms.

468

8.3.3

Boolean Algebra

Variable

Any literal symbol like x, y, z, x1 , x2 , representing an arbitrary element of a Boolean


algebra will be called a variable. A variable represents an arbitrary or unspecified element
of B. A boolean variable assumes only two values 0 and 1, i.e. it takes values from Z2 where
Z2 is a boolean algebra {0, 1}. Two boolean variables are said to be independent if they
assume values independent of each other. Note that x and x0 are not independent variables.
Let x, y, z, be boolean variables, then
(i) x + y = y + x and x.y = y.x, commutative laws.
(ii) (x + y) + z = x + (y + z), and (x.y).z = x.(y.z), associative laws.
(iii) x.(y + z) = (x.y) + (x.z) and x + (y.z) = (x + y).(x + z), distributive laws.
(iv) x + x = x and x.1 = x, idempotent laws.
(v) x + 0 = x and x.1 = x, identity laws.
(vi) x + x0 = 1 and x.x0 = 0, inverse laws.
(vii) x + 1 = 1 and x.0 = 0, dominance laws.
(viii) x + x.zy = x and x.(x + y) = x, absorption laws.
(ix) (x + y)0 = x0 .y 0 and (x.y)0 = x0 + y 0 , De-Morgans laws.
(x) (x0 )0 = x, double complement law.

8.3.4

Monomial

In a boolean algebra a single element with or without prime or more elements connected by
operation () is said to be a monomial. x, y 0 , xyz 0 , . . . etc. are examples of Monomial.

8.3.5

Polynomial

In a boolean algebra and expression of some indicated monomials connected by operation


(+) is called a polynomial. x + xy 0 + x0 yz is an example of a polynomial.
Each monomial in a polynomial ia called a term of the polynomial.

8.3.6

Factor

If an expression consist of some elements and polynomials connected by operation then


each of the elements and polynomials is called a factor of this expression. A factor may
be linear or may not be linear. The factors of the expression x(x + y 0 )(x0 + y + z) are
x, (x + y 0 ), (x0 + y + z).

8.3.7

Boolean Function

An expression which represents the combination of a finite number of constants and variables
by the operations +, or 0 is said to be a Boolean algebra. In the expression (a + b0 )x + a0 y +
0; 0, a and b are constants, x and y are variables. It is a boolean function, if a, b, 0, x, y are
elements of a boolean algebra. x+x0 , xy 0 +a, xyz 0 +x0 yz+y 0 z+1 are functions of one, two and
three variables respectively. If f (x, y) = x + y 0 then f (0, 0) = 1, f (0, 1) = 0, f (1, 0) = 1,
and f (1, 1) = 1. Let f, g, h, be Boolean expressions, then

Truth Table

469

(i) f + g = g + f and f.g = g.f , commutative laws.


(ii) (f + g) + h = f + (g + h), and (f.g).h = f.(g.h), associative laws.
(iii) f.(g + h) = (f.g) + (f.h) and f + (g.h) = (f + g).(f + h), distributive laws.
(iv) f + f = f and f.1 = f , idempotent laws.
(v) f + 0 = f and f.1 = f , identity laws.
(vi) f + f 0 = 1 and f.f 0 = 0, inverse laws.
(vii) f + 1 = 1 and f.0 = 0, dominance laws.
(viii) f + f.hg = f and f.(f + g) = f, absorption laws.
(ix) (f + g)0 = f 0 .g 0 and (f.g)0 = f 0 + g 0 , De-Morgans laws.
(x) (f 0 )0 = f, double complement law.

8.4

Truth Table

A Boolean function f , which a combination of a finite number of Boolean variables connected


by the operations + (OR) and/or . (AND) will assume a value (either 1 or 0) when the
variable involved in it are assigned with their truth values. This value of the function f is
called its truth value corresponding to that particular set of values of the variable.
Mathematicians have found a convenient way of expressing the truth values of a function f
for all possible combinations of the truth values of the independent variables which appear
in the expression of f in the form of a table. Such a table is called the Truth table for the
function f .
Definition 8.4.1 A Boolean expression f in the variables x1 , x2 , , xn is called a maxterm
if
f =x
1 + x
2 + + x
n ,
where each x
i denotes either xi or x0i . For example f = x + y + z 0 , x0 + y 0 + z, x0 + y 0 + z 0
are examples of the maxterms in the variables x, y, z.
Definition 8.4.2 A Boolean expression f in the variables x1 , x2 , , xn is called a minterm
if
f =x
1 .
x2 . .
xn ,
where each x
i denotes either xi or x0i . For example f = xyz 0 , x0 y 0 z, x0 y 0 z 0 are examples of
the minterms in the variables x, y, z.
Ex 8.4.1 Obtain the truth table for the Boolean function f (x1 , x2 , x3 ) = x1 + (x2 .x03 ).
Solution: A set of 2n possible combinations of 1 and 0. Here f is included three independent
variables x1 , x2 and x3 , each of these variables can have the value 1 or 0, so that total number

470

Boolean Algebra

of possible combinations of truth values of them is 23 = 8.


x1
1
1
1
1
0
0
0
0

8.5

x2
1
1
0
0
1
1
0
0

x3
1
0
1
0
1
0
1
0

x03
0
1
0
1
0
1
0
1

x2 .x03
0
1
0
0
0
1
0
0

f
1
1
1
1
0
1
0
0

Disjunctive Normal Form

A Boolean function is said to be disjunctive normal form (DNF) in n variables x1 , x2 , . . . , xn


for n > 0, if each term of the function is a monomial of the type f1 (x1 )f2 (x2 ) fn (xn )
where fi (xi ) is xi or xi 0 for each i = 1, 2, . . . , n and no two terms are identical. xy 0 +
xy, xyz + x0 yz + x0 y 0 z are the Boolean function in the DNF.

8.5.1

Complete DNF

The disjunctive normal form of a Boolean function is n variables is said to be the Complete
disjunctive normal form, if it contains 2n terms. For example, xy + x0 y + xy 0 + x0 y 0 is the
complete disjunctive normal form of two variables x and y, and xyz + x0 yz + xy 0 z + xyz 0 +
x0 y 0 z +x0 yz 0 +xy 0 z 0 +x0 y 0 z 0 is the complete disjunctive normal form of three variables x, y, z.
Note: Each term of a complete disjunctive normal form of n variables x1 , x2 , . . . , xn contains
xi in either xi or xi 0 for all i. Thus the complete disjunctive normal form consists of 2n
terms.
Note: Every complete disjunctive normal form is identically 1 and conversely the unit
function is in complete disjunctive normal form. For example,
xy + x0 y + xy 0 + x0 y 0 = (x + x0 )y + (x + x0 )y 0
= y + y0 = 1
Note: Incomplete disjunctive normal form is not unique except its reduced form is in
minimum number of variables.
f = xy = xy(z + z 0 ); as z + z 0 = 1
= xyz + xyz 0
f = xyz + xyz 0 + x0 y 0 z + xy 0 z
= xy(z + z 0 ) + (x0 + x)y 0 z = xy + y 0 z
Note: The complement of an Incomplete disjunctive normal form are those terms to
make it a complete one. For example, the complement of xyz + xy 0 z + xy 0 z 0 + x0 y 0 z 0 is
x0 yz + xyz 0 + x0 y 0 z + x0 yz 0 .
Note: Two Boolean function are equal iff their respective disjunctive normal form have the
same terms.
Note: Since 0 is the sum of a numbers of elements, zero function (i.e. 0) can not be
expressed in disjunctive normal form.
Ex 8.5.1 Express the Boolean function f = x + (x0 .y 0 + x0 .z 0 )0 in disjunctive normal form.

Disjunctive Normal Form

471

Solution: Using the properties, we get,


f = x + (x0 .y 0 + x0 .z 0 )0 = x + (x0 .y 0 )0 .(x0 .z)0 ; De Morgans law
= x + (x + y).(x + z 0 ); De Morgans law
= x + x + y.z 0 Distributive law
= x + y.z 0 = x.(y + y 0 ).(z + z 0 ) + y.z 0 .(x + x0 )
= x.(y.z + y.z 0 + y 0 .z + y 0 .z 0 ) + x.y.z 0 + x0 .y.z 0
= x.y.z + x.y.z 0 + x.y 0 .z + x.y 0 .z 0 + x.y.z 0 + x0 .y.z 0
= x.y.z + x.y.z 0 + x.y 0 .z + x.y 0 .z 0 + x0 .y.z 0
which is in the full disjunctive normal form.
Ex 8.5.2 Express the Boolean function f = (x+y+z).(xy+xz); x, y, z B in full disjunctive
normal form.
Solution: Using the properties, we get,
f = (x + y + z).(xy + xz) = xxy + xxz + xyy + xyz + xyz + xzz
= xy + xz + xy + xyz + xyz + xz; as x.x = x
= xy + xz + xyz = xy(z + z 0 ) + xz(y + y 0 ) + xyz; as y + y 0 = 1 = z + z 0
= xyz + xyz 0 + xyz + xy 0 z + xyz = xyz + xyz 0 + xy 0 z,
which is in the disjunctive normal form.
Ex 8.5.3 Express the Boolean function f = (x + y)(x + y 0 ).(x0 + z); x, y, z B in full
disjunctive normal form.
CH99
Solution: Using the properties, we get,
f = (x + y)(x + y 0 ).(x0 + z) = (x + yy 0 ).(x0 + z)
= x.(x0 + z); as x.x0 = 0
= xx0 + xz = xz; as x.x0 = 0
= xz(y + y 0 ) = xyz + xy 0 z; asy + y 0 = 1,
which is in the disjunctive normal form.
Ex 8.5.4 Express the Boolean function f = (x + y + z)(xy + x0 .z)0 in disjunctive normal
form in the variables x, y, z.
Solution: Let us construct the truth table of the expression f = (x + y + z)(xy + x0 .z)0 for
all possible assignments of values 1 or 0 to x, y and z. There are 23 = 8 possible assignments
of values 1 or 0 to x, y, z in f . Hence,
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

f
0
0
1
1
0
1
0
0

472

Boolean Algebra

Now, we consider only those rows in which the values of f is 1. Here these rows are 3, 4
and 6. For each of these rows we construct minterms as xy 0 z, xy 0 z 0 , x0 yz 0 . Hence, f =
(x + y + z)(xy + x0 .z)0 = xy 0 z + xy 0 z 0 + x0 yz 0 in disjunctive normal form in the variables
x, y, z.

8.6

Conjunctive Normal Form

A Boolean function is said to be conjunctive normal form (CNF) in n variables x1 , x2 , . . . , xn


for n > 0, if the function is a product of linear factors of the type f1 (x1 )+f2 (x2 )+ +fn (xn )
where fi (xi ) is xi or xi 0 for each i = 1, 2, . . . , n and no two terms are identical. For example,
(x + y)(x0 + y 0 ) and (x + y + z)(x + y 0 + z)(x0 + y + z) are Boolean function in conjunctive
normal form.

8.6.1

Complete CNF

The conjunctive normal form of a Boolean function is n variables is said to be the Complete
Conjunctive Normal Form, if it contains 2n factors. For example, (x+y)(x0 +y)(x+y 0 )(x0 +y 0 )
is the complete conjunctive normal form of two variables x and y.
Note: Each factor of a Complete Conjunctive Normal Form of n variables x1 , x2 , . . . , xn
contains xi in either xi or xi 0 form for all i. Thus the Complete Conjunctive Normal Form
consists of 2n terms.
Note: Every complete conjunctive normal form is identically 0 and conversely zero function
can be expressed in complete conjunctive normal form. For example,
(x + y)(x0 + y)(x + y 0 )(x0 + y 0 ) = (y + xx0 )(y 0 + xx0 )
= (y + 0)(y 0 + 0) = yy 0 = 0
For three variables the complete conjunctive normal form,
b(x + y + z)(x0 + y + z)(x + y 0 + z)(x + y + z 0 )
(x0 + y 0 + z)(x0 + y + z 0 )(x + y 0 + z 0 )(x0 + y 0 + z 0 )
= (y + z + xx0 )(y 0 + z + xx0 )(y + z 0 + xx0 )(y 0 + z 0 + xx0 )
= (y + z)(y 0 + z)(y + z 0 )(y 0 + z 0 )
= (z + yy 0 )(z 0 + yy 0 )
= zz 0 = 0
Note: Two Boolean function, each expresses in conjunctive normal form, are equal, iff
they contain identical factors.
Note: Incomplete conjunctive normal form is not unique except its reduced form is in
minimum no. of variables. For example, f = y = (y + x)(y + x0 ) = (y + x + z)(y + x +
z 0 )(y + x0 + z)(y + x0 + z 0 ).
Note: The complement of an incomplete conjunctive normal form are those factors to make
it complete one. For example, the complement of (x + y)(x0 + y 0 ) is (x0 + y)(x + y 0 ).
Note: The unit function can not be expressed in conjunctive normal form.
Ex 8.6.1 Express the boolean function f = (x + y + z).(x.y + x.z) in full conjunctive normal
form.
Solution: Using the properties, we get,
f = (x + y + z).(x.y + x.z) = (x + y + z).x.(y + z); distributive law

Conjunctive Normal Form

473

= (x + y + z).(x + y.y 0 + z.z 0 ).(y + z + x.x0 ); since a.a0 = 0


= (x + y + z).(x + y.y 0 + z).(x + y.y 0 + z 0 ).(y + z + x.x0 ); distributive law
= (x + y + z).(x + z + y.y 0 ).(x + z 0 + y.y 0 ).(y + z + x.x0 ); commutative law
= (x + y + z).(x + z + y).(x + z + y 0 ).(x + z 0 + y).(x + z 0 + y 0 ).(y + z + x).(y + z + x0 )
= (x + y + z).(x + y + z).(x + y 0 + z).(x + y + z 0 ).(x + y 0 + z 0 ).(x + y + z).(x0 + y + z)
= (x + y + z).(x + y 0 + z).(x + y + z 0 ).(x + y 0 + z 0 )
which is in the full conjunctive normal form.
Ex 8.6.2 Express the boolean function f = xyz +(x+y)(y +z); x, y, z B to its conjunctive
normal form.
Solution: Using the properties, we get,
f = xyz + (x + y)(y + z) = (xyz + x + y)(xyz + y + z)
= (x + y + z)(x + y + yx)(y + z + x)(y + z + yz)
= (x + y + z)(x + y)(x + y + z)(y + z); as x = x + xy
= (x + y + z)(x + y + zz 0 )(y + z + xx0 ); as zz 0 = xx0 = 0
= (x + y + z)(x + y + z 0 )(x0 + y + z),
which is in the conjunctive normal form.
Ex 8.6.3 Express the Boolean function f = (x + y + z)(xy + x0 .z)0 in conjunctive normal
form in the variables x, y, z.
Solution: Let us construct the truth table of the expression f = (x + y + z)(xy + x0 .z)0 for
all possible assignments of values 1 or 0 to x, y and z. There are 23 = 8 possible assignments
of values 1 or 0 to x, y, z in f . Hence,
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

f
0
0
1
1
0
1
0
0

Now, we consider only those rows in which the values of f is 0. Here these rows are 1, 2, 5 and
7. For each of these rows we construct maxterms as x0 +y 0 +z 0 , x0 +y 0 +z, x+y 0 +z 0 , x+y +z 0
and x + y + z. Hence, f = (x + y + z)(xy + x0 .z)0 = (x0 + y 0 + z 0 )(x0 + y 0 + z)(x + y 0 + z 0 )(x +
y + z 0 )(x + y + z) in conjunctive normal form in the variables x, y, z.
Deduction 8.6.1 Conversion of normal form to another normal form : It is done
by double complementation. Let f = xyz + x0 y 0 z + xyz 0 + xy 0 z, then the complement of f
is f 0 given by
f 0 = x0 y 0 z 0 + xy 0 z 0 + x0 yz 0 + x0 yz.
Now, the complement of f 0 is (f 0 )0 which is given by,
(f 0 )0 = (x0 y 0 z 0 + xy 0 z 0 + x0 yz 0 + x0 yz)0
= (x + y + z)(x0 + y + z)(x + y 0 + z)(x + y 0 + z 0 ).
It is the conjunctive normal form of f .

474

Boolean Algebra

Ex 8.6.4 Change the Boolean function f = xy + x0 y + x0 y 0 from the DNF to its CNF.
Solution: Using the properties of Boolean algebra, we get,
f = xy + x0 y + x0 y 0 = [(xy + x0 y + x0 y 0 )0 ]0
= [(xy)0 .(x0 y)0 .(x0 y 0 )0 ]0 = [(x0 + y 0 )(x + y 0 )(x + y)]0 ; by De Morgans law
= x0 + y; by the method of complete CNF.
Ex 8.6.5 Change the Boolean function f = (x + y + z)(x + y + z 0 )(x + y 0 + z)(x0 + y + z 0 )
from the CNF to its DNF.
Solution: Using the properties of Boolean algebra, we get,
f = (x + y + z)(x + y + z 0 )(x + y 0 + z)(x0 + y + z 0 )
= [{(x + y + z)(x + y + z 0 )(x + y 0 + z)(x0 + y + z 0 )}0 ]0
= [x0 y 0 z 0 + x0 y 0 z + x0 yz 0 + xy 0 z 0 ]0 ; by De Morgans law
= xyz + xy 0 z 0 + x0 yz + xyz 0 ; by the method of complete DNF.
Ex 8.6.6 Let f (x, y, z) = x + (y.z 0 ) be a Boolean function. Express f (x, y, z) in CNF. What
is its DNF?
Solution: Here f (x, y, z) contains three independent variables x, y and z and hence there
are 23 = 8 possible combinations of 1 and 0 as truth values of x, y and z. According to the
definition, the truth table is
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

f
0
0
1
1
1
1
0
1

Since we have 0 in the last columns of the 1st , 2nd and 7th rows, we construct the corresponding maxterms, which are respectively, (x + y + z), (x + y + z 0 ) and (x0 + y 0 + z). Consequently,
the required Boolean expression in CNF is given by
fc = (x + y + z).(x + y + z 0 ).(x0 + y 0 + z).
Since we have 1 in the last columns of the 3rd , 4th , 5th , 6th and 8th rows, we construct the
corresponding maxterms, which are respectively, x.y 0 .z, x.y 0 .z 0 , x0 .y.z, x0 .y.z 0 and x0 .y 0 .z 0 .
Consequently, the required Boolean expression in DNF is given by
fd = x.y 0 .z + x.y 0 .z 0 + x0 .y.z + x0 .y.z 0 + x0 .y 0 .z 0 .
Ex 8.6.7 Let f (x, y, z) be a Boolean function such that f (x, y, z) = 0 if and only if at least
two variables take the value 1. Express f (x, y, z) in CNF. What is its DNF?

Switching Circuit

475

Solution: ere f (x, y, z) contains three independent variables x, y and z and hence there
are 23 = 8 possible combinations of 1 and 0 as truth values of x, y and z. According to the
definition, the truth table is
x

y
1
1
0
0
1
1
0
0

1
1
1
1
0
0
0
0

z
1
0
1
0
1
0
1
0

f
1
1
1
0
1
0
0
0

Since we have 0 in the last columns of the 4th , 6th , 7th and 8th rows, we construct the
corresponding maxterms, which are respectively, (x + y 0 + z 0 ), (x0 + y + z 0 ), (x0 + y 0 + z) and
(x0 + y 0 + z 0 ). Consequently, the required Boolean expression in CNF is given by
fc = (x + y 0 + z 0 ).(x0 + y + z 0 ).(x0 + y 0 + z).(x0 + y 0 + z 0 ).
Since we have 1 in the last columns of the 1st , 2nd , 3rd and 5th rows, we construct the corresponding maxterms, which are respectively, x.y.z, x.y.z 0 , x.y 0 .z and x0 .y.z. Consequently,
the required Boolean expression in DNF is given by
fd = x.y.z + x.y.z 0 + x.y 0 .z + x0 .y.z.

8.7

Switching Circuit

In this section, we present an application of Boolean algebra in the design of Electrical


switching circuits. Here, the two element algebra plays an important role.
Definition 8.7.1 An electrical switch is a mechanical device which is attached to a point
in a wire having only two possible states ON or OFF, i.e., closed or open. The switch
allows current to flow through the point when it is in the ON state and no current can flow
through the point when it is the OFF state.
Here we consider the switches which are bi-stable, either ON or OFF, i.e., closed or open. We
say that, the Boolean expression represents the circuit and the circuit realizes the Boolean
expression.
Ex 8.7.1 A committee of three persons A, B, C decide proposals by majority of votes. B
has a voting weight 1 and each of A and C has voting weight 2. Each can press a button to
cast his vote. Design a simple circuit so that light will glow when a majority of votes is cast
in favor of the proposal.
CH03, 07
Solution: Let x, y, z be the switches passed by A, B, C respectively. Let v be the number
of votes cast. By the given condition
x = 0, v = 0; x = 1, v = 2
y = 0, v = 0; y = 1, v = 1
z = 0, v = 0; z = 1, v = 2.

476

Boolean Algebra

If f be the Boolean function of x, y, z. The light will glow when f (x, y, z) = 1. Now,
f (x, y, z) = 1, when v 3 and f (x, y, z) = 0 for v < 3. The table for the function f is given
below:
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

v
5
3
4
2
3
1
2
0

f
1
1
1
0
1
0
0
0

Using the properties of Boolean algebra, we can simplify f as follows


f (x, y, z) = x.y.z + x.y.z 0 + x.y 0 .z + x0 .y.z = x.y.(z + z 0 ) + x.y 0 .z + x0 .y.z
= x.y + x.y 0 .z + x0 .y.z; as z + z 0 = 1
= x.(y + y 0 .z) + x0 .y.z; as a.1 = 1 and distributive law
= x.(y + y 0 ).(y + z) + x0 .y.z; distributive law
= x.1.(y + z) + x0 .y.z; as a + a0 = 1
= x.(y + z) + x0 .y.z; as a.1 = 1
= x.y + x.z + x0 .y.z = x.y + (x + x0 .y).z; distributive law
= x.y + (x + x0 ).(x + y).z; distributive law
= x.y + 1.(x + y).z = x.y + (x + y).z = x.y + y.z + z.x.
The simplified form is given in the following circuit
x

y
z

z
x

Figure 8.1:

Ex 8.7.2 A committee of three approves proposal by majority vote. Each member can vote
for the proposal by passing a button at the side of their chairs. These three buttons are
connected to light bulb. For a proposal whenever the majority of votes takes place, a light
bulb is turned on. Design a circuit as possible so that the current passes and the light bulb
is turned on only when the proposal is approved.
Solution: Let x, y, z denote the three switches. First we construct a Boolean expression f
in the independent variables x, y and z for the required circuit. Let f (x, y, z) be a Boolean
function such that f (x, y, z) = 1 if and only if at least two variables take the value 1, i.e.,
whenever the majority of votes takes place, a light bulb is turned in 1 state. The truth table

Switching Circuit

477

is given below:
x

y
1
1
0
0
1
1
0
0

1
1
1
1
0
0
0
0

z
1
0
1
0
1
0
1
0

f
1
1
1
0
1
0
0
0

Using the properties of Boolean algebra, we can simplify f as follows


f (x, y, z) = x.y.z + x.y.z 0 + x.y 0 .z + x0 .y.z = x.y + yz + zx.
Ex 8.7.3 A committee consists of the President, Vice-President and Secretary. A proposal
is approved if and only if it receives a majority vote or the vote of the president plus one
other member. Each member approves the proposal by passing a button attached to their
chair. Design a switching circuit controlled by the bottoms which allows current to pass if
and only if a proposal is approved.
Solution: Let x, y, z denote the three switches controlled by respectively President, VicePresident and Secretary. First we construct a Boolean expression f in the independent
variables x, y and z for the required circuit. Let f (x, y, z) be a Boolean function. The truth
table is given below:
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

v
3
2
2
1
2
1
1
0

f
1
1
1
0
1
0
0
0

Using the properties of Boolean algebra, we can simplify f as follows


f (x, y, z) = x.y.z + x.y.z 0 + x.y 0 .z + x0 .y.z = x.y + yz + zx.
Ex 8.7.4 A light bulb in a room is controlled independently by three wall switches at three
entrances of a room in such a way that the state of the light bulb will change by flicking any
one of the switches (irrespective of its previous state). Design a simple circuit connecting
these three wall switches and the light bulb.
CH05
Solution: Let x, y, z denote the three wall switches. First we construct a Boolean expression
f in the independent variables x, y and z for the required circuit. Let f (x, y, z) be a Boolean

478

Boolean Algebra

function. The truth table is given below:


x

y
1
1
0
0
1
1
0
0

1
1
1
1
0
0
0
0

z
1
0
1
0
1
0
1
0

f
1
0
0
1
0
1
1
0

Using the properties of Boolean algebra, we can simplify f as follows


f (x, y, z) = x0 .y 0 .z 0 + x0 .y.z 0 + x.y 0 .z 0 + x.y.z = x.(yz + y 0 z 0 ) + x0 (yz 0 + y 0 z).
The corresponding simplified form is given in the following circuit
y
x
y

Figure 8.2:
Ex 8.7.5 A light bulb in a room is controlled independently by three wall switches at three
entrances of a room in such a way that the state of the light bulb will change by flicking any
one of the switches (irrespective of its previous state). Design a simple circuit connecting
these three wall switches and the light bulb.
CH98, 06
Solution: Let x, y denote the two wall switches and f be the Boolean function. First
we construct a Boolean expression f in the independent variables x and y for the required
circuit. The light will glow when f (x, y) = 1. By the given condition, the light is in off
state when both the switches are in off state. So f (x, y) = 0 when x = 0, y = 0. The
truth table for the function to the problem is
x

y
1
0
1
0

1
1
0
0

f
0
1
1
0

Using the properties of Boolean algebra, we can simplify f as f (x, y, z) = x0 y + xy 0 . The


corresponding simplified form is given in the following circuit
y
x
x

Figure 8.3:
Solution: The circuit is represented by the expression
Ex 8.7.6 Simplify the following
f (x, y, z) = z.(x + y 0 ) + z 0 .x + (z + y 0 ).z 0
= x.z + y 0 .z + z 0 .x + z.z 0 + y 0 .z 0
= xz + y 0 z + xz 0 + y 0 z 0 = x(z + z 0 ) + y 0 (z + z 0 ) = x + y 0 .
The corresponding simplified form is given in the following circuit

Switching Circuit

479
x
z
y
x

z
z

z
y
Figure 8.4:
x

y
Figure 8.5:

Exercise 8
Section-A
[Multiple Choice Questions]
1. Principle of duality is defined as
(a) is replaced by
(b) LUB becomes GLB
(c) All properties not altered when is replaced by
(d) All properties not altered when is replaced by other than 0 and 1 element.
2. What values of A, B, C and D satisfy the following simultaneous Boolean equations?
A + AB = 0, AB = AC, AB + AC + CD = CD.
(a) A = 1, B = 0, C = 0, D = 1 (b) A = 1, B = 1, C = 0, D = 0
0, C = 1, D = 1 (d) A = 1, B = 0, C = 0, D = 0.
3. The absorption law is defined as
(a) a (a b) = b (b) |C| 2 (c) a (a b) = b b

(c) A = 1, B =

(d) C does not exist.

4. The Boolean expression A + BC equals


(a) (A + B)(A + C) (b) (A + B)(A + C) (c) (A + B)(A + C) None of the above.
5. Simplifying Boolean expression ABCD + ABCD we get,
(a) ABC (b) ABC (c) A + BCD (d) AB + CD
6. The minimization of Boolean expression AC + AB + ABC + BC, we get,
(a) A B + C (b) AB + C (c) AB + BC (d) None of the above.
7. How many truth tables can be made from one function table
(a) 1
(b) 2
(c) 3
(d) 8.
8. The term sum of product in Boolean algebra means
(a) AND function of several OR functions
(b) OR function of several AND functions
(c) AND function of several AND functions
(d) OR function of several OR functions.

480

Boolean Algebra
Section-B
[Objective Questions]

1. Show that in any Boolean algebra (x0 )0 = x, where x0 is the complement of x.


2. Write the dual of each Boolean expression:
(a) a(a0 + b) = ab.
(b) (a + 1)(a + 0) = a.
(c) (a + b)(b + c) = ac + b.
Section-C
[Long Answer Questions]
1. Show that the power set P (X) of a non empty set X is a poset with respect to set
inclusion relation . Show further that hP (X), i is a linearly ordered set if and only
if X is singleton set.
2. Establish that the set of all real numbers does not form a boolean algebra with respect
to usual addition and multiplication.
[ CH: 10]
3. Show that the set S = {a, b, c, d} with operations + and . defined below
+
a
b
c
d

abcd
abcd
b bdd
cdcd
dddd

.
a
b
c
d

abcd
aaaa
abab
bacc
abcd

forms a Boolean algebra.


4. Let U be a given universal set. A nonempty class S of subsets of U is said to be a
field of sets if S is closed under set theoretical operations of union, intersection and
complementation, i.e., such that
(a) A S, B S A B = S.
(b) A S, B S A B S.
(c) A S A0 ( complement of A in U ) S.
The universal set U is called the space U .
5. Let U be the family of all finite subsets of < and their respective complements in <.
Show that U forms a Boolean algebra under usual set theoretical operations of union,
intersection and complementation.
6. Prove that the Boolean algebra (B, +, .,0 ) becomes a poset with respect to the relation
, defined by a b if and only if a + b = b for a, b B.
7. Prove that a Boolean algebra of three elements {0, 1, a} cannot exist.
8. In a Boolean algebra B, for any a, b and c, prove the following
(a) (a + b0 )(a0 + b0 )(a + b)(a0 + b) = 0.
(b) (a + b)(a0 + b0 ) = ba0 + ab0 .

[CH06]

Switching Circuit

481

(c) (a + b)(b + c)(c + a) = ab + bc + ca.


(d) ab + a0 b + ab0 + a0 b0 = 1.

[BH83, 96, 99]

(e) a + b = a + c and a.b = a.c b = c.

[CH10]

(f) a + ab = a; a, b B.

[CH08]
0

(g) b + a = c + a and b + a = c + a b = c.

[CH07]

9. Prove that the following properties are equivalent in a Boolean algebra B:


(i)a + b = a, (ii)a + b0 = 1, and (iii)ab = b. a, b B.
10. Express the following boolean functions in both DNF and CNF.
(a) f (x, y, z) = (x + y + z)(xy + x0 z)0 ,
0

[CH06]

(b) f (x, y, z) = xy + yz + zx ,
(c) f (x, y, z) = (x + y)(x + y 0 )(x0 + z),
(d) f (x, y, z) = (x0 + y 0 + z)(x + y 0 + z 0 )(x0 + y + z 0 )
11. Express the following CNF into an expression of DNF.
(a) (x + y 0 + z)(x + y + z 0 )(x + y 0 + z 0 )(x0 + y + z)(x0 + y + z 0 )(x0 + y 0 + z), [CH08]
(b) (x + y)(y + z)(x0 + y 0 + z 0 ),

[CH10]

12. What is truth table. Construct the truth table for the function
f = xy 0 z + x0 z 0 + y.
13. Using the truth table, find full conjunctive normal form of the following Boolean
expression x0 y 0 z + xy 0 z 0 + xy 0 z + xyz 0 + xyz.
14. A Boolean function f (x, y, z) is such that f (1, 1, 0) = f (0, 1, 1) = f (1, 0, 1) = f (1, 1, 1) =
1 and f (x, y, z) = 0 for all other cases. Find the function f (x, y, z) in minimized form.
15. f is a Boolean function of three variables as given in the following truth table:
x
1
1
1
1
0
0
0
0

y
1
1
0
0
1
1
0
0

z
1
0
1
0
1
0
1
0

f1
0
1
0
0
1
1
0
0

f2
1
1
1
0
1
0
0
0

Find the simplified expression of f and then obtain a switching circuit to get the
output function.
16. Find the boolean function which represents the circuit
and simplify the functions if possible.
17. State De-Morgans laws and verify them using truth tables.
18. Express the following in CNF in the smallest possible number of variables:
(a) x0 y + xyz 0 + xy 0 z + x0 y 0 z 0 t + t0 .

[CH09]

482

Boolean Algebra
y

z
z

x
y

Figure 8.6:
(b) (x + y)(x + y 0 )(x0 + z).
(c) xy 0 + xz + xy.
(d) (x0 + y 0 )z + (x + z)(x0 + z 0 ).
19. Construct the truth table and draw switching circuit diagram of the following Boolean
functions:
(a) f = xy + yz + zx
(b) f = (xy + xz + x0 z 0 )z 0 (x + y + z)
(c) f = xyz 0 + xy 0 z + x0 y 0 z 0 .
(d) f = x + y[z + x0 (y 0 + z 0 )].
20. A committee consists of the President, Vice-President, Secretary and Treasurer. A
proposal is approved if and only if it receives a majority vote or the vote of the
president plus one other member. Each member approves the proposal by passing a
button attached to their chair. Design a switching circuit controlled by the bottoms
which allows current to pass if and only if a proposal is approved.
[CH04]
21. Draw the switching circuit representing the switches x, y, x0 and y 0 such that the light
bulb in the circuit glows only if x be ON and y be OFF or y be ON and x be
OFF.
[BH90]
22. Construct the switching circuit representing ab + ab0 + a0 b0 and show that the circuit
is equivalent to the switching circuit a + b0 .
[BH87]
23. Let f (x, y, z) be a boolean function which assumes the value 1 if only one of the
variables takes the value 1. Construct a truth table of f and hence write f in CNF.
Draw a switching circuit corresponding to the DNF.
[CH10]

Bibliography
[1] A. R. Rao and P. Bhimasankaram, Linear algebra, TMH.
[2] D. Dubois, and H. Prade, Fuzzy Sets and Fuzzy Systems, Theory and applications,
Academic Press, New York, 1980.
[3] I. Niven and H. Zuckerman, An intoduction to theory of numbers, John Wesley and
Sons.
[4] K. Atanassov, Intuitionistic Fuzzy Sets: Theory and Applications, Physica-Verlag,
1999.
[5] K. Hoffman and R. Kuynze, Linear algebra, PHI.
[6] L.A.Zadeh, Fuzzy sets, Information and Control, 8 (1965) 338-352.
[7] M. K. Sen, Ghosh and Mukhopadhyay, Topics in Abstract Algebra, Universty Press.
[8] P.K.Nayak, Numerical Analysis (Theory and Applications), Asian Books Pvt. Limited,
2007.
[9] P.K.Nayak, Mechanics: Newtonian, Classical and Relativistic(Theory and Applications), Asian Books Pvt. Limited, 2009.
[10] S.K. Mapa, Higher Algebra, Sarat Books Pvt. Limited.
[11] S. Lipschutz and M.L.Lipson, Discrete Mathematics, Schaums Series.

483

Você também pode gostar